\section{Introduction}
The primary medium for music distribution is rapidly changing from physical
-media to digital media. In 2016 the \gls{IFPI} stated that about $43\%$ of
-music revenue arises from digital distribution. Another $39\%$ arises from the
+media to digital media. In 2016 the \gls{IFPI} stated that about $50\%$ of
+music revenue arises from digital distribution. Another $34\%$ arises from the
physical sale and the remaining $16\%$ is made through performance and
synchronisation revenues. The overtake of digital formats on physical formats
took place somewhere in 2015. Moreover, ever since twenty years the music
available. However, a temporal alignment of the lyrics is not and creating it
involves manual labour.
-A lot of the current day musical distribution goes via non-official channels
+A lot of the current day music distribution goes via non-official channels
such as YouTube\footnote{\url{https://youtube.com}} in which fans of the
performers often accompany the music with synchronized lyrics. This means that
there is an enormous treasure of lyrics-annotated music available. However, the
that music has different properties than speech. Music uses a wider spectral
bandwidth in which events happen. Music contains more tonality and rhythm.
Multivariate Gaussian classifiers were used to discriminate the categories with
-an average performance of $90\%$~\cite{saunders_real-time_1996}.
+an average accuracy of $90\%$~\cite{saunders_real-time_1996}.
Williams and Ellis were inspired by the aforementioned research and tried to
separate the singing segments from the instrumental segments~%
to detect a singing voice~\cite{berenzweig_using_2002}. Nwe et al.\ showed that
there is not much difference in accuracy when using different features founded
in speech processing. They tested several features and found accuracies differ
-less that a few percent. Moreover, they found that others have tried to tackle
+less than a few percent. Moreover, they found that others have tried to tackle
the problem using myriads of different approaches such as using \gls{ZCR},
\gls{MFCC} and \gls{LPCC} as features and \glspl{HMM} or \glspl{GMM} as
classifiers~\cite{nwe_singing_2004}.