+comprehensible. The vocals produced by \emph{Cannibal Corpse} are bordering
+regular shouting.
+
+The second band is called \emph{Disgorge} and make even more violently sounding
+music. The growls of the lead singer sound like a coffee grinder and are more
+shallow. In the spectrals it is clearly visible that there are overtones
+produced during some parts of the growling. The lyrics are completely
+incomprehensible and therefore some parts were not annotated with the actual
+lyrics because it was not possible what was being sung.
+
+Lastly a band from Moscow is chosen bearing the name \emph{Who Dies in
+Siberian Slush}. This band is a little odd compared to the previous \gls{dm}
+bands because they create \gls{dom}. \gls{dom} is characterized by the very
+slow tempo and low tuned guitars. The vocalist has a very characteristic growl
+and performs in several moscovian bands. This band also stands out because it
+uses piano's and synthesizers. The droning synthesizers often operate in the
+same frequency as the vocals.
+
+\section{\gls{MFCC} Features}
+The waveforms in itself are not very suitable to be used as features due to the
+high dimensionality and correlation. Therefore we use the aften used
+\glspl{MFCC} feature vectors.\todo{cite which papers use this} The actual
+conversion is done using the \emph{python\_speech\_features}%
+\footnote{\url{https://github.com/jameslyons/python_speech_features}} package.
+
+\gls{MFCC} features are nature inspired and built incrementally in a several of
+steps.
+\begin{enumerate}
+ \item The first step in the process is converting the time representation
+ of the signal to a spectral representation using a sliding window with
+ overlap. The width of the window and the step size are two important
+ parameters in the system. In classical phonetic analysis window sizes
+ of $25ms$ with a step of $10ms$ are often chosen because they are small
+ enough to only contain subphone entities. Singing for $25ms$ is
+ impossible so it is arguable that the window size is very small.
+ \item The standard \gls{FT} gives a spectral representation that has
+ linearly scaled frequencies. This scale is converted to the \gls{MS}
+ using triangular overlapping windows.
+ \item
+\end{enumerate}
+
+
+\todo{Explain why MFCC and which parameters}
+
+\section{\gls{ANN} Classifier}
+\todo{Spectrals might be enough, no decorrelation}
+
+\section{Model training}
+
+\section{Experiments}
+
+\section{Results}
+