\newacronym{HMM}{HMM}{Hidden Markov Model}
\newacronym{HTK}{HTK}{\acrlong{HMM} Toolkit}
\newacronym{IFPI}{IFPI}{International Federation of the Phonographic Industry}
-\newacronym{LPCC}{LPCC}{\acrlong{LPC} derivec cepstrum}
+\newacronym{LPCC}{LPCC}{\acrlong{LPC} derived cepstrum}
\newacronym{LPC}{LPC}{Linear Prediction Coefficients}
\newacronym{MFCC}{MFCC}{\acrlong{MFC} coefficient}
\newacronym{MFC}{MFC}{Mel-frequency cepstrum}
However, the model does not cope very well with different singing techniques or
with data that contains a lot of atmospheric noise and accompaniment.
+From the results we conclude that the model generalizes well over the trainings
+set, even with little hidden nodes. The models with 3 or 5 hidden nodes score a
+little worse than their bigger brothers but there is hardly any difference
+between the performance of a model with 8 or 13 nodes. Moreover, contrary than
+expected the window size does not seem to be doing much in the performance.
+
\subsection{Future research}
\paragraph{Forced aligment: }
Future interesting research includes doing the actual forced alignment. This
\section{Results}
\subsection{\emph{Singing} voice detection}
+Table~\ref{tbl:singing} shows the results for the singing-voice detection.
+Figure~\ref{fig:bclass} shows an example of a segment of a song with the
+classifier plotted underneath to illustrate the performance.
\begin{table}[H]
\centering
13h & 0.89 (0.28) & 0.89 (0.29) & 0.88 (0.30)\\
\bottomrule
\end{tabular}
- \caption{Binary classification results (accuracy (loss))}
+ \caption{Binary classification results (accuracy
+ (loss))}\label{tbl:singing}
\end{table}
\begin{figure}[H]
\centering
\includegraphics[width=.6\linewidth]{bclass}.
- \caption{Plotting the classifier under the audio signal}
+ \caption{Plotting the classifier under the audio signal}\label{fig:bclass}
\end{figure}
\subsection{\emph{Singer} voice detection}
+Table~\ref{tbl:singer} shows the results for the singer-voice detection.
\begin{table}[H]
\centering
13h & 0.87 (0.37) & 0.87 (0.38) & 0.86 (0.39)\\
\bottomrule
\end{tabular}
- \caption{Multiclass classification results (accuracy (loss))}
+ \caption{Multiclass classification results (accuracy
+ (loss))}\label{tbl:singer}
\end{table}
\subsection{Alien data}