From: Mart Lubbers Date: Wed, 7 Jun 2017 14:12:29 +0000 (+0200) Subject: new classifier picture and extended results X-Git-Url: https://git.martlubbers.net/?a=commitdiff_plain;h=246fb7cc5cccaab6e8c0e6e250d6c4827bccb389;p=asr1617.git new classifier picture and extended results --- diff --git a/conclusion.tex b/conclusion.tex index 8f53277..6bd8087 100644 --- a/conclusion.tex +++ b/conclusion.tex @@ -15,7 +15,7 @@ little worse than their bigger brothers but there is hardly any difference between the performance of a model with 8 or 13 nodes. Moreover, contrary than expected the window size does not seem to be doing much in the performance. -\subsection{Future research} +\section{Future research} \paragraph{Forced aligment: } Future interesting research includes doing the actual forced alignment. This probably requires entirely different models. The models used for real speech diff --git a/img/bclass.png b/img/bclass.png index 836fe1b..951910d 100644 Binary files a/img/bclass.png and b/img/bclass.png differ diff --git a/results.tex b/results.tex index d6cf238..bb05b6b 100644 --- a/results.tex +++ b/results.tex @@ -1,9 +1,12 @@ \section{\emph{Singing}-voice detection} -Table~\ref{tbl:singing} shows the results for the singing-voice detection. +Table~\ref{tbl:singing} shows the results for the singing-voice detection. The +performance is given by the accuracy (and loss). The accuracy is the percentage +of correctly classified samples. + Figure~\ref{fig:bclass} shows an example of a segment of a song with the -classifier plotted underneath to illustrate the performance. The performance is -given by the accuracy and loss. The accuracy is the percentage of correctly -classified samples. +classifier plotted underneath. For this illustration the $13$ node model is +used with a analysis window size and step of $40$ and $100ms$ respectively. The +output is smoothed using a hanning window. \begin{table}[H] \centering @@ -19,16 +22,13 @@ classified samples. & 0.89 (0.28) & 0.89 (0.29) & 0.88 (0.30)\\ \bottomrule \end{tabular} - \caption{Binary classification results (accuracy - (loss))}\label{tbl:singing} + \caption{Binary classification results (accuracy (loss))}% + \label{tbl:singing} \end{table} -Plotting the classifier under a segment of the data results in -Figure~\ref{fig:bclass}. - \begin{figure}[H] \centering - \includegraphics[width=.7\linewidth]{bclass}. + \includegraphics[width=1\linewidth]{bclass} \caption{Plotting the classifier under the audio signal}\label{fig:bclass} \end{figure}