From: Mart Lubbers <mart@martlubbers.net>
Date: Wed, 7 Jun 2017 14:12:29 +0000 (+0200)
Subject: new classifier picture and extended results
X-Git-Url: https://git.martlubbers.net/?a=commitdiff_plain;h=246fb7cc5cccaab6e8c0e6e250d6c4827bccb389;p=asr1617.git

new classifier picture and extended results
---

diff --git a/conclusion.tex b/conclusion.tex
index 8f53277..6bd8087 100644
--- a/conclusion.tex
+++ b/conclusion.tex
@@ -15,7 +15,7 @@ little worse than their bigger brothers but there is hardly any difference
 between the performance of a model with 8 or 13 nodes. Moreover, contrary than
 expected the window size does not seem to be doing much in the performance.
 
-\subsection{Future research}
+\section{Future research}
 \paragraph{Forced aligment: }
 Future interesting research includes doing the actual forced alignment. This
 probably requires entirely different models. The models used for real speech
diff --git a/img/bclass.png b/img/bclass.png
index 836fe1b..951910d 100644
Binary files a/img/bclass.png and b/img/bclass.png differ
diff --git a/results.tex b/results.tex
index d6cf238..bb05b6b 100644
--- a/results.tex
+++ b/results.tex
@@ -1,9 +1,12 @@
 \section{\emph{Singing}-voice detection}
-Table~\ref{tbl:singing} shows the results for the singing-voice detection.
+Table~\ref{tbl:singing} shows the results for the singing-voice detection. The
+performance is given by the accuracy (and loss). The accuracy is the percentage
+of correctly classified samples.
+
 Figure~\ref{fig:bclass} shows an example of a segment of a song with the
-classifier plotted underneath to illustrate the performance. The performance is
-given by the accuracy and loss. The accuracy is the percentage of correctly
-classified samples.
+classifier plotted underneath. For this illustration the $13$ node model is
+used with a analysis window size and step of $40$ and $100ms$ respectively. The
+output is smoothed using a hanning window.
 
 \begin{table}[H]
 	\centering
@@ -19,16 +22,13 @@ classified samples.
 		 & 0.89 (0.28) & 0.89 (0.29) & 0.88 (0.30)\\
 		\bottomrule
 	\end{tabular}
-	\caption{Binary classification results (accuracy
-		(loss))}\label{tbl:singing}
+	\caption{Binary classification results (accuracy (loss))}%
+	\label{tbl:singing}
 \end{table}
 
-Plotting the classifier under a segment of the data results in
-Figure~\ref{fig:bclass}.
-
 \begin{figure}[H]
 	\centering
-	\includegraphics[width=.7\linewidth]{bclass}.
+	\includegraphics[width=1\linewidth]{bclass}
 	\caption{Plotting the classifier under the audio signal}\label{fig:bclass}
 \end{figure}