From: Mart Lubbers Date: Mon, 29 May 2017 14:57:39 +0000 (+0200) Subject: elaborate on sections X-Git-Url: https://git.martlubbers.net/?a=commitdiff_plain;h=0309ddbfdcaa0d8ccbeef78b5b06ff76894b173d;p=asr1617.git elaborate on sections --- diff --git a/conclusion.tex b/conclusion.tex index 9d5176f..d1dd736 100644 --- a/conclusion.tex +++ b/conclusion.tex @@ -1,18 +1,33 @@ -\section{Conclusion} +\section{Conclusion \& Future Research} This research shows that existing techniques for singing-voice detection designed for regular singing voices also work respectably on extreme singing styles like grunting. With a standard \gls{ANN} classifier using \gls{MFCC} -features a performance of $85\%$ can be achieved. When applying smoothing this -can be increased until\todo{results}. +features a performance of $85\%$ can be achieved which is similar to the same +techniques on regular singing. This means that it might be suitable as a +pre-processing step for lyrics forced alignment. + +Future interesting research includes doing the actual forced alignment. This +probably requires entirely different models. The models used for real speech +are probably not suitable because the acoustic properties of a regular singing +voice is very different from a growling voice, let alone speech. + +Secondly, it would be interesting if a model could be trained that could +discriminate a singing voice for all styles of singing including growling. +Moreover, it is possible to investigate the performance of detecting growling +on regular singing-voice trained models and the other way around. %Discussion section \section{Discussion} -Singing-voice detection can be seen as a crude way of -genre-discrimination.\todo{finish} +The dataset used is not very big. Only three albums are annotated and used +as training data. The albums chosen do represent the ends of the spectrum and +therefore the resulting model can be very general. However, it could also mean +that the model is able to recognize three islands in the entire space of +grunting. This does not seem the case since the results show that totally alien +data also has a good performance. -\todo{Novelty} -\todo{Weaknesses} -\todo{Dataset is not very varied but\ldots} +The model clearly has trouble with pauses between singing. -\todo{Doom metal} -%Conclusion section +\emph{Singing}-voice detection and \emph{singer}-voice Singing-voice detection +can be seen as a crude way of genre-discrimination. Therefore it be +generalizable to extensive genre recognition +might. diff --git a/methods.tex b/methods.tex index 0a226a1..0e557df 100644 --- a/methods.tex +++ b/methods.tex @@ -211,6 +211,7 @@ batch size of $32$. \section{Results} \subsection{\emph{Singing} voice detection} + \begin{table}[H] \centering \begin{tabular}{rccc} @@ -234,6 +235,7 @@ batch size of $32$. \end{figure} \subsection{\emph{Singer} voice detection} + \begin{table}[H] \centering \begin{tabular}{rccc} @@ -249,3 +251,5 @@ batch size of $32$. \end{tabular} \caption{Multiclass classification results (accuracy (loss))} \end{table} + +\subsection{Alien data}