+\newacronym{HMM}{HMM}{Hidden Markov Model}
+\newacronym{HTK}{HTK}{\acrlong{HMM} Toolkit}
+\newacronym{FA}{FA}{Forced alignment}
+\newacronym{MFC}{MFC}{Mel-frequency cepstrum}
+\newacronym{MFCC}{MFCC}{\acrlong{MFC} coefficient}
+% description={is an abstraction for \glspl{Task} living on \acrshort{IoT} devices}}
+Berenzweig and Ellis use acoustic classifiers from speech recognition as a
+detector for singing lines. They achive 80\% accuracy for forty 15 second
+exerpts. They mention people that wrote signal features that discriminate
+between speech and music. Neural net
+In 2014 Dzhambazov et al.\ applied state of the art segmentation methods to
+polyphonic turkish music, this might be interesting to use for heavy metal.
+They mention Fujihara(2011) to have a similar \gls{FA} system. This method uses
+phone level segmentation, first 12 \gls{MFCC}s. They first do vocal/non-vocal
+detection, then melody extraction, then alignment. They compare results with
+Mesaros \& Virtanen, 2008.