update for tomorrow
[asr1617.git] / proposal.tex
1 %&proposal
2 \begin{document}
3 \maketitle
4
5 My proposed research consists of two questions of which the thesis will answer
6 at least one.
7
8 The first topic is singing voice detection. Singing voice detection has been
9 done on numerous amounts of musical styles ranging from unconventional styles
10 like Beijing opera to conventional pop music. Moreover, the problem has been
11 tackled using myriads of different approaches such as HMMs with different
12 acoustic model types but also machine learned feature sets. I would like to
13 explore how HMM based techniques perform on extreme heavy metal styles to see
14 how well it can detect growling and how classifier might be adapted to perform
15 better. Initially the classifier will be a binary classifier that classifies
16 growling and non-growling. Later on classes might be added such as and musical
17 genres.
18
19 Singing voice detection is often used as a preprocessing step for song lyrics
20 forced alignment. If the time permits I would like to explore forced alignment
21 using existing phone models on extreme heavy metal styles. Features probably
22 need to be changed to improve performance since growling is very different from
23 regular singing and speaking.
24
25 The data for this will be coming from my personal collection audio CDs. For the
26 singing voice detection the data of one band can be used and even a test set
27 can be held out since the band made over 15 full studio albums each with a
28 running time between 30 and 60 minutes. The segmentation for the trainingsdata
29 can be done by hand. Later, for the lyrics alignment, the labels for the
30 segments can be found online on song lyric websites. Validation of the
31 alignment is a bit tricky however since there is no golden standard but my own.
32
33 \paragraph{Planning}\strut\\
34 This results in the following rough outline divided on a month by month basis
35 shown in Table~\ref{tbl:outline}.
36 Possible pitfalls can arise in preparing the data since that requires
37 segmentation. It is expected to take around twice the playing time but that
38 might be an overestimation.
39
40 \begin{table}[ht]
41 \centering
42 \begin{tabular}{cll}
43 \toprule
44 Month & Description\\
45 \midrule
46 March
47 & Preparing the data\\
48 & Preparing an experiment platform\\
49 & Literature research\\
50 April
51 & Running the experiments\\
52 & Fiddle with parameters\\
53 & Explore the possibilities for forced alignment\\
54 May
55 & Write up the thesis\\
56 & Possibly do forced alignment\\
57 June
58 & Finish up thesis\\
59 & Wrap up\\
60 \bottomrule
61 \end{tabular}
62 \caption{Outline}\label{tbl:outline}
63 \end{table}
64
65 \end{document}