-All these steps combined results in thirteen tab separated features per line in
-a file for every source file. Technical info about the processing steps is
-given in the following sections.
+
+\gls{MFCC} features are nature inspired and built incrementally in a several of
+steps.
+\begin{enumerate}
+ \item The first step in the process is converting the time representation
+ of the signal to a spectral representation using a sliding window with
+ overlap. The width of the window and the step size are two important
+ parameters in the system. In classical phonetic analysis window sizes
+ of $25ms$ with a step of $10ms$ are often chosen because they are small
+ enough to only contain subphone entities. Singing for $25ms$ is
+ impossible so it is arguable that the window size is very small.
+ \item The standard \gls{FT} gives a spectral representation that has
+ linearly scaled frequencies. This scale is converted to the \gls{MS}
+ using triangular overlapping windows.
+ \item
+\end{enumerate}
+