e3b4f9e9ee721cfe163122da04b8ae3db4f09aae
[itlast1617.git] / exam2 / q3.tex
1 \begin{enumerate}[label=\alph*.]
2 % 3a
3 \item
4 The \emph{Levenshtein} algorithm for edit distance is a very usefull
5 tool to detect spelling variants, however there are certain situations
6 where it will not work out of the box. One of such cases is when there
7 is a difference in script. Transliteration between scripts often
8 introduces extra letters.
9
10 For example the russian form of
11 \emph{Muhammad} becomes \emph{Mukhammed}. The \emph{kh} is a
12 construction that is not used in the English language but it sound a
13 lot like the \emph{ch} in the Scottish \emph{loch}. Such added
14 characters can introduce higher edit distances. We can possibly
15 overcome this problem by using a broader notion of characters and look
16 at phonemes for example.
17
18 \emph{Viterbi} on the other hand
19
20
21 % 3b
22 \item
23
24
25 % 3c
26 \item
27
28 \end{enumerate}