final exam final
[itlast1617.git] / exam / q2.tex
1 \begin{enumerate}
2 % Question 2a
3 \item This can be achieved by adding disfluency rules to the \textsc{CFG}.
4 This has to be done for all rules that can possible produce
5 disfluencies. Most likely only the lowest level of rules (unit
6 productions) need such disfluency structures. For example, if we would
7 do it for the rule that transforms a \texttt{Noun} into a word it would
8 look like this:
9
10 \begin{lstlisting}
11 Noun -> TrueNoun | EditNoun TrueNoun
12 TrueNoun -> flight | ...
13
14 EditNoun -> TrueNoun EditWord
15 EditWord -> uh | ...
16 \end{lstlisting}
17
18 With feature structures this can be generalized and have less
19 ambiguitiy. Features can for example force the \emph{Reparandum} to be
20 of the same \texttt{CAT} as the \emph{Repair} and disfluencies might
21 have some constraints that can also be expressed with features.
22
23 % Question 2b
24 \item Standard \textsc{CKY} parsing only works for grammars in
25 \emph{Chomsky Normal Form} (\textsc{CNF}). This means that the tree
26 returned will not exactly represent the \textsc{CFG} since it possibly
27 had to be converted to \textsc{CNF}. To adapt \textsc{CKY} in a
28 fundamental way so that it correctly parses repair structures would be
29 very difficult, albeit impossible. It basically means that, in the
30 deepest loop, you have to build in functionality that is similar to the
31 grammar that recognizes such structures and behave accordingly. While
32 this is probably theoretically possible, it will result in a different
33 algorithm that has a hard-coded sub-grammar in itself.
34
35 % Question 2c
36 \item Similar to the previous sub-question; while it is possible to make the
37 \emph{Predictor} more smart and add disfluency structures to the chart
38 it would change the \emph{Earley} algorithm significantly. The change
39 of the algorithm would also be very specific to certain disfluency
40 structures and makes it possibly unusable for languages that do not
41 have such structures. Note that it is more easy to add this to an
42 \emph{Earley} parser compared to adding it to an \emph{CKY} parser. For
43 an \emph{Earley} parser it just means hard-coding some extra grammar
44 rules in the \emph{Predictor}. For \emph{CKY} it means transforming
45 the rules to specific transformations in the table which might not be
46 trivial.
47 \end{enumerate}