exam/q2.tex

   1 \begin{enumerate}
   2         % Question 2a
   3         \item This can be achieved by adding disfluency rules to the \textsc{CFG}.
   4                 This has to be done for all rules that can possible produce
   5                 disfluencies. Most likely only the lowest level of rules (unit
   6                 productions) need such disfluency structures. For example, if we would
   7                 do it for the rule that transforms a \texttt{Noun} into a word it would
   8                 look like this:
   9
  10                 \begin{lstlisting}
  11 Noun -> TrueNoun | EditNoun TrueNoun
  12 TrueNoun -> flight | ...
  13
  14 EditNoun -> TrueNoun EditWord
  15 EditWord -> uh | ...
  16                 \end{lstlisting}
  17
  18                 With feature structures this can be generalized and have less
  19                 ambiguitiy. Features can for example force the \emph{Reparandum} to be
  20                 of the same \texttt{CAT} as the \emph{Repair} and disfluencies might
  21                 have some constraints that can also be expressed with features.
  22
  23         % Question 2b
  24         \item Standard \textsc{CKY} parsing only works for grammars in
  25                 \emph{Chomsky Normal Form} (\textsc{CNF}). This means that the tree
  26                 returned will not exactly represent the \textsc{CFG} since it possibly
  27                 had to be converted to \textsc{CNF}. To adapt \textsc{CKY} in a
  28                 fundamental way so that it correctly parses repair structures would be
  29                 very difficult, albeit impossible. It basically means that, in the
  30                 deepest loop, you have to build in functionality that is similar to the
  31                 grammar that recognizes such structures and behave accordingly. While
  32                 this is probably theoretically possible, it will result in a different
  33                 algorithm that has a hard-coded sub-grammar in itself.
  34
  35         % Question 2c
  36         \item Similar to the previous sub-question; while it is possible to make the
  37                 \emph{Predictor} more smart and add disfluency structures to the chart
  38                 it would change the \emph{Earley} algorithm significantly. The change
  39                 of the algorithm would also be very specific to certain disfluency
  40                 structures and makes it possibly unusable for languages that do not
  41                 have such structures. Note that it is more easy to add this to an
  42                 \emph{Earley} parser compared to adding it to an \emph{CKY} parser. For
  43                 an \emph{Earley} parser it just means hard-coding some extra grammar
  44                 rules in the \emph{Predictor}. For \emph{CKY} it means transforming
  45                 the rules to specific transformations in the table which might not be
  46                 trivial.
  47 \end{enumerate}