c15c231a6dea22ac0ecc4d500a49bc7609331c0e
[itlast1617.git] / exam2 / q2.tex
1 \begin{enumerate}[label=\alph*.]
2 % 2a
3 \item
4 Using the existing (English language) infrastructure to process foreign
5 queries might work better than one might expect. A lot of languages
6 share linguistic structures with English such as word positioning.
7 Moreover a lot of specialised domain words and proper names in foreign
8 languages are borrowed from English.
9
10 Of course there are also very major problems. A very big problem would
11 be not translating certain (question) words. For example the query
12 \emph{Stierf Micheal Jackson in 2009?}. When we put this query in the
13 engine it will know that something happened to \emph{MJ} in 2009 but it
14 will not know whether that something is the same as what the user
15 wanted to ask which leads to confusion.
16
17 Moreover, there are several seemingly simple structural
18 divergences (Section 25.1.2) that can cause major problems when not
19 translating such as date notation.
20
21 In conclusion, using no translation, when the language is similar to
22 English it might yield surprisingly good results. However, when the
23 difference is bigger especially the question classification will be
24 wrong and that will result into strange answers.
25
26 % 2b
27 \item
28 Translated material is hardly ever exactly the same as the original
29 materials, it either has more details that were not in the original
30 query, less details or wrong details.
31
32 More details can occur because of lexical gaps (Section 25.1.3). Some
33 language might have been developed in a region where there hardly any
34 fish, such as in the desert, and therefore the need for specialised
35 words in fishing was not there. Maybe this language only has one word
36 for fish whereas English has many. In this way extra details can be
37 inserted. Of course this also works the other way around. A popular,
38 dubious statement is often made that some Inu{\"\i}t language has over a
39 hundred words for snow. When such a specialised word is used it might
40 not be possible to correctly translate it at all to English and
41 therefore we lose detail.
42
43 % 2c
44 \item
45 The quality of the knowledge extraction depends heavily on the user's
46 language because of the aforementioned lexical gaps. However, these
47 lexical gaps might be bridged with a suitable translation system.
48 \end{enumerate}