update
[bsc-thesis1415.git] / thesis2 / 4.discussion.tex
1 \section{Conclusion}
2 Although the research question is answered the underlying goal of the project
3 is not achieved. The application is a intuitive system that allows users to
4 manage RSS crawlers. With the application it is easy to generate, change, test
5 and remove crawlers. However while trying real world data we stumbled upon a
6 problem. Lack of RSS feeds and misuse of RSS feeds.
7
8
9 \section{Discussion}
10 \label{sec:discuss}
11
12 \begin{itemize}
13 \item No low level stuff, future research
14 \item RSS not that great of a source,
15 \item Expand technique to HTML, reuse interface, defining patterns
16 The interface for managing the crawlers works very intuitive and therefore
17 this system could be extended with a dedicated HTML crawler generation
18 module. The current method for extracting the information is not very
19 suitable for HTML but due to the modularity of the program a module can be
20 easily implemented to incorporate another technique in the application.
21 \item \textbf{Combine RSS and HTML}\\
22 A solution for bridging the gap between HTML and RSS could be a software
23 solution that can convert HTML to RSS feeds that can be fed to the existing
24 application. When HTML sites are of a certain structure, namely that with
25 news articles created by a CMS, they can be converted to RSS by flattening
26 out the structure and create the specified fields of information of RSS
27 entries. In this way the current application can be used to also process
28 possibly complicated HTML sources.
29 \end{itemize}