update
[bsc-thesis1415.git] / thesis2 / 4.discussion.tex
1 \section{Conclusion}
2
3
4 \section{Discussion}
5 \label{sec:discuss}
6
7 \begin{itemize}
8 \item No low level stuff, future research
9 \item RSS not that great of a source,
10 \item Expand technique to HTML, reuse interface, defining patterns
11 The interface for managing the crawlers works very intuitive and therefore
12 this system could be extended with a dedicated HTML crawler generation
13 module. The current method for extracting the information is not very
14 suitable for HTML but due to the modularity of the program a module can be
15 easily implemented to incorporate another technique in the application.
16 \item \textbf{Combine RSS and HTML}\\
17 A solution for bridging the gap between HTML and RSS could be a software
18 solution that can convert HTML to RSS feeds that can be fed to the existing
19 application. When HTML sites are of a certain structure, namely that with
20 news articles created by a CMS, they can be converted to RSS by flattening
21 out the structure and create the specified fields of information of RSS
22 entries. In this way the current application can be used to also process
23 possibly complicated HTML sources.
24 \end{itemize}