From: Mart Lubbers Date: Mon, 2 Feb 2015 13:33:39 +0000 (+0100) Subject: t' X-Git-Url: https://git.martlubbers.net/?a=commitdiff_plain;h=912e1a4aad43860711eede97d2c8ceeb26058af0;p=bsc-thesis1415.git t' --- diff --git a/thesis2/4.discussion.tex b/thesis2/4.discussion.tex index 9fb2fcb..7ab6b21 100644 --- a/thesis2/4.discussion.tex +++ b/thesis2/4.discussion.tex @@ -2,7 +2,7 @@ \begin{center} \textit{Is it possible to shorten the feedback loop for repairing and adding crawlers by making a system that can create, add and maintain crawlers for - RSS feeds} + RSS feeds} \end{center} The short answer to the problem statement made in the introduction is yes. We @@ -65,14 +65,14 @@ extraction. % combine RSS HTML Another use or improvement could be combining the forces of HTML and RSS. Some -specifically structured HTML sources could be converted to a RSS feed and still -get procces by the application. In this way, with an extra intermediate step, -the extraction techniques can still be used. HTML sources most likely have to -be generated because there has to be a very consistent structure in the data. -Websites with such great structure are usually generated from a CMS. This will -enlarge the domain for the application significantly since almost all websites -use CMS to publish their data. When conversion between HTML and RSS feeds is -not possible but one has a technique to extract patterns in a similar way then -this application it is also possible to embed it in the current application. -Due to the modularity of the application extending the application is very -easy. +specifically structured HTML sources could be converted to a tidy RSS feed and +still get proccesed by the application. In this way, with an extra intermediate +step, the extraction techniques can still be used. HTML sources most likely +have to be generated because there has to be a very consistent structure in the +data. Websites with such great structure are usually generated from a CMS. +This will enlarge the domain for the application significantly since almost all +websites use CMS to publish their data. When conversion between HTML and RSS +feeds is not possible but one has a technique to extract patterns in a similar +way then this application it is also possible to embed it in the current +application. Due to the modularity of the application extending the +application is very easy. diff --git a/thesis2/Makefile b/thesis2/Makefile index 29934ec..56f23d7 100644 --- a/thesis2/Makefile +++ b/thesis2/Makefile @@ -1,5 +1,5 @@ SHELL:=/bin/bash -VERSION:=0.6 +VERSION:=0.7 all: thesis diff --git a/thesis2/thesis.tex b/thesis2/thesis.tex index 7e7039c..3fe318e 100644 --- a/thesis2/thesis.tex +++ b/thesis2/thesis.tex @@ -37,8 +37,8 @@ texcl=false, } -\newcommand{\cvartitle}{Non IT configurable adaptive data mining solution used -in transforming raw data to structured data} +\newcommand{\cvartitle}{Adaptable crawler generation system for leisure +activity RSS feeds} % Setup hyperlink formatting \hypersetup{ pdftitle={\cvartitle},