intro updated

author Mart Lubbers <mart@martlubbers.net>

Wed, 20 Aug 2014 18:27:47 +0000 (20:27 +0200)

committer Mart Lubbers <mart@martlubbers.net>

Wed, 20 Aug 2014 18:27:47 +0000 (20:27 +0200)
author Mart Lubbers <mart@martlubbers.net>
Wed, 20 Aug 2014 18:27:47 +0000 (20:27 +0200)
committer Mart Lubbers <mart@martlubbers.net>
Wed, 20 Aug 2014 18:27:47 +0000 (20:27 +0200)
diff --git a/thesis/appendices.tex b/thesis/appendices.tex

index fd0132c..366452f 100644 (file)
--- a/thesis/appendices.tex
+++ b/thesis/appendices.tex
@@ -1,7 +1,7 @@
  \section{Input application}
-\lstinputlisting[style=custompy,title=Python front/back-end]
-       {../program/hypfront/hyper.py}
-\lstinputlisting[style=customhtml,title=HTML landing page]
-       {../program/hypfront/index.html}
-\lstinputlisting[style=customjs,title=Javascript frontend]
-       {../program/hypfront/contextmenu_o.js}
+%\lstinputlisting[style=custompy,title=Python front/back-end]
+%      {../program/hypfront/hyper.py}
+%\lstinputlisting[style=customhtml,title=HTML landing page]
+%      {../program/hypfront/index.html}
+%\lstinputlisting[style=customjs,title=Javascript frontend]
+%      {../program/hypfront/contextmenu_o.js}
diff --git a/thesis/introduction.tex b/thesis/introduction.tex

index 553f1dc..edcdf76 100644 (file)
--- a/thesis/introduction.tex
+++ b/thesis/introduction.tex
@@ -1,57 +1,60 @@
  \section{Introduction}
-Within the entertainment business there is no consistent style of informing
-people about the events. Different venues display their, often incomplete,
-information in entirely different ways. Because of this, converting raw
-information from venues to structured consistent data is a challenging and,
-a relevant problem. 
+People are looking on the internet for information about their favourite
+theater show, music group or movie. All the information is scattered around on
+the websites, mailinglists, newsfeeds and other sources owned by the venues.
+This makes the search for cerntain events a energy consuming task. The venues
+do not have a consistent way of presenting the information and to get all the
+details different sources must be consulted. Because of this, converting raw
+information from venues into structured consistent data is a relevant problem. 
  
  \section{HyperLeap}
-Hyperleap\footnote{\url{http://hyperleap.nl/}} is a small company that is
-specialized in infotainment (information + entertainment) and administrates
-several websites which bundle information about entertainment in an ordered way
-and as complete as possible.  Right now, most of the input data is added to the
-database by by hand which is very labor intensive. Therefore Hyperleap is
-looking for a smart solution to automate a part of the data injection in the
-database, the crux however is that the system must not be too complicated from
-the outside and be usable for a non IT professional(NIP).
+Hyperleap\footnote{\url{http://hyperleap.nl/}} is a small company settled in
+Nijmegen that is specialized in bundling the information from different sources
+into a consistent information source about entertainment(infotainment), it 
+administrates several websites for several entertainment categories. Hyperleap
+differentiates itself from other companies with the same business goals because
+Hyperleap the most complete information most of the time.
+Right now, most of the data in the database is added in two different fashions.
+The first method is inputting the data in the database by hand, an employee
+scans the raw inputs gathered from websites and has to separate the entries and
+match them to existing events or create new events.  This process is very
+labour intensive and therefore costly.
+The second way of adding information to the database is by crawlers programmed
+specifically for certain websites. Because a programmer is needed to program
+all the separate crawlers individually this is a costly business. This way of
+gathering information is also very error-prone, this because when a source
+changes it structure, for example the layout, the crawler is not functioning
+anymore. When this happens the programmer has to adapt the crawler again to the
+new changes and this takes valuable time.
  
  \section{Research question and practical goals}
-This brings up the main research question: \textit{How can we make an adaptive,
-autonomous and programmable data mining program that can be set up by a NIP
-which is able to transform raw data into structured data.}\\
+The goal of the project is to create a software solution to make an employee
+with no particular programmers background able to train or retrain crawlers for
+RSS\footnote{\url{http://www.rssboard.org/rss-specification}} or
+Atom\footnote{\url{http://tools.ietf.org/html/rfc5023}} publishing feeds. This
+is done in such a way that the information is categorized and put into the
+database. The software will notice the administrator of the program when a
+source changed so that the new data can be added to the crawlers trainingset or
+it can be decided that source crawler will be retrained from scratch.
  
-In practice the goal and aim of the project is to create an application that
-can, with NIP input, give computer parsable patterns which a separate crawler
-can periodically crawl. The NIP has to be able to enter the information about
-the data source in a user friendly interface which sends the information
-together with the data source to the data processing application. The
-data processing application then in turn processes the data into a extraction
-pattern which is sent to the crawler. The crawler can visit sources specified
-by the NIP accompanied by the extraction pattern created by the data processing
-application. This work flow is described in graph~\ref{fig:ig1}.
+This brings up the main research question: 
+\begin{center}
+       \textit{How can we make an adaptive, autonomous and programmable data mining
+       program that can be set up by someone without programmering experience which
+is capable of transforming raw data into structured data.}
+\end{center}
  
-\begin{figure}[H]
-       \centering
-       \caption{Work flow within the applications}
-       \label{fig:ig1}
-       \includegraphics[width=150mm]{./dots/graph3.png}
-\end{figure}
+In practise this means that the end product is a software solution which does
+the previously described tasks.
  
-In this way the NIP can train the crawler to periodically crawl different data
-sources without too much technical knowledge. The main goal of this project is
-to extract the underlying structure rather then to extract the substructures.
-The project is in principle a continuation of a past project done by Wouter
-Roelofs\cite{Roelofs2009} which was also supervised by Franc Grootjen and
-Alessandro Paula, however it was never taken out of the experimental phase. The
-techniques described by Roelofs et al. are more focussed on extracting data
-from substructures so it can be an addition to the current project.\\
-
-As a very important side note, the crawler needs to notify the administrators if
-a source has become problematic to crawl, in this way the NIP can very easily
-retrain the application to fit the latest structural patterns.
  
  \section{Scientific relevance}
  Currently the techniques for conversion from non structured data to structured
-data are static and mainly only usable by IT specialists. There is a great need
-of data mining in non structured data because the data within companies and on
-the internet is piling up and are usually left to catch dust.
+data are static and mainly only usable by computer science experts. There is a
+great need of data mining in non structured data because the data within
+companies and on the internet is piling up and are usually left to catch dust.
+
+The project is a continuation of the past project done by Roelofs et
+al.\cite{Roelofs2009}. The techniques described by Roelofs et al. are more
+focussed on extracting data from already isolated data so it can be an addition
+to the current project.
diff --git a/thesis/methods.tex b/thesis/methods.tex

index 475d80b..6fab641 100644 (file)
--- a/thesis/methods.tex
+++ b/thesis/methods.tex
@@ -1,3 +1,52 @@
+\section{Software architecture}
+\begin{figure}
+  \centering
+
+  \begin{sequencediagram}
+               \newthread{u}{:User}
+               \newinst{i}{:Input}
+               \newinst{p}{:Data processing}
+               \newinst{s}{:Source}
+               \newthread{c}{:Crawler}
+               \newinst{d}{:Database}
+
+               \begin{sdblock}{Training}{}
+                       \begin{messcall}
+                               {u}{initiate}{i}
+                               \begin{call}
+                                       {i}{fetch source}
+                                       {s}{source data}
+                               \end{call}
+                               \begin{call}
+                                       {i}{ask for markings}
+                                       {u}{marked data}
+                               \end{call}
+                               \begin{messcall}
+                                       {i}{marked data}{p}
+                                       \begin{messcall}
+                                               {p}{processed crawler pattern}{c}
+                                       \end{messcall}
+                               \end{messcall}
+                       \end{messcall}
+               \end{sdblock}
+
+               \begin{sdblock}{Crawl}{Correct}
+                       \begin{call}
+                               {c}{visit source}
+                               {s}{source data}
+                       \end{call}
+                       \begin{messcall}
+                               {c}{processed data}{d}
+                       \end{messcall}
+               \end{sdblock}
+
+  \end{sequencediagram}
+
+       \caption{Workflow of the application}
+\end{figure}
+
+
+
  The program can be divided into three components: input, data processing and
  the crawler. The applications have separate tasks within the workflow, the
  input application defines together with the NIP the patterns for the source,
diff --git a/thesis/pgf-umlsd.sty b/thesis/pgf-umlsd.sty

new file mode 100644 (file)

index 0000000..99847db
--- /dev/null
+++ b/thesis/pgf-umlsd.sty
@@ -0,0 +1,329 @@
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Start of pgf-umlsd.sty
+%
+% Some macros for UML Sequence Diagrams.
+% Home page of project: http://pgf-umlsd.googlecode.com/
+% Author: Xu Yuan <xuyuan.cn@gmail.com>, Southeast University, China
+% Contributor: Nobel Huang <nobel1984@gmail.com>, Southeast University, China
+%
+% History:
+% v0.7 2012/03/05
+%      - unify interface of call and callself
+%      - non-instantaneous message
+%      - bugfix: conflits with tikz library backgrounds
+% v0.6 2011/07/27
+%      - Fix Issue 6 reported by frankmorgner@gmail.com
+%        - diagram without a thread
+%        - allows empty diagram
+%      - New manual
+% v0.5 2009/09/30 Fix Issue 2 reported by vlado.handziski
+%      - Nested callself is supported
+%      - Rename sdloop and sdframe to sdblock
+% v0.4 2008/12/08  Fix Issue 1 reported by MathStuf:
+%      Nested sdloop environment hides outer loop
+% v0.3 2008/11/10 in Berlin, fix for the PGF cvs version:
+%      - the list items in \foreach are not evaluated by default now,
+%      the `evaluate' opinion should be used
+% v0.2 2008/03/20 create project at http://pgf-umlsd.googlecode.com/
+%      - use `shadows' library
+%      Thanks for Dr. Ludger Humbert's <humbert@uni-wuppertal.de> feedback!
+%      - reduce the parameter numbers, the user can write the content
+%      of instance (such as no colon)
+%      - the user can redefine the `inststyle'
+%      - new option: switch underlining of the instance text
+%      - new option: switch rounded corners
+% v0.1 2008/01/25 first release at http://www.fauskes.net/pgftikzexamples/
+%
+
+\NeedsTeXFormat{LaTeX2e}[1999/12/01]
+\ProvidesPackage{pgf-umlsd}[2011/07/27 v0.6 Some LaTeX macros for UML
+Sequence Diagrams.]
+
+\RequirePackage{tikz}
+\usetikzlibrary{arrows,shadows}
+
+\RequirePackage{ifthen}
+
+% Options
+% ? the instance name under line ?
+\newif\ifpgfumlsdunderline\pgfumlsdunderlinetrue
+\DeclareOption{underline}{\pgfumlsdunderlinetrue}
+\DeclareOption{underline=true}{\pgfumlsdunderlinetrue}
+\DeclareOption{underline=false}{\pgfumlsdunderlinefalse}
+% ? the instance box with rounded corners ?
+\newif\ifpgfumlsdroundedcorners\pgfumlsdroundedcornersfalse
+\DeclareOption{roundedcorners}{\pgfumlsdroundedcornerstrue}
+\DeclareOption{roundedcorners=true}{\pgfumlsdroundedcornerstrue}
+\DeclareOption{roundedcorners=false}{\pgfumlsdroundedcornersfalse}
+\ProcessOptions
+
+% new counters
+\newcounter{preinst}
+\newcounter{instnum}
+\newcounter{threadnum}
+\newcounter{seqlevel} % level
+\newcounter{callevel}
+\newcounter{callselflevel}
+\newcounter{blocklevel}
+
+% new an instance
+% Example:
+% \newinst[edge distance]{var}{name:class}
+\newcommand{\newinst}[3][0.2]{
+  \stepcounter{instnum}
+  \path (inst\thepreinst.east)+(#1,0) node[inststyle] (inst\theinstnum)
+  {\ifpgfumlsdunderline
+    \underline{#3}
+  \else
+  #3
+  \fi};
+  \path (inst\theinstnum)+(0,-0.5*\unitfactor) node (#2) {};
+  \tikzstyle{instcolor#2}=[]
+  \stepcounter{preinst}
+}
+
+% new an instance thread
+% Example:
+% \newinst[color]{var}{name}{class}
+\newcommand{\newthread}[3][gray!30]{
+  \newinst{#2}{#3}
+  \stepcounter{threadnum}
+  \node[below of=inst\theinstnum,node distance=0.8cm] (thread\thethreadnum) {};
+  \tikzstyle{threadcolor\thethreadnum}=[fill=#1]
+  \tikzstyle{instcolor#2}=[fill=#1]
+}
+
+% draw running (thick) line, should not call directly
+\newcommand*{\drawthread}[2]{
+  \begin{pgfonlayer}{umlsd@threadlayer}
+    \draw[threadstyle] (#1.west) -- (#1.east) -- (#2.east) -- (#2.west) -- cycle;
+  \end{pgfonlayer}
+}
+
+% a function call
+% Example:
+% \begin{call}[height]{caller}{function}{callee}{return}
+% \end{call}
+\newenvironment{call}[5][1]{
+\ifthenelse{\equal{#2}{#4}}
+{
+  \begin{callself}[#1]{#2}{#3}{#5}
+}
+{
+  \begin{callanother}[#1]{#2}{#3}{#4}{#5}
+}
+}
+{
+\ifthenelse{\equal{\f\thecallevel}{\t\thecallevel}}
+{
+  \end{callself}
+}
+{
+  \end{callanother}
+}
+}
+
+% function call to another instance
+% interal use only
+\newenvironment*{callanother}[5][1]{
+  \stepcounter{seqlevel}
+  \stepcounter{callevel} % push
+  \path
+  (#2)+(0,-\theseqlevel*\unitfactor-0.7*\unitfactor) node (cf\thecallevel) {}
+  (#4.\threadbias)+(0,-\theseqlevel*\unitfactor-0.7*\unitfactor) node (ct\thecallevel) {};
+  
+  \draw[->,>=triangle 60] ({cf\thecallevel}) -- (ct\thecallevel)
+  node[midway, above] {#3};
+  \def\l\thecallevel{#1}
+  \def\f\thecallevel{#2}
+  \def\t\thecallevel{#4}
+  \def\returnvalue{#5}
+  \tikzstyle{threadstyle}+=[instcolor#2]
+}
+{
+  \addtocounter{seqlevel}{\l\thecallevel}
+  \path
+  (\f\thecallevel)+(0,-\theseqlevel*\unitfactor-0.7*\unitfactor) node (rf\thecallevel) {}
+  (\t\thecallevel.\threadbias)+(0,-\theseqlevel*\unitfactor-0.7*\unitfactor) node (rt\thecallevel) {};
+  \draw[dashed,->,>=angle 60] ({rt\thecallevel}) -- (rf\thecallevel)
+  node[midway, above]{\returnvalue};
+  \drawthread{ct\thecallevel}{rt\thecallevel}
+  \addtocounter{callevel}{-1} % pop
+}
+
+% a function do not need call others
+% interal use only
+% Example:
+% \begin{callself}[height]{caller}{function}{return}
+% \end{callself}
+\newenvironment*{callself}[4][1]{
+  \stepcounter{seqlevel}
+  \stepcounter{callevel} % push
+  \stepcounter{callselflevel}
+
+  \path
+  (#2)+(\thecallselflevel*0.1-0.1,-\theseqlevel*\unitfactor-0.7*\unitfactor) node (sc\thecallevel) {}
+  ({sc\thecallevel}.east)+(0,-0.33*\unitfactor) node (scb\thecallevel) {};
+
+  \draw[->,>=triangle 60] ({sc\thecallevel}.east) -- ++(0.8,0)
+  node[near start, above right] {#3} -- ++(0,-0.33*\unitfactor)
+  -- (scb\thecallevel); 
+  \def\l\thecallevel{#1}
+  \def\f\thecallevel{#2}
+  \def\t\thecallevel{#2}
+  \def\returnvalue{#4}
+  \tikzstyle{threadstyle}+=[instcolor#2]
+}{
+  \addtocounter{seqlevel}{\l\thecallevel}
+  \path (\f\thecallevel)+(\thecallselflevel*0.1-0.1,-\theseqlevel*\unitfactor-0.33*\unitfactor) node
+  (sct\thecallevel) {};
+
+  \draw[dashed,->,>=angle 60] ({sct\thecallevel}.east) node
+  (sce\thecallevel) {} -- ++(0.8,0) -- node[midway, right]{\returnvalue} ++(0,-0.33*\unitfactor) -- ++(-0.8,0);
+  \drawthread{scb\thecallevel}{sce\thecallevel}
+  \addtocounter{callevel}{-1} % pop
+  \addtocounter{callselflevel}{-1}
+}
+
+% message between threads
+% Example:
+% \mess[delay]{sender}{message content}{receiver}
+\newcommand{\mess}[4][0]{
+  \stepcounter{seqlevel}
+  \path
+  (#2)+(0,-\theseqlevel*\unitfactor-0.7*\unitfactor) node (mess from) {};
+  \addtocounter{seqlevel}{#1}
+  \path
+  (#4)+(0,-\theseqlevel*\unitfactor-0.7*\unitfactor) node (mess to) {};
+  \draw[->,>=angle 60] (mess from) -- (mess to) node[midway, above]
+  {#3};
+
+  \node (#3 from) at (mess from) {};
+  \node (#3 to) at (mess to) {};
+}
+
+\newenvironment{messcall}[4][1]{
+  \stepcounter{seqlevel}
+  \stepcounter{callevel} % push
+  \path
+  (#2)+(0,-\theseqlevel*\unitfactor-0.7*\unitfactor) node (cf\thecallevel) {}
+  (#4.\threadbias)+(0,-\theseqlevel*\unitfactor-0.7*\unitfactor) node (ct\thecallevel) {};
+  
+  \draw[->,>=angle 60] ({cf\thecallevel}) -- (ct\thecallevel)
+  node[midway, above] {#3};
+  \def\l\thecallevel{#1}
+  \def\f\thecallevel{#2}
+  \def\t\thecallevel{#4}
+  \tikzstyle{threadstyle}+=[instcolor#2]
+}
+{
+  \addtocounter{seqlevel}{\l\thecallevel}
+  \path
+  (\f\thecallevel)+(0,-\theseqlevel*\unitfactor-0.7*\unitfactor) node (rf\thecallevel) {}
+  (\t\thecallevel.\threadbias)+(0,-\theseqlevel*\unitfactor-0.3*\unitfactor) node (rt\thecallevel) {};
+  \drawthread{ct\thecallevel}{rt\thecallevel}
+  \addtocounter{callevel}{-1} % pop
+}
+
+% In the situation of multi-threads, some objects are called at the
+% same time. Currently, we have to adjust the bias of thread line
+% manually. Possible parameters are: center, west, east
+\newcommand{\setthreadbias}[1]{\global\def\threadbias{#1}}
+
+% This function makes the call earlier.
+\newcommand{\prelevel}{\addtocounter{seqlevel}{-1}}
+
+% This function makes the call later.
+\newcommand{\postlevel}{\addtocounter{seqlevel}{+1}}
+
+% a block box with caption
+% \begin{sdblock}[caption background color]{caption}{comments}
+% \end{sdblock}
+\newenvironment{sdblock}[3][white]{
+  \stepcounter{seqlevel}
+  \stepcounter{blocklevel} % push
+  \coordinate (blockbeg\theblocklevel) at (0,-\theseqlevel*\unitfactor-\unitfactor);
+  \stepcounter{seqlevel}
+  \def\blockcolor\theblocklevel{#1}
+  \def\blockname\theblocklevel{#2}
+  \def\blockcomm\theblocklevel{#3}
+  \begin{pgfinterruptboundingbox}
+}{
+  \coordinate (blockend) at (0,-\theseqlevel*\unitfactor-2*\unitfactor);
+  \path (current bounding box.east)+(0.2,0) node (boxeast) {}
+  (current bounding box.west |- {blockbeg\theblocklevel}) + (-0.2,0)
+  node (nw) {};
+  \path (boxeast |- blockend) node (se) {};
+
+  % % title
+  \node[blockstyle] (blocktitle) at (nw) {\blockname\theblocklevel};
+  \path (blocktitle.south east) + (0,0.2) node (set) {}
+  (blocktitle.south east) + (-0.2,0) node (seb) {}
+  (blocktitle.north east) + (0.2,0) node (comm) {};
+  \draw[fill=\blockcolor\theblocklevel] (blocktitle.north west) -- (blocktitle.north east) --
+  (set.center) -- (seb.center) -- (blocktitle.south west) -- cycle;
+  \node[blockstyle] (blocktitle) at (nw) {\blockname\theblocklevel};
+  \node[blockcommentstyle] (blockcomment) at (comm) {\blockcomm\theblocklevel};
+
+  \coordinate (se) at (current bounding box.south east);
+  \end{pgfinterruptboundingbox}
+
+  \draw (se) rectangle (nw);
+
+  \addtocounter{blocklevel}{-1} % pop
+  \stepcounter{seqlevel}
+}
+
+% the environment of sequence diagram
+\newenvironment{sequencediagram}{
+  % declare layers
+  \pgfdeclarelayer{umlsd@background}
+  \pgfdeclarelayer{umlsd@threadlayer}
+  \pgfsetlayers{umlsd@background,umlsd@threadlayer,main}
+
+  \begin{tikzpicture}
+    \setlength{\unitlength}{1cm}
+    \tikzstyle{sequence}=[coordinate]
+    \tikzstyle{inststyle}=[rectangle, draw, anchor=west, minimum
+    height=0.8cm, minimum width=1.6cm, fill=white, 
+    drop shadow={opacity=1,fill=black}]
+    \ifpgfumlsdroundedcorners
+    \tikzstyle{inststyle}+=[rounded corners=3mm]
+    \fi
+    \tikzstyle{blockstyle}=[anchor=north west]
+    \tikzstyle{blockcommentstyle}=[anchor=north west, font=\small]
+    \tikzstyle{dot}=[inner sep=0pt,fill=black,circle,minimum size=0.2pt]
+    \global\def\unitfactor{0.6}
+    \global\def\threadbias{center}
+    % reset counters
+    \setcounter{preinst}{0}
+    \setcounter{instnum}{0}
+    \setcounter{threadnum}{0}
+    \setcounter{seqlevel}{0}
+    \setcounter{callevel}{0}
+    \setcounter{callselflevel}{0}
+    \setcounter{blocklevel}{0}
+
+    % origin
+    \node[coordinate] (inst0) {};
+}
+{
+  \begin{pgfonlayer}{umlsd@background}
+    \ifnum\c@instnum > 0
+    \foreach \t [evaluate=\t] in {1,...,\theinstnum}{
+      \draw[dotted] (inst\t) -- ++(0,-\theseqlevel*\unitfactor-2.2*\unitfactor);
+    }
+    \fi
+    \ifnum\c@threadnum > 0
+    \foreach \t [evaluate=\t] in {1,...,\thethreadnum}{
+      \path (thread\t)+(0,-\theseqlevel*\unitfactor-0.1*\unitfactor) node (threadend) {};
+      \tikzstyle{threadstyle}+=[threadcolor\t]
+      \drawthread{thread\t}{threadend}
+    }
+    \fi
+  \end{pgfonlayer}
+\end{tikzpicture}}
+
+
+%%% End of pgf-umlsd.sty
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\ No newline at end of file
diff --git a/thesis/thesis.tex b/thesis/thesis.tex

index 98ff026..69f8159 100644 (file)
--- a/thesis/thesis.tex
+++ b/thesis/thesis.tex
@@ -1,11 +1,15 @@
-\documentclass{scrbook}
+\documentclass[hidelinks]{scrbook}
  
-\usepackage{lipsum}
-\usepackage{graphicx}
-\usepackage{float}
-\usepackage{listings}
-\usepackage{hyperref}
+\usepackage{lipsum} % Dummy text
+\usepackage{graphicx} % Images
+\usepackage{float} % Better placement float figures
+\usepackage{listings} % Source code formatting
+\usepackage{hyperref} % Hyperlinks
+\usepackage{tikz} % Sequence diagrams
+\usepackage{pgf-umlsd}
+\usepgflibrary{arrows}
  
+% Set listings settings
  \lstset{
         basicstyle=\scriptsize,
         breaklines=true,
@@ -13,8 +17,6 @@
         numberstyle=\tiny,
         tabsize=2
  }
-
-
  \lstdefinestyle{custompy}{
         language=python,
         keepspaces=true,
@@ -28,7 +30,14 @@
         language=java
  }
  
+% Setup hyperlink formatting
+\hypersetup{
+       pdftitle={Non IT congurable adaptive data mining solution used in       transforming raw data to structured data},
+       pdfauthor={Mart Lubbers},
+       pdfsubject={Artificial Intelligence},
+}
  
+% Describe the frontpage
  \author{Mart Lubbers\\s4109053}
  \title{Non IT congurable adaptive data mining solution used in transforming raw
  data to structured data} 
@@ -41,7 +50,6 @@ data to structured data}
                 RU && Hyperleap
         \end{tabular}
         }
-
  \date{\today}
  
  \begin{document}
@@ -49,6 +57,7 @@ data to structured data}
  \tableofcontents
  \newpage
  
+% Surrogate abstract
  \chapter*{
         \centering 
         \begin{normalsize}
author	Mart Lubbers <mart@martlubbers.net>
	Wed, 20 Aug 2014 18:27:47 +0000 (20:27 +0200)
committer	Mart Lubbers <mart@martlubbers.net>
	Wed, 20 Aug 2014 18:27:47 +0000 (20:27 +0200)
thesis/appendices.tex		patch \| blob \| history
thesis/introduction.tex		patch \| blob \| history
thesis/methods.tex		patch \| blob \| history
thesis/pgf-umlsd.sty	[new file with mode: 0644]	patch \| blob
thesis/thesis.tex		patch \| blob \| history