From 5cc599b4cbfcaef87ebdb72bd0f352ea616beda4 Mon Sep 17 00:00:00 2001
From: Mart Lubbers <mart@martlubbers.net>
Date: Wed, 29 Oct 2014 16:29:20 +0100
Subject: [PATCH] part of algo explained

---
 thesis2/2.methods.tex | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/thesis2/2.methods.tex b/thesis2/2.methods.tex
index 477190d..3f67cb3 100644
--- a/thesis2/2.methods.tex
+++ b/thesis2/2.methods.tex
@@ -23,3 +23,25 @@ Generate xml
 \subsection{Interface}
 
 \subsection{Algorithm}
+\subsection{Preprocessing}
+When the data is received by the crawler the data is embedded as POST data in a
+HTTP request. The POST data consists of several fields with information about
+the feed and a container that has the table with the user markers embedded.
+After that the entries are extracted and processed line by line.
+
+The line processing converts the raw string of html data from a table row to a
+string. The string is stripped of all the html tags and is accompanied by a
+list of marker items.
+
+The entries that don't contain any markers are left out in the next step of
+processing. All data, including entries without user markers, is stored in the
+object too for possible later reference, for example for editing the patterns.
+
+The last step is when the entries with markers are then processed to build
+node-lists. Node-lists are basically strings where the user markers are
+replaced by patterns so that the variable data, the isolated data, is not used
+in the node-lists. 
+
+\subsection{Directed acyclic graphs}
+
+
-- 
2.20.1