bash ./dots/compileall
thesis:
- latex thesis.tex
- latex thesis.tex
+ pdflatex thesis.tex
+ pdflatex thesis.tex
# bibtex thesis.aux
- latex thesis.tex
- dvipdfm thesis.dvi
+ pdflatex thesis.tex
clean:
rm -vf *.aux *.bbl *.blg *.dvi *.log *.out *.pdf *.toc
#!/bin/bash
for f in dots/*.dot
do
- dot -Tps "$f" > "$f.ps"
+ dot -odots/$(basename -s ".dot" "$f").png -Tpng "$f"
done
digraph finite_state_machine {
+ graph [ dpi = 300 ];
node [shape = doublecircle]; 2
node [shape = circle]; 0 1
0 -> 1 [label = "a"];
1 -> 2 [label = "b"];
- 0 -> 2 [label = "c"];j
+ 1 -> 2 [label = "c"];
}
--- /dev/null
+digraph finite_state_machine {
+ graph [ dpi = 300 ];
+ rankdir = "LR"
+ node [shape = doublecircle]; 5
+ node [shape = circle]; 0 1 2 3 4 6 7 8 9
+ 0 -> 1 [label = "what"];
+ 1 -> 2 [label = "space"];
+ 2 -> 3 [label = "hyphen"];
+ 3 -> 4 [label = "space"];
+ 4 -> 5 [label = "when"];
+ 4 -> 6 [label = "when"];
+ 6 -> 7 [label = "space"];
+ 7 -> 8 [label = "hyphen"];
+ 8 -> 9 [label = "space"];
+ 9 -> 5 [label = "where"];
+}
\section{Research question}
The main research question is: \textit{How can we make an adaptive, autonomous
and programmable data mining program that can be set up by a non IT
-professional which is able to transform raw data into structured data.}\\
+professional(NIP) which is able to transform raw data into structured data.}\\
The practical goal and aim of the project is to make a crawler(web or other
document types) that can autonomously gather information after it has been
Directed acyclic graphs(DAG) and finite state automatas(FSA) have a lot in
common concerning pattern recognition and information extraction. By feeding
words into an algorithm a DAG can be generated so that it matches certain
-patters present in the given words.
+patters present in the given words. Figure~\ref{fig:mg1} for example shows a
+FSA that matches on the words \textit{ab} and \textit{ac}.
+\begin{figure}[H]
+ \centering
+ \caption{Example DAG/FSA}
+ \label{fig:mg1}
+ \includegraphics[width=15mm]{./dots/graph1.png}
+\end{figure}
+
+With this FSA we can test if a word fits to the constraints it the FSA
+describes. And with a little adaptation we can extract dynamic information from
+semi-structured data.\\
+
+\section{NIP input}
+
+\section{Back to DAG's and FSA's}
+Nodes in this datastructure can be single letters but also bigger
+constructions. The example in Figure~\ref{fig:mg2} describes different
+separator pattern for event data with its three component: what, when, where.
+In this example the nodes with the labels \textit{what, when, where} can also
+be complete subgrahps. In this way data on a larger scale
+\begin{figure}[H]
+ \centering
+ \caption{Example event data}
+ \label{fig:mg2}
+ \includegraphics[width=\linewidth]{./dots/graph2.png}
+\end{figure}
+
+
\section{Algorithm}
+Hello Wordl
\usepackage{lipsum}
\usepackage{graphicx}
+\usepackage{float}
\author{Mart Lubbers\\s4109053}
\title{Non IT congurable adaptive data mining solution used in transforming raw data to structured data}
Radboud University Nijmegen\\
\vspace{15mm}
\begin{tabular}{cp{5em}c}
- Franc Grootjen && Alessandro Paula\\
- RU && Hyperleap
+ Franc Grootjen && Alessandro Paula\\
+ RU && Hyperleap
\end{tabular}
}