From 5c8d867391f3b2affc9c7132bea6a7b67283d5d3 Mon Sep 17 00:00:00 2001 From: Mart Lubbers Date: Tue, 18 Nov 2014 10:23:25 +0100 Subject: [PATCH] another graph update --- thesis2/.gitignore | 2 +- thesis2/1.introduction.tex | 70 ++++++++++++++++++++++++++------------ 2 files changed, 50 insertions(+), 22 deletions(-) diff --git a/thesis2/.gitignore b/thesis2/.gitignore index f019367..ebea2d2 100644 --- a/thesis2/.gitignore +++ b/thesis2/.gitignore @@ -5,7 +5,7 @@ *.toc *.dvi *.bbl -*.bbg +*.blg *.pdf *.ps *.dot diff --git a/thesis2/1.introduction.tex b/thesis2/1.introduction.tex index 129e38d..c3c3588 100644 --- a/thesis2/1.introduction.tex +++ b/thesis2/1.introduction.tex @@ -290,6 +290,7 @@ techniques of recognizing data in text can still be used to interpret the isolated extracted parts from this project. \section{Directed Acyclic Graphs} +\paragraph{Normal graphs} A graph($G$) is a mathematical structure to describe relations between nodes with edges. A standard graph is defined as the ordered pair: $G=(V,E)$. In this ordered pair $V$ is the set of nodes and $E$ is set of undirected edges @@ -309,28 +310,55 @@ Figure~\ref{fig:graphexample} is specified as: } \end{figure} +\paragraph{Directed graphs} +Directed graphs are special kind of graphs. A directed graph $G$ is also +defined by the ordered pair: $G=(V,E)$. $V$ still defines the nodes and $E$ +still the edges but the inherent difference is that the edges are ordered +tuples in stead of not ordered. Adding this property gives the edges a +direction. Every edge has a specific start and end and are therefore called +directed edges. A directed graph would look the same as the graph in +Figure~\ref{fig:graphexample} but then the normal edges would be replaced by +directional arrows that specifically go from one node to the other. + +\paragraph{Directed acyclic graphs} +Directed Acyclic Graphs(DAGs) are a special kind of directed graphs. DAGs +are also defined as $G=(V,E)$ but with a restriction on $E$. Namely that cycles +are not allowed. Figure~\ref{fig:dagexample} shows two graphs. The left graph +contains a cycle and the right graph does not. Only the right graph is a valid +DAG. A cycle is defined by a sequence of edges where nodes are visited more +then once. Adding the property of non-cyclicity to graphs lowers the +computational complexity of checking if a node sequence is present in the +graph to $\mathcal{O}(L)$ where $L$ is the length of the sequence. -Directed graphs are special kinds of graphs. A directed graph is described as -the ordered pair: $G=(V,A)$. Where $A$ is a set of ordered pairs that we will -call \textit{directed edges}. Directed edges can be visualized as arrows in a -graph whereas undirected edges are just lines connecting the nodes with no -particular direction. - -Directed Acyclic Graphs(DAGs) are again a special kind of directed graph. DAGs -don't allow cycles to be created. Every node is only reachable at maximum once -when starting from an arbitrary point in the graph. DAGs have the nice property -of checking if a sequence appears in the graph, checking if a sequence is -present in a graph only has a computational complexity of $\mathcal{O}(L)$ -where $L$ is the length of the sequence. - -The type of graph used in the project is another special king of DAG called -Directed Acyclic Word Graphs(DAWGs). This type of graph can be used to store -large dictionaries of words. A DAWGs nodes basically represent indices of a -word and the edges describe the character added when traversing. For example -the graph in Figure~\ref{fig:2.1.1} can describe a dictionary that holds the -words: \textit{abd, bad, bae}. Testing if a word is present in the DAWG is, -just as for a DAG, falls als in the computational complexity class of -$\mathcal{O}(L)$ meaning that it grows linearly with the length of the word. +\begin{figure}[H] + \caption{Example DAG} + \label{fig:dagexample} + \centering + \digraph[]{dagexample}{ + rankdir=LR + n01 -> n02 + n02 -> n03 + n03 -> n01 + n11 -> n12 + n12 -> n13 + n12 -> n14 + } +\end{figure} + +\paragraph{Directed Acyclic Word Graphs} +The type of graph used in the project is a special kind of DAG called +Directed Acyclic Word Graphs(DAWGs). A DAWG can be defined by the ordered pair +$G=(V,E)$ and is the same as a directed graph except for the edges. An edge in +a DAWG is instead of a tuple a triple and consist of a starting point, an end +point and a label. Because of the property of labeled edges data can be stored +in a DAWG. When traversing a DAWG and saving all the edgelabels one can +construct words. Because of properties of graphs a DAWG can store big sets of +words in a small amount of storage because it can re-use edges to specify +transitions. For example the graph in Figure~\ref{fig:2.1.1} can describe a +dictionary that holds the words: \textit{abd, bad, bae}. Testing if a word is +present in the DAWG is, just as for a DAG, falls in the computational +complexity class of $\mathcal{O}(L)$ meaning that it grows linearly with the +length of the word. \begin{figure}[H] \caption{Example DAWG} -- 2.20.1