From: Mart Lubbers Date: Fri, 12 Dec 2014 15:50:11 +0000 (+0100) Subject: thing X-Git-Url: https://git.martlubbers.net/?a=commitdiff_plain;h=48449970789af8e6214d048fbb96e68724f6bd62;p=bsc-thesis1415.git thing --- diff --git a/thesis2/3.methods.tex b/thesis2/3.methods.tex index 8670cd0..fc5bd6d 100644 --- a/thesis2/3.methods.tex +++ b/thesis2/3.methods.tex @@ -42,9 +42,22 @@ then added as a word. The nodelists are then sent to the actual algorithm to be converted to a graph representation. \section{Minimizing DAWGs} -The first algorithm to generate DAG's was proposed by Hopcroft et -al\cite{Hopcroft1971}. The algorithm they described wasn't incremental and had -a complexity of $\mathcal{O}(N\log{N})$. \cite{Daciuk2000} et al. later +As a representation of the patterns we use slightly altered DAWGs. Normally +DAWGs have as edgelabels letters from an alphabet, in our case the DAWGs +alphabet contains all letters, whitespace and punctuation but also the +specified user markers. DAWGs are a graph but by using them as an automaton we +can check if a word is accepted by the automaton, or in other words, if the +word matches the specified pattern. The first algorithm to generate DAWGs from +node-lists was proposed by Hopcroft et al\cite{Hopcroft1971}. It is an +incremental approach in generating the graph. Meaning that entry by entry the +graph was built. The only constraint that the algorithm has is that the entries +must be sorted lexicographically. Later on Daciuk et al.\cite{Daciuk2000} +improved on the original algorithm and their algorithm is the algorithm we used +to minimize or optimize our DAWGs. + +Pseudocode for the algorithm can be found in Listing~\ref{pseudodawg}. + + extended the algorithm and created an incremental one without increasing the computational complexity. The non incremental algorithm from Daciuk et al. is used to convert the nodelists to a graph. diff --git a/thesis2/5.appendices.tex b/thesis2/5.appendices.tex index 20fc864..dca472e 100644 --- a/thesis2/5.appendices.tex +++ b/thesis2/5.appendices.tex @@ -1,4 +1,12 @@ \section{Algorithm} +\begin{listing} + \label{pseudodawg} + \caption{Graph minimization algorithm} + \begin{minted}[mathescape=true,linenos=true]{text} + register:=$\emptyset$ + \end{minted} +\end{listing} + \section{Schemes} \subsection{scheme.xsd} \label{scheme.xsd} diff --git a/thesis2/version/mart_thesis_0.4.tar b/thesis2/version/mart_thesis_0.4.tar new file mode 100644 index 0000000..7a528d7 Binary files /dev/null and b/thesis2/version/mart_thesis_0.4.tar differ