From 0a699fc381a5ba0b3bf24f3976384038dc829a7e Mon Sep 17 00:00:00 2001 From: Mart Lubbers Date: Tue, 14 Jun 2016 13:30:03 +0200 Subject: [PATCH] update' --- deliverables/report/pars.tex | 27 ++++++++++++++++++++++----- deliverables/report/todo.txt | 2 -- 2 files changed, 22 insertions(+), 7 deletions(-) diff --git a/deliverables/report/pars.tex b/deliverables/report/pars.tex index 4883e1a..56fe1f7 100644 --- a/deliverables/report/pars.tex +++ b/deliverables/report/pars.tex @@ -37,10 +37,27 @@ can be lexed as one token such as literal characters. As said, the lexer uses a \Yard{} parser. The parser takes a list of characters and returns a list of \CI{Token}s. A token is a \CI{TokenValue} accompanied with a position and the \emph{ADT} used is show in Listing~\ref{lst:lextoken}. -Parser combinators make it very easy to account for arbitrary white space and it -is much less elegant to do this in a regular way. By choosing to lex with -parser combinators the speed of the phase decreases. However, since the parsers -are very easy this increase is very small. +Parser combinators make it very easy to account for arbitrary white space and +it is much less elegant to do this in a regular way. +Listing~\ref{lst:lexerwhite} shows the way we handle white space and recalculate +positions. By choosing to lex with parser combinators the speed of the phase +decreases. However due to the simplicity of the lexer this is barely +measurable. + +\begin{lstlisting}[ + language=Clean, + label={lst:lexerwhite}, + caption={Lexer whitespace handling}] +lexProgram :: Int Int -> Parser Char [Token] +lexProgram line column = lexToken >>= \t->case t of + LexEOF = pure [] + LexNL = lexProgram (line+1) 1 + (LexSpace l c) = lexProgram (line+l) (column+c) + (LexItemError e) = fail + PositionalError line column ("LexerError: " +++ e) + (LexToken c t) = lexProgram line (column+c) + >>= \rest->pure [({line=line,col=column}, t):rest] +\end{lstlisting} \begin{lstlisting}[ language=Clean, @@ -86,7 +103,7 @@ Listing~\ref{lst:fundecl}. The \CI{FunDecl} is one of the most complex parsers and is composed of as complex subparsers. A \CI{FunDecl} parser exactly one complete function. -First we do the non consuming \CI{peekPos} to make sure the positionality of +First we do the non consuming \CI{peekPos} to make sure the position of the function is guaranteed in the \AST{}, after that the identifier is parsed which is basically a parser that transforms an \CI{IdToken} into a \CI{String}. diff --git a/deliverables/report/todo.txt b/deliverables/report/todo.txt index 9aed6bb..d8a5be4 100644 --- a/deliverables/report/todo.txt +++ b/deliverables/report/todo.txt @@ -1,5 +1,3 @@ -mart parse: Uitleggen over tokens lexen dat dat eigenlijk ook whitespace is. -mart parse: tikfouten pim parse: Uitleggen over YARD pim sem: sem opschonen appendix, die pagina landscape(package: lscape, \begin{lscape}) mart ext: plaats extensions in desbetreffende section en maak intro paragraaf -- 2.20.1