1 \documentclass[../thesis.tex
]{subfiles
}
4 \ifSubfilesClassLoaded{
8 \chapter{DSL embedding techniques
}%
9 \label{chp:dsl_embedding_techniques
}
10 An
\gls{EDSL
} is a language embedded in a host language created for a specific domain
\todo{citation needed?
}.
11 \glspl{EDSL
} can have one or more backends or views.
12 Commonly used views are pretty printing, compiling, simulating, verifying and proving the program.
13 Embedding techniques can be characterised by at least the following properties:
15 \item Extendibility of constructs
16 \item Extendibility of views
17 \item Type safety, the type of a term in the language is typed in the host language
18 \item Support for well typed variables
19 \item Intensional analyses
20 \item Support for partial views, not all views need to support all constructs
23 There are two main techniques for creating embedding
\glspl{DSL
}, deep---also called tagged---embedding and shallow---also called tagless---embedding.
24 In the following sections both techniques are explained.
25 As an example, a simple language with integers, booleans and some arithmetic operators is used as a running example.
27 \section{Deep embedding
}
28 A deep
\gls{EDSL
} is a language represented as data in the host language.
29 Views are functions that transform
\emph{something
} to the datatype or the other way around.
30 Definition~
\ref{lst:exdeep
} shows an implementation for the example
\gls{DSL
}.
32 \begin{lstHaskell
}[label=
{lst:exdeep
},caption=
{A deeply embedded expression
\gls{DSL
}.
}]
33 data Value = I Int | B Bool
41 Implementing a printer for the language is straightforward:
43 \begin{lstHaskell
}[caption=
{A printer for the deeply embedded expression
\gls{DSL
}.
}]
44 print :: Expr -> String
45 print (Lit i) = show i
46 print (Plus l r) = "(" ++ print l ++ "+" ++ print r ++ ")"
47 print (Eq l r) = "(" ++ print l ++ "==" ++ print r ++ ")"
50 Adding a construct, for example subtraction reveals the Achilles' heel of deep embedding, namely that we need to revisit the original data type
\emph{and
} all the existing views.
51 I.e.\ we need to add
\haskellinline{| Sub Expr Expr
} to the
\haskellinline{Expr
} data type and
\haskellinline{print (Sub l r) = ...
} to the
\haskellinline{print
} view.
52 This limitation can be overcome by lifting the views to classes (See
\cref{chp:classy_deep_embedding
}).
54 Implementing an evaluator for the language is possible without touching any original code, we just add a function operating on the
\haskellinline{Expr
} data type.
55 To store variables, it has an extra environment argument
\todo{cite while
}.
56 Here another downside of basic deep embedding arises immediately, the expressions are not typed, and therefore there has to be some type checking in the evaluation code.
61 eval (Plus l r) = case (eval l, eval r) of
62 (Lit (I l), Lit (I r)) -> I (l+r))
63 (l, r) -> error $ "Cannot add " ++ show l ++ " to " ++ show r
64 eval (Eq l r) = case (eval l, eval r) of
65 (Lit (I l), Lit (I r)) -> B (l==r)
66 (Lit (B l), Lit (B r)) -> B (l==r)
67 (l, r) -> error $ "Cannot compare " ++ show l ++ " to " ++ show r
70 Luckily this problem can be overcome by switching from regular
\glspl{ADT
} to
\glspl{GADT
}, resulting in the following data type and evaluator.
74 Lit :: Show a => a -> Expr a
75 Plus :: Num a => Expr a -> Expr a -> Expr a
76 Eq :: Eq a => Expr a -> Expr a -> Expr Bool
80 eval (Plus l r) = eval l + eval r
81 eval (Eq l r) = eval l == eval r
84 Deep embedding has the advantage that it is easy to build and views are easy to add.
85 On the downside, the expressions created with this language are not necessarily type-safe.
86 In the given language it is possible to create an expression such as
\cleaninline{Plus (LitI
4) (LitB True)
} that adds a boolean to an integer.
87 Extending the
\gls{ADT
} is easy and convenient but extending the views accordingly is tedious since it has to be done individually for all views.
89 The first downside of this type of
\gls{EDSL
} can be overcome by using
\glspl{GADT
}~
\cite{cheney_first-class_2003
}.
90 Example~
\ref{lst:exdeepgadt
} shows the same language, but type-safe with a
\gls{GADT
}.
91 \glspl{GADT
} are not supported in the current version of
\gls{CLEAN
} and therefore the syntax is hypothetical.
92 However, it has been shown that
\glspl{GADT
} can be simulated using bimaps or projection pairs~
\cite{cheney_first-class_2003
}.
93 Unfortunately the lack of extendability remains a problem.
94 If a language construct is added, no compile time guarantee can be given that all views support it.
96 \begin{lstClean
}[label=
{lst:exdeepgadt
},caption=
{A deeply embedded expression
\gls{DSL
} using
\glspl{GADT
}.
}]
99 | E.e: Var String -> Expr e
100 | Plus (Expr Int) (Expr Int) -> Expr Int
101 | E.e: Eq (Expr e) (Expr e) -> Expr Bool & == e
104 \subsection{Shallow embedding
}
105 In a shallowly
\gls{EDSL
} all language constructs are expressed as functions in the host language.
106 An evaluator view for the example language then can be implemented as the code shown in Definition~
\ref{lst:exshallow
}.
107 Note that much of the internals of the language can be hidden using monads.
109 \begin{lstClean
}[label=
{lst:exshallow
}, caption=
{A minimal shallow
\gls{EDSL
}.
}]
110 :: Env :== (String -> Int)
111 :: DSL a :== (Env -> a)
116 Var :: String -> DSL Int
117 Var i =
\e->retrEnv e i
119 Plus :: (DSL Int) (DSL Int) -> DSL Int
120 Plus x y =
\e->x e + y e
122 Eq :: (DSL a) (DSL a) -> DSL Bool | == a
123 Eq x y =
\e->x e == y e
126 The advantage of shallowly embedding a language in a host language is its extendability.
127 It is very easy to add functionality because the compile time checks of the host language guarantee whether or not the functionality is available when used.
128 Moreover, the language is type safe as it is directly typed in the host language, i.e.\
\cleaninline{Lit True +. Lit
4} is rejected.
130 The downside of this method is extending the language with views.
131 It is nearly impossible to add views to a shallowly embedded language.
132 The only way of achieving this is by reimplementing all functions so that they run all backends at the same time.
133 This will mean that every component will have to implement all views rendering it slow for multiple views and complex to implement.
135 \subsection{Generalised algebraic data types
}
137 \section{Shallow embedding
}
139 \subsection{Tagless-final embedding
}
143 \input{subfilepostamble
}