1 \documentclass[../thesis.tex
]{subfiles
}
4 \ifSubfilesClassLoaded{
8 \mychapter{chp:dsl_embedding_techniques
}{DSL embedding techniques
}%
9 An
\gls{EDSL
} is a language embedded in a host language created for a specific domain
\todo{citation needed?
}.
10 \glspl{EDSL
} can have one or more backends or views.
11 Commonly used views are pretty printing, compiling, simulating, verifying and proving the program.
12 Embedding techniques can be characterised by at least the following properties:
14 \item Extendibility of constructs
15 \item Extendibility of views
16 \item Type safety, the type of a term in the language is typed in the host language
17 \item Support for well typed variables
18 \item Intensional analyses
19 \item Support for partial views, not all views need to support all constructs
22 There are two main techniques for creating embedding
\glspl{DSL
}, deep---also called tagged---embedding and shallow---also called tagless---embedding.
23 In the following sections both techniques are explained.
24 As an example, a simple language with integers, booleans and some arithmetic operators is used as a running example.
26 \section{Deep embedding
}
27 A deep
\gls{EDSL
} is a language represented as data in the host language.
28 Views are functions that transform
\emph{something
} to the datatype or the other way around.
29 Definition~
\ref{lst:exdeep
} shows an implementation for the example
\gls{DSL
}.
31 \begin{lstHaskell
}[label=
{lst:exdeep
},caption=
{A deeply embedded expression
\gls{DSL
}.
}]
32 data Value = I Int | B Bool
40 Implementing a printer for the language is straightforward:
42 \begin{lstHaskell
}[caption=
{A printer for the deeply embedded expression
\gls{DSL
}.
}]
43 print :: Expr -> String
44 print (Lit i) = show i
45 print (Plus l r) = "(" ++ print l ++ "+" ++ print r ++ ")"
46 print (Eq l r) = "(" ++ print l ++ "==" ++ print r ++ ")"
49 Adding a construct, for example subtraction reveals the Achilles' heel of deep embedding, namely that we need to revisit the original data type
\emph{and
} all the existing views.
50 I.e.\ we need to add
\haskellinline{| Sub Expr Expr
} to the
\haskellinline{Expr
} data type and
\haskellinline{print (Sub l r) = ...
} to the
\haskellinline{print
} view.
51 This limitation can be overcome by lifting the views to classes (See
\cref{chp:classy_deep_embedding
}).
53 Implementing an evaluator for the language is possible without touching any original code, we just add a function operating on the
\haskellinline{Expr
} data type.
54 To store variables, it has an extra environment argument
\todo{cite while
}.
55 Here another downside of basic deep embedding arises immediately, the expressions are not typed, and therefore there has to be some type checking in the evaluation code.
60 eval (Plus l r) = case (eval l, eval r) of
61 (Lit (I l), Lit (I r)) -> I (l+r))
62 (l, r) -> error ("Can't add " ++ show l ++ " to " ++ show r)
63 eval (Eq l r) = case (eval l, eval r) of
64 (Lit (I l), Lit (I r)) -> B (l==r)
65 (Lit (B l), Lit (B r)) -> B (l==r)
66 (l, r) -> error ("Can't compare " ++ show l ++ " to " ++ show r)
69 Luckily this problem can be overcome by switching from regular
\glspl{ADT
} to
\glspl{GADT
}, resulting in the following data type and evaluator.
73 Lit :: Show a => a -> Expr a
74 Plus :: Num a => Expr a -> Expr a -> Expr a
75 Eq :: Eq a => Expr a -> Expr a -> Expr Bool
79 eval (Plus l r) = eval l + eval r
80 eval (Eq l r) = eval l == eval r
83 Deep embedding has the advantage that it is easy to build and views are easy to add.
84 On the downside, the expressions created with this language are not necessarily type-safe.
85 In the given language it is possible to create an expression such as
\cleaninline{Plus (LitI
4) (LitB True)
} that adds a boolean to an integer.
86 Extending the
\gls{ADT
} is easy and convenient but extending the views accordingly is tedious since it has to be done individually for all views.
88 The first downside of this type of
\gls{EDSL
} can be overcome by using
\glspl{GADT
}~
\cite{cheney_first-class_2003
}.
89 Example~
\ref{lst:exdeepgadt
} shows the same language, but type-safe with a
\gls{GADT
}.
90 \glspl{GADT
} are not supported in the current version of
\gls{CLEAN
} and therefore the syntax is hypothetical (See
\todo{insert link to appendix
}).
91 However, it has been shown that
\glspl{GADT
} can be simulated using bimaps or projection pairs~
\cite{cheney_first-class_2003
}.
92 Unfortunately the lack of extendability remains a problem.
93 If a language construct is added, no compile time guarantee can be given that all views support it.
95 \begin{lstClean
}[label=
{lst:exdeepgadt
},caption=
{A deeply embedded expression
\gls{DSL
} using
\glspl{GADT
}.
}]
98 | E.e: Var String -> Expr e
99 | Plus (Expr Int) (Expr Int) -> Expr Int
100 | E.e: Eq (Expr e) (Expr e) -> Expr Bool & == e
103 \subsection{Shallow embedding
}
104 In a shallowly
\gls{EDSL
} all language constructs are expressed as functions in the host language.
105 An evaluator view for the example language then can be implemented as the code shown in Definition~
\ref{lst:exshallow
}.
106 Note that much of the internals of the language can be hidden using monads.
108 \begin{lstClean
}[label=
{lst:exshallow
}, caption=
{A minimal shallow
\gls{EDSL
}.
}]
109 :: Env :== (String -> Int)
110 :: DSL a :== (Env -> a)
115 Var :: String -> DSL Int
116 Var i =
\e->retrEnv e i
118 Plus :: (DSL Int) (DSL Int) -> DSL Int
119 Plus x y =
\e->x e + y e
121 Eq :: (DSL a) (DSL a) -> DSL Bool | == a
122 Eq x y =
\e->x e == y e
125 The advantage of shallowly embedding a language in a host language is its extendability.
126 It is very easy to add functionality because the compile time checks of the host language guarantee whether or not the functionality is available when used.
127 Moreover, the language is type safe as it is directly typed in the host language, i.e.\
\cleaninline{Lit True +. Lit
4} is rejected.
129 The downside of this method is extending the language with views.
130 It is nearly impossible to add views to a shallowly embedded language.
131 The only way of achieving this is by reimplementing all functions so that they run all backends at the same time.
132 This will mean that every component will have to implement all views rendering it slow for multiple views and complex to implement.
134 \subsection{Generalised algebraic data types
}
136 \section{Shallow embedding
}
138 \subsection{Tagless-final embedding
}\label{ssec:tagless
}
142 \input{subfilepostamble
}