Hence, defining the constructs is as simple as creating the following algebraic data type\footnote{All data types and functions are subscripted to indicate the evolution.}.
\begin{lstHaskellLhstex}
-data Expr_0 = Lit_0 Int
- | Add_0 Expr_0 Expr_0
+data Expr_0
+ = Lit_0 Int
+ | Add_0 Expr_0 Expr_0
\end{lstHaskellLhstex}
Semantics are defined as functions on the \haskelllhstexinline{Expr_0} data type.
So, adding subtraction to the language results in the following revised data type.
\begin{lstHaskellLhstex}
-data Expr_0 = Lit_0 Int
- | Add_0 Expr_0 Expr_0
- | Sub_0 Expr_0 Expr_0
+data Expr_0
+ = Lit_0 Int
+ | Add_0 Expr_0 Expr_0
+ | Sub_0 Expr_0 Expr_0
\end{lstHaskellLhstex}
Extending the DSL with language constructs exposes the Achilles' heel of deep embedding.
\begin{lstHaskellLhstex}
class Expr_t s where
- lit_t :: Int -> s
- add_t :: s -> s -> s
+ lit_t :: Int -> s
+ add_t :: s -> s -> s
\end{lstHaskellLhstex}
Semantics become data types\footnotemark{} implementing these type classes, resulting in the following instance for the evaluator.
newtype Eval_t = E_t Int
instance Expr_t Eval_t where
- lit_t v = E_t v
- add_t (E_t e1) (E_t e2) = E_t (e1 + e2)
+ lit_t v = E_t v
+ add_t (E_t e1) (E_t e2) = E_t (e1 + e2)
\end{lstHaskellLhstex}
Adding constructs---e.g.\ subtraction---just results in an extra type class and corresponding instances.
\begin{lstHaskellLhstex}
class Sub_t s where
- sub_t :: s -> s -> s
+ sub_t :: s -> s -> s
instance Sub_t Eval_t where
- sub_t (E_t e1) (E_t e2) = E_t (e1 - e2)
+ sub_t (E_t e1) (E_t e2) = E_t (e1 - e2)
\end{lstHaskellLhstex}
Finally, adding semantics such as a printer over the language is achieved by providing a data type representing the semantics accompanied by instances for the language constructs.
newtype Printer_t = P_t String
instance Expr_t Printer_t where
- lit_t i = P_t (show i)
- add_t (P_t e1) (P_t e2) = P_t ("(" ++ e1 ++ "+" ++ e2 ++ ")")
+ lit_t i = P_t (show i)
+ add_t (P_t e1) (P_t e2) = P_t ("(" ++ e1 ++ "+" ++ e2 ++ ")")
instance Sub_t Printer_t where
- sub_t (P_t e1) (P_t e2) = P_t ("(" ++ e1 ++ "-" ++ e2 ++ ")")
+ sub_t (P_t e1) (P_t e2) = P_t ("(" ++ e1 ++ "-" ++ e2 ++ ")")
\end{lstHaskellLhstex}
\section{Lifting the backends}%
\begin{lstHaskellLhstex}
class Eval_1 v where
- eval_1 :: v -> Int
+ eval_1 :: v -> Int
-data Expr_1 = Lit_1 Int
- | Add_1 Expr_1 Expr_1
+data Expr_1
+ = Lit_1 Int
+ | Add_1 Expr_1 Expr_1
\end{lstHaskellLhstex}
Implementing the semantics type class instances for the \haskelllhstexinline{Expr_1} data type is an elementary exercise.
\begin{lstHaskellLhstex}
instance Eval_1 Expr_1 where
- eval_1 (Lit_1 v) = v
- eval_1 (Add_1 e1 e2) = eval_1 e1 + eval_1 e2
+ eval_1 (Lit_1 v) = v
+ eval_1 (Add_1 e1 e2) = eval_1 e1 + eval_1 e2
\end{lstHaskellLhstex}
Subtraction can now be defined in a separate data type, leaving the original data type intact.
data Sub_1 = Sub_1 Expr_1 Expr_1
instance Eval_1 Sub_1 where
- eval_1 (Sub_1 e1 e2) = eval_1 e1 - eval_1 e2
+ eval_1 (Sub_1 e1 e2) = eval_1 e1 - eval_1 e2
\end{lstHaskellLhstex}
\section{Existential data types}%
It contains an existentially quantified~\citep{mitchell_abstract_1988} type with type class constraints~\citep{laufer_combining_1994,laufer_type_1996} for all semantics type classes~\citep[Chp.~6.4.6]{ghc_team_ghc_2021} to allow it to house not just subtraction but any future extension.
\begin{lstHaskellLhstex}
-data Expr_2 = Lit_2 Int
- | Add_2 Expr_2 Expr_2
- | forall x. Eval_2 x => Ext_2 x
+data Expr_2
+ = Lit_2 Int
+ | Add_2 Expr_2 Expr_2
+ | forall x. Eval_2 x => Ext_2 x
\end{lstHaskellLhstex}
The implementation of the extension case in the semantics type classes is in most cases just a matter of calling the function for the argument as can be seen in the semantics instances shown below.
\begin{lstHaskellLhstex}
instance Eval_2 Expr_2 where
- eval_2 (Lit_2 v) = v
- eval_2 (Add_2 e1 e2) = eval_2 e1 + eval_2 e2
- eval_2 (Ext_2 x) = eval_2 x
+ eval_2 (Lit_2 v) = v
+ eval_2 (Add_2 e1 e2) = eval_2 e1 + eval_2 e2
+ eval_2 (Ext_2 x) = eval_2 x
\end{lstHaskellLhstex}
Adding language construct extensions in different data types does mean that an extra \haskelllhstexinline{Ext_2} tag is introduced when using the extension.
\begin{lstHaskellLhstex}
class (Eval_2 x, Print_2 x) => Semantics_2 x
-data Expr_2 = Lit_2 Int
- | Add_2 Expr_2 Expr_2
- | forall x. Semantics_2 x => Ext_2 x
+data Expr_2
+ = Lit_2 Int
+ | Add_2 Expr_2 Expr_2
+ | forall x. Semantics_2 x => Ext_2 x
\end{lstHaskellLhstex}
The class alias removes the need for the programmer to visit the main data type when adding additional semantics.
First the data types for the language constructs are parametrised by the type variable \haskelllhstexinline{d} as follows.
\begin{lstHaskellLhstex}
-data Expr_3 d = Lit_3 Int
- | Add_3 (Expr_3 d) (Expr_3 d)
- | forall x. Ext_3 (d x) x
+data Expr_3 d
+ = Lit_3 Int
+ | Add_3 (Expr_3 d) (Expr_3 d)
+ | forall x. Ext_3 (d x) x
-data Sub_3 d = Sub_3 (Expr_3 d) (Expr_3 d)
+data Sub_3 d = Sub_3 (Expr_3 d) (Expr_3 d)
\end{lstHaskellLhstex}
The \haskelllhstexinline{d} type variable is inhabited by an explicit dictionary for the semantics, i.e.\ a witness to the class instance.
newtype EvalDict_3 v = EvalDict_3 (v -> Int)
class HasEval_3 d where
- getEval_3 :: d v -> v -> Int
+ getEval_3 :: d v -> v -> Int
instance HasEval_3 EvalDict_3 where
- getEval_3 (EvalDict_3 e) = e
+ getEval_3 (EvalDict_3 e) = e
\end{lstHaskellLhstex}
The instances for the type classes change as well according to the change in the datatype.
\begin{lstHaskellLhstex}
instance HasEval_3 d => Eval_3 (Expr_3 d) where
- eval_3 (Lit_3 v) = v
- eval_3 (Add_3 e1 e2) = eval_3 e1 + eval_3 e2
- eval_3 (Ext_3 d x) = getEval_3 d x
+ eval_3 (Lit_3 v) = v
+ eval_3 (Add_3 e1 e2) = eval_3 e1 + eval_3 e2
+ eval_3 (Ext_3 d x) = getEval_3 d x
instance HasEval_3 d => Eval_3 (Sub_3 d) where
- eval_3 (Sub_3 e1 e2) = eval_3 e1 - eval_3 e2
+ eval_3 (Sub_3 e1 e2) = eval_3 e1 - eval_3 e2
\end{lstHaskellLhstex}
Because the \haskelllhstexinline{Ext_3} constructor from \haskelllhstexinline{Expr_3} now contains a value of type \haskelllhstexinline{d}, the smart constructor for \haskelllhstexinline{Sub_3} must somehow come up with this value.
\begin{lstHaskellLhstex}
class GDict a where
- gdict :: a
+ gdict :: a
\end{lstHaskellLhstex}
This type class has individual instances for all semantics dictionaries, linking the class instance to the witness value.
\begin{lstHaskellLhstex}
instance Eval_3 v => GDict (EvalDict_3 v) where
- gdict = EvalDict_3 eval_3
+ gdict = EvalDict_3 eval_3
\end{lstHaskellLhstex}
With these instances, the semantics function can be retrieved from the \haskelllhstexinline{Ext_3} constructor and in the smart constructors they can be generated as follows:
\begin{lstHaskellLhstex}
class Print_3 v where
- print_3 :: v -> String
+ print_3 :: v -> String
newtype PrintDict_3 v = PrintDict_3 (v -> String)
class HasPrint_3 d where
- getPrint_3 :: d v -> v -> String
+ getPrint_3 :: d v -> v -> String
instance HasPrint_3 PrintDict_3 where
- getPrint_3 (PrintDict_3 e) = e
+ getPrint_3 (PrintDict_3 e) = e
instance Print_3 v => GDict (PrintDict_3 v) where
- gdict = PrintDict_3 print_3
+ gdict = PrintDict_3 print_3
\end{lstHaskellLhstex}
Then the instances for \haskelllhstexinline{Print_3} of all the language constructs can be defined.
\begin{lstHaskellLhstex}
instance HasPrint_3 d => Print_3 (Expr_3 d) where
- print_3 (Lit_3 v) = show v
- print_3 (Add_3 e1 e2) = "(" ++ print_3 e1 ++ "+" ++ print_3 e2 ++ ")"
- print_3 (Ext_3 d x) = getPrint_3 d x
+ print_3 (Lit_3 v) = show v
+ print_3 (Add_3 e1 e2) = "(" ++ print_3 e1 ++ "+" ++ print_3 e2 ++ ")"
+ print_3 (Ext_3 d x) = getPrint_3 d x
instance HasPrint_3 d => Print_3 (Sub_3 d) where
- print_3 (Sub_3 e1 e2) = "(" ++ print_3 e1 ++ "-" ++ print_3 e2 ++ ")"
+ print_3 (Sub_3 e1 e2) = "(" ++ print_3 e1 ++ "-" ++ print_3 e2 ++ ")"
\end{lstHaskellLhstex}
\section{Transformation semantics}
\begin{lstHaskellLhstex}
class Opt_3 v where
- opt_3 :: v -> v
+ opt_3 :: v -> v
newtype OptDict_3 v = OptDict_3 (v -> v)
class HasOpt_3 d where
- getOpt_3 :: d v -> v -> v
+ getOpt_3 :: d v -> v -> v
instance HasOpt_3 OptDict_3 where
- getOpt_3 (OptDict_3 e) = e
+ getOpt_3 (OptDict_3 e) = e
instance Opt_3 v => GDict (OptDict_3 v) where
- gdict = OptDict_3 opt_3
+ gdict = OptDict_3 opt_3
\end{lstHaskellLhstex}
The implementation of the optimizer for the \haskelllhstexinline{Expr_3} data type is no complicated task.
\begin{lstHaskellLhstex}
instance HasOpt_3 d => Opt_3 (Expr_3 d) where
- opt_3 (Lit_3 v) = Lit_3 v
- opt_3 (Add_3 e1 e2) = case (opt_3 e1, opt_3 e2) of
- (Lit_3 0, e2p ) -> e2p
- (e1p, Lit_3 0) -> e1p
- (e1p, e2p ) -> Add_3 e1p e2p
- opt_3 (Ext_3 d x) = Ext_3 d (getOpt_3 d x)
+ opt_3 (Lit_3 v) = Lit_3 v
+ opt_3 (Add_3 e1 e2) = case (opt_3 e1, opt_3 e2) of
+ (Lit_3 0, e2p ) -> e2p
+ (e1p, Lit_3 0) -> e1p
+ (e1p, e2p ) -> Add_3 e1p e2p
+ opt_3 (Ext_3 d x) = Ext_3 d (getOpt_3 d x)
\end{lstHaskellLhstex}
Replicating this for the \haskelllhstexinline{Opt_3} instance of \haskelllhstexinline{Sub_3} seems a clear-cut task at first glance.
\begin{lstHaskellLhstex}
instance HasOpt_3 d => Opt_3 (Sub_3 d) where
- opt_3 (Sub_3 e1 e2) = case (opt_3 e1, opt_3 e2) of
- (e1p, Lit_3 0) -> e1p
- (e1p, e2p ) -> Sub_3 e1p e2p
+ opt_3 (Sub_3 e1 e2) = case (opt_3 e1, opt_3 e2) of
+ (e1p, Lit_3 0) -> e1p
+ (e1p, e2p ) -> Sub_3 e1p e2p
\end{lstHaskellLhstex}
Unsurprisingly, this code is rejected by the compiler.
This changes the \haskelllhstexinline{Sub_3} data type as follows.
\begin{lstHaskellLhstex}
-data Sub_4 d = Sub_4 (Expr_4 d) (Expr_4 d)
- | SubLoop_4 (Expr_4 d)
+data Sub_4 d
+ = Sub_4 (Expr_4 d) (Expr_4 d)
+ | SubLoop_4 (Expr_4 d)
instance HasEval_4 d => Eval_4 (Sub_4 d) where
- eval_4 (Sub_4 e1 e2) = eval_4 e1 - eval_4 e2
- eval_4 (SubLoop_4 e1) = eval_4 e1
+ eval_4 (Sub_4 e1 e2) = eval_4 e1 - eval_4 e2
+ eval_4 (SubLoop_4 e1) = eval_4 e1
\end{lstHaskellLhstex}
With this loopback case in the toolbox, the following \haskelllhstexinline{Sub} instance optimises away subtraction with zero literals.
\begin{lstHaskellLhstex}
instance HasOpt_4 d => Opt_4 (Sub_4 d) where
- opt_4 (Sub_4 e1 e2) = case (opt_4 e1, opt_4 e2) of
- (e1p, Lit_4 0) -> SubLoop_4 e1p
- (e1p, e2p ) -> Sub_4 e1p e2p
- opt_4 (SubLoop_4 e) = SubLoop_4 (opt_4 e)
+ opt_4 (Sub_4 e1 e2) = case (opt_4 e1, opt_4 e2) of
+ (e1p, Lit_4 0) -> SubLoop_4 e1p
+ (e1p, e2p ) -> Sub_4 e1p e2p
+ opt_4 (SubLoop_4 e) = SubLoop_4 (opt_4 e)
\end{lstHaskellLhstex}
\subsection{Pattern matching}
As a result, the \haskelllhstexinline{Typeable} type class constraints are added to the quantified type variable \haskelllhstexinline{x} of the \haskelllhstexinline{Ext_4} constructor and to \haskelllhstexinline{d}s in the smart constructors.
\begin{lstHaskellLhstex}
-data Expr_4 d = Lit_4 Int
- | Add_4 (Expr_4 d) (Expr_4 d)
- | forall x. Typeable x => Ext_4 (d x) x
+data Expr_4 d
+ = Lit_4 Int
+ | Add_4 (Expr_4 d) (Expr_4 d)
+ | forall x. Typeable x => Ext_4 (d x) x
\end{lstHaskellLhstex}
First let us add negation to the language by defining a datatype representing it.
Negation elimination requires the removal of negation constructors, so a convolution constructor is defined as well.
\begin{lstHaskellLhstex}
-data Neg_4 d = Neg_4 (Expr_4 d)
- | NegLoop_4 (Expr_4 d)
+data Neg_4 d
+ = Neg_4 (Expr_4 d)
+ | NegLoop_4 (Expr_4 d)
neg_4 :: (Typeable d, GDict (d (Neg_4 d))) => Expr_4 d -> Expr_4 d
neg_4 e = Ext_4 gdict (Neg_4 e)
\begin{lstHaskellLhstex}
instance HasEval_4 d => Eval_4 (Neg_4 d) where
- eval_4 (Neg_4 e) = negate (eval_4 e)
- eval_4 (NegLoop_4 e) = eval_4 e
+ eval_4 (Neg_4 e) = negate (eval_4 e)
+ eval_4 (NegLoop_4 e) = eval_4 e
instance HasPrint_4 d => Print_4 (Neg_4 d) where
- print_4 (Neg_4 e) = "(~" ++ print_4 e ++ ")"
- print_4 (NegLoop_4 e) = print_4 e
+ print_4 (Neg_4 e) = "(~" ++ print_4 e ++ ")"
+ print_4 (NegLoop_4 e) = print_4 e
\end{lstHaskellLhstex}
The \haskelllhstexinline{Opt_4} instance contains the interesting bit.
\begin{lstHaskellLhstex}
instance (Typeable d, GDict (d (Neg_4 d)), HasOpt_4 d) => Opt_4 (Neg_4 d) where
- opt_4 (Neg_4 (Add_4 e1 e2))
- = NegLoop_4 (Add_4 (opt_4 (neg_4 e1)) (opt_4 (neg_4 e2)))
- opt_4 (Neg_4 (Ext_4 d x))
- = case fromDynamic (toDyn (getOpt_4 d x)) of
- Just (Neg_4 e) -> NegLoop_4 e
- _ -> Neg_4 (Ext_4 d (getOpt_4 d x))
- opt_4 (Neg_4 e) = Neg_4 (opt_4 e)
- opt_4 (NegLoop_4 e) = NegLoop_4 (opt_4 e)
+ opt_4 (Neg_4 (Add_4 e1 e2))
+ = NegLoop_4 (Add_4 (opt_4 (neg_4 e1)) (opt_4 (neg_4 e2)))
+ opt_4 (Neg_4 (Ext_4 d x))
+ = case fromDynamic (toDyn (getOpt_4 d x)) of
+ Just (Neg_4 e) -> NegLoop_4 e
+ _ -> Neg_4 (Ext_4 d (getOpt_4 d x))
+ opt_4 (Neg_4 e) = Neg_4 (opt_4 e)
+ opt_4 (NegLoop_4 e) = NegLoop_4 (opt_4 e)
\end{lstHaskellLhstex}
Loopback cases do make cross-extensional pattern matching less modular in general.
data OptPrintDict_4 v = OPD_4 (OptDict_4 v) (PrintDict_4 v)
instance HasOpt_4 OptPrintDict_4 where
- getOpt_4 (OPD_4 v _) = getOpt_4 v
+ getOpt_4 (OPD_4 v _) = getOpt_4 v
instance HasPrint_4 OptPrintDict_4 where
- getPrint_4 (OPD_4 _ v) = getPrint_4 v
+ getPrint_4 (OPD_4 _ v) = getPrint_4 v
instance (Opt_4 v, Print_4 v) => GDict (OptPrintDict_4 v) where
- gdict = OPD_4 gdict gdict
+ gdict = OPD_4 gdict gdict
\end{lstHaskellLhstex}
And this allows us to write \haskelllhstexinline{print_4 (opt_4 e1)} resulting in \verb|"((~42)+(~38))"| when \haskelllhstexinline{e1} represents $(\sim(42+38))-0$ and is thus defined as follows.
e3 = neg_4 (Lit_4 42 `sub_4` Lit_4 38) `Add_4` Lit_4 1
\end{lstHaskellLhstex}
-\section{Chaining (reprise)}\label{sec:classy_reprise}
-\todo{\ldots}
-
\section{Generalised algebraic data types}%
Generalised algebraic data types (GADTs) are enriched data types that allow the type instantiation of the constructor to be explicitly defined~\citep{cheney_first-class_2003,hinze_fun_2003}.
Leveraging GADTs, deeply embedded DSLs can be made statically type safe even when different value types are supported.
\begin{lstHaskellLhstex}
data Expr_g d a where
- Lit_g :: Show a => a -> Expr_g d a
- Add_g :: (Eq a, Num a) => Expr_g d a -> Expr_g d a -> Expr_g d a
- Ext_g :: Typeable x => d x -> x a -> Expr_g d a
+ Lit_g :: Show a => a -> Expr_g d a
+ Add_g :: (Eq a, Num a) => Expr_g d a -> Expr_g d a -> Expr_g d a
+ Ext_g :: Typeable x => d x -> x a -> Expr_g d a
data Neg_g d a where
- Neg_g :: (Typeable a, Num a) => Expr_g d a -> Neg_g d a
- NegLoop_g :: Expr_g d a -> Neg_g d a
+ Neg_g :: (Typeable a, Num a) => Expr_g d a -> Neg_g d a
+ NegLoop_g :: Expr_g d a -> Neg_g d a
data Not_g d a where
- Not_g :: Expr_g d Bool -> Not_g d Bool
- NotLoop_g :: Expr_g d a -> Not_g d a
+ Not_g :: Expr_g d Bool -> Not_g d Bool
+ NotLoop_g :: Expr_g d a -> Not_g d a
\end{lstHaskellLhstex}
The smart constructors for the language extensions inherit the class constraints of their data types and include a \haskelllhstexinline{Typeable} constraint on the \haskelllhstexinline{d} type variable for it to be usable in the \haskelllhstexinline{Ext_g} constructor as can be seen in the smart constructor for \haskelllhstexinline{Neg_g}:
\begin{lstHaskellLhstex}
neg_g :: (Typeable d, GDict (d (Neg_g d)), Typeable a, Num a) =>
- Expr_g d a -> Expr_g d a
+ Expr_g d a -> Expr_g d a
neg_g e = Ext_g gdict (Neg_g e)
not_g :: (Typeable d, GDict (d (Not_g d))) =>
- Expr_g d Bool -> Expr_g d Bool
+ Expr_g d Bool -> Expr_g d Bool
not_g e = Ext_g gdict (Not_g e)
\end{lstHaskellLhstex}
\begin{lstHaskellLhstex}
class Eval_g v where
- eval_g :: v a -> a
+ eval_g :: v a -> a
class Print_g v where
- print_g :: v a -> String
+ print_g :: v a -> String
class Opt_g v where
- opt_g :: v a -> v a
+ opt_g :: v a -> v a
\end{lstHaskellLhstex}
Now that the shape of the type classes has changed, the dictionary data types and the type classes need to be adapted as well.
\begin{lstHaskellLhstex}
newtype EvalDict_g v = EvalDict_g (forall a. v a -> a)
class HasEval_g d where
- getEval_g :: d v -> v a -> a
+ getEval_g :: d v -> v a -> a
instance HasEval_g EvalDict_g where
- getEval_g (EvalDict_g e) = e
+ getEval_g (EvalDict_g e) = e
\end{lstHaskellLhstex}
The \haskelllhstexinline{GDict} type class is general enough, so the instances can remain the same.
\begin{lstHaskellLhstex}
instance Eval_g v => GDict (EvalDict_g v) where
- gdict = EvalDict_g eval_g
+ gdict = EvalDict_g eval_g
\end{lstHaskellLhstex}
Finally, the implementations for the instances can be ported without complication show using the optimisation instance of \haskelllhstexinline{Not_g}:
\begin{lstHaskellLhstex}
instance (Typeable d, GDict (d (Not_g d)), HasOpt_g d) => Opt_g (Not_g d) where
- opt_g (Not_g (Ext_g d x))
- = case fromDynamic (toDyn (getOpt_g d x)) :: Maybe (Not_g d Bool) of
- Just (Not_g e) -> NotLoop_g e
- _ -> Not_g (Ext_g d (getOpt_g d x))
- opt_g (Not_g e) = Not_g (opt_g e)
- opt_g (NotLoop_g e) = NotLoop_g (opt_g e)
+ opt_g (Not_g (Ext_g d x))
+ = case fromDynamic (toDyn (getOpt_g d x)) :: Maybe (Not_g d Bool) of
+ Just (Not_g e) -> NotLoop_g e
+ _ -> Not_g (Ext_g d (getOpt_g d x))
+ opt_g (Not_g e) = Not_g (opt_g e)
+ opt_g (NotLoop_g e) = NotLoop_g (opt_g e)
\end{lstHaskellLhstex}
\section{Conclusion}%
This extension case is overloaded by the language construct using a data type containing the class dictionary.
As a result, orthogonal extension is possible for language constructs and semantics using only little syntactic overhead or type annotations.
The basic technique only requires---well established through history and relatively standard---existential data types.
-However, if needed, the technique generalises to GADTs as well, adding rank-2 types to the list of type system requirements as well.
+However, if needed, the technique generalises to \glspl{GADT} as well, adding rank-2 types to the list of type system requirements as well.
Finally, the abstract syntax tree remains observable which makes it suitable for intensional analyses, albeit using occasional dynamic typing for truly cross-extensional transformations.
Defining reusable expressions overloaded in semantics or using multiple semantics on a single expression requires some boilerplate still, getting around this remains future work.
\label{sec:cde:appendix}
\begin{lstHaskellLhstex}[caption={Data type definitions.}]
data Sub_g d a where
- Sub_g :: (Eq a, Num a) => Expr_g d a -> Expr_g d a -> Sub_g d a
- SubLoop_g :: Expr_g d a -> Sub_g d a
+ Sub_g :: (Eq a, Num a) => Expr_g d a -> Expr_g d a -> Sub_g d a
+ SubLoop_g :: Expr_g d a -> Sub_g d a
data Eq_g d a where
- Eq_g :: (Typeable a, Eq a) => Expr_g d a -> Expr_g d a -> Eq_g d Bool
- EqLoop_g :: Expr_g d a -> Eq_g d a
+ Eq_g :: (Typeable a, Eq a) => Expr_g d a -> Expr_g d a -> Eq_g d Bool
+ EqLoop_g :: Expr_g d a -> Eq_g d a
\end{lstHaskellLhstex}
\begin{lstHaskellLhstex}[caption={Smart constructions.}]
sub_g :: (Typeable d, GDict (d (Sub_g d)), Eq a, Num a) =>
- Expr_g d a -> Expr_g d a -> Expr_g d a
+ Expr_g d a -> Expr_g d a -> Expr_g d a
sub_g e1 e2 = Ext_g gdict (Sub_g e1 e2)
eq_g :: (Typeable d, GDict (d (Eq_g d)), Eq a, Typeable a) =>
- Expr_g d a -> Expr_g d a -> Expr_g d Bool
+ Expr_g d a -> Expr_g d a -> Expr_g d Bool
eq_g e1 e2 = Ext_g gdict (Eq_g e1 e2)
\end{lstHaskellLhstex}
newtype PrintDict_g v = PrintDict_g (forall a.v a -> String)
class HasPrint_g d where
- getPrint_g :: d v -> v a -> String
+ getPrint_g :: d v -> v a -> String
instance HasPrint_g PrintDict_g where
- getPrint_g (PrintDict_g e) = e
+ getPrint_g (PrintDict_g e) = e
newtype OptDict_g v = OptDict_g (forall a.v a -> v a)
class HasOpt_g d where
- getOpt_g :: d v -> v a -> v a
+ getOpt_g :: d v -> v a -> v a
instance HasOpt_g OptDict_g where
- getOpt_g (OptDict_g e) = e
+ getOpt_g (OptDict_g e) = e
\end{lstHaskellLhstex}
\begin{lstHaskellLhstex}[caption={\texorpdfstring{\haskelllhstexinline{GDict}}{GDict} instances}]
instance Print_g v => GDict (PrintDict_g v) where
- gdict = PrintDict_g print_g
+ gdict = PrintDict_g print_g
instance Opt_g v => GDict (OptDict_g v) where
- gdict = OptDict_g opt_g
+ gdict = OptDict_g opt_g
\end{lstHaskellLhstex}
\begin{lstHaskellLhstex}[caption={Evaluator instances}]
instance HasEval_g d => Eval_g (Expr_g d) where
- eval_g (Lit_g v) = v
- eval_g (Add_g e1 e2) = eval_g e1 + eval_g e2
- eval_g (Ext_g d x) = getEval_g d x
+ eval_g (Lit_g v) = v
+ eval_g (Add_g e1 e2) = eval_g e1 + eval_g e2
+ eval_g (Ext_g d x) = getEval_g d x
instance HasEval_g d => Eval_g (Sub_g d) where
- eval_g (Sub_g e1 e2) = eval_g e1 - eval_g e2
- eval_g (SubLoop_g e) = eval_g e
+ eval_g (Sub_g e1 e2) = eval_g e1 - eval_g e2
+ eval_g (SubLoop_g e) = eval_g e
instance HasEval_g d => Eval_g (Neg_g d) where
- eval_g (Neg_g e) = negate (eval_g e)
- eval_g (NegLoop_g e) = eval_g e
+ eval_g (Neg_g e) = negate (eval_g e)
+ eval_g (NegLoop_g e) = eval_g e
instance HasEval_g d => Eval_g (Eq_g d) where
- eval_g (Eq_g e1 e2) = eval_g e1 == eval_g e2
- eval_g (EqLoop_g e) = eval_g e
+ eval_g (Eq_g e1 e2) = eval_g e1 == eval_g e2
+ eval_g (EqLoop_g e) = eval_g e
instance HasEval_g d => Eval_g (Not_g d) where
- eval_g (Not_g e) = not (eval_g e)
- eval_g (NotLoop_g e) = eval_g e
+ eval_g (Not_g e) = not (eval_g e)
+ eval_g (NotLoop_g e) = eval_g e
\end{lstHaskellLhstex}
\begin{lstHaskellLhstex}[caption={Printer instances}]
instance HasPrint_g d => Print_g (Expr_g d) where
- print_g (Lit_g v) = show v
- print_g (Add_g e1 e2) = "(" ++ print_g e1 ++ "+" ++ print_g e2 ++ ")"
- print_g (Ext_g d x) = getPrint_g d x
+ print_g (Lit_g v) = show v
+ print_g (Add_g e1 e2) = "(" ++ print_g e1 ++ "+" ++ print_g e2 ++ ")"
+ print_g (Ext_g d x) = getPrint_g d x
instance HasPrint_g d => Print_g (Sub_g d) where
- print_g (Sub_g e1 e2) = "(" ++ print_g e1 ++ "-" ++ print_g e2 ++ ")"
- print_g (SubLoop_g e) = print_g e
+ print_g (Sub_g e1 e2) = "(" ++ print_g e1 ++ "-" ++ print_g e2 ++ ")"
+ print_g (SubLoop_g e) = print_g e
instance HasPrint_g d => Print_g (Neg_g d) where
- print_g (Neg_g e) = "(negate " ++ print_g e ++ ")"
- print_g (NegLoop_g e) = print_g e
+ print_g (Neg_g e) = "(negate " ++ print_g e ++ ")"
+ print_g (NegLoop_g e) = print_g e
instance HasPrint_g d => Print_g (Eq_g d) where
- print_g (Eq_g e1 e2) = "(" ++ print_g e1 ++ "==" ++ print_g e2 ++ ")"
- print_g (EqLoop_g e) = print_g e
+ print_g (Eq_g e1 e2) = "(" ++ print_g e1 ++ "==" ++ print_g e2 ++ ")"
+ print_g (EqLoop_g e) = print_g e
instance HasPrint_g d => Print_g (Not_g d) where
- print_g (Not_g e) = "(not " ++ print_g e ++ ")"
- print_g (NotLoop_g e) = print_g e
+ print_g (Not_g e) = "(not " ++ print_g e ++ ")"
+ print_g (NotLoop_g e) = print_g e
\end{lstHaskellLhstex}
\begin{lstHaskellLhstex}[caption={Optimisation instances}]
instance HasOpt_g d => Opt_g (Expr_g d) where
- opt_g (Lit_g v) = Lit_g v
- opt_g (Add_g e1 e2) = case (opt_g e1, opt_g e2) of
- (Lit_g 0, e2p ) -> e2p
- (e1p, Lit_g 0) -> e1p
- (e1p, e2p ) -> Add_g e1p e2p
- opt_g (Ext_g d x) = Ext_g d (getOpt_g d x)
+ opt_g (Lit_g v) = Lit_g v
+ opt_g (Add_g e1 e2) = case (opt_g e1, opt_g e2) of
+ (Lit_g 0, e2p ) -> e2p
+ (e1p, Lit_g 0) -> e1p
+ (e1p, e2p ) -> Add_g e1p e2p
+ opt_g (Ext_g d x) = Ext_g d (getOpt_g d x)
instance HasOpt_g d => Opt_g (Sub_g d) where
- opt_g (Sub_g e1 e2) = case (opt_g e1, opt_g e2) of
- (e1p, Lit_g 0) -> SubLoop_g e1p
- (e1p, e2p ) -> Sub_g e1p e2p
- opt_g (SubLoop_g e) = SubLoop_g (opt_g e)
+ opt_g (Sub_g e1 e2) = case (opt_g e1, opt_g e2) of
+ (e1p, Lit_g 0) -> SubLoop_g e1p
+ (e1p, e2p ) -> Sub_g e1p e2p
+ opt_g (SubLoop_g e) = SubLoop_g (opt_g e)
instance (Typeable d, GDict (d (Neg_g d)), HasOpt_g d) => Opt_g (Neg_g d) where
- opt_g (Neg_g (Add_g e1 e2))
- = NegLoop_g (Add_g (opt_g (neg_g e1)) (opt_g (neg_g e2)))
- opt_g (Neg_g (Ext_g d x))
- = case fromDynamic (toDyn (getOpt_g d x)) of
- Just (Neg_g e) -> NegLoop_g e
- _ -> Neg_g (Ext_g d (getOpt_g d x))
- opt_g (Neg_g e) = Neg_g (opt_g e)
- opt_g (NegLoop_g e) = NegLoop_g (opt_g e)
+ opt_g (Neg_g (Add_g e1 e2))
+ = NegLoop_g (Add_g (opt_g (neg_g e1)) (opt_g (neg_g e2)))
+ opt_g (Neg_g (Ext_g d x))
+ = case fromDynamic (toDyn (getOpt_g d x)) of
+ Just (Neg_g e) -> NegLoop_g e
+ _ -> Neg_g (Ext_g d (getOpt_g d x))
+ opt_g (Neg_g e) = Neg_g (opt_g e)
+ opt_g (NegLoop_g e) = NegLoop_g (opt_g e)
instance HasOpt_g d => Opt_g (Eq_g d) where
- opt_g (Eq_g e1 e2) = Eq_g (opt_g e1) (opt_g e2)
- opt_g (EqLoop_g e) = EqLoop_g (opt_g e)
+ opt_g (Eq_g e1 e2) = Eq_g (opt_g e1) (opt_g e2)
+ opt_g (EqLoop_g e) = EqLoop_g (opt_g e)
+\end{lstHaskellLhstex}
+
+\section{Chaining semantics (reprise)}\label{sec:classy_reprise}
+\todo{Verbeteren}
+One of the unique selling points of this novel \gls{DSL} embedding technique is that it, in its basic form, does not require advanced type system extensions.
+However, while generalising the technique to \glspl{GADT} arguably unleashes a cesspool of \emph{unsafe} compiler extensions.
+If we are willing to work with extensions, much of the boilerplate can be either generated or omitted entirely.
+
+The \gls{DSL} datatype is parametrised by a type variable providing a witness to the view on the language.
+Using constraint kinds, we can make these witnesses explicit, tying into \gls{HASKELL}'s type system immediately.
+Furthermore, when resorting to the \GHCmod{DataKinds} and \GHCmod{PolyKinds} extensions, this constraint does not necessarily has to be a single constraint, using a \haskelllhstexinline{Record} auxiliary type, we can encode list of witnesses.
+The data type for this list of witnesses is \haskelllhstexinline{Record}.
+Record is parametrised by two type variables, the first type variable (\haskelllhstexinline{dt}) is the data type on which the constraints can be applied.
+The second type variable (\haskelllhstexinline{clist}) is the list of constraints itself.
+It is not just a list of \haskelllhstexinline{Constraint} but it is a list containing constraint constructors that will, when given a type of kind \haskelllhstexinline{k}, produce a constraint.
+This means that when \haskelllhstexinline{Cons} is pattern matched, the type class constraint for \haskelllhstexinline{c dt} can be solved by the compiler.
+
+\begin{lstHaskellLhstex}[caption={Data type for a list of constraints}]
+data Record (dt :: k) (clist :: [k -> Constraint]) where
+ Nil :: Record dt '[]
+ Cons :: c dt => Record dt cs -> Record dt (c ': cs)
+\end{lstHaskellLhstex}
+
+To incorporate this type in the \haskelllhstexinline{Expr} type, the \haskelllhstexinline{Ext} constructor changes as follows:
+
+\begin{lstHaskellLhstex}[caption={Data type for a list of constraints}]
+data Expr c
+ = Lit Int
+ | Add (Expr c) (Expr c)
+ | Ext (Record x c) x
+\end{lstHaskellLhstex}
+
+Furthermore, we define a type class that allows us to extract explicit dictionaries \haskelllhstexinline{Dict} from these records if the constraint can is present in the list.
+
+\begin{lstHaskellLhstex}[caption={Membership functions for constraints}]
+class c `In` cs where
+ project :: Record dt cs -> Dict (c dt)
+instance {-# OVERLAPPING #-} c `In` (c ': cs) where
+ project (Cons _) = Dict
+instance {-# OVERLAPPING #-} c `In` cs => c `In` (b ': cs) where
+ project (Cons xs) = project xs
+\end{lstHaskellLhstex}
+
+Finally, creating these \haskelllhstexinline{Record} witnesses is a chore so this can be automated as well using a \haskelllhstexinline{CreateRecord} type class that will create a record structure cell by cell if and only if all type class constraints are available.
+
+\begin{lstHaskellLhstex}[caption={Membership functions for constraints}]
+class CreateRecord dt c where
+ createRecord :: Record dt c
+instance CreateRecord d '[] where
+ createRecord = Nil
+instance (c (d c0), CreateRecord (d c0) cs) =>
+ CreateRecord (d c0) (c ': cs) where
+ createRecord = Cons createRecord
+\end{lstHaskellLhstex}
+
+The class constraints for the interpretation instances can now be greatly simplified, as shown in the evaluation instance for \haskelllhstexinline{Expr}.
+The implementation remains the same, only that for the extension case, a trick needs to be applied to convince the compiler of the correct instances.
+Using \haskelllhstexinline{`In`}'s \haskelllhstexinline{project} function, a dictionary can be brought into scope.
+This dictionary can then subsequently be used to apply the type class function on the extension using the \haskelllhstexinline{withDict} function from the \haskelllhstexinline{Data.Constraint} library\footnote{\haskelllhstexinline{withDict :: Dict c -> (c => r) -> r}}.
+The \GHCmod{ScopedTypeVariables} extension is used to make sure the existentially quantified type variable for the extension is matched to the type of the dictionary.
+
+\begin{lstHaskellLhstex}[caption={Evaluation instance for the main data type}]
+class Eval v where eval :: v -> Int
+
+instance Eval `In` s => Eval (Expr s) where
+ eval (Lit i) = i
+ eval (Add l r) = eval l + eval r
+ eval (Ext r (e :: x)) = withDict (project r :: Dict (Eval x)) eval e
+\end{lstHaskellLhstex}
+
+Smart constructors need to be adapted as well, as can be seen from the smart constructor \haskelllhstexinline{subst}.
+
+\begin{lstHaskellLhstex}[caption={Substitution smart constructor}]
+subst :: (Typeable c, CreateRecord (Subt c) c) => Expr c -> Expr c -> Expr c
+subst l r = Ext createRecord (l `Subt` r)
+\end{lstHaskellLhstex}
+
+Finally, defining terms in the language is can be done immediately if the interpretations are known.
+For example, if we want to print the term $~(~(42+(38-4)))$, we can define it as follows.
+
+\begin{lstHaskellLhstex}[caption={Substitution smart constructor}]
+e0 :: Expr '[Print,Opt]
+e0 = neg (neg (Lit 42 `Add` (Lit 38 `subt` Lit 4)))
+\end{lstHaskellLhstex}
+
+It is also possible to define terms in the \gls{DSL} as being overloaded in the interpretation.
+This does require enumerating all the \haskelllhstexinline{CreateRecord} type classes for every extension.
+At the call site, the concrete list of constraints must be known.
+
+\begin{lstHaskellLhstex}[caption={Substitution smart constructor}]
+e1 :: (Typeable c
+ , CreateRecord (Neg c) c
+ , CreateRecord (Subst c) c)
+ => Expr c
+e1 = neg (neg (Lit 42 `Add` (Lit 38 `subt` Lit 4)))
\end{lstHaskellLhstex}
\end{subappendices}
\pagenumbering{arabic}
}{}
-\chapter{DSL embedding techniques}%
+\chapter{\texorpdfstring{\Acrshort{DSL}}{DSL} embedding techniques}%
\label{chp:dsl_embedding_techniques}%
An \gls{EDSL} is a language embedded in a host language created for a specific domain\todo{citation needed?}.
-\glspl{EDSL} can have one or more backends or views.
-Commonly used views are pretty printing, compiling, simulating, verifying and proving the program.
-Embedding techniques can be characterised by at least the following properties:
-\begin{enumerate}
- \item Extendibility of constructs
- \item Extendibility of views
- \item Type safety, the type of a term in the language is typed in the host language
- \item Support for well typed variables
- \item Intensional analyses
- \item Support for partial views, not all views need to support all constructs
-\end{enumerate}
-
-There are two main techniques for creating embedding \glspl{DSL}, deep---also called tagged---embedding and shallow---also called tagless---embedding.
+Properties such as referential transparency, minimal syntax and \glspl{ADT} make \gls{FP} languages excellent candidates for hosting \glspl{EDSL}.
+Terms in an \glspl{EDSL} can have multiple interpretations---also called backends or views---i.e.\ preferably, a term in the \gls{DSL} is just an interface.
+Commonly used intepretations are printing, compiling, simulating, optimising, verifying, proving the program\etc.
+There are two main flavours of embedding \glspl{DSL}, deep---also called tagged---embedding and shallow---also called tagless---embedding.
+
+\Cref{sec:deep_embedding} shows the basics of deep embedding.
+\Cref{sec:shallow_embedding} shows the basics of shallow embedding including tagless embedding.
+\Cref{sec:compare_embedding} compares the embedding technique.
+
+Both flavours have their strengths and weaknesses and both flavours can be improved in order to mitigate (some of the) downsides.
+
+\begin{table}[ht]
+ \begin{threeparttable}[b]
+ \caption{Comparison of embedding techniques, adapted from \citet[Sec.~3.6]{sun_compositional_2022}}%
+ \label{tbl:dsl_comparison}
+ \begin{tabular}{lllllll}
+ \toprule
+ & Shallow & Deep & Hybrid & Poly & Comp. & Classy\\
+ \midrule
+ Transcoding free & yes & yes & no & yes & yes & yes\\
+ Linguistic reuse & yes & no & partly\tnote{1} & yes & yes & no?\\
+ Extend constructs & yes & no & partly\tnote{1} & yes & yes & yes\\
+ Extend interpretations & no & yes & yes & yes & yes & yes\\
+ Transformations & no & yes & yes & maybe\tnote{2} & maybe\tnote{2} & yes\\
+ Modular dependencies & no & maybe & maybe & yes & yes & yes\\
+ Nested pattern matching & no & yes & yes & no & maybe & maybe\tnote{3}\\
+ Type safe & yes & maybe & no & yes & yes & yes\\
+ \bottomrule
+ \end{tabular}
+ \begin{tablenotes}
+ \item [1] Only in the shallowly embedded part.
+ \item [2] Transformations require some ingenuity and are sometimes awkward to write.
+ \item [3] It requires some---safe---form of dynamic typing.
+ \end{tablenotes}
+ \end{threeparttable}
+\end{table}
+
In the following sections both techniques are explained.
As an example, a simple language with integers, booleans and some arithmetic operators is used as a running example.
-\section{Deep embedding}
-A deep \gls{EDSL} is a language represented as data in the host language.
-Views are functions that transform \emph{something} to the datatype or the other way around.
+\section{Deep embedding}\label{sec:deep_embedding}
+In a deeply embedded \gls{DSL}, the language terms are represented as data type{(s)} in the host language.
+Therefore, interpretations of the terms are functions that operate on these data types.
Definition~\ref{lst:exdeep} shows an implementation for the example \gls{DSL}.
\begin{lstHaskell}[label={lst:exdeep},caption={A deeply embedded expression \gls{DSL}.}]
data Value = I Int | B Bool
data Expr
- = Lit Value
- | Plus Expr Expr
- | Eq Expr Expr
+ = Lit Value
+ | Plus Expr Expr
+ | Eq Expr Expr
deriving Show
\end{lstHaskell}
-Implementing a printer for the language is straightforward:
+Implementing a printer for the language is straightforward, we just define a function that transforms the term to a string.
\begin{lstHaskell}[caption={A printer for the deeply embedded expression \gls{DSL}.}]
print :: Expr -> String
print (Eq l r) = "(" ++ print l ++ "==" ++ print r ++ ")"
\end{lstHaskell}
-Adding a construct, for example subtraction reveals the Achilles' heel of deep embedding, namely that we need to revisit the original data type \emph{and} all the existing views.
-I.e.\ we need to add \haskellinline{| Sub Expr Expr} to the \haskellinline{Expr} data type and \haskellinline{print (Sub l r) = ...} to the \haskellinline{print} view.
+Adding a construct---for example subtraction---reveals the Achilles' heel of deep embedding, namely that we need to revisit the original data type \emph{and} all the existing views.
+I.e.\ we need to add \haskellinline{| Sub Expr Expr} to the \haskellinline{Expr} data type.
+Furthermore, we need to add \haskellinline{print (Sub l r) = ...} to the \haskellinline{print} view in order to not end up with a partial function.
This limitation can be overcome by lifting the views to classes (See \cref{chp:classy_deep_embedding}).
Implementing an evaluator for the language is possible without touching any original code, we just add a function operating on the \haskellinline{Expr} data type.
-To store variables, it has an extra environment argument\todo{cite while}.
+To store variables, it has an extra environment argument.
Here another downside of basic deep embedding arises immediately, the expressions are not typed, and therefore there has to be some type checking in the evaluation code.
+Luckily this problem can be overcome by switching from regular \glspl{ADT} to \glspl{GADT}, resulting in the following data type and evaluator.
-\begin{lstHaskell}[]
+\begin{lstHaskell}[caption={An evaluator for the deeply embedded expression \gls{DSL}.}]
eval :: Expr -> Value
eval (Lit i) = i
eval (Plus l r) = case (eval l, eval r) of
(l, r) -> error ("Can't compare " ++ show l ++ " to " ++ show r)
\end{lstHaskell}
-Luckily this problem can be overcome by switching from regular \glspl{ADT} to \glspl{GADT}, resulting in the following data type and evaluator.
+\subsection{Deep embedding with \texorpdfstring{\acrshortpl{GADT}}{GADTs}}
+Deep embedding has the advantage that it is easy to build and views are easy to add.
+On the downside, the expressions created with this language are not necessarily type-safe.
+In the given language it is possible to create an expression such as \haskellinline{Plus (LitI 4) (LitB True)} that adds a boolean to an integer.
+Extending the \gls{ADT} is easy and convenient but extending the views accordingly is tedious since it has to be done individually for all views.
-\begin{lstHaskell}[]
+The first downside of this type of \gls{EDSL} can be overcome by using \glspl{GADT}~\citep{cheney_first-class_2003}.
+\Cref{lst:exdeepgadt} shows the same language, but type-safe with a \gls{GADT}.
+\glspl{GADT} are not supported in the current version of \gls{CLEAN} and therefore the syntax is hypothetical (See \todo{insert link to appendix}).
+However, it has been shown that \glspl{GADT} can be simulated using bimaps or projection pairs~\citep{cheney_first-class_2003}.
+Unfortunately the lack of extendability remains a problem.
+If a language construct is added, no compile time guarantee can be given that all views support it.
+
+\begin{lstHaskell}[label={lst:exdeepgadt},caption={A deeply embedded expression \gls{DSL} using \glspl{GADT}.}]
data Expr a where
Lit :: Show a => a -> Expr a
Plus :: Num a => Expr a -> Expr a -> Expr a
eval (Eq l r) = eval l == eval r
\end{lstHaskell}
-Deep embedding has the advantage that it is easy to build and views are easy to add.
-On the downside, the expressions created with this language are not necessarily type-safe.
-In the given language it is possible to create an expression such as \cleaninline{Plus (LitI 4) (LitB True)} that adds a boolean to an integer.
-Extending the \gls{ADT} is easy and convenient but extending the views accordingly is tedious since it has to be done individually for all views.
-
-The first downside of this type of \gls{EDSL} can be overcome by using \glspl{GADT}~\citep{cheney_first-class_2003}.
-Example~\ref{lst:exdeepgadt} shows the same language, but type-safe with a \gls{GADT}.
-\glspl{GADT} are not supported in the current version of \gls{CLEAN} and therefore the syntax is hypothetical (See \todo{insert link to appendix}).
-However, it has been shown that \glspl{GADT} can be simulated using bimaps or projection pairs~\citep{cheney_first-class_2003}.
-Unfortunately the lack of extendability remains a problem.
-If a language construct is added, no compile time guarantee can be given that all views support it.
-
-\begin{lstClean}[label={lst:exdeepgadt},caption={A deeply embedded expression \gls{DSL} using \glspl{GADT}.}]
-:: Expr a
- = Lit a -> Expr a
- | E.e: Var String -> Expr e
- | Plus (Expr Int) (Expr Int) -> Expr Int
- | E.e: Eq (Expr e) (Expr e) -> Expr Bool & == e
-\end{lstClean}
-
\section{Shallow embedding}
In a shallowly \gls{EDSL} all language constructs are expressed as functions in the host language.
-An evaluator view for the example language then can be implemented as the code shown in Definition~\ref{lst:exshallow}.
+An evaluator view for the example language then can be implemented as the code shown in \cref{lst:exshallow}.
Note that much of the internals of the language can be hidden using monads.
-\begin{lstClean}[label={lst:exshallow}, caption={A minimal shallow \gls{EDSL}.}]
-:: Env :== (String -> Int)
-:: DSL a :== (Env -> a)
+\begin{lstHaskell}[label={lst:exshallow}, caption={A minimal shallow \gls{EDSL}.}]
+type Env = String -> Int
+type DSL a = Env -> a
Lit :: a -> DSL a
Lit x = \e->x
Var :: String -> DSL Int
Var i = \e->retrEnv e i
-Plus :: (DSL Int) (DSL Int) -> DSL Int
+Plus :: DSL Int -> DSL Int -> DSL Int
Plus x y = \e->x e + y e
-Eq :: (DSL a) (DSL a) -> DSL Bool | == a
+Eq :: Eq a => DSL a -> DSL a -> DSL Bool
Eq x y = \e->x e == y e
-\end{lstClean}
+\end{lstHaskell}
-The advantage of shallowly embedding a language in a host language is its extendability.
+One of the advantages of shallowly embedding a language in a host language is its extendability.
It is very easy to add functionality because the compile time checks of the host language guarantee whether or not the functionality is available when used.
Moreover, the language is type safe as it is directly typed in the host language, i.e.\ \cleaninline{Lit True +. Lit 4} is rejected.
+Another advantage is the intimate link with the host language, allowing for a lot more linguistic reuse such as the support of implicit sharing~\cite{krishnamurthi_linguistic_2001}.
The downside of this method is extending the language with views.
It is nearly impossible to add views to a shallowly embedded language.
\subsection{Tagless-final embedding}\label{ssec:tagless}
+
\section{Comparison}
\todo{cite compositional DSLs}
\documentclass[tikz]{standalone}
-\usetikzlibrary{positioning}
+\usetikzlibrary{positioning,shapes.geometric,shapes.symbols,shadows,fit}
\begin{document}
-\begin{tikzpicture}[nodes={draw,minimum width=7.5em}]
- \node (dsl) {domain-specific language};
- \node (sta) [below=of dsl,xshift=-3.75em] {standalone};
- \node (emb) [right=of sta] {embedded};
- \node (het) [below=of emb,xshift=-3.75em] {heterogeneous};
- \node (hom) [right=of het] {homogeneous};
- \draw [->] (dsl) -- (sta);
- \draw [->] (dsl) -- (emb);
- \draw [->] (emb) -- (het);
- \draw [->] (emb) -- (hom);
+\begin{tikzpicture}[%
+ node distance=1em,
+ task/.style={regular polygon,regular polygon sides=3,text width=2.5em,draw},
+ share/.style={cylinder,shape border rotate=90,aspect=.3,text width=2.5em,text height=1em,draw},
+ client/.style={cloud,aspect=2}
+ ]
+ \node (C1) [client,inner sep=0pt,draw] {Client\textsubscript{1}};
+ \node (C2) [client,inner sep=0pt,draw,right=of C1] {Client\textsubscript{2}};
+ \node (CD) [right=of C2] {\ldots};
+ \node (CN) [client,inner sep=0pt,draw,right=of CD] {Client\textsubscript{n}};
+
+ % line between server and browser
+ \draw [dotted] ([xshift=-5em,yshift=-.8em]C1.south west) node[left,above]{browser} node[left,below]{server(s)} -- ([xshift=3em,yshift=-.8em]CN.south east);
+
+ \node[task,inner sep=-4.5pt,below=of C1,fill=white,xshift=4.5em,yshift=-4.5em] (Tn) {Task\textsubscript{n}};
+ \node[task,inner sep=-4.5pt,below=of C1,fill=white,xshift=3em,yshift=-3.0em] (Td) {\ldots};
+ \node[task,inner sep=-4.5pt,below=of C1,fill=white,xshift=1.5em,yshift=-1.5em] (T2) {Task\textsubscript{2}};
+ \node[task,inner sep=-4.5pt,below=of C1,fill=white] (T1) {Task\textsubscript{1}};
+
+ \node[share,node distance=6em,right=of T1,fill=white,xshift=4.5em,yshift=-4.5em] (Sn) {SDS\textsubscript{n}};
+ \node[share,node distance=6em,right=of T1,fill=white,xshift=3em,yshift=-3.0em] (Sd) {\ldots};
+ \node[share,node distance=6em,right=of T1,fill=white,xshift=1.5em,yshift=-1.5em] (S2) {SDS\textsubscript{2}};
+ \node[share,node distance=6em,right=of T1,fill=white] (S1) {SDS\textsubscript{1}};
+
+ % line between client and server
+ \draw [dotted] ([xshift=-5em,yshift=-11em]C1.south west) node[left,above]{server(s)} node[left,below]{device(s)} -- ([xshift=3em,yshift=-11em]CN.south east);
+
+ % device 1
+ \node[task,text width=1.5em,inner sep=-4.5pt,node distance=12em,below=of C1,double copy shadow={shadow xshift=.4em,shadow yshift=-.4em},fill=white] (D1T1) {};
+ \node[share,text width=.3em,text height=.8ex,right=of D1T1,double copy shadow={shadow xshift=.4em,shadow yshift=-.4em},fill=white] (D1S1) {};
+
+ \node[draw,fit={(D1S1)(D1T1)([shift={(1.2em,-1.2em)}]D1S1.south east)},label=below:{\scriptsize mTask device\textsubscript{1}}] {};
+
+ % device 2
+ \node[task,text width=1.5em,inner sep=-4.5pt,node distance=12em,below=of C2,double copy shadow={shadow xshift=.4em,shadow yshift=-.4em},fill=white] (D2T1) {};
+ \node[share,text width=.3em,text height=.8ex,right=of D2T1,double copy shadow={shadow xshift=.4em,shadow yshift=-.4em},fill=white] (D2S1) {};
+
+ \node[draw,fit={(D2S1)(D2T1)([shift={(1.2em,-1.2em)}]D2S1.south east)},label=below:{\scriptsize mTask device\textsubscript{2}}] (D2) {};
+
+ % device ...
+ \node[right=of D2] {\ldots};
+
+ % device n
+ \node[task,text width=1.5em,inner sep=-4.5pt,node distance=12em,below=of CN,double copy shadow={shadow xshift=.4em,shadow yshift=-.4em},fill=white] (DNT1) {};
+ \node[share,text width=.3em,text height=.8ex,right=of DNT1,double copy shadow={shadow xshift=.4em,shadow yshift=-.4em},fill=white] (DNS1) {};
+
+ \node[draw,fit={(DNS1)(DNT1)([shift={(1.2em,-1.2em)}]DNS1.south east)},label=below:{\scriptsize mTask device\textsubscript{n}}] (DN) {};
+
+ % iTask server
+ \node[fit={(C1)([shift={(1em,-1.2em)}]DN.south east)},draw,label={iTask server}] {};
+
+ % Arrows
+ \draw [<->] (C1) -- (T1);
+ \draw [<->] (C2) -- (T1);
+ \draw [<->] (CN) -- (T1);
+
+ \draw [<->] (T1) -- (S1);
+ \draw [<->] (T1) -- (S2);
+ \draw [<->] (T1) -- (Sn);
+
+ \draw [<->] (Tn) -- node [midway,above,sloped] {\scriptsize\texttt{liftmTask}} (D1T1);
+ \draw [<->] (Sn) -- node [midway,above,sloped] {\scriptsize\texttt{liftsds}} (D1S1);
+
+ \draw [<->] (D1T1) -- (D1S1);
+
\end{tikzpicture}
\end{document}