updates
[phd-thesis.git] / appendix / clean_for_haskell_programmers.tex
1 \documentclass[../thesis.tex]{subfiles}
2
3 \begin{document}
4
5 \ifSubfilesClassLoaded{
6 \author{%
7 Mart Lubbers\\
8 \texttt{mart@cs.ru.nl}
9 \and
10 Peter Achten\\
11 \texttt{peter@cs.ru.nl}
12 }
13 \title{Clean for Haskell Programmers}
14 \date{\today}
15
16 \stopthumb{}%
17 \setcounter{chapter}{1}
18
19 \pagenumbering{arabic}
20 \maketitle
21 \tableofcontents
22 }{
23 \chapter{\glsentrytext{CLEAN} for \glsentrytext{HASKELL} Programmers}%
24 \label{chp:clean_for_haskell_programmers}
25 }
26
27 This note is meant to give people who are familiar with the functional programming language \gls{HASKELL} a consise overview of \gls{CLEAN} language elements and how they differ from \gls{HASKELL}.
28 The goal is to support the reader when reading \gls{CLEAN} code.
29 Table~\ref{tbl:syn_clean_haskell} shows frequently occuring \gls{CLEAN} language elements on the left side and their \gls{HASKELL} equivalent on the right side.
30 Obviously, this summary is not exhaustive.
31 Some \gls{CLEAN} language elements that are not easily translatable to \gls{HASKELL} and thus do not occur in the summary follow below.
32 We hope you enjoy these notes and that it aids you in reading \gls{CLEAN} programs.
33
34 While \gls{CLEAN} and \gls{HASKELL} were both conceived around 1987 and have similar syntax, there are some subtle differences in syntax and functionality.
35 This section describes some of the history of \gls{CLEAN} and provides a crash course in \gls{CLEAN} pecularities written for \gls{HASKELL} programmers.
36 It is based on the
37
38 \Gls{CLEAN}---acronym for Clean \acrlong{LEAN}~\cite{barendregt_towards_1987}---, was originally designed as a \gls{GRS} core language but quickly served as an intermediate language for other functional languages~\cite{brus_clean_1987}.
39 In the early days it has also been called \emph{Concurrent} \gls{CLEAN}~\cite{nocker_concurrent_1991} but these days the language has no support for this anymore.
40 Fast forward thirty years, \gls{CLEAN} is now a robust language with state-of-the-art features and is actually used in industry as well as academia---albeit in select areas of the world~\cite{plasmeijer_clean_2021}.
41
42 Initially, when it was used mostly as an intermediate language, it had a fairly spartan syntax.
43 However, over the years, the syntax got friendlier and it currently it looks a lot like \gls{HASKELL}.
44 In the past, a \emph{double-edged} fronted even existed that allowed \gls{CLEAN} to be extended with \gls{HASKELL98} syntax and vice versa, however this frontend is no longer maintained~\cite{groningen_exchanging_2010}.
45 This chapter therefore gives a brief syntactical and functional comparison, a complete specification of the \gls{CLEAN} language can be found in the latest language report~\cite{plasmeijer_clean_2021}.
46 Many of this is based on work by Achten although that was based on \gls{CLEAN} 2.1 and \gls{HASKELL98}~\cite{achten_clean_2007}.
47 When \gls{HASKELL} is mentioned we actually mean \gls{GHC}'s \gls{HASKELL} and by \gls{CLEAN} we mean \gls{CLEAN} 3.1's \gls{ITASK} compiler.
48
49 \section{Features}
50 \subsection{Modules}
51 \Gls{CLEAN} has separate implementation and definition modules.
52 The definition module contains the class definitions, instances, function types and type definitions (possibly abstract).
53 Implementation modules contain the function implementations as well.
54 This means that only what is defined in the definition module is exported in \gls{CLEAN}.
55 This differs greatly from \gls{HASKELL}, as there is only a module file there.
56 Choosing what is exported in \gls{HASKELL} is done using the \haskellinline{module Mod(...)} syntax.
57
58 \subsection{Strictness}
59 In \gls{CLEAN}, by default, all expressions are evaluated lazily.
60 Types can be annotated with a strictness attribute (\cleaninline{!}), resulting in the values being evaluated to head-normal form before the function is entered.
61 In \gls{HASKELL}, in patterns, strictness can be enforced using \haskellinline{!}\requiresGHCmod{BangPatterns}.
62 Within functions the strict let (\cleaninline{#!}) can be used to force evaluate an expression, in \gls{HASKELL} \haskellinline{seq} or \haskellinline{\$!} is used for this.
63
64 \subsection{Uniqueness typing}
65 Types in \gls{CLEAN} may be \emph{unique}, which means that they may not be shared\todo{cite}.
66 The uniqueness type system allows the compiler to generate efficient code because unique data structures can be destructively updated.
67 Furthermore, uniqueness typing serves as a model for side effects as well.
68 \Gls{CLEAN} uses the \emph{world-as-value} paradigm where \cleaninline{World} represents the external environment and is always unique.
69 A program with side effects is characterised by a \cleaninline{Start :: *World -> *World} start function.
70 In \gls{HASKELL}, interaction with the world is done using the \haskellinline{IO} monad.
71 The \haskellinline{IO} monad could very well be---and actually is---implemented in \gls{CLEAN} using a state monad with the \cleaninline{World} as a state.
72 Besides marking types as unique, it is also possible to mark them with uniqueness attributes variables \cleaninline{u:} and define constraints on them.
73 For example, to make sure that an argument of a function is at least as unique as another argument.
74 Finally, using \cleaninline{.} (a dot), it is possible to state that several variables are equally unique.
75 Uniqueness is propagated automatically in function types but must be marked manually in data types.
76 Examples can be seen in \cref{lst:unique_examples}.
77
78 \begin{lstClean}[label={lst:unique_examples},caption={Examples of uniqueness annotations in \gls{CLEAN}.}]
79 f :: *a -> *a // f works on unique values only
80 f :: .a -> .a // f works on unique and non-unique values
81 f :: v:a u:b -> u:b, [v<=u] // f works when a is less unique than b
82 \end{lstClean}
83 %f :: (Int, *World) -> *World // The uniqueness is propagated automatically (i.e. *(Int, *World)))
84 %:: T = T *(Int, *World) // Writing :: T = T (Int, *World) won't work
85 %:: T = T (Int -> *(*World -> *World)) // Writing :: T = T (Int *World -> *World) won't work
86
87 \subsection{Expressions}
88 \todo[inline]{Matches pattern expression \texttt{=:}}
89
90 \todo[inline]{Let before}
91
92 \subsection{Generics}
93 Polytypic functions~\cite{jeuring_polytypic_1996}---also known as generic or kind-indexed fuctions---are built into \gls{CLEAN}~\cite[Chp.~7.1]{plasmeijer_clean_2021}\cite{alimarine_generic_2005} whereas in \gls{HASKELL} they are implemented as a library~\cite[Chp.~6.19.1]{ghc_team_ghc_2021}.
94 The implementation of generics in \gls{CLEAN} is very similar to that of Generic H$\forall$skell~\cite{hinze_generic_2003}.
95 %When calling a generic function, the kind must always be specified and depending on the kind, the function may require more arguments.
96
97 For example, defining a generic equality is done as in \cref{lst:generic_eq}.
98 \lstinputlisting[language=Clean,firstline=4,label={lst:generic_eq},caption={Generic equality function in \gls{CLEAN}.}.]{lst/generic_eq.icl}
99
100 Metadata about the types is available using the \cleaninline{of} syntax that gives the function access to metadata records, as can be seen in \cref{lst:generic_print} showing a generic print function. This abundance of metadata allows for very complex generic functions that near the expression level of template metaprogramming\ifSubfilesClassLoaded{}{ (See \cref{chp:first-class_datatypes})}.
101 \lstinputlisting[language=Clean,firstline=4,label={lst:generic_print},caption={Generic print function in \gls{CLEAN}.}]{lst/generic_print.icl}
102
103 \subsection{\glsentrytext{GADT}s}
104 GADTs are enriched data types that allow the type instantiation of the constructor to be explicitly defined~\cite{cheney_first-class_2003,hinze_fun_2003}.
105 While \glspl{GADT} are not natively supported in \gls{CLEAN}, they can be simulated using embedding-projection pairs or equivalence types~\cite[Sec.~2.2]{cheney_lightweight_2002}.
106 To illustrate this, \cref{lst:gadt_clean} shows an example \gls{GADT} that would be implemented in \gls{HASKELL} as done in \cref{lst:gadt_haskell}\requiresGHCmod{GADTs}.
107
108 \lstinputlisting[language=Clean,firstline=4,label={lst:gadt_clean},caption={Expression \gls{GADT} using equivalence types in \gls{CLEAN}.}]{lst/expr_gadt.icl}
109 \lstinputlisting[language=Haskell,style=haskell,firstline=4,label={lst:gadt_haskell},caption={Expression \gls{GADT} in \gls{HASKELL}.}]{lst/expr_gadt.hs}
110
111 \section{Syntax}
112 \begin{longtable}{p{.45\linewidth}p{.5\linewidth}}
113 \caption[]{Syntactical differences between \gls{CLEAN} and \gls{HASKELL}.}%
114 \label{tbl:syn_clean_haskell}\\
115 \toprule
116 \gls{CLEAN} & \gls{HASKELL}\\
117 \midrule
118 \endfirsthead%
119 \caption[]{(continued)}\\
120 \toprule
121 \gls{CLEAN} & \gls{HASKELL}\\
122 \midrule
123 \endhead%
124
125 \midrule
126 \multicolumn{2}{c}{Comments}\\
127 \midrule
128 \cleaninline{// single line} & \haskellinline{-- single line}\\
129 \cleaninline{/* multi line /* nested */ */} & \haskellinline{\{- multi line \{- nested -\} \}}\\
130
131 \midrule
132 \multicolumn{2}{c}{Imports}\\
133 \midrule
134 \cleaninline{import Mod => qualified f1, :: t} & \haskellinline{import qualified Mod (f1, t)}\\
135 & \haskellinline{import Mod hiding (f1, t)}\\
136 \cleaninline{/* multi line /* nested */ */} & \haskellinline{\{- multi line \{- nested -\} \}}\\
137
138 \midrule
139 \multicolumn{2}{c}{Basic types}\\
140 \midrule
141 \cleaninline{42 :: Int} & \haskellinline{42 :: Int}\\
142 \cleaninline{True :: Bool} & \haskellinline{True :: Bool}\\
143 \cleaninline{toInteger 42 :: Integer} & \haskellinline{42 :: Integer}\\
144 \cleaninline{38.0 :: Real} & \haskellinline{38.0 :: Float -- or Double}\\
145 \cleaninline{\"Hello\" +++ \"World\" :: String}\footnote{Strings are represented as unboxed character arrays.}
146 & \haskellinline{\"Hello\" ++ \"World\" :: String}\footnote{Strings are represented as lists of characters by default but may be overloaded as well if \GHCmod{OverloadedStrings} is enabled.}\\
147 \cleaninline{['Hello'] :: [Char]} & \haskellinline{\"Hello\" :: String}\\
148 \cleaninline{?t} & \haskellinline{Maybe t}\\
149 \cleaninline{(?None, ?Just e)} & \haskellinline{(Nothing, Just e)}\\
150
151 \midrule
152 \multicolumn{2}{c}{Type definitions}\\
153 \midrule
154 \cleaninline{:: T a0 ... :== t} & \haskellinline{type T a0 ... = t}\\
155 \cleaninline{:: T a0 ... } & \haskellinline{data T a1 ...}\\
156 \quad\cleaninline{= C1 f0 ... fn \| ... \| Cn f0 ... fn} & \quad\haskellinline{= C1 f0 ... fn \| ... \| Cn f0 ... fn}\\
157 \cleaninline{:: T a0 ...} & \haskellinline{data T a0 ...}\\
158 \quad\cleaninline{= \{ f0 :: t0, ..., fn :: tn \} } & \quad\haskellinline{= T \{ f0 :: t0, ..., fn :: tn \} }\\
159 \cleaninline{:: T a0 ... =: t} & \haskellinline{newtype T a0 ... = t}\\
160 \cleaninline{:: T = E.t: Box t \& C t} & \haskellinline{data T = forall t.C t => Box t}\requiresGHCmod{ExistentialQuantification}\\
161
162 \midrule
163 \multicolumn{2}{c}{Function types}\\
164 \midrule
165 \cleaninline{f0 :: a0 a1 ... -> t}
166 & \haskellinline{f0 :: (c0 v0, c1 v1, c2 v2) =>}\\
167 \quad\cleaninline{\| c0 v0 \& c1, c2 v1}
168 & \quad\haskellinline{a0 -> a1 ... -> t}\\
169 \cleaninline{(+) infixl 6 :: Int Int -> Int} & \haskellinline{infixl 6 +}\\
170 & \haskellinline{(+) :: Int -> Int -> Int}\\
171 \cleaninline{qid :: (A.a: a -> a) -> (Bool, Int)}
172 & \haskellinline{qid :: (forall a: a -> a) -> (Bool, Int)}\requiresGHCmod{RankNTypes}\\
173 \cleaninline{qid id = (id True, id 42)} &
174 \haskellinline{qid id = (id True, id 42)}\\
175
176 \midrule
177 \multicolumn{2}{c}{Type classes}\\
178 \midrule
179 \cleaninline{class f a :: t} & \haskellinline{class f a where f :: t}\\
180 \cleaninline{class C a \| C0, ... , Cn a}\footnote{In contrast to the \gls{HASKELL} variant, this does not require an instance.} & \haskellinline{class (C0 a, ..., Cn, a) => C a}\\
181 \cleaninline{class C s ~m where ...} & \haskellinline{class C s m \| m -> s where ...}\requiresGHCmod{MultiParamTypeClasses}\\
182 \cleaninline{instance C t \| C0, ..., Cn a} & \haskellinline{instance (C0 a, ..., Cn a) => C t}\\
183 \quad\cleaninline{where ...} & \quad\haskellinline{where ...}\\
184
185 \midrule
186 \multicolumn{2}{c}{As pattern}\\
187 \midrule
188 \cleaninline{x=:p} & \haskellinline{x@p}\\
189
190 \midrule
191 \multicolumn{2}{c}{Lists}\\
192 \midrule
193 \cleaninline{[1,2,3]} & \haskellinline{[1,2,3]}\\
194 \cleaninline{[x:xs]} & \haskellinline{x:xs}\\
195 \cleaninline{[e \\\\ e <- xs \| p e]} & \haskellinline{[e \| e <- xs, p e]}\\
196 \cleaninline{[l \\\\ l <- xs, r <- ys]} & \haskellinline{[l \| l <- xs, r <- ys]}\\
197 \cleaninline{[(l, r) \\\\ l <- xs \& r <- ys]} & \haskellinline{[(l, r) \| (l, r) <- zip xs ys]}\\
198 & or \haskellinline{[(l, r) \| l <- xs \| r <- ys]}\requiresGHCmod{ParallelListComp}\\
199
200 \midrule
201 \multicolumn{2}{c}{Lambda expressions}\\
202 \midrule
203 \cleaninline{\\a0 a1 ...->e} or \cleaninline{\\....e} or \cleaninline{\\...=e} & \haskellinline{\\a0 a1 ...->e}\\
204
205 \midrule
206 \multicolumn{2}{c}{Case distinction}\\
207 \midrule
208 \cleaninline{if p e0 e1} & \haskellinline{if p then e0 else e1}\\
209 \cleaninline{case e of p0 -> e0; ...} & \haskellinline{case e of p0 -> e0; ...}\\
210 \quad or \cleaninline{case e of p0 = e0; ...}\\
211 \cleaninline{f p0 ... pn} & \haskellinline{f p0 ... pn}\\
212 \quad\cleaninline{\| c = t} & \quad\haskellinline{\| c = t}\\
213 \quad\cleaninline{\| otherwise = t} or \cleaninline{= t} & \quad\haskellinline{\| otherwise = t}\\
214
215 \midrule
216 \multicolumn{2}{c}{Record expressions}\\
217 \midrule
218 \cleaninline{:: R = \{ f :: t \}} & \haskellinline{data R = R \{ f :: t \}}\\
219 \cleaninline{r = \{ f = e \}} & \haskellinline{r = R \{e\}}\\
220 \cleaninline{r.f} & \haskellinline{f r}\\
221 \cleaninline{r!f}\footnote{This operator allows for field selection from unique records.} & \haskellinline{(\\v->(f v, v)) r}\\
222 \cleaninline{\{r \& f = e \}} & \haskellinline{r \{ f = e \}}\\
223
224 \midrule
225 \multicolumn{2}{c}{Record patterns}\\
226 \midrule
227 \cleaninline{:: R0 = \{ f0 :: R1 \}} & \haskellinline{data R0 = R0 \{ f0 :: R1 \}}\\
228 \cleaninline{:: R1 = \{ f1 :: t \}} & \haskellinline{data R1 = R1 \{ f1 :: t \}}\\
229 \cleaninline{g \{ f0 \} = e f0} & \haskellinline{g (R0 \{f0=x\}) = e x}\\
230 & or \haskellinline{g (R0 \{f0\}) = e f0}\requiresGHCmod{RecordPuns}\\
231 \cleaninline{g \{ f0 = \{f1\} \} = e f1} & \haskellinline{g (R0 \{f0=R1 \{f1=x\}\}) = e x}\\
232
233 \midrule
234 \multicolumn{2}{c}{Arrays}\\
235 \midrule
236 \cleaninline{:: A :== \{t\}} & \haskellinline{type A = Array Int t}\\
237 \cleaninline{a = \{v0, ..., vn\}} & \haskellinline{a = array (0, n+1)}\\
238 & \quad\haskellinline{[(0, v0), ..., (n, vn)]}\\
239 \cleaninline{a = \{e \\\\ p <-: a\}} & \haskellinline{a = array (0, length a-1)}\\
240 & \quad\haskellinline{[e \| (i, a) <- [0..] `zip` a]}\\
241 \cleaninline{a.[i]} & \haskellinline{a!i}\\
242 \cleaninline{a![i]}\footnote{This operator allows for field selection from unique arrays.} & \haskellinline{(\v->(v!i, v)) a}\\
243 \cleaninline{\{ a \& [i] = e\}} & \haskellinline{a//[(i, e)]}\\
244
245 \midrule
246 \multicolumn{2}{c}{Dynamics}\\
247 \midrule
248 \cleaninline{f :: a -> Dynamic \| TC a} & \haskellinline{f :: Typeable a => a -> Dynamic}\\
249 \cleaninline{f e = dynamic e} & \haskellinline{f e = toDyn (e)}\\
250 \cleaninline{g :: Dynamic -> t} & \haskellinline{g :: Dynamic -> t}\\
251 \cleaninline{g (e :: t) = e0} & \haskellinline{g d = case fromDynamic d}\\
252 \cleaninline{g e = e1} & \quad\haskellinline{Just e -> e0}\\
253 & \quad\haskellinline{Nothing -> e1}\\
254
255 \bottomrule
256 \end{longtable}
257
258 \input{subfilepostamble}
259 \end{document}