top/imp.tex

   1 \documentclass[../thesis.tex]{subfiles}
   2
   3 \input{subfilepreamble}
   4
   5 \setcounter{chapter}{4}
   6
   7 \begin{document}
   8 \input{subfileprefix}
   9 \chapter{The implementation of \texorpdfstring{\gls{MTASK}}{mTask}}%
  10 \label{chp:implementation}
  11 \begin{chapterabstract}
  12         This chapter shows the implementation of the \gls{MTASK} system by:
  13         \begin{itemize}
  14                 \item elaborating on the implementation and architecture of the \gls{RTS} of \gls{MTASK};
  15                 \item giving details of the implementation of \gls{MTASK}'s \gls{TOP} engine that executes the \gls{MTASK} tasks on the microcontroller;
  16                 \item showing the implementation of the byte code compiler for \gls{MTASK}'s \gls{TOP} language;
  17                 \item explaining the machinery used to automatically serialise and deserialise data to-and-fro the device.
  18         \end{itemize}
  19 \end{chapterabstract}
  20
  21 \todo[inline]{Dit hoofdstuk is het ruwst van allen}
  22 The \gls{MTASK} system targets resource-constrained edge devices that have little memory, processor speed and communication.
  23 Such edge devices are often powered by microcontrollers.
  24 They usually have flash-based program memory which wears out fairly quick.
  25 For example, the flash memory of the popular atmega328p powering the \gls{ARDUINO} UNO is just rated for 10000 write cycles.
  26 While this sounds like a lot, if new tasks are sent to the device every minute or so, a lifetime of only seven days is guaranteed.
  27 Hence, for dynamic applications, storing the program in the \gls{RAM} of the device and interpreting this code is necessary, saving precious write cycles of the program memory.
  28 In the \gls{MTASK} system, this is done by the \gls{MTASK} \gls{RTS}.
  29
  30 \section{\texorpdfstring{\Glsxtrlong{RTS}}{Run time system}}
  31 The \gls{RTS} is a customisable domain-specific \gls{OS} that takes care of the execution of tasks, but also low-level mechanisms such as the communication, multitasking, and memory management.
  32 Once a device is programmed with the \gls{MTASK} \gls{RTS}, it can continuously receive new tasks without the need for reprogramming.
  33 The \gls{OS} is written in portable \ccpp{} and only contains a small device-specific portion.
  34 In order to keep the abstraction level high and the hardware requirements low, much of the high-level functionality of the \gls{MTASK} language is implemented not in terms of lower-level constructs from \gls{MTASK} language but in terms of \ccpp{} code.
  35
  36 As most microcontrollers software is run solely by a cyclic executive instead of an \gls{OS}, the \gls{RTS} of the \gls{MTASK} system is implemented as such also.
  37 It consists of a loop function with a relatively short execution time, similar to the one in \gls{ARDUINO}, that gets called repeatedly.
  38 The event loop consists of three distinct phases.
  39 After doing the three phases, the devices goes to sleep for as long as possible (see \cref{chp:green_computing_mtask} for more details on task scheduling).
  40
  41 \subsection{Communication}
  42 In the first phase, the communication channels are processed.
  43 The exact communication method is a customisable device-specific option baked into the \gls{RTS}.
  44 The interface is deliberately kept simple and consists of a two layer interface: a link interface and a communication interface.
  45 Besides opening, closing and cleaning up, the link interface has only three functions that are shown in \cref{lst:link_interface}.
  46 Consequently, implementing this link interface is very simple but allows for many more advanced link settings such as buffering.
  47 There are implementations for this interface for serial or \gls{WIFI} connections using \gls{ARDUINO} and \gls{TCP} connections for Linux.
  48
  49 \begin{lstArduino}[caption={Link interface of the \gls{MTASK} \gls{RTS}.},label={lst:link_interface}]
  50 bool link_input_available(void);
  51 uint8_t link_read_byte(void);
  52 void link_write_byte(uint8_t b);
  53 \end{lstArduino}
  54
  55 The communication interface abstracts away from this link interface and is typed instead.
  56 It contains only two functions as seen in \cref{lst:comm_interface}.
  57 There are implementations for direct communication, or communication using an \gls{MQTT} broker.
  58 Both use the automatic serialisation and deserialisation shown in \cref{sec:ccodegen}.
  59
  60 \begin{lstArduino}[caption={Communication interface of the \gls{MTASK} \gls{RTS}.},label={lst:comm_interface}]
  61 struct MTMessageTo receive_message(void);
  62 void send_message(struct MTMessageFro msg);
  63 \end{lstArduino}
  64
  65 Processing the received messages from the communication channels happens synchronously and the channels are exhausted completely before moving on to the next phase.
  66 There are several possible messages that can be received from the server:
  67
  68 \begin{description}
  69         \item[SpecRequest]
  70                 is a message instructing the device to send its specification and is sent usually immediately after connecting.
  71                 The \gls{RTS} responds with a \texttt{Spec} answer containing the specification.
  72         \item[TaskPrep]
  73                 tells the device a (big) task is on its way.
  74                 Especially on faster connections, it may be the case that the communication buffers overflow because a big message is sent while the \gls{RTS} is busy executing tasks.
  75                 This allows the \gls{RTS} to postpone execution for a while, until the big task has been received.
  76                 The server sends the big task when the device acknowledges (by sending a \texttt{TaskPrepAck} message) the preparation.
  77         \item[Task]
  78                 contains a new task, its peripheral configuration, the \glspl{SDS}, and the bytecode.
  79                 The new task is immediately copied to the task storage but is only initialised during the next phase after which a \texttt{TaskAck} is sent.
  80                 Tasks are stored below the stack, but since the stack is only used in the middle phase, execution, it is no problem that it moves.
  81         \item[SdsUpdate]
  82                 notifies the device of the new value for a lowered \gls{SDS}.
  83                 The old value of the lowered \gls{SDS} is immediately replaced with the new one.
  84                 There is no acknowledgement required.
  85         \item[TaskDel]
  86                 instructs the device to delete a running task.
  87                 Tasks are automatically deleted when they become stable.
  88                 However, a task may also be deleted when the surrounding task on the server is deleted, for example when the task is on the left-hand side of a step combinator and the condition to step holds.
  89                 The device acknowledges by sending a \texttt{TaskDelAck}.
  90         \item[Shutdown]
  91                 tells the device to reset.
  92 \end{description}
  93
  94 \subsection{Execution}
  95 The second phase performs one execution step for all tasks that wish for it.
  96 Tasks are ordered in a priority queue ordered by the time a task needs to be executed, the \gls{RTS} selects all tasks that can be scheduled, see \cref{sec:scheduling} for more details.
  97 Execution of a task is always an interplay between the interpreter and the \emph{rewriter}.
  98
  99 When a new task is received, the main expression is evaluated to produce a task tree.
 100 A task tree is a tree in which each node represents a task combinator and the leaves are basic tasks.
 101 If a task is not initialized yet, i.e.\ the pointer to the current task tree is still null, the byte code of the main function is interpreted.
 102 The main expression always produces a task tree.
 103 Execution of a task consists of continuously rewriting the task until its value is stable.
 104
 105 Rewriting is a destructive process, i.e.\ the rewriting is done in place.
 106 The rewriting engine uses the interpreter when needed, e.g.\ to calculate the step continuations.
 107 The rewriter and the interpreter use the same stack to store intermediate values.
 108 Rewriting steps are small so that interleaving results in seemingly parallel execution.
 109 In this phase new task tree nodes may be allocated.
 110 Both rewriting and initialization are atomic operations in the sense that no processing on \glspl{SDS} is done other than \gls{SDS} operations from the task itself.
 111 The host is notified if a task value is changed after a rewrite step.
 112
 113 Take for example the blink task for which the code is shown in \cref{lst:blink_code}.
 114
 115 \begin{lstClean}[caption={Code for a blink program.},label={lst:blink_code}]
 116 fun \blink=(\st->delay (lit 500) >>|. writeD d3 st >>=. blink o Not)
 117 In {main = blink true}
 118 \end{lstClean}
 119
 120 On receiving this task, the task tree is still null and the initial expression \cleaninline{blink true} is evaluated by the interpreter.
 121 This results in the task tree shown in \cref{fig:blink_tree}.
 122 Rewriting always starts at the top of the tree and traverses to the leaves, the basic tasks that do the actual work.
 123 The first basic task encountered is the \cleaninline{delay} task, that yields no value until the time, \qty{500}{\ms} in this case, has passed.
 124 When the \cleaninline{delay} task yielded a stable value, the task continues with the right-hand side of the \cleaninline{>>\|.} combinator.
 125 This combinator has a \cleaninline{writeD} task at the left-hand side that becomes stable after one rewrite step in which it writes the value to the given pin.
 126 When \cleaninline{writeD} becomes stable, the written value is the task value that is observed by the right-hand side of the \cleaninline{>>=.} combinator.
 127 This will call the interpreter to evaluate the expression, now that the argument of the function is known.
 128 The result of the function is basically the original task tree again, but now with the state inversed.
 129
 130 \begin{figure}
 131         \centering
 132         \includestandalone{blinktree}
 133         \caption{The task tree for a blink task in \cref{lst:blink_code} in \gls{MTASK}.}%
 134         \label{fig:blink_tree}
 135 \end{figure}
 136
 137 \subsection{Memory management}
 138 The third and final phase is memory management.
 139 The \gls{MTASK} \gls{RTS} is designed to run on systems with as little as \qty{2}{\kibi\byte} of \gls{RAM}.
 140 Aggressive memory management is therefore vital.
 141 Not all firmwares for microprocessors support heaps and---when they do---allocation often leaves holes when not used in a \emph{last in first out} strategy.
 142 The \gls{RTS} uses a chunk of memory in the global data segment with its own memory manager tailored to the needs of \gls{MTASK}.
 143 The size of this block can be changed in the configuration of the \gls{RTS} if necessary.
 144 On an \gls{ARDUINO} UNO---equipped with \qty{2}{\kibi\byte} of \gls{RAM}---the maximum viable size is about \qty{1500}{\byte}.
 145 The self-managed memory uses a similar layout as the memory layout for \gls{C} programs only the heap and the stack are switched (see \cref{fig:memory_layout}).
 146
 147 \begin{figure}
 148         \centering
 149         \includestandalone{memorylayout}
 150         \caption{Memory layout in the \gls{MTASK} \gls{RTS}.}\label{fig:memory_layout}
 151 \end{figure}
 152
 153 A task is stored below the stack and its complete state is a \gls{CLEAN} record contain most importantly the task id, a pointer to the task tree in the heap (null if not initialised yet), the current task value, the configuration of \glspl{SDS}, the configuration of peripherals, the byte code and some scheduling information.
 154
 155 In memory, task data grows from the bottom up and the interpreter stack is located directly on top of it growing in the same direction.
 156 As a consequence, the stack moves when a new task is received.
 157 This never happens within execution because communication is always processed before execution.
 158 Values in the interpreter are always stored on the stack.
 159 Compound data types are stored unboxed and flattened.
 160 Task trees grow from the top down as in a heap.
 161 This approach allows for flexible ratios, i.e.\ many tasks and small trees or few tasks and big trees.
 162
 163 Stable tasks, and unreachable task tree nodes are removed.
 164 If a task is to be removed, tasks with higher memory addresses are moved down.
 165 For task trees---stored in the heap---the \gls{RTS} already marks tasks and task trees as trash during rewriting so the heap can be compacted in a single pass.
 166 This is possible because there is no sharing or cycles in task trees and nodes contain pointers pointers to their parent.
 167
 168 \todo[inline]{plaa\-tje van me\-mo\-ry hier uitbreiden?}
 169
 170 \section{Compiler}
 171 \subsection{Instruction set}
 172 The instruction set is a fairly standard stack machine instruction set extended with special \gls{TOP} instructions for creating task tree nodes.
 173 All instructions are housed in a \gls{CLEAN} \gls{ADT} and serialised to the byte representation using a generic function.
 174 Type synonyms (\cref{lst:type_synonyms}) are used to provide insight on the arguments of the instructions.
 175 Labels are always two bytes long, all other arguments are one byte long.
 176
 177 \begin{lstClean}[caption={Type synonyms for instructions arguments.},label={lst:type_synonyms}]
 178 :: ArgWidth    :== UInt8         :: ReturnWidth :== UInt8
 179 :: Depth       :== UInt8         :: Num         :== UInt8
 180 :: SdsId       :== UInt8         :: JumpLabel   =: JL UInt16
 181 \end{lstClean}
 182
 183 \Cref{lst:instruction_type} shows an excerpt of the \gls{CLEAN} type that represents the instruction set.
 184 For example, shorthand instructions are omitted for brevity.
 185 Detailed semantics for the instructions are given in \cref{chp:bytecode_instruction_set}.
 186 One notable instruction is the \cleaninline{MkTask} instruction, it allocates and initialises a task tree node and pushes a pointer to it on the stack.
 187
 188 \begin{lstClean}[caption={The type housing the instruction set in \gls{MTASK}.},label={lst:instruction_type}]
 189 :: BCInstr
 190         //Jumps
 191         = BCJumpF JumpLabel | BCLabel JumpLabel | BCJumpSR ArgWidth JumpLabel
 192         | BCReturn ReturnWidth ArgWidth | BCTailcall ArgWidth ArgWidth JumpLabel
 193         //Arguments
 194         | BCArgs ArgWidth ArgWidth
 195         //Task node creation and refinement
 196         | BCMkTask BCTaskType | BCTuneRateMs | BCTuneRateSec
 197         //Stack ops
 198         | BCPush String255 | BCPop Num | BCRot Depth Num | BCDup | BCPushPtrs
 199         //Casting
 200         | BCItoR | BCItoL | BCRtoI | ...
 201         // arith
 202         | BCAddI | BCSubI | ...
 203         ...
 204
 205 :: BCTaskType
 206         = BCStableNode ArgWidth | BCUnstableNode ArgWidth
 207         // Pin io
 208         | BCReadD | BCWriteD | BCReadA | BCWriteA | BCPinMode
 209         // Interrupts
 210         | BCInterrupt
 211         // Repeat
 212         | BCRepeat
 213         // Delay
 214         | BCDelay | BCDelayUntil
 215         // Parallel
 216         | BCTAnd | BCTOr
 217         //Step
 218         | BCStep ArgWidth JumpLabel
 219         //Sds ops
 220         | BCSdsGet SdsId | BCSdsSet SdsId | BCSdsUpd SdsId JumpLabel
 221         // Rate limiter
 222         | BCRateLimit
 223         ////Peripherals
 224         //DHT
 225         | BCDHTTemp UInt8 | BCDHTHumid UInt8
 226         ...
 227 \end{lstClean}
 228
 229 \subsection{Compiler infrastructure}
 230 The bytecode compiler interpretation for the \gls{MTASK} language is implemented as a monad stack containing a writer monad and a state monad.
 231 The writer monad is used to generate code snippets locally without having to store them in the monadic values.
 232 The state monad accumulates the code, and stores the stateful data the compiler requires.
 233 \Cref{lst:compiler_state} shows the data type for the state, storing:
 234 function the compiler currently is in;
 235 code of the main expression;
 236 context (see \cref{ssec:step});
 237 code for the functions;
 238 next fresh label;
 239 a list of all the used \glspl{SDS}, either local \glspl{SDS} containing the initial value (\cleaninline{Left}) or lifted \glspl{SDS} (see \cref{sec:liftsds}) containing a reference to the associated \gls{ITASK} \gls{SDS};
 240 and finally there is a list of peripherals used.
 241
 242 \begin{lstClean}[label={lst:compiler_state},caption={The type for the \gls{MTASK} byte code compiler}]
 243 :: BCInterpret a :== StateT BCState (WriterT [BCInstr] Identity) a
 244 :: BCState =
 245         { bcs_infun        :: JumpLabel
 246         , bcs_mainexpr     :: [BCInstr]
 247         , bcs_context      :: [BCInstr]
 248         , bcs_functions    :: Map JumpLabel BCFunction
 249         , bcs_freshlabel   :: JumpLabel
 250         , bcs_sdses        :: [Either String255 MTLens]
 251         , bcs_hardware     :: [BCPeripheral]
 252         }
 253 :: BCFunction =
 254         { bcf_instructions :: [BCInstr]
 255         , bcf_argwidth     :: UInt8
 256         , bcf_returnwidth  :: UInt8
 257         }
 258 \end{lstClean}
 259
 260 Executing the compiler is done by providing an initial state.
 261 After compilation, several post-processing steps are applied to make the code suitable for the microprocessor.
 262 First, in all tail call \cleaninline{BCReturn}'s are replaced by \cleaninline{BCTailCall} to implement tail call elimination.
 263 Furthermore, all byte code is concatenated, resulting in one big program.
 264 Many instructions have commonly used arguments so shorthands are introduced to reduce the program size.
 265 For example, the \cleaninline{BCArg} instruction is often called with argument \qtyrange{0}{2} and can be replaced by the \cleaninline{BCArg0}--\cleaninline{BCArg2} shorthands.
 266 Furthermore, redundant instructions (e.g.\ pop directly after push) are removed as well in order not to burden the code generation with these intricacies.
 267 Finally the labels are resolved to represent actual program addresses instead of freshly generated identifiers.
 268 After the byte code is ready, the lifted \glspl{SDS} are resolved to provide an initial value for them.
 269 The result---byte code, \gls{SDS} specification and perpipheral specifications---are the result of the process, ready to be sent to the device.
 270
 271 \section{Compilation rules}
 272 This section describes the compilation rules, the translation from abstract syntax to byte code.
 273 The compilation scheme consists of three schemes\slash{}functions.
 274 When something is surrounded by double vertical bars, e.g.\ $\stacksize{a_i}$, it denotes the number of stack cells required to store it.
 275
 276 Some schemes have a \emph{context} $r$ as an argument which contains information about the location of the arguments in scope.
 277 More information is given in the schemes requiring such arguments.
 278
 279 \newcommand{\cschemeE}[2]{\mathcal{E}\llbracket#1\rrbracket~#2}
 280 \newcommand{\cschemeF}[1]{\mathcal{F}\llbracket#1\rrbracket}
 281 \newcommand{\cschemeS}[3]{\mathcal{S}\llbracket#1\rrbracket~#2~#3}
 282 \begin{table}
 283         \centering
 284         \begin{tabularx}{\linewidth}{l X}
 285                 \toprule
 286                 Scheme & Description\\
 287                 \midrule
 288                 $\cschemeE{e}{r}$ & Produces the value of expression $e$ given the context $r$ and pushes it on the stack.
 289                         The result can be a basic value or a pointer to a task.\\
 290                 $\cschemeF{e}$ & Generates the bytecode for functions.\\
 291                 $\cschemeS{e}{r}{w} $ & Generates the function for the step continuation given the context $r$ and the width $w$ of the left-hand side task value.\\
 292                 \bottomrule
 293         \end{tabularx}
 294 \end{table}
 295
 296 \subsection{Expressions}
 297 Almost all expression constructions are compiled using $\mathcal{E}$.
 298 The argument of $\mathcal{E}$ is the context (see \cref{ssec:functions}).
 299 Values are always placed on the stack; tuples and other compound data types are unpacked.
 300 Function calls, function arguments and tasks are also compiled using $\mathcal{E}$ but their compilations is explained later.
 301
 302 \begin{align*}
 303         \cschemeE{\text{\cleaninline{lit}}~e}{r} & = \text{\cleaninline{BCPush (bytecode e)}};\\
 304         \cschemeE{e_1\mathbin{\text{\cleaninline{+.}}}e_2}{r} & = \cschemeE{e_1}{r};
 305                         \cschemeE{e_2}{r};
 306                         \text{\cleaninline{BCAdd}};\\
 307                 {} & \text{\emph{Similar for other binary operators}}\\
 308         \cschemeE{\text{\cleaninline{Not}}~e}{r} & =
 309                         \cschemeE{e}{r};
 310                         \text{\cleaninline{BCNot}};\\
 311                 {} & \text{\emph{Similar for other unary operators}}\\
 312         \cschemeE{\text{\cleaninline{If}}~e_1~e_2~e_3}{r} & =
 313                         \cschemeE{e_1}{r};
 314                         \text{\cleaninline{BCJmpF}}\enskip l_{else}; \mathbin{\phantom{=}} \cschemeE{e_2}{r}; \text{\cleaninline{BCJmp}}\enskip l_{endif};\\
 315                 {} & \mathbin{\phantom{=}} \text{\cleaninline{BCLabel}}\enskip l_{else}; \cschemeE{e_3}{r}; \mathbin{\phantom{=}} \text{\cleaninline{BCLabel}}\enskip l_{endif};\\
 316                 {} & \text{\emph{Where $l_{else}$ and $l_{endif}$ are fresh labels}}\\
 317         \cschemeE{\text{\cleaninline{tupl}}~e_1~e_2}{r} & =
 318                         \cschemeE{e_1}{r};
 319                         \cschemeE{e_2}{r};\\
 320                 {} & \text{\emph{Similar for other unboxed compound data types}}\\
 321         \cschemeE{\text{\cleaninline{first}}~e}{r} & =
 322                         \cschemeE{e}{r};
 323                         \text{\cleaninline{BCPop}}\enskip w;\\
 324                 {} & \text{\emph{Where $w$ is the width of the left value and}}\\
 325                 {} & \text{\emph{similar for other unboxed compound data types}}\\
 326         \cschemeE{\text{\cleaninline{second}}\enskip e}{r} & =
 327                         \cschemeE{e}{r};
 328                         \text{\cleaninline{BCRot}}\enskip w_1\enskip (w_1+w_2);
 329                         \text{\cleaninline{BCPop}}\enskip w_2;\\
 330                 {} & \text{\emph{Where $w_1$ is the width of the left and, $w_2$ of the right value}}\\
 331                 {} & \text{\emph{similar for other unboxed compound data types}}\\
 332 \end{align*}
 333
 334 Translating $\mathcal{E}$ to \gls{CLEAN} code is very straightforward, it basically means executing the monad.
 335 Almost always, the type of the interpretation is not used, i.e.\ it is a phantom type.
 336 To still have the functions return the correct type, the \cleaninline{tell`}\footnote{\cleaninline{tell` :: [BCInstr] -> BCInterpret a}} helper is used.
 337 This function is similar to the writer monad's \cleaninline{tell} function but is casted to the correct type.
 338 \Cref{lst:imp_arith} shows the implementation for the arithmetic and conditional expressions.
 339 Note that $r$, the context, is not an explicit argument but stored in the state.
 340
 341 \begin{lstClean}[caption={Interpretation implementation for the arithmetic and conditional classes.},label={lst:imp_arith}]
 342 instance expr BCInterpret where
 343         lit   t   = tell` [BCPush (toByteCode{|*|} t)]
 344         (+.)  a b = a >>| b >>| tell` [BCAdd]
 345         ...
 346         If c t e = freshlabel >>= \elselabel->freshlabel >>= \endiflabel->
 347                 c >>| tell` [BCJumpF elselabel] >>|
 348                 t >>| tell` [BCJump endiflabel,BCLabel elselabel] >>|
 349                 e >>| tell` [BCLabel endiflabel]
 350 \end{lstClean}
 351
 352 \subsection{Functions}\label{ssec:functions}
 353 Compiling functions occurs in $\mathcal{F}$, which generates bytecode for the complete program by iterating over the functions and ending with the main expression.
 354 When compiling the body of the function, the arguments of the function are added to the context so that the addresses can be determined when referencing arguments.
 355 The main expression is a special case of $\mathcal{F}$ since it neither has arguments nor something to continue.
 356 Therefore, it is just compiled using $\mathcal{E}$.
 357
 358 \begin{align*}
 359         \cschemeF{main=m} & =
 360                 \cschemeE{m}{[]};\\
 361         \cschemeF{f~a_0 \ldots a_n = b~\text{\cleaninline{In}}~m} & =
 362                 \text{\cleaninline{BCLabel}}~f; \cschemeE{b}{[\langle f, i\rangle, i\in \{(\Sigma^n_{i=0}\stacksize{a_i})..0\}]};\\
 363                 {} & \mathbin{\phantom{=}} \text{\cleaninline{BCReturn}}~\stacksize{b}~n; \cschemeF{m};\\
 364 \end{align*}
 365
 366 A function call starts by pushing the stack and frame pointer, and making space for the program counter (\cref{lst:funcall_pushptrs}) followed by evaluating the arguments in reverse order (\cref{lst:funcall_args}).
 367 On executing \cleaninline{BCJumpSR}, the program counter is set and the interpreter jumps to the function (\cref{lst:funcall_jumpsr}).
 368 When the function returns, the return value overwrites the old pointers and the arguments.
 369 This occurs right after a \cleaninline{BCReturn} (\cref{lst:funcall_ret}).
 370 Putting the arguments on top of pointers and not reserving space for the return value uses little space and facilitates tail call optimization.
 371
 372 \begin{figure}
 373         \begin{subfigure}{.24\linewidth}
 374                 \centering
 375                 \includestandalone{memory1}
 376                 \caption{\cleaninline{BCPushPtrs}}\label{lst:funcall_pushptrs}
 377         \end{subfigure}
 378         \begin{subfigure}{.24\linewidth}
 379                 \centering
 380                 \includestandalone{memory2}
 381                 \caption{Arguments}\label{lst:funcall_args}
 382         \end{subfigure}
 383         \begin{subfigure}{.24\linewidth}
 384                 \centering
 385                 \includestandalone{memory3}
 386                 \caption{\cleaninline{BCJumpSR}}\label{lst:funcall_jumpsr}
 387         \end{subfigure}
 388         \begin{subfigure}{.24\linewidth}
 389                 \centering
 390                 \includestandalone{memory4}
 391                 \caption{\cleaninline{BCReturn}}\label{lst:funcall_ret}
 392         \end{subfigure}
 393         \caption{The stack layout during function calls.}%
 394 \end{figure}
 395
 396 Calling a function and referencing function arguments are an extension to $\mathcal{E}$ as shown below.
 397 Arguments may be at different places on the stack at different times (see \cref{ssec:step}) and therefore the exact location always has to be determined from the context using \cleaninline{findarg}\footnote{\cleaninline{findarg [l`:r] l = if (l == l`) 0 (1 + findarg r l)}}.
 398 Compiling argument $a_{f^i}$, the $i$th argument in function $f$, consists of traversing all positions in the current context.
 399 Arguments wider than one stack cell are fetched in reverse to preserve the order.
 400
 401 \begin{align*}
 402         \cschemeE{f(a_0, \ldots, a_n)}{r} & =
 403                 \text{\cleaninline{BCPushPtrs}}; \cschemeE{a_n}{r}; \cschemeE{a_{\ldots}}{r}; \cschemeE{a_0}{r}; \text{\cleaninline{BCJumpSR}}~n~f;\\
 404         \cschemeE{a_{f^i}}{r} & =
 405                 \text{\cleaninline{BCArg}~findarg}(r, f, i)~\text{for all}~i\in\{w\ldots v\};\\
 406                 {} & v = \Sigma^{i-1}_{j=0}\stacksize{a_{f^j}}~\text{ and }~ w = v + \stacksize{a_{f^i}}\\
 407 \end{align*}
 408
 409 Translating the compilation schemes for functions to Clean is not as straightforward as other schemes due to the nature of shallow embedding.
 410 The \cleaninline{fun} class has a single function with a single argument.
 411 This argument is a Clean function that---when given a callable Clean function representing the mTask function---will produce \cleaninline{main} and a callable function.
 412 To compile this, the argument must be called with a function representing a function call in mTask.
 413 \Cref{lst:fun_imp} shows the implementation for this as Clean code.
 414 To uniquely identify the function, a fresh label is generated.
 415 The function is then called with the \cleaninline{callFunction} helper function that generates the instructions that correspond to calling the function.
 416 That is, it pushes the pointers, compiles the arguments, and writes the \cleaninline{JumpSR} instruction.
 417 The resulting structure (\cleaninline{g In m}) contains a function representing the mTask function (\cleaninline{g}) and the \cleaninline{main} structure to continue with.
 418 To get the actual function, \cleaninline{g} must be called with representations for the argument, i.e.\ using \cleaninline{findarg} for all arguments.
 419 The arguments are added to the context and \cleaninline{liftFunction} is called with the label, the argument width and the compiler.
 420 This function executes the compiler, decorates the instructions with a label and places them in the function dictionary together with the metadata such as the argument width.
 421 After lifting the function, the context is cleared again and compilation continues with the rest of the program.
 422
 423 \begin{lstClean}[label={lst:fun_imp},caption={The backend implementation for functions.}]
 424 instance fun (BCInterpret a) BCInterpret | type a where
 425         fun def = {main=freshlabel >>= \funlabel->
 426                 let (g In m) = def \a->callFunction funlabel (toByteWidth a) [a]
 427                     argwidth = toByteWidth (argOf g)
 428                 in  addToCtx funlabel zero argwidth
 429                 >>| infun funlabel
 430                         (liftFunction funlabel argwidth
 431                                 (g (retrieveArgs funlabel zero argwidth)
 432                                 ) ?None)
 433                 >>| clearCtx >>| m.main
 434                 }
 435
 436 argOf :: ((m a) -> b) a -> UInt8 | toByteWidth a
 437 callFunction :: JumpLabel UInt8 [BCInterpret b] -> BCInterpret c | ...
 438 liftFunction :: JumpLabel UInt8 (BCInterpret a) (?UInt8) -> BCInterpret ()
 439 \end{lstClean}
 440
 441 \subsection{Tasks}\label{ssec:scheme_tasks}
 442 Task trees are created with the \cleaninline{BCMkTask} instruction that allocates a node and pushes it to the stack.
 443 It pops arguments from the stack according to the given task type.
 444 The following extension of $\mathcal{E}$ shows this compilation scheme (except for the step combinator, explained in \cref{ssec:step}).
 445
 446 \begin{align*}
 447         \cschemeE{\text{\cleaninline{rtrn}}~e}{r} & =
 448                         \cschemeE{e}{r};
 449                         \text{\cleaninline{BCMkTask BCStable}}_{\stacksize{e}};\\
 450         \cschemeE{\text{\cleaninline{unstable}}~e}{r} & =
 451                         \cschemeE{e}{r};
 452                         \text{\cleaninline{BCMkTask BCUnstable}}_{\stacksize{e}};\\
 453         \cschemeE{\text{\cleaninline{readA}}~e}{r} & =
 454                         \cschemeE{e}{r};
 455                         \text{\cleaninline{BCMkTask BCReadA}};\\
 456         \cschemeE{\text{\cleaninline{writeA}}~e_1~e_2}{r} & =
 457                         \cschemeE{e_1}{r};
 458                         \cschemeE{e_2}{r};
 459                         \text{\cleaninline{BCMkTask BCWriteA}};\\
 460         \cschemeE{\text{\cleaninline{readD}}~e}{r} & =
 461                         \cschemeE{e}{r};
 462                         \text{\cleaninline{BCMkTask BCReadD}};\\
 463         \cschemeE{\text{\cleaninline{writeD}}~e_1~e_2}{r} & =
 464                         \cschemeE{e_1}{r};
 465                         \cschemeE{e_2}{r};
 466                         \text{\cleaninline{BCMkTask BCWriteD}};\\
 467         \cschemeE{\text{\cleaninline{delay}}~e}{r} & =
 468                         \cschemeE{e}{r};
 469                         \text{\cleaninline{BCMkTask BCDelay}};\\
 470         \cschemeE{\text{\cleaninline{rpeat}}~e}{r} & =
 471                         \cschemeE{e}{r};
 472                         \text{\cleaninline{BCMkTask BCRepeat}};\\
 473         \cschemeE{e_1\text{\cleaninline{.\|\|.}}e_2}{r} & =
 474                         \cschemeE{e_1}{r};
 475                         \cschemeE{e_2}{r};
 476                         \text{\cleaninline{BCMkTask BCOr}};\\
 477         \cschemeE{e_1\text{\cleaninline{.&&.}}e_2}{r} & =
 478                         \cschemeE{e_1}{r};
 479                         \cschemeE{e_2}{r};
 480                         \text{\cleaninline{BCMkTask BCAnd}};\\
 481 \end{align*}
 482
 483 This simply translates to Clean code by writing the correct \cleaninline{BCMkTask} instruction as exemplified in \cref{lst:imp_ret}.
 484
 485 \begin{lstClean}[caption={The backend implementation for \cleaninline{rtrn}.},label={lst:imp_ret}]
 486 instance rtrn BCInterpret
 487 where
 488         rtrn m = m >>| tell` [BCMkTask (bcstable m)]
 489 \end{lstClean}
 490
 491 \subsection{Sequential combinator}\label{ssec:step}
 492 The \cleaninline{step} construct is a special type of task because the task value of the left-hand side may change over time.
 493 Therefore, the continuation tasks on the right-hand side are \emph{observing} this task value and acting upon it.
 494 In the compilation scheme, all continuations are first converted to a single function that has two arguments: the stability of the task and its value.
 495 This function either returns a pointer to a task tree or fails (denoted by $\bot$).
 496 It is special because in the generated function, the task value of a task can actually be inspected.
 497 Furthermore, it is a lazy node in the task tree: the right-hand side may yield a new task tree after several rewrite steps (i.e.\ it is allowed to create infinite task trees using step combinators).
 498 The function is generated using the $\mathcal{S}$ scheme that requires two arguments: the context $r$ and the width of the left-hand side so that it can determine the position of the stability which is added as an argument to the function.
 499 The resulting function is basically a list of if-then-else constructions to check all predicates one by one.
 500 Some optimization is possible here but has currently not been implemented.
 501
 502 \begin{align*}
 503         \cschemeE{t_1\text{\cleaninline{>>*.}}t_2}{r} & =
 504                 \cschemeE{a_{f^i}}{r}, \langle f, i\rangle\in r;
 505                 \text{\cleaninline{BCMkTask}}~\text{\cleaninline{BCStable}}_{\stacksize{r}}; \cschemeE{t_1}{r};\\
 506         {} & \mathbin{\phantom{=}} \text{\cleaninline{BCMkTask}}~\text{\cleaninline{BCAnd}}; \text{\cleaninline{BCMkTask}}~(\text{\cleaninline{BCStep}}~(\cschemeS{t_2}{(r + [\langle l_s, i\rangle])}{\stacksize{t_1}}));\\
 507 \end{align*}
 508
 509 \begin{align*}
 510         \cschemeS{[]}{r}{w} & =
 511                 \text{\cleaninline{BCPush}}~\bot;\\
 512         \cschemeS{\text{\cleaninline{IfValue}}~f~t:cs}{r}{w} & =
 513                 \text{\cleaninline{BCArg}} (\stacksize{r} + w);
 514                 \text{\cleaninline{BCIsNoValue}};\\
 515         {} & \mathbin{\phantom{=}} \cschemeE{f}{r};
 516                 \text{\cleaninline{BCAnd}};\\
 517         {} & \mathbin{\phantom{=}} \text{\cleaninline{BCJmpF}}~l_1;\\
 518         {} & \mathbin{\phantom{=}} \cschemeE{t}{r};
 519                 \text{\cleaninline{BCJmp}}~l_2;\\
 520         {} & \mathbin{\phantom{=}} \text{\cleaninline{BCLabel}}~l_1;
 521                 \cschemeS{cs}{r}{w};\\
 522         {} & \mathbin{\phantom{=}} \text{\cleaninline{BCLabel}}~l_2;\\
 523         {} & \text{\emph{Where $l_1$ and $l_2$ are fresh labels}}\\
 524         {} & \text{\emph{Similar for \cleaninline{IfStable} and \cleaninline{IfUnstable}}}\\
 525 \end{align*}
 526
 527 First the context is evaluated.
 528 The context contains arguments from functions and steps that need to be preserved after rewriting.
 529 The evaluated context is combined with the left-hand side task value by means of a \cleaninline{.&&.} combinator to store it in the task tree so that it is available after a rewrite.
 530 This means that the task tree is be transformed as follows:
 531
 532 \begin{lstClean}
 533 t1 >>= \v1->t2 >>= \v2->t3 >>= ...
 534 //is transformed to
 535 t1 >>= \v1->rtrn v1 .&&. t2 >>= \v2->rtrn (v1, v2) .&&. t3 >>= ...
 536 \end{lstClean}
 537
 538 The translation to \gls{CLEAN} is given in \cref{lst:imp_seq}.
 539
 540 \begin{lstClean}[caption={Backend implementation for the step class.},label={lst:imp_seq}]
 541 instance step BCInterpret where
 542         (>>*.) lhs cont
 543                 //Fetch a fresh label and fetch the context
 544                 =   freshlabel >>= \funlab->gets (\s->s.bcs_context)
 545                 //Generate code for lhs
 546                 >>= \ctx->lhs
 547                 //Possibly add the context
 548                 >>| tell` (if (ctx =: []) []
 549                                 //The context is just the arguments up till now in reverse
 550                                 (  [BCArg (UInt8 i)\\i<-reverse (indexList ctx)]
 551                                 ++ map BCMkTask (bcstable (UInt8 (length ctx)))
 552                                 ++ [BCMkTask BCTAnd]
 553                                 ))
 554                 //Increase the context
 555                 >>| addToCtx funlab zero lhswidth
 556                 //Lift the step function
 557                 >>| liftFunction funlab
 558                                 //Width of the arguments is the width of the lhs plus the
 559                                 //stability plus the context
 560                                 (one + lhswidth + (UInt8 (length ctx)))
 561                                 //Body     label  ctx width            continuations
 562                                 (contfun funlab (UInt8 (length ctx)))
 563                                 //Return width (always 1, a task pointer)
 564                                 (Just one)
 565                 >>| modify (\s->{s & bcs_context=ctx})
 566                 >>| tell` [BCMkTask (instr rhswidth funlab)]
 567
 568 toContFun :: JumpLabel UInt8 -> BCInterpret a
 569 toContFun steplabel contextwidth
 570         = foldr tcf (tell` [BCPush fail]) cont
 571 where
 572         tcf (IfStable f t)
 573                 = If ((stability >>| tell` [BCIsStable]) &. f val)
 574                         (t val >>| tell` [])
 575         ...
 576         stability = tell` [BCArg (lhswidth + contextwidth)]
 577         val = retrieveArgs steplabel zero lhswidth
 578 \end{lstClean}
 579
 580 \subsection{\texorpdfstring{\Glspl{SDS}}{Shared data sources}}
 581 The compilation scheme for \gls{SDS} definitions is a trivial extension to $\mathcal{F}$ since there is no code generated as seen below.
 582
 583 \begin{align*}
 584         \cschemeF{\text{\cleaninline{sds}}~x=i~\text{\cleaninline{In}}~m} & =
 585                 \cschemeF{m};\\
 586 \end{align*}
 587
 588 The \gls{SDS} access tasks have a compilation scheme similar to other tasks (see \cref{ssec:scheme_tasks}).
 589 The \cleaninline{getSds} task just pushes a task tree node with the \gls{SDS} identifier embedded.
 590 The \cleaninline{setSds} task evaluates the value, lifts that value to a task tree node and creates an \gls{SDS} set node.
 591
 592 \begin{align*}
 593         \cschemeE{\text{\cleaninline{getSds}}~s}{r} & =
 594                 \text{\cleaninline{BCMkTask}} (\text{\cleaninline{BCSdsGet}} s);\\
 595         \cschemeE{\text{\cleaninline{setSds}}~s~e}{r} & =
 596                 \cschemeE{e}{r};
 597                 \text{\cleaninline{BCMkTask BCStable}}_{\stacksize{e}};\\
 598         {} & \mathbin{\phantom{=}} \text{\cleaninline{BCMkTask}} (\text{\cleaninline{BCSdsSet}} s);\\
 599 \end{align*}
 600
 601 While there is no code generated in the definition, the byte code compiler is storing the \gls{SDS} data in the \cleaninline{bcs_sdses} field in the compilation state.
 602 The \glspl{SDS} are typed as functions in the host language so an argument for this function must be created that represents the \gls{SDS} on evaluation.
 603 For this, an \cleaninline{BCInterpret} is created that emits this identifier.
 604 When passing it to the function, the initial value of the \gls{SDS} is returned.
 605 This initial value is stored as a byte code encoded value in the state and the compiler continues with the rest of the program.
 606
 607 Compiling \cleaninline{getSds} is a matter of executing the \cleaninline{BCInterpret} representing the \gls{SDS}, which yields the identifier that can be embedded in the instruction.
 608 Setting the \gls{SDS} is similar: the identifier is retrieved and the value is written to put in a task tree so that the resulting task can remember the value it has written.
 609 Lifted SDSs are compiled in a very similar way \cref{sec:liftsds}.
 610
 611 % VimTeX: SynIgnore on
 612 \begin{lstClean}[caption={Backend implementation for the SDS classes.},label={lst:comp_sds}]
 613 :: Sds a = Sds Int
 614 instance sds BCInterpret where
 615         sds def = {main = freshsds >>= \sdsi->
 616                         let sds = modify (\s->{s & bcs_sdses=put sdsi
 617                                                 (Left (toByteCode t)) s.bcs_sdses})
 618                                         >>| pure (Sds sdsi)
 619                             (t In e) = def sds
 620                         in e.main}
 621         getSds f   = f >>= \(Sds i)-> tell` [BCMkTask (BCSdsGet (fromInt i))]
 622         setSds f v = f >>= \(Sds i)->v >>| tell`
 623                 (  map BCMkTask (bcstable (byteWidth v))
 624                 ++ [BCMkTask (BCSdsSet (fromInt i))])
 625 \end{lstClean}
 626 % VimTeX: SynIgnore off
 627
 628 \section{\texorpdfstring{\Gls{C}}{C} code generation}\label{sec:ccodegen}
 629 \todo[inline]{Dit is nog zeer ruw}
 630 All communication between the \gls{ITASK} server and the \gls{MTASK} server is type-parametrised.
 631 From the structural representation of the type, a \gls{CLEAN} parser and printer is constructed using generic programming.
 632 Furthermore, a \ccpp{} parser and printer is generated for use on the \gls{MTASK} device.
 633 The technique for generating the \ccpp{} parser and printer is very similar to template metaprogramming and requires a generic programming library or compiler support that includes a lot of metadata in the record and constructor nodes.
 634 Using generic programming in the \gls{MTASK} system, both serialisation and deserialisation on the microcontroller and and the server is automatically generated.
 635
 636 \subsection{Server}
 637 On the server, off-the-shelve generic programming techniques are used to make the serialisation and deserialisation functions (see \cref{lst:ser_deser_server}).
 638 Serialisation is a simple conversion from a value of the type to a string.
 639 Deserialisation is a little bit different in order to support streaming\footnotemark{}.
 640 \footnotetext{%
 641         Here the \cleaninline{*!} variant of the generic interface is chosen that has less uniqueness constraints for the compiler-generated adaptors\citep{alimarine_generic_2005,hinze_derivable_2001}.%
 642 }
 643 Given a list of available characters, a tuple is always returned.
 644 The right-hand side of the tuple contains the remaining characters, the unparsed input.
 645 The left-hand side contains either an error or a maybe value.
 646 If the value is a \cleaninline{?None}, there was no full value to parse.
 647 If the value is a \cleaninline{?Just}, the data field contains a value of the requested type.
 648
 649 \begin{lstClean}[caption={Serialisation and deserialisation functions in \gls{CLEAN}.},label={lst:ser_deser_server}]
 650 generic toByteCode   a    :: a -> String
 651 generic fromByteCode a *! :: [Char] -> (Either String (? a), [Char])
 652 \end{lstClean}
 653
 654 \subsection{Client}
 655 The \gls{RTS} of the \gls{MTASK} system runs on resource-constrained microcontrollers and is implemented in portable \ccpp{}.
 656 In order to still achieve some type safety, the communication between the server and the client is automated, i.e.\ the serialisation and deserialisation code in the \gls{RTS} is generated.
 657 The technique used for this is very similar to the technique shown in \cref{chp:first-class_datatypes}.
 658 However, instead of using template metaprogramming, a feature \gls{CLEAN} lacks, generic programming is used also as a two-stage rocket.
 659 In contrast to many other generic programming systems, \gls{CLEAN} allows for access to much of the metadata of the compiler.
 660 For example, \cleaninline{Cons}, \cleaninline{Object}, \cleaninline{Field}, and \cleaninline{Record} generic constructors are enriched with their arity, names, types, \etc.
 661 Furthermore, constructors can access the metadata of the objects and fields of their parent records.
 662 Using this metadata, generic functions can be created that generate \ccpp{} type definitions, parsers and printers for any first-order \gls{CLEAN} type.
 663 The exact details of this technique can be found later in a paper that is in preparation\todo{noe\-men?}.
 664
 665 \Glspl{ADT} are converted to tagged unions, newtypes to typedefs, records to structs, and arrays to dynamically size-parametrised allocated arrays.
 666 For example, the \gls{CLEAN} types in \cref{lst:ser_clean} are translated to the \ccpp{} types seen in \cref{lst:ser_c}
 667
 668 \begin{lstClean}[caption={Simple \glspl{ADT} in \gls{CLEAN}.},label={lst:ser_clean}]
 669 :: T a = A a | B NT {#Char}
 670 :: NT =: NT Real
 671 \end{lstClean}
 672
 673 \begin{lstArduino}[caption={Simple \glspl{ADT} in \ccpp{}.},label={lst:ser_c}]
 674 typedef double Real;
 675 typedef char Char;
 676
 677 typedef Real NT;
 678 enum T_c {A_c, B_c};
 679
 680 struct Char_HshArray { uint32_t size; Char *elements; };
 681 struct T {
 682         enum T_c cons;
 683         struct { void *A;
 684                  struct { NT f0; struct Char_HshArray f1; } B;
 685         } data;
 686 };
 687 \end{lstArduino}
 688
 689 For each of these generated types, two functions are created, a typed printer, and a typed parser (see \cref{lst:ser_pp}).
 690 The parser functions are parametrised by a read function, an allocation function and parse functions for all type variables.
 691 This allows for the use of these functions in environments where the communication is parametrised and the memory management is self-managed such as in the \gls{MTASK} \gls{RTS}.
 692
 693 \begin{lstArduino}[caption={Printer and parser for the \glspl{ADT} in \ccpp{}.},label={lst:ser_pp}]
 694 struct T parse_T(uint8_t (*get)(), void *(*alloc)(size_t), void *(*parse_0)(uint8_t (*)(), void *(*)(size_t)));
 695
 696 void print_T(void (*put)(uint8_t), struct T r, void (*print_0)(void (*)(uint8_t), void *));
 697 \end{lstArduino}
 698 \todo[inline]{uitbreiden, maar niet te veel}
 699
 700 \section{Conclusion}
 701 It is not straightforward to execute \gls{MTASK} tasks on resources-constrained \gls{IOT} edge devices.
 702 To achieve this, the terms in the \gls{DSL} are compiled to domain-specific byte code.
 703 This byte code is sent for interpretation to the light-weight \gls{RTS} of the edge device.
 704 First the expression is evaluated.
 705 The result of this evaluation, a run time representation of the task, is a task tree.
 706 This task tree is rewritten according to rewrite rules until a stable value is observed.
 707 Rewriting multiple tasks at the same time is achieved by interleaving the rewrite steps, resulting in seamingly parallel execution of the tasks.
 708 All communication is automated.
 709
 710 \todo[inline]{conclusion}
 711
 712 \input{subfilepostamble}
 713 \end{document}