top/imp.tex

   1 \documentclass[../thesis.tex]{subfiles}
   2
   3 \input{subfilepreamble}
   4
   5 \setcounter{chapter}{4}
   6
   7 \begin{document}
   8 \input{subfileprefix}
   9 \chapter{The implementation of \texorpdfstring{\gls{MTASK}}{mTask}}%
  10 \label{chp:implementation}
  11 \begin{chapterabstract}
  12         This chapter shows the implementation of the \gls{MTASK} system by:
  13         \begin{itemize}
  14                 \item elaborating on the implementation and architecture of the \gls{RTS} of \gls{MTASK};
  15                 \item showing the implementation of the byte code compiler for the \gls{MTASK} language;
  16                 \item and explaining the machinery used to automatically serialise and deserialise data to-and-fro the device.
  17         \end{itemize}
  18 \end{chapterabstract}
  19
  20 The \gls{MTASK} system targets resource-constrained edge devices that have little memory, processor speed, and communication.
  21 Such edge devices are often powered by microcontrollers, tiny computers specifically designed for embedded applications.
  22 The microcontrollers usually have flash-based program memory which wears out fairly quickly.
  23 For example, the flash memory of the popular atmega328p powering the \gls{ARDUINO} UNO is rated for 10000 write cycles.
  24 While this sounds like a lot, if new tasks are sent to the device every minute or so, a lifetime of only seven days is guaranteed.
  25 Hence, for dynamic applications, storing the program in the \gls{RAM} of the device and interpreting this code is necessary in order to save precious write cycles of the program memory.
  26 In the \gls{MTASK} system, the \gls{MTASK} \gls{RTS}, a domain-specific \gls{OS}, is responsible for interpreting the programs.
  27
  28 \section{\texorpdfstring{\Glsxtrlong{RTS}}{Run time system}}
  29 The \gls{RTS} is a customisable domain-specific \gls{OS} that takes care of the execution of tasks.
  30 Furthermore, it also takes care of low-level mechanisms such as the communication, multitasking, and memory management.
  31 Once a device is programmed with the \gls{MTASK} \gls{RTS}, it can continuously receive new tasks without the need for reprogramming.
  32 The \gls{OS} is written in portable \ccpp{} and only contains a small device-specific portion.
  33 In order to keep the abstraction level high and the hardware requirements low, much of the high-level functionality of the \gls{MTASK} language is implemented not in terms of lower-level constructs from \gls{MTASK} language but in terms of \ccpp{} code.
  34
  35 Most microcontrollers software consists of a cyclic executive instead of an \gls{OS}, the \gls{RTS} of the \gls{MTASK} system is implemented as such also.
  36 It consists of a loop function with a relatively short execution time, similar to the one in \gls{ARDUINO}, that gets called repeatedly.
  37 The event loop consists of three distinct phases.
  38 After doing the three phases, the devices goes to sleep for as long as possible (see \cref{chp:green_computing_mtask} for more details on task scheduling).
  39
  40 \subsection{Communication phase}
  41 In the first phase, the communication channels are processed.
  42 The exact communication method is a customisable device-specific option baked into the \gls{RTS}.
  43 The interface is kept deliberately simple and consists of two layers: a link interface and a communication interface.
  44 Besides opening, closing and cleaning up, the link interface has three functions that are shown in \cref{lst:link_interface}.
  45 Consequently, implementing this link interface is very simple but it is still possible to implement more advanced link features such as buffering.
  46 There are implementations for this interface for serial or \gls{WIFI} connections using \gls{ARDUINO}, and \gls{TCP} connections for Linux.
  47
  48 \begin{lstArduino}[caption={Link interface of the \gls{MTASK} \gls{RTS}.},label={lst:link_interface}]
  49 bool    link_input_available(void);
  50 uint8_t link_read_byte(void);
  51 void    link_write_byte(uint8_t b);
  52 \end{lstArduino}
  53
  54 The communication interface abstracts away from this link interface and is typed instead.
  55 It contains only two functions as seen in \cref{lst:comm_interface}.
  56 There are implementations for direct communication, or communication using an \gls{MQTT} broker.
  57 Both use the automatic serialisation and deserialisation shown in \cref{sec:ccodegen}.
  58
  59 \begin{lstArduino}[caption={Communication interface of the \gls{MTASK} \gls{RTS}.},label={lst:comm_interface}]
  60 struct MTMessageTo receive_message(void);
  61 void send_message(struct MTMessageFro msg);
  62 \end{lstArduino}
  63
  64 Processing the received messages from the communication channels happens synchronously and the channels are exhausted completely before moving on to the next phase.
  65 There are several possible messages that can be received from the server:
  66
  67 \begin{description}
  68         \item[SpecRequest]
  69                 is a message instructing the device to send its specification and it is received immediately after connecting.
  70                 The \gls{RTS} responds with a \texttt{Spec} answer containing the specification.
  71         \item[TaskPrep]
  72                 tells the device a task is on its way.
  73                 Especially on faster connections, it may be the case that the communication buffers overflow because a big message is sent while the \gls{RTS} is busy executing tasks.
  74                 This message allows the \gls{RTS} to postpone execution for a while, until the larger task has been received.
  75                 The server sends the task only after the device acknowledged the preparation by by sending a \texttt{TaskPrepAck} message.
  76         \item[Task]
  77                 contains a new task, its peripheral configuration, the \glspl{SDS}, and the byte code.
  78                 The new task is immediately copied to the task storage but is only initialised during the next phase.
  79                 The device acknowledges the task by sending a \texttt{TaskAck} message.
  80         \item[SdsUpdate]
  81                 notifies the device of the new value for a lowered \gls{SDS}.
  82                 The old value of the lowered \gls{SDS} is immediately replaced with the new one.
  83                 There is no acknowledgement required.
  84         \item[TaskDel]
  85                 instructs the device to delete a running task.
  86                 Tasks are automatically deleted when they become stable.
  87                 However, a task may also be deleted when the surrounding task on the server is deleted, for example when the task is on the left-hand side of a step combinator and the condition to step holds.
  88                 The device acknowledges the deletion by sending a \texttt{TaskDelAck}.
  89         \item[Shutdown]
  90                 tells the device to reset.
  91 \end{description}
  92
  93 \subsection{Execution phase}
  94 The second phase performs one execution step for all tasks that wish for it.
  95 Tasks are ordered in a priority queue ordered by the time a task needs to execute, the \gls{RTS} selects all tasks that can be scheduled, see \cref{sec:scheduling} for more details.
  96 Execution of a task is always an interplay between the interpreter and the rewriter.
  97
  98 When a new task is received, the main expression is evaluated to produce a task tree.
  99 A task tree is a tree structure in which each node represents a task combinator and the leaves are basic tasks.
 100 If a task is not initialized yet, i.e.\ the pointer to the current task tree is still null, the byte code of the main function is interpreted.
 101 The main expression always produces a task tree.
 102 Execution of a task consists of continuously rewriting the task until its value is stable.
 103
 104 Rewriting is a destructive process, i.e.\ the rewriting is done in place.
 105 The rewriting engine uses the interpreter when needed, e.g.\ to calculate the step continuations.
 106 The rewriter and the interpreter use the same stack to store intermediate values.
 107 Rewriting steps are small so that interleaving results in seemingly parallel execution.
 108 In this phase new task tree nodes may be allocated.
 109 Both rewriting and initialization are atomic operations in the sense that no processing on \glspl{SDS} is done other than \gls{SDS} operations from the task itself.
 110 The host is notified if a task value is changed after a rewrite step by sending a \texttt{TaskReturn} message.
 111
 112 Take for example a blink task for which the code is shown in \cref{lst:blink_code}.
 113
 114 \begin{lstClean}[caption={Code for a blink program.},label={lst:blink_code}]
 115 fun \blink=(\st->delay (lit 500) >>|. writeD d3 st >>=. blink o Not)
 116 In {main = blink true}
 117 \end{lstClean}
 118
 119 On receiving this task, the task tree is still null and the initial expression \cleaninline{blink true} is evaluated by the interpreter.
 120 This results in the task tree shown in \cref{fig:blink_tree}.
 121 Rewriting always starts at the top of the tree and traverses to the leaves, the basic tasks that do the actual work.
 122 The first basic task encountered is the \cleaninline{delay} task, that yields no value until the time, \qty{500}{\ms} in this case, has passed.
 123 When the \cleaninline{delay} task yielded a stable value after a number of rewrites, the task continues with the right-hand side of the \cleaninline{>>\|.} combinator.
 124 This combinator has a \cleaninline{writeD} task at the left-hand side that becomes stable after one rewrite step in which it writes the value to the given pin.
 125 When \cleaninline{writeD} becomes stable, the written value is the task value that is observed by the right-hand side of the \cleaninline{>>=.} combinator.
 126 This will call the interpreter to evaluate the expression, now that the argument of the function is known.
 127 The result of the function is the original task tree again, but now with the state inversed.
 128
 129 \begin{figure}
 130         \centering
 131         \includestandalone{blinktree}
 132         \caption{The task tree for a blink task in \cref{lst:blink_code} in \gls{MTASK}.}%
 133         \label{fig:blink_tree}
 134 \end{figure}
 135
 136 \subsection{Memory management}
 137 The third and final phase is memory management.
 138 The \gls{MTASK} \gls{RTS} is designed to run on systems with as little as \qty{2}{\kibi\byte} of \gls{RAM}.
 139 Aggressive memory management is therefore vital.
 140 Not all firmwares for microprocessors support heaps and---when they do---allocation often leaves holes when not used in a \emph{last in first out} strategy.
 141 The \gls{RTS} uses a chunk of memory in the global data segment with its own memory manager tailored to the needs of \gls{MTASK}.
 142 The size of this block can be changed in the configuration of the \gls{RTS} if necessary.
 143 On an \gls{ARDUINO} UNO---equipped with \qty{2}{\kibi\byte} of \gls{RAM}---the maximum viable size is about \qty{1500}{\byte}.
 144 The self-managed memory uses a similar layout as the memory layout for \gls{C} programs only the heap and the stack are switched (see \cref{fig:memory_layout}).
 145
 146 \begin{figure}
 147         \centering
 148         \includestandalone{memorylayout}
 149         \caption{Memory layout in the \gls{MTASK} \gls{RTS}.}\label{fig:memory_layout}
 150 \end{figure}
 151
 152 A task is stored below the stack and its complete state is a \gls{CLEAN} record contain most importantly the task id, a pointer to the task tree in the heap (null if not initialised yet), the current task value, the configuration of \glspl{SDS}, the configuration of peripherals, the byte code and some scheduling information.
 153
 154 In memory, task data grows from the bottom up and the interpreter stack is located directly on top of it growing in the same direction.
 155 As a consequence, the stack moves when a new task is received.
 156 This never happens within execution because communication is always processed before execution.
 157 Values in the interpreter are always stored on the stack.
 158 Compound data types are stored unboxed and flattened.
 159 Task trees grow from the top down as in a heap.
 160 This approach allows for flexible ratios, i.e.\ many tasks and small trees or few tasks and big trees.
 161
 162 Stable tasks, and unreachable task tree nodes are removed.
 163 If a task is to be removed, tasks with higher memory addresses are moved down.
 164 For task trees---stored in the heap---the \gls{RTS} already marks tasks and task trees as trash during rewriting so the heap can be compacted in a single pass.
 165 This is possible because there is no sharing or cycles in task trees and nodes contain pointers pointers to their parent.
 166
 167 \section{Compiler}
 168 \subsection{Compiler infrastructure}
 169 The bytecode compiler interpretation for the \gls{MTASK} language is implemented as a monad stack containing a writer monad and a state monad.
 170 The writer monad is used to generate code snippets locally without having to store them in the monadic values.
 171 The state monad accumulates the code, and stores the state the compiler requires.
 172 \Cref{lst:compiler_state} shows the data type for the state, storing:
 173 function the compiler currently is in;
 174 code of the main expression;
 175 context (see \cref{ssec:step});
 176 code for the functions;
 177 next fresh label;
 178 a list of all the used \glspl{SDS}, either local \glspl{SDS} containing the initial value (\cleaninline{Left}) or lifted \glspl{SDS} (see \cref{sec:liftsds}) containing a reference to the associated \gls{ITASK} \gls{SDS};
 179 and finally there is a list of peripherals used.
 180
 181 \begin{lstClean}[label={lst:compiler_state},caption={The type for the \gls{MTASK} byte code compiler.}]
 182 :: BCInterpret a :== StateT BCState (WriterT [BCInstr] Identity) a
 183 :: BCState =
 184         { bcs_infun        :: JumpLabel
 185         , bcs_mainexpr     :: [BCInstr]
 186         , bcs_context      :: [BCInstr]
 187         , bcs_functions    :: Map JumpLabel BCFunction
 188         , bcs_freshlabel   :: JumpLabel
 189         , bcs_sdses        :: [Either String255 MTLens]
 190         , bcs_hardware     :: [BCPeripheral]
 191         }
 192 :: BCFunction =
 193         { bcf_instructions :: [BCInstr]
 194         , bcf_argwidth     :: UInt8
 195         , bcf_returnwidth  :: UInt8
 196         }
 197 \end{lstClean}
 198
 199 Executing the compiler is done by providing an initial state and running the monad.
 200 After compilation, several post-processing steps are applied to make the code suitable for the microprocessor.
 201 First, in all tail call \cleaninline{BCReturn} instructions are replaced by \cleaninline{BCTailCall} instructions to optimise the tail calls.
 202 Furthermore, all byte code is concatenated, resulting in one big program.
 203 Many instructions have commonly used arguments so shorthands are introduced to reduce the program size.
 204 For example, the \cleaninline{BCArg} instruction is often called with argument \numrange{0}{2} and can be replaced by the \numrange[parse-numbers=false]{\cleaninline{BCArg0}}{\cleaninline{BCArg2}} shorthands.
 205 Furthermore, redundant instructions such as pop directly after push are removed as well in order not to burden the code generation with these intricacies.
 206 Finally the labels are resolved to represent actual program addresses instead of the freshly generated identifiers.
 207 After the byte code is ready, the lifted \glspl{SDS} are resolved to provide an initial value for them.
 208 The byte code, \gls{SDS} specification and perpipheral specifications are the result of the process, ready to be sent to the device.
 209
 210 \subsection{Instruction set}
 211 The instruction set is a fairly standard stack machine instruction set extended with special \gls{TOP} instructions for creating task tree nodes.
 212 All instructions are housed in a \gls{CLEAN} \gls{ADT} and serialised to the byte representation using generic functions (see \cref{sec:ccodegen}).
 213 Type synonyms and newtypes are used to provide insight on the arguments of the instructions (\cref{lst:type_synonyms}).
 214 Labels are always two bytes long, all other arguments are one byte long.
 215
 216 \begin{lstClean}[caption={Type synonyms for instructions arguments.},label={lst:type_synonyms}]
 217 :: ArgWidth    :== UInt8         :: ReturnWidth :== UInt8
 218 :: Depth       :== UInt8         :: Num         :== UInt8
 219 :: SdsId       :== UInt8         :: JumpLabel   =: JL UInt16
 220 \end{lstClean}
 221
 222 \Cref{lst:instruction_type} shows an excerpt of the \gls{CLEAN} type that represents the instruction set.
 223 Shorthand instructions such as instructions with inlined arguments are omitted for brevity.
 224 Detailed semantics for the instructions are given in \cref{chp:bytecode_instruction_set}.
 225 One notable instruction is the \cleaninline{MkTask} instruction, it allocates and initialises a task tree node and pushes a pointer to it on the stack.
 226
 227 \begin{lstClean}[caption={The type housing the instruction set in \gls{MTASK}.},label={lst:instruction_type}]
 228 :: BCInstr
 229         //Jumps
 230         = BCJumpF JumpLabel | BCLabel JumpLabel | BCJumpSR ArgWidth JumpLabel
 231         | BCReturn ReturnWidth ArgWidth
 232         | BCTailcall ArgWidth ArgWidth JumpLabel
 233         //Arguments
 234         | BCArgs ArgWidth ArgWidth
 235         //Task node creation and refinement
 236         | BCMkTask BCTaskType | BCTuneRateMs | BCTuneRateSec
 237         //Stack ops
 238         | BCPush String255 | BCPop Num | BCRot Depth Num | BCDup | BCPushPtrs
 239         //Casting
 240         | BCItoR | BCItoL | BCRtoI | ...
 241         // arith
 242         | BCAddI | BCSubI | ...
 243         ...
 244
 245 :: BCTaskType
 246         = BCStableNode ArgWidth | BCUnstableNode ArgWidth
 247         // Pin io
 248         | BCReadD | BCWriteD | BCReadA | BCWriteA | BCPinMode
 249         // Interrupts
 250         | BCInterrupt
 251         // Repeat
 252         | BCRepeat
 253         // Delay
 254         | BCDelay | BCDelayUntil
 255         // Parallel
 256         | BCTAnd | BCTOr
 257         //Step
 258         | BCStep ArgWidth JumpLabel
 259         //Sds ops
 260         | BCSdsGet SdsId | BCSdsSet SdsId | BCSdsUpd SdsId JumpLabel
 261         // Rate limiter
 262         | BCRateLimit
 263         ////Peripherals
 264         //DHT
 265         | BCDHTTemp UInt8 | BCDHTHumid UInt8
 266         ...
 267 \end{lstClean}
 268
 269
 270 \section{Compilation rules}
 271 This section describes the compilation rules, the translation from \gls{AST} to byte code.
 272 The compilation scheme consists of three schemes\slash{}functions.
 273 Double vertical bars, e.g.\ $\stacksize{a_i}$, denote the number of stack cells required to store the argument.
 274
 275 Some schemes have a context $r$ as an argument which contains information about the location of the arguments in scope.
 276 More information is given in the schemes requiring such arguments.
 277
 278 \begin{table}
 279         \centering
 280         \caption{An overview of the compilation schemes.}
 281         \begin{tabularx}{\linewidth}{l X}
 282                 \toprule
 283                 Scheme & Description\\
 284                 \midrule
 285                 $\cschemeE{e}{r}$ & Produces the value of expression $e$ given the context $r$ and pushes it on the stack.
 286                         The result can be a basic value or a pointer to a task.\\
 287                 $\cschemeF{e}$ & Generates the bytecode for functions.\\
 288                 $\cschemeS{e}{r}{w} $ & Generates the function for the step continuation given the context $r$ and the width $w$ of the left-hand side task value.\\
 289                 \bottomrule
 290         \end{tabularx}
 291 \end{table}
 292
 293 \subsection{Expressions}
 294 Almost all expression constructions are compiled using $\mathcal{E}$.
 295 The argument of $\mathcal{E}$ is the context (see \cref{ssec:functions}).
 296 Values are always placed on the stack; tuples and other compound data types are unpacked.
 297 Function calls, function arguments and tasks are also compiled using $\mathcal{E}$ but their compilations is explained later.
 298
 299 \begin{align*}
 300         \cschemeE{\text{\cleaninline{lit}}~e}{r} & = \text{\cleaninline{BCPush (bytecode e)}};\\
 301         \cschemeE{e_1\mathbin{\text{\cleaninline{+.}}}e_2}{r} & = \cschemeE{e_1}{r};
 302                         \cschemeE{e_2}{r};
 303                         \text{\cleaninline{BCAdd}};\\
 304                 {} & \text{\emph{Similar for other binary operators}}\\
 305         \cschemeE{\text{\cleaninline{Not}}~e}{r} & =
 306                         \cschemeE{e}{r};
 307                         \text{\cleaninline{BCNot}};\\
 308                 {} & \text{\emph{Similar for other unary operators}}\\
 309         \cschemeE{\text{\cleaninline{If}}~e_1~e_2~e_3}{r} & =
 310                         \cschemeE{e_1}{r};
 311                         \text{\cleaninline{BCJmpF}}\enskip l_{else}; \mathbin{\phantom{=}} \cschemeE{e_2}{r}; \text{\cleaninline{BCJmp}}\enskip l_{endif};\\
 312                 {} & \mathbin{\phantom{=}} \text{\cleaninline{BCLabel}}\enskip l_{else}; \cschemeE{e_3}{r}; \mathbin{\phantom{=}} \text{\cleaninline{BCLabel}}\enskip l_{endif};\\
 313                 {} & \text{\emph{Where $l_{else}$ and $l_{endif}$ are fresh labels}}\\
 314         \cschemeE{\text{\cleaninline{tupl}}~e_1~e_2}{r} & =
 315                         \cschemeE{e_1}{r};
 316                         \cschemeE{e_2}{r};\\
 317                 {} & \text{\emph{Similar for other unboxed compound data types}}\\
 318         \cschemeE{\text{\cleaninline{first}}~e}{r} & =
 319                         \cschemeE{e}{r};
 320                         \text{\cleaninline{BCPop}}\enskip w;\\
 321                 {} & \text{\emph{Where $w$ is the width of the right value and}}\\
 322                 {} & \text{\emph{similar for other unboxed compound data types}}\\
 323         \cschemeE{\text{\cleaninline{second}}\enskip e}{r} & =
 324                         \cschemeE{e}{r};
 325                         \text{\cleaninline{BCRot}}\enskip (w_l+w_r)\enskip w_r;
 326                         \text{\cleaninline{BCPop}}\enskip w_l;\\
 327                 {} & \text{\emph{Where $w_l$ is the width of the left and, $w_r$ of the right value}}\\
 328                 {} & \text{\emph{similar for other unboxed compound data types}}\\
 329 \end{align*}
 330
 331 Translating $\mathcal{E}$ to \gls{CLEAN} code is very straightforward, it basically means writing the instructions to the writer monad.
 332 Almost always, the type of the interpretation is not used, i.e.\ it is a phantom type.
 333 To still have the functions return the correct type, the \cleaninline{tell`}\footnote{\cleaninline{tell` :: [BCInstr] -> BCInterpret a}} helper is used.
 334 This function is similar to the writer monad's \cleaninline{tell} function but is casted to the correct type.
 335 \Cref{lst:imp_arith} shows the implementation for the arithmetic and conditional expressions.
 336 Note that $r$, the context, is not an explicit argument here but stored in the state.
 337
 338 \begin{lstClean}[caption={Interpretation implementation for the arithmetic and conditional functions.},label={lst:imp_arith}]
 339 instance expr BCInterpret where
 340         lit   t   = tell` [BCPush (toByteCode{|*|} t)]
 341         (+.)  a b = a >>| b >>| tell` [BCAdd]
 342         ...
 343         If c t e = freshlabel >>= \elselabel->freshlabel >>= \endiflabel->
 344                 c >>| tell` [BCJumpF elselabel] >>|
 345                 t >>| tell` [BCJump endiflabel,BCLabel elselabel] >>|
 346                 e >>| tell` [BCLabel endiflabel]
 347 \end{lstClean}
 348
 349 \subsection{Functions}\label{ssec:functions}
 350 Compiling functions and other top-level definitions is done using in $\mathcal{F}$, which generates bytecode for the complete program by iterating over the functions and ending with the main expression.
 351 When compiling the body of the function, the arguments of the function are added to the context so that the addresses can be determined when referencing arguments.
 352 The main expression is a special case of $\mathcal{F}$ since it neither has arguments nor something to continue.
 353 Therefore, it is just compiled using $\mathcal{E}$ with an empty context.
 354
 355 \begin{align*}
 356         \cschemeF{main=m} & =
 357                 \cschemeE{m}{[]};\\
 358         \cschemeF{f~a_0 \ldots a_n = b~\text{\cleaninline{In}}~m} & =
 359                 \text{\cleaninline{BCLabel}}~f; \cschemeE{b}{[\langle f, i\rangle, i\in \{(\Sigma^n_{i=0}\stacksize{a_i})..0\}]};\\
 360                 {} & \mathbin{\phantom{=}} \text{\cleaninline{BCReturn}}~\stacksize{b}~n; \cschemeF{m};\\
 361 \end{align*}
 362
 363 A function call starts by pushing the stack and frame pointer, and making space for the program counter (\cref{lst:funcall_pushptrs}) followed by evaluating the arguments in reverse order (\cref{lst:funcall_args}).
 364 On executing \cleaninline{BCJumpSR}, the program counter is set and the interpreter jumps to the function (\cref{lst:funcall_jumpsr}).
 365 When the function returns, the return value overwrites the old pointers and the arguments.
 366 This occurs right after a \cleaninline{BCReturn} (\cref{lst:funcall_ret}).
 367 Putting the arguments on top of pointers and not reserving space for the return value uses little space and facilitates tail call optimization.
 368
 369 \begin{figure}
 370         \begin{subfigure}{.24\linewidth}
 371                 \centering
 372                 \includestandalone{memory1}
 373                 \caption{\cleaninline{BCPushPtrs}.}\label{lst:funcall_pushptrs}
 374         \end{subfigure}
 375         \begin{subfigure}{.24\linewidth}
 376                 \centering
 377                 \includestandalone{memory2}
 378                 \caption{Arguments.}\label{lst:funcall_args}
 379         \end{subfigure}
 380         \begin{subfigure}{.24\linewidth}
 381                 \centering
 382                 \includestandalone{memory3}
 383                 \caption{\cleaninline{BCJumpSR}.}\label{lst:funcall_jumpsr}
 384         \end{subfigure}
 385         \begin{subfigure}{.24\linewidth}
 386                 \centering
 387                 \includestandalone{memory4}
 388                 \caption{\cleaninline{BCReturn}.}\label{lst:funcall_ret}
 389         \end{subfigure}
 390         \caption{The stack layout during function calls.}%
 391 \end{figure}
 392
 393 Calling a function and referencing function arguments are an extension to $\mathcal{E}$ as shown below.
 394 Arguments may be at different places on the stack at different times (see \cref{ssec:step}) and therefore the exact location always is be determined from the context using \cleaninline{findarg}\footnote{\cleaninline{findarg [l`:r] l = if (l == l`) 0 (1 + findarg r l)}}.
 395 Compiling argument $a_{f^i}$, the $i$th argument in function $f$, consists of traversing all positions in the current context.
 396 Arguments wider than one stack cell are fetched in reverse to reconstruct the original order.
 397
 398 \begin{align*}
 399         \cschemeE{f(a_0, \ldots, a_n)}{r} & =
 400                 \text{\cleaninline{BCPushPtrs}}; \cschemeE{a_i}{r}~\text{for all}~i\in\{n\ldots 0\}; \text{\cleaninline{BCJumpSR}}~n~f;\\
 401         \cschemeE{a_{f^i}}{r} & =
 402                 \text{\cleaninline{BCArg}~findarg}(r, f, i)~\text{for all}~i\in\{w\ldots v\};\\
 403                 {} & v = \Sigma^{i-1}_{j=0}\stacksize{a_{f^j}}~\text{ and }~ w = v + \stacksize{a_{f^i}}\\
 404 \end{align*}
 405
 406 Translating the compilation schemes for functions to \gls{CLEAN} is not as straightforward as other schemes due to the nature of shallow embedding in combination with the use of state.
 407 The \cleaninline{fun} class has a single function with a single argument.
 408 This argument is a \gls{CLEAN} function that---when given a callable \gls{CLEAN} function representing the \gls{MTASK} function---produces the \cleaninline{main} expression and a callable function.
 409 To compile this, the argument must be called with a function representing a function call in \gls{MTASK}.
 410 \Cref{lst:fun_imp} shows the implementation for this as \gls{CLEAN} code.
 411 To uniquely identify the function, a fresh label is generated.
 412 The function is then called with the \cleaninline{callFunction} helper function that generates the instructions that correspond to calling the function.
 413 That is, it pushes the pointers, compiles the arguments, and writes the \cleaninline{JumpSR} instruction.
 414 The resulting structure (\cleaninline{g In m}) contains a function representing the mTask function (\cleaninline{g}) and the \cleaninline{main} structure to continue with.
 415 To get the actual function, \cleaninline{g} must be called with representations for the argument, i.e.\ using \cleaninline{findarg} for all arguments.
 416 The arguments are added to the context using \cleaninline{infun} and \cleaninline{liftFunction} is called with the label, the argument width and the compiler.
 417 This function executes the compiler, decorates the instructions with a label and places them in the function dictionary together with the metadata such as the argument width.
 418 After lifting the function, the context is cleared again and compilation continues with the rest of the program.
 419
 420 \begin{lstClean}[label={lst:fun_imp},caption={The interpretation implementation for functions.}]
 421 instance fun (BCInterpret a) BCInterpret | type a where
 422         fun def = {main=freshlabel >>= \funlabel->
 423                 let (g In m) = def \a->callFunction funlabel (toByteWidth a) [a]
 424                     argwidth = toByteWidth (argOf g)
 425                 in  addToCtx funlabel zero argwidth
 426                 >>| infun funlabel
 427                         (liftFunction funlabel argwidth
 428                                 (g (retrieveArgs funlabel zero argwidth)
 429                                 ) ?None)
 430                 >>| clearCtx >>| m.main
 431                 }
 432
 433 argOf :: ((m a) -> b) a -> UInt8 | toByteWidth a
 434 callFunction :: JumpLabel UInt8 [BCInterpret b] -> BCInterpret c | ...
 435 liftFunction :: JumpLabel UInt8 (BCInterpret a) (?UInt8) -> BCInterpret ()
 436 infun :: JumpLabel (BCInterpret a) -> BCInterpret a
 437 \end{lstClean}
 438
 439 \subsection{Tasks}\label{ssec:scheme_tasks}
 440 Task trees are created with the \cleaninline{BCMkTask} instruction that allocates a node and pushes a pointer to it on the stack.
 441 It pops arguments from the stack according to the given task type.
 442 The following extension of $\mathcal{E}$ shows this compilation scheme (except for the step combinator, explained in \cref{ssec:step}).
 443
 444 \begin{align*}
 445         \cschemeE{\text{\cleaninline{rtrn}}~e}{r} & =
 446                         \cschemeE{e}{r};
 447                         \text{\cleaninline{BCMkTask BCStable}}_{\stacksize{e}};\\
 448         \cschemeE{\text{\cleaninline{unstable}}~e}{r} & =
 449                         \cschemeE{e}{r};
 450                         \text{\cleaninline{BCMkTask BCUnstable}}_{\stacksize{e}};\\
 451         \cschemeE{\text{\cleaninline{readA}}~e}{r} & =
 452                         \cschemeE{e}{r};
 453                         \text{\cleaninline{BCMkTask BCReadA}};\\
 454         \cschemeE{\text{\cleaninline{writeA}}~e_1~e_2}{r} & =
 455                         \cschemeE{e_1}{r};
 456                         \cschemeE{e_2}{r};
 457                         \text{\cleaninline{BCMkTask BCWriteA}};\\
 458         \cschemeE{\text{\cleaninline{readD}}~e}{r} & =
 459                         \cschemeE{e}{r};
 460                         \text{\cleaninline{BCMkTask BCReadD}};\\
 461         \cschemeE{\text{\cleaninline{writeD}}~e_1~e_2}{r} & =
 462                         \cschemeE{e_1}{r};
 463                         \cschemeE{e_2}{r};
 464                         \text{\cleaninline{BCMkTask BCWriteD}};\\
 465         \cschemeE{\text{\cleaninline{delay}}~e}{r} & =
 466                         \cschemeE{e}{r};
 467                         \text{\cleaninline{BCMkTask BCDelay}};\\
 468         \cschemeE{\text{\cleaninline{rpeat}}~e}{r} & =
 469                         \cschemeE{e}{r};
 470                         \text{\cleaninline{BCMkTask BCRepeat}};\\
 471         \cschemeE{e_1\text{\cleaninline{.\|\|.}}e_2}{r} & =
 472                         \cschemeE{e_1}{r};
 473                         \cschemeE{e_2}{r};
 474                         \text{\cleaninline{BCMkTask BCOr}};\\
 475         \cschemeE{e_1\text{\cleaninline{.&&.}}e_2}{r} & =
 476                         \cschemeE{e_1}{r};
 477                         \cschemeE{e_2}{r};
 478                         \text{\cleaninline{BCMkTask BCAnd}};\\
 479 \end{align*}
 480
 481 This translates to Clean code by writing the correct \cleaninline{BCMkTask} instruction as exemplified in \cref{lst:imp_ret}.
 482
 483 \begin{lstClean}[caption={The byte code interpretation implementation for \cleaninline{rtrn}.},label={lst:imp_ret}]
 484 instance rtrn BCInterpret
 485 where
 486         rtrn m = m >>| tell` [BCMkTask (bcstable m)]
 487 \end{lstClean}
 488
 489 \subsection{Sequential combinator}\label{ssec:step}
 490 The \cleaninline{step} construct is a special type of task because the task value of the left-hand side changes over time.
 491 Therefore, the task continuations on the right-hand side are \emph{observing} this task value and acting upon it.
 492 In the compilation scheme, all continuations are first converted to a single function that has two arguments: the stability of the task and its value.
 493 This function either returns a pointer to a task tree or fails (denoted by $\bot$).
 494 It is special because in the generated function, the task value of a task is inspected.
 495 Furthermore, it is a lazy node in the task tree: the right-hand side may yield a new task tree after several rewrite steps, i.e.\ it is allowed to create infinite task trees using step combinators.
 496 The function is generated using the $\mathcal{S}$ scheme that requires two arguments: the context $r$ and the width of the left-hand side so that it can determine the position of the stability which is added as an argument to the function.
 497 The resulting function is basically a list of if-then-else constructions to check all predicates one by one.
 498 Some optimization is possible here but has currently not been implemented.
 499
 500 \begin{align*}
 501         \cschemeE{t_1\text{\cleaninline{>>*.}}t_2}{r} & =
 502                 \cschemeE{a_{f^i}}{r}, \langle f, i\rangle\in r;
 503                 \text{\cleaninline{BCMkTask}}~\text{\cleaninline{BCStable}}_{\stacksize{r}}; \cschemeE{t_1}{r};\\
 504         {} & \mathbin{\phantom{=}} \text{\cleaninline{BCMkTask}}~\text{\cleaninline{BCAnd}}; \text{\cleaninline{BCMkTask}}~(\text{\cleaninline{BCStep}}~(\cschemeS{t_2}{(r + [\langle l_s, i\rangle])}{\stacksize{t_1}}));\\
 505 \end{align*}
 506
 507 \begin{align*}
 508         \cschemeS{[]}{r}{w} & =
 509                 \text{\cleaninline{BCPush}}~\bot;\\
 510         \cschemeS{\text{\cleaninline{IfValue}}~f~t:cs}{r}{w} & =
 511                 \text{\cleaninline{BCArg}} (\stacksize{r} + w);
 512                 \text{\cleaninline{BCIsNoValue}};\\
 513         {} & \mathbin{\phantom{=}} \cschemeE{f}{r};
 514                 \text{\cleaninline{BCAnd}};\\
 515         {} & \mathbin{\phantom{=}} \text{\cleaninline{BCJmpF}}~l_1;\\
 516         {} & \mathbin{\phantom{=}} \cschemeE{t}{r};
 517                 \text{\cleaninline{BCJmp}}~l_2;\\
 518         {} & \mathbin{\phantom{=}} \text{\cleaninline{BCLabel}}~l_1;
 519                 \cschemeS{cs}{r}{w};\\
 520         {} & \mathbin{\phantom{=}} \text{\cleaninline{BCLabel}}~l_2;\\
 521         {} & \text{\emph{Where $l_1$ and $l_2$ are fresh labels}}\\
 522         {} & \text{\emph{Similar for \cleaninline{IfStable} and \cleaninline{IfUnstable}}}\\
 523 \end{align*}
 524
 525 First the context is evaluated.
 526 The context contains arguments from functions and steps that need to be preserved after rewriting.
 527 The evaluated context is combined with the left-hand side task value by means of a \cleaninline{.&&.} combinator to store it in the task tree so that it is available after a rewrite.
 528 This means that the task tree is be transformed as follows as
 529
 530 \begin{figure}
 531         \begin{subfigure}{.5\textwidth}
 532                 \includestandalone{contexttree1}
 533
 534         \end{subfigure}
 535         \begin{subfigure}{.5\textwidth}
 536                 \includestandalone{contexttree2}
 537         \end{subfigure}
 538 \end{figure}
 539
 540 The translation to \gls{CLEAN} is given in \cref{lst:imp_seq}.
 541
 542 \begin{lstClean}[caption={Byte code compilation interpretation implementation for the step class.},label={lst:imp_seq}]
 543 instance step BCInterpret where
 544         (>>*.) lhs cont
 545                 //Fetch a fresh label and fetch the context
 546                 =   freshlabel >>= \funlab->gets (\s->s.bcs_context)
 547                 //Generate code for lhs
 548                 >>= \ctx->lhs
 549                 //Possibly add the context
 550                 >>| tell` (if (ctx =: []) []
 551                                 //The context is just the arguments up till now in reverse
 552                                 (  [BCArg (UInt8 i)\\i<-reverse (indexList ctx)]
 553                                 ++ map BCMkTask (bcstable (UInt8 (length ctx)))
 554                                 ++ [BCMkTask BCTAnd]
 555                                 ))
 556                 //Increase the context
 557                 >>| addToCtx funlab zero lhswidth
 558                 //Lift the step function
 559                 >>| liftFunction funlab
 560                                 //Width of the arguments is the width of the lhs plus the
 561                                 //stability plus the context
 562                                 (one + lhswidth + (UInt8 (length ctx)))
 563                                 //Body     label  ctx width            continuations
 564                                 (contfun funlab (UInt8 (length ctx)))
 565                                 //Return width (always 1, a task pointer)
 566                                 (Just one)
 567                 >>| modify (\s->{s & bcs_context=ctx})
 568                 >>| tell` [BCMkTask (instr rhswidth funlab)]
 569
 570 toContFun :: JumpLabel UInt8 -> BCInterpret a
 571 toContFun steplabel contextwidth
 572         = foldr tcf (tell` [BCPush fail]) cont
 573 where
 574         tcf (IfStable f t)
 575                 = If ((stability >>| tell` [BCIsStable]) &. f val)
 576                         (t val >>| tell` [])
 577         ...
 578         stability = tell` [BCArg (lhswidth + contextwidth)]
 579         val = retrieveArgs steplabel zero lhswidth
 580 \end{lstClean}
 581
 582 \subsection{\texorpdfstring{\Glspl{SDS}}{Shared data sources}}
 583 The compilation scheme for \gls{SDS} definitions is a trivial extension to $\mathcal{F}$ since there is no code generated as seen below.
 584
 585 \begin{align*}
 586         \cschemeF{\text{\cleaninline{sds}}~x=i~\text{\cleaninline{In}}~m} & =
 587                 \cschemeF{m};\\
 588 \end{align*}
 589
 590 The \gls{SDS} access tasks have a compilation scheme similar to other tasks (see \cref{ssec:scheme_tasks}).
 591 The \cleaninline{getSds} task just pushes a task tree node with the \gls{SDS} identifier embedded.
 592 The \cleaninline{setSds} task evaluates the value, lifts that value to a task tree node and creates \pgls{SDS} set node.
 593
 594 \begin{align*}
 595         \cschemeE{\text{\cleaninline{getSds}}~s}{r} & =
 596                 \text{\cleaninline{BCMkTask}} (\text{\cleaninline{BCSdsGet}} s);\\
 597         \cschemeE{\text{\cleaninline{setSds}}~s~e}{r} & =
 598                 \cschemeE{e}{r};
 599                 \text{\cleaninline{BCMkTask BCStable}}_{\stacksize{e}};\\
 600         {} & \mathbin{\phantom{=}} \text{\cleaninline{BCMkTask}} (\text{\cleaninline{BCSdsSet}} s);\\
 601 \end{align*}
 602
 603 While there is no code generated in the definition, the byte code compiler is storing the \gls{SDS} data in the \cleaninline{bcs_sdses} field in the compilation state.
 604 The \glspl{SDS} are typed as functions in the host language so an argument for this function must be created that represents the \gls{SDS} on evaluation.
 605 For this, an \cleaninline{BCInterpret} is created that emits this identifier.
 606 When passing it to the function, the initial value of the \gls{SDS} is returned.
 607 This initial value is stored as a byte code encoded value in the state and the compiler continues with the rest of the program.
 608
 609 First, the initial \gls{SDS} value is extracted from the expression by bootstrapping the fixed point with a dummy value.
 610 This is safe because the expression on the right-hand side of the \cleaninline{In} is never evaluated.
 611 Then, using \cleaninline{addSdsIfNotExist}, the identifier for this particular \gls{SDS} is either retrieved from the compiler state or generated freshly.
 612 This identifier is then used to provide a reference to the \cleaninline{def} definition to evaluate the main expression.
 613 Compiling \cleaninline{getSds} is a matter of executing the \cleaninline{BCInterpret} representing the \gls{SDS}, which yields the identifier that can be embedded in the instruction.
 614 Lifted SDSs are compiled in a very similar way \cref{sec:liftsds}.
 615 Setting the \gls{SDS} is similar: the identifier is retrieved and the value is written to put in a task tree so that the resulting task can remember the value it has written.
 616
 617 % VimTeX: SynIgnore on
 618 \begin{lstClean}[caption={Backend implementation for the SDS classes.},label={lst:comp_sds}]
 619 :: Sds a = Sds Int
 620 instance sds BCInterpret where
 621         sds def = {main =
 622                         let (t In e) = def (abort "sds: expression too strict")
 623                         in addSdsIfNotExist (Left $ String255 (toByteCode{|*|} t))
 624                                 >>= \sdsi-> let (t In e) = def (pure (Sds sdsi))
 625                                         in e.main
 626                 }
 627         getSds f   = f >>= \(Sds i)-> tell` [BCMkTask (BCSdsGet (fromInt i))]
 628         setSds f v = f >>= \(Sds i)->v >>| tell`
 629                 (  map BCMkTask (bcstable (byteWidth v))
 630                 ++ [BCMkTask (BCSdsSet (fromInt i))])
 631 \end{lstClean}
 632 % VimTeX: SynIgnore off
 633
 634 \section{\texorpdfstring{\Gls{C}}{C} code generation}\label{sec:ccodegen}
 635 All communication between the \gls{ITASK} server and the \gls{MTASK} server is type-parametrised.
 636 From the structural representation of the type, a \gls{CLEAN} parser and printer is constructed using generic programming.
 637 Furthermore, a \ccpp{} parser and printer is generated for use on the \gls{MTASK} device.
 638 The technique for generating the \ccpp{} parser and printer is very similar to template metaprogramming and requires a rich generic programming library or compiler support that includes a lot of metadata in the record and constructor nodes.
 639 Using generic programming in the \gls{MTASK} system, both serialisation and deserialisation on the microcontroller and and the server is automatically generated.
 640
 641 \subsection{Server}
 642 On the server, off-the-shelve generic programming techniques are used to make the serialisation and deserialisation functions (see \cref{lst:ser_deser_server}).
 643 Serialisation is a simple conversion from a value of the type to a string.
 644 Deserialisation is a little bit different in order to support streaming\footnotemark.
 645 \footnotetext{%
 646         Here the \cleaninline{*!} variant of the generic interface is chosen that has less uniqueness constraints for the compiler-generated adaptors \citep{alimarine_generic_2005,hinze_derivable_2001}.%
 647 }
 648 Given a list of available characters, a tuple is always returned.
 649 The right-hand side of the tuple contains the remaining characters, the unparsed input.
 650 The left-hand side contains either an error or a maybe value.
 651 If the value is a \cleaninline{?None}, there was no full value to parse.
 652 If the value is a \cleaninline{?Just}, the data field contains a value of the requested type.
 653
 654 \begin{lstClean}[caption={Serialisation and deserialisation functions in \gls{CLEAN}.},label={lst:ser_deser_server}]
 655 generic toByteCode   a    :: a -> String
 656 generic fromByteCode a *! :: [Char] -> (Either String (? a), [Char])
 657 \end{lstClean}
 658
 659 \subsection{Client}
 660 The \gls{RTS} of the \gls{MTASK} system runs on resource-constrained microcontrollers and is implemented in portable \ccpp{}.
 661 In order to achieve more interoperation safety, the communication between the server and the client is automated, i.e.\ the serialisation and deserialisation code in the \gls{RTS} is generated.
 662 The technique used for this is very similar to the technique shown in \cref{chp:first-class_datatypes}.
 663 However, instead of using template metaprogramming, a feature \gls{CLEAN} lacks, generic programming is used also as a two-stage rocket.
 664 In contrast to many other generic programming systems, \gls{CLEAN} allows for access to much of the metadata of the compiler.
 665 For example, \cleaninline{Cons}, \cleaninline{Object}, \cleaninline{Field}, and \cleaninline{Record} generic constructors are enriched with their arity, names, types, \etc.
 666 Furthermore, constructors can access the metadata of the objects and fields of their parent records.
 667 Using this metadata, generic functions are created that generate \ccpp{} type definitions, parsers and printers for any first-order \gls{CLEAN} type.
 668 The exact details of this technique can be found in the future in a paper that is in preparation.
 669
 670 \Glspl{ADT} are converted to tagged unions, newtypes to typedefs, records to structs, and arrays to dynamic size-parametrised allocated arrays.
 671 For example, the \gls{CLEAN} types in \cref{lst:ser_clean} are translated to the \ccpp{} types seen in \cref{lst:ser_c}
 672
 673 \begin{lstClean}[caption={Simple \glspl{ADT} in \gls{CLEAN}.},label={lst:ser_clean}]
 674 :: T a = A a | B NT {#Char}
 675 :: NT =: NT Real
 676 \end{lstClean}
 677
 678 \begin{lstArduino}[caption={Generated \ccpp{} type definitions for the simple \glspl{ADT}.},label={lst:ser_c}]
 679 typedef double Real;
 680 typedef char Char;
 681
 682 typedef Real NT;
 683 enum T_c {A_c, B_c};
 684
 685 struct Char_HshArray { uint32_t size; Char *elements; };
 686 struct T {
 687         enum T_c cons;
 688         struct { void *A;
 689                  struct { NT f0; struct Char_HshArray f1; } B;
 690         } data;
 691 };
 692 \end{lstArduino}
 693
 694 For each of these generated types, two functions are created, a typed printer, and a typed parser (see \cref{lst:ser_pp}).
 695 The parser functions are parametrised by a read function, an allocation function and parse functions for all type variables.
 696 This allows for the use of these functions in environments where the communication is parametrised and the memory management is self-managed such as in the \gls{MTASK} \gls{RTS}.
 697
 698 \begin{lstArduino}[caption={Printer and parser for the \glspl{ADT} in \ccpp{}.},label={lst:ser_pp}]
 699 struct T parse_T(uint8_t (*get)(), void *(*alloc)(size_t),
 700                  void *(*parse_0)(uint8_t (*)(), void *(*)(size_t)));
 701
 702 void print_T(void (*put)(uint8_t), struct T r,
 703              void (*print_0)(void (*)(uint8_t), void *));
 704 \end{lstArduino}
 705
 706 \section{Conclusion}
 707 It is not straightforward to execute \gls{MTASK} tasks on resources-constrained \gls{IOT} edge devices.
 708 To achieve this, the terms in the \gls{DSL} are compiled to domain-specific byte code.
 709 This byte code is sent for interpretation to the light-weight \gls{RTS} of the edge device.
 710 The \gls{RTS} first evaluates the main expression.
 711 The result of this evaluation, a run time representation of the task, is a task tree.
 712 This task tree is rewritten according to rewrite rules until a stable value is observed.
 713 Rewriting multiple tasks at the same time is achieved by interleaving the rewrite steps, resulting in seamingly parallel execution of the tasks.
 714 Furthermore, the \gls{RTS} automates communication and coordinates multi tasking.
 715
 716 \input{subfilepostamble}
 717 \end{document}