\input{subfilepreamble}
+\setcounter{chapter}{5}
+
\begin{document}
\input{subfileprefix}
\chapter{Implementation}%
\begin{chapterabstract}
\noindent This chapter shows the implementation of the \gls{MTASK} system by:
\begin{itemize}
- \item shows the implementation of the byte code compiler for \gls{MTASK}'s \gls{TOP} language;
\item gives details of the implementation of \gls{MTASK}'s \gls{TOP} engine that executes the \gls{MTASK} tasks on the microcontroller;
- \item and explains how the integration of \gls{MTASK} tasks and \glspl{SDS} is implemented both on the server and on the device.
+ \item shows the implementation of the byte code compiler for \gls{MTASK}'s \gls{TOP} language;
+ \item explains the machinery used to automatically serialise and deserialise data to-and-fro the device.
\end{itemize}
\end{chapterabstract}
Microcontrollers usually have flash-based program memory which wears out fairly quick.
For example, the atmega328p in the \gls{ARDUINO} UNO is rated for 10000 write cycles.
-While this sounds like a lot, if new tasks are sent to the device every minute or so, a lifetime of not even seven days is guaranteed.
-Hence, for dynamic applications, generating code at run-time for interpretation on the device is necessary.
-This byte code is then interpreted on microcontrollers with very little memory and processing power and thus save precious write cycles of the program memory.
-precious write cycles of the program memory.
+While this sounds like a lot, if new tasks are sent to the device every minute or so, a lifetime of only seven days is guaranteed.
+Hence, for dynamic applications, interpretation on the device is necessary, saving precious write cycles of the program memory.
In order to provide the device with the tools to interpret the byte code, it is programmed with a \gls{RTS}, a customisable domain-specific \gls{OS} that takes care of the execution of tasks but also low-level mechanisms such as the communication, multi tasking, and memory management.
-Once the device is programmed with the \gls{MTASK} \gls{RTS}, it can continuously receive new tasks.
+Once the device is programmed with the \gls{MTASK} \gls{RTS}, it can continuously receive new tasks without the need for reprogramming.
+The \gls{OS} is written in portable \ccpp{} and only contains a small device-specific portion.
+
+\section{\texorpdfstring{\Glsxtrlong{RTS}}{Run time system}}
+The event loop of the \gls{RTS} is executed repeatedly and consists of three distinct phases.
+After doing the three phases, the devices goes to sleep for as long as possible (see \cref{chp:green_computing_mtask}).
+
+\subsection{Communication}
+In the first phase, the communication channels are processed.
+The exact communication method is a customisable device-specific option baked into the \gls{RTS}.
+The interface is deliberately kept simple and consists of a two layer interface: a link interface and a communication interface.
+Besides opening, closing and cleaning up, the link interface has only three functions that are shown in \cref{lst:link_interface}.
+Consequently, implementing this link interface is very simple but allows for many more advanced link settings such as buffering.
+There are implementations for this interface for serial or Wi-Fi connections using \gls{ARDUINO} and \gls{TCP} connections for Linux.
+
+\begin{lstArduino}[caption={Link interface of the \gls{MTASK} \gls{RTS}.},label={lst:link_interface}]
+bool link_input_available(void);
+uint8_t link_read_byte(void);
+void link_write_byte(uint8_t b);
+\end{lstArduino}
+
+The communication interface abstracts away from this link interface and is typed instead.
+It contains only two functions as seen in \cref{lst:comm_interface}.
+There are implementations for direct communication, or communication using an \gls{MQTT} broker.
+Both use the automatic serialisation and deserialisation shown in \cref{sec:ccodegen}.
+
+\begin{lstArduino}[caption={Communication interface of the \gls{MTASK} \gls{RTS}.},label={lst:comm_interface}]
+struct MTMessageTo receive_message(void);
+void send_message(struct MTMessageFro msg);
+\end{lstArduino}
+
+Processing the received messages from the communication channels happens synchronously and the channels are exhausted completely before moving on to the next phase.
+There are several possible messages that can be received from the server:
+
+\begin{description}
+ \item[SpecRequest]
+ is a message instructing the device to send its specification and is sent usually immediately after connecting.
+ The \gls{RTS} responds with a \texttt{Spec} answer containing the specification.
+ \item[TaskPrep]
+ tells the device a (big) task is on its way.
+ Especially on faster connections, it may be the case that the communication buffers overflow because a big message is sent while the \gls{RTS} is busy executing tasks.
+ This allows the \gls{RTS} to postpone execution for a while, until the big task has been received.
+ The server sends the big task when the device acknowledges (by sending a \texttt{TaskPrepAck} message) the preparation.
+ \item[Task]
+ contains a new task, its peripheral configuration, the \glspl{SDS}, and the bytecode.
+ The new task is immediately copied to the task storage but is only initialised during the next phase after which a \texttt{TaskAck} is sent.
+ Tasks are stored below the stack, but since the stack is only used in the middle phase, execution, it is no problem that it moves.
+ \item[SdsUpdate]
+ notifies the device of the new value for a lowered \gls{SDS}.
+ The old value of the lowered \gls{SDS} is immediately replaced with the new one.
+ There is no acknowledgement required.
+ \item[TaskDel]
+ instructs the device to delete a running task.
+ Tasks are automatically deleted when they become stable.
+ However, a task may also be deleted when the surrounding task on the server is deleted, for example when the task is on the left-hand side of a step combinator and the condition to step holds.
+ The device acknowledges by sending a \texttt{TaskDelAck}.
+ \item[Shutdown]
+ tells the device to reset.
+\end{description}
+
+\subsection{Execution}
+The second phase consists of performing one execution step for all tasks that wish for it.
+Tasks are ordered in a priority queue ordered by the time a task needs to be executed, the \gls{RTS} selects all tasks that can be scheduled, see \cref{sec:scheduling} for more details.
+A task is stored below the stack and its complete state is a \gls{CLEAN} record contain most importantly the task id, a pointer to the task tree in the heap (null if not initialised yet), the current task value, the configuration of \glspl{SDS}, the configuration of peripherals, the byte code and some scheduling information.
+
+If a task is not initialized yet, i.e.\ the pointer to the current task tree is still null, the byte code of the main function is interpreted.
+The main expression always produces a task tree.
+The rewriting engine uses the interpreter when needed, e.g.\ to calculate the step continuations.
+The rewriter and the interpreter use the same stack to store intermediate values.
+Rewriting steps are small so that interleaving results in seemingly parallel execution.
+In this phase new task tree nodes may be allocated.
+Both rewriting and initialization are atomic operations in the sense that no processing on SDSs is done other than SDS operations from the task itself.
+The host is notified if a task value is changed after a rewrite step.
+\subsection{Memory management}
+The third and final phase is memory management.
+The \gls{MTASK} \gls{RTS} is designed to run on systems with as little as \qty{2}{\kibi\byte} of \gls{RAM}.
+Aggressive memory management is therefore vital.
+Not all firmwares for microprocessors support heaps and---when they do---allocation often leaves holes when not used in a \emph{last in first out} strategy.
+The \gls{RTS} uses a chunk of memory in the global data segment with its own memory manager tailored to the needs of \gls{MTASK}.
+The size of this block can be changed in the configuration of the \gls{RTS} if necessary.
+On an \gls{ARDUINO} UNO---equipped with \qty{2}{\kibi\byte} of \gls{RAM}---the maximum viable size is about \qty{1500}{\byte}.
+The self-managed memory uses a similar layout as the memory layout for \gls{C} programs only the heap and the stack are switched (see \cref{fig:memory_layout}).
+
+\begin{figure}
+ \centering
+ \includestandalone{memorylayout}
+ \caption{Memory layout}\label{fig:memory_layout}
+\end{figure}
+
+In memory, task data grows from the bottom up and the interpreter stack is located directly on top of it growing in the same direction.
+As a consequence, the stack moves when a new task is received.
+This never happens within execution because communication is always processed before execution.
+Values in the interpreter are always stored on the stack, even tuples.
+Task trees grow from the top down as in a heap.
+This approach allows for flexible ratios, i.e.\ many tasks and small trees or few tasks and big trees.
+
+Stable tasks, and unreachable task tree nodes are removed.
+If a task is to be removed, tasks with higher memory addresses are moved down.
+For task trees---stored in the heap---the \gls{RTS} already marks tasks and task trees as trash during rewriting so the heap can be compacted in a single pass.
+This is possible because there is no sharing or cycles in task trees and nodes contain pointers pointers to their parent.
+
+\todo{plaa\-tje van me\-mo\-ry hier}
+
+\section{Compiler}
\subsection{Instruction set}
The instruction set is a fairly standard stack machine instruction set extended with special \gls{TOP} instructions.
\Cref{lst:instruction_type} shows the \gls{CLEAN} type representing the instruction set of which \cref{tbl:instr_task} gives detailed semantics.
\end{lstClean}
% VimTeX: SynIgnore off
-\section{\texorpdfstring{\Glsxtrlong{RTS}}{Run time system}}
-
-The \gls{RTS} is designed to run on systems with as little as \qty{2}{\kibi\byte} of \gls{RAM}.
-Aggressive memory management is therefore vital.
-Not all firmwares for microprocessors support heaps and---when they do---allocation often leaves holes when not used in a \emph{last in first} out strategy.
-Therefore the \gls{RTS} uses a chunk of memory in the global data segment with its own memory manager tailored to the needs of \gls{MTASK}.
-The size of this block can be changed in the configuration of the \gls{RTS} if necessary.
-On an \gls{ARDUINO} UNO ---equipped with \qty{2}{\kibi\byte} of \gls{RAM}--- this size is about \qty{1500}{\byte}.
-
-In memory, task data grows from the bottom up and an interpreter stack is located directly on top of it growing in the same direction.
-As a consequence, the stack moves when a new task is received.
-This never happens within execution because communication is always processed before execution.
-Values in the interpreter are always stored on the stack, even tuples.
-Task trees grow from the top down as in a heap.
-This approach allows for flexible ratios, i.e.\ many tasks and small trees or few tasks and big trees.
-
-The event loop of the \gls{RTS} is executed repeatedly and consists of three distinct phases.
-\todo{plaa\-tje van me\-mo\-ry hier}
-\todo{pseu\-do\-code hier van de ex\-e\-cu\-tie}
-
-\subsection{C code generation}
+\section{C code generation}\label{sec:ccodegen}
+All communication between the \gls{ITASK} server and the \gls{MTASK} server is type-parametrised.
+From the structural representation of the type, a \gls{CLEAN} parser and printer is constructed using generic programming.
+Furthermore, a \ccpp{} parser and printer is generated for use on the \gls{MTASK} device.
+The technique for generating the \ccpp{} parser and printer is very similar to template metaprogramming and requires a generic programming library or compiler support that includes a lot of metadata in the record and constructor nodes.
-%TODO evt subsubsections verwijderen
-\subsection{Communication}
-In the first phase, the communication channels are processed.
-The messages announcing \gls{SDS} updates are applied immediately, the initialization of new tasks is delayed until the next phase.
-
-\subsection{Execution}
-The second phase consists of executing tasks.
-The \gls{RTS} executes all tasks in a round robin fashion.
-If a task is not initialized yet, the bytecode of the main function is interpreted to produce the initial task tree.
-The rewriting engine uses the interpreter when needed, e.g.\ to calculate the step continuations.
-The rewriter and the interpreter use the same stack to store intermediate values.
-Rewriting steps are small so that interleaving results in seemingly parallel execution.
-In this phase new task tree nodes may be allocated.
-Both rewriting and initialization are atomic operations in the sense that no processing on SDSs is done other than SDS operations from the task itself.
-The host is notified if a task value is changed after a rewrite step.
-
-\subsection{Memory management}
-The third and final phase is memory management.
-Stable tasks, and unreachable task tree nodes are removed.
-If a task is to be removed, tasks with higher memory addresses are moved down.
-For task trees---stored in the heap---the \gls{RTS} already marks tasks and task trees as trash during rewriting so the heap can be compacted in a single pass.
-This is possible because there is no sharing or cycles in task trees and nodes contain pointers pointers to their parent.
\section{Conclusion}
\todo[inline]{conclusion}