updates
[phd-thesis.git] / intro / intro.tex
1 \documentclass[../thesis.tex]{subfiles}
2
3 \input{subfilepreamble}
4
5 \begin{document}
6 \chapter{Prelude}%
7 \label{chp:introduction}
8 In 2022, there were an estimated number of 13.4 billion of connected computers that sense, act or otherwise interact with people, other computers and the physical world surrounding us\footnotemark{}.
9 \footnotetext{\url{https://transformainsights.com/research/tam/market}, accessed on: \formatdate{2022}{10}{13}}
10 The variety among these devices is considerable but these devices have one thing in common though: they are all controlled by software.
11 Concretely this means that programmers write code for these specific device to make sure the brains of the device---the processor---do what we want it to do.
12
13 An increasing amount of these connected devices are so-called \emph{edge devices}.
14 Typically these edge devices are small microprocessors containing various sensors and actuators to interact with the physical world.
15 They are often part of and coordinated by a bigger system called \gls{IOT} systems.
16
17
18 %These ed
19 %These edge devices differ very much from other devices we see around us.
20 %Compared to servers, laptops, tablets, or mobile phones they boast tiny amounts of memory, are powered by a slow but energy efficient microprocessor, only support low-level programming languages, and are not so easily reprogrammed.
21 %Moreover, these edge devices differ among eachother as well by using various microprocessor architectures, different communication protocols and using a variety of device-specific toolchains.
22 %As a result, there are many points of failure and programming these systems is difficult and error-prone.
23 %
24 %\Gls{TOP} is a novel programming paradigm that offers a solution to this problem.
25 %In a \gls{TOP} language, from a single declarative specification of the work that needs to be done, ready-for-work applications are generated for all layers of the system.
26 %However, the hardware requirements for traditional \gls{TOP} frameworks make it not feasable to run these generated applications on resource-constrained edge devices.
27 %
28 %\Glspl{DSL} can overcome this limitation because domain-specific knowledge is built into the programming language, allowing for lower hardware requirements.
29 %This thesis presents \gls{MTASK}, a \gls{TOP} \gls{DSL} for edge devices that can be fully integrated with \gls{ITASK}, a \gls{TOP} \gls{DSL} for distributed multi-user workflow systems.
30 %With \gls{MTASK}, all layers of an \gls{IOT} system can be programmed from a single programming language in a single programming paradigm.
31
32 \section{Internet of things}\label{sec:back_iot}
33 The \gls{IOT} is growing rapidly and it is changing the way people and machines interact with the world.
34 While the term \gls{IOT} briefly gained interest around 1999 to describe the communication of \gls{RFID} devices \citep{ashton_internet_1999,ashton_that_2009}, it probably already popped up halfway the eighties in a speech by \citet{peter_t_lewis_speech_1985}:
35
36 \begin{quote}
37 \emph{The \glsxtrlong{IOT}, or \glsxtrshort{IOT}, is the integration of people, processes and technology with connectable devices and sensors to enable remote monitoring, status, manipulation and evaluation of trends of such devices.}
38 \end{quote}
39
40 CISCO states that the \gls{IOT} only started when there where as many connected devices as there were people on the globe, i.e.\ around 2008 \citep{evans_internet_2011}.
41 Today, the \gls{IOT} is the term for a system of devices that sense the environment, act upon it and communicate with each other and the world.
42 These connected devices are already in households all around us in the form of smart electricity meters, fridges, phones, watches, home automation, \etc.
43
44 When describing \gls{IOT} systems, a tiered---or layered---architecture is often used to compartmentalize the technology.
45 The number of tiers heavily depends on the required complexity of the model but for the intents and purposes of the thesis, the four layer architecture shown in \cref{fig:iot-layers} is used.
46
47 \begin{figure}[ht]
48 \centering
49 \includestandalone{iot-layers}
50 \caption{A four-layer \gls{IOT} architecture.}%
51 \label{fig:iot-layers}
52 \end{figure}
53
54 Closest to the end-user is the presentation layer, it provides the interface between the user and the \gls{IOT} application.
55 In home automation this may be a web interface or a app used on a phone or mounted tablet to interact with the edge devices and view the sensor data.
56
57 The application layer provides the \glspl{API}, interfaces and data storage.
58 A cloud service or local server provides this layer in a typical home automation application.
59
60 All layers are connected using the network layer.
61 In many applications this is implemented using conventional networking techniques such as WiFi or Ethernet.
62 However, networks or layers on top of it tailored to the needs of \gls{IOT} applications have been increasingly popular such as \gls{BLE}, LoRa, ZigBee, LTE-M, or \gls{MQTT}.
63
64 The perception layer---also called edge layer---collects the data and interacts with the environment.
65 It consists of edge devices such as microprocessors equipped with various sensors and actuators.
66 In home automation this layer consists of all the devices hosting the sensors and actuators such as in a smart lightbulb, an actuator to open a door or a temperature and humidity sensor.
67
68 Across the layers, the devices are a large heterogeneous collection of different platforms, protocols, paradigms, and programming languages often resulting in impedance problems or semantic friction between layers when programming \citep{ireland_classification_2009}.
69 Even more so, perception layer specifically often is a heterogeneous collections of microprocessors in itself as well, each having their own peculiarities, language of choice and hardware interfaces.
70 As the edge hardware needs to be cheap, small-scale, and energy efficient, the microprocessors used to power these devices do not have a lot of computational power, only a soup\c{c}on of memory, and little communication bandwidth.
71 Typically the devices do not run a full fledged \gls{OS} but a compiled firmware.
72 This firmware is often written in an imperative language that needs to be flashed to the program memory.
73 Program memory typically is flash based and only lasts a couple of thousand writes before it wears out.
74 While devices are getting a bit faster, smaller, and cheaper, they keep these properties to an extent, greatly reducing the flexibility for dynamic systems where tasks are created on the fly, executed on demand, or require parallel execution.
75 These problems can be mitigated by dynamically sending code to be interpreted to the microprocessor.
76 With interpretation, a specialized interpreter is flashed in the program memory once that receives the program code to execute at runtime.
77 Interpretation always comes with an overhead, making it challenging to create them for small edge devices.
78 However, the hardware requirements can be reduced by embedding domain-specific data into the programming language to be interpreted, so called \glspl{DSL}.
79
80 \section{\texorpdfstring{\Glsxtrlongpl{DSL}}{Domain-specific languages}}\label{sec:back_dsl}
81 % General
82 Programming languages can be divided up into two categories: \glspl{DSL}\footnote{Historically this has been called DSEL as well.} and \glspl{GPL} \citep{fowler_domain_2010}.
83 Where \glspl{GPL} are not made with a demarcated area in mind, \glspl{DSL} are tailor-made for a specific domain.
84 Writing idiomatic domain-specific code in an \gls{DSL} is easy but this may come at the cost of the \gls{DSL} being less expressive to an extent that it may not even be Turing complete.
85 \Glspl{DSL} come in two main flavours: standalone and embedded\footnote{Also called external and internal respectively.} of which \glspl{EDSL} can again be classified into heterogeneous and homogeneous languages (see \cref{fig:hyponymy_of_dsls} for this hyponymy).
86
87 \begin{figure}[ht]
88 \centering
89 \includestandalone{hyponymy_of_dsls}
90 \caption{Hyponymy of \glspl{DSL} (adapted from \citet[\citepage{2}]{mernik_extensible_2013})}%
91 \label{fig:hyponymy_of_dsls}
92 \end{figure}
93
94 \subsection{Standalone and embedded}
95 \glspl{DSL} where historically created as standalone languages, meaning all the machinery is developed solely for the language.
96 The advantage of this approach is that the language designer is free to define the syntax and type system of the language as they wish, not being restricted by any constraint.
97 Unfortunately it also means that they need to develop a compiler or interpreter for the language to be usable making standalone \glspl{DSL} costly to create.
98 Examples of standalone \glspl{DSL} are regular expressions, make, yacc, XML, SQL, \etc.
99
100 The dichotomous approach is embedding the \gls{DSL} in a host language, i.e.\ \glspl{EDSL} \citep{hudak_modular_1998}.
101 By defining the language as constructs in the host language, much of the machinery is inherited and the cost of creating embedded languages is very low.
102 There is more linguistic reuse~\cite{krishnamurthi_linguistic_2001}.
103 There are however two sides to the this coin.
104 If the syntax of the host language is not very flexible, the syntax of the \gls{DSL} may become clumsy.
105 Furthermore, errors shown to the programmer may be larded with host language errors, making it difficult for a non-expert of the host language to work with the \gls{DSL}.
106
107 \subsection{Heterogeneity and homogeneity}
108 \Citet{tratt_domain_2008} applied a notion from metaprogramming \citep{sheard_accomplishments_2001} to \glspl{EDSL} to define homogeneity and heterogeneity of \glspl{EDSL} as follows:
109
110 \begin{quote}
111 \emph{
112 A homogeneous system is one where all the components are specifically designed to work with each other, whereas in heterogeneous systems at least one of the components is largely, or completely, ignorant of the existence of the other parts of the system.
113 }
114 \end{quote}
115
116 Homogeneous \glspl{EDSL} are therefore languages that are solely defined as an extension to their host language.
117 They often restrict features of the host language to provide a safer interface or capture an idiomatic pattern in the host language for reuse.
118 The difference between a library and a homogeneous \glspl{EDSL} is not always clear.
119 Examples of homogeneous \glspl{EDSL} are libraries such as ones for sets, \glspl{GUI} creation, LISP's macro system, \etc.
120
121 On the other hand, heterogeneous \glspl{EDSL} are languages that are not executed in the host language.
122 For example, \citep{elliott_compiling_2003} describe the language Pan, for which the final representation in the host language is a compiler that will, when executed, generate code for a completely different target platform.
123 In fact, \gls{ITASK} and \gls{MTASK} are both heterogeneous \glspl{EDSL} and \gls{MTASK} specifically is a compiling \gls{DSL}.
124
125 \section{\texorpdfstring{\Glsxtrlong{TOP}}{Task-oriented programming}}\label{sec:back_top}
126 \Gls{TOP} is a declarative programming paradigm designed to model interactive systems \citep{plasmeijer_task-oriented_2012}.
127 Instead of dividing problems into layers or tiers, as is done in \gls{IOT} architectures, it deals with separation of concerns in a novel way.
128 From the data types, utilising various \emph{type-parametrised} concepts, all other aspects are handled automatically (see \cref{fig:tosd}).
129 This approach to software development is called \gls{TOSD} \citep{wang_maintaining_2018}.
130
131 \begin{figure}[ht]
132 \centering
133 \begin{subfigure}[t]{.5\textwidth}
134 \centering
135 \includestandalone{traditional}
136 \caption{Traditional layered approach.}
137 \end{subfigure}%
138 \begin{subfigure}[t]{.5\textwidth}
139 \centering
140 \includestandalone{tosd}
141 \caption{\Gls{TOSD} approach.}
142 \end{subfigure}
143 \caption{Separation of concerns in a traditional setting and in \gls{TOSD} (adapted from~\cite[\citepage{20}]{wang_maintaining_2018}).}%
144 \label{fig:tosd}
145 \end{figure}
146
147 \begin{description}
148 \item[\Glsxtrshort{UI} (presentation layer):]
149 The \gls{UI} of the system is automatically generated from the representation of the type.
150 Even though the \gls{UI} is generated from the structure of the datatypes, in practical \gls{TOP} systems it can be tweaked afterwards to suit the specific needs of the application.
151 \item[Tasks (business layer):]
152 A task is an abstract representation of a piece of work that needs to be done.
153 It provides an intuitive abstraction over work in the real world.
154 Just as with real-life tasks and workflow, tasks can be combined in various ways such as in parallel or in sequence.
155 Furthermore, a task is observable which means it is possible to observe a---partial---result during execution and act upon it by for example starting new tasks.
156 Examples of tasks are filling in a form, sending an email, reading a sensor or even doing a physical task.
157 \item[\Glsxtrshortpl{SDS} (resource access):]
158 Tasks can communicate using task values but this imposes a problem in many collaboration patterns where tasks that are not necessarily related need to share data.
159 Tasks can also share data using \glspl{SDS}, an abstraction over any data.
160 An \gls{SDS} can represent typed data stored in a file, a chunk of memory, a database \etc.
161 \Glspl{SDS} can also represent external impure data such as the time, random numbers or sensory data.
162 Similar to tasks, transformation and combination of \glspl{SDS} is possible.
163 \item[Programming language (\glsxtrshort{UOD}):]
164 The \gls{UOD} from the business layer is explicitly and separately modelled by the relations that exist in the functions of the host language.
165 \end{description}
166
167 The concept of \gls{TOP} originated from the \gls{ITASK} framework, a declarative workflow language for defining multi-user distributed web applications implemented as an \gls{EDSL} in the lazy pure \gls{FP} language \gls{CLEAN} \citep{plasmeijer_itasks:_2007,plasmeijer_task-oriented_2012}.
168 While \gls{ITASK} conceived \gls{TOP}, it is not the only \gls{TOP} language.
169 Some \gls{TOP} languages arose from Master's and Bachelor's thesis projects (e.g.\ \textmu{}Task \citep{piers_task-oriented_2016} and LTasks \citep{van_gemert_task_2022}) or were created to solve a practical problem (e.g.\ Toppyt \citep{lijnse_toppyt_2022} and hTask \citep{lubbers_htask_2022}).
170
171 Furthermore, \gls{TOPHAT} is a fully formally specified \gls{TOP} language designed to capture the essence of \gls{TOP} formally \citep{steenvoorden_tophat_2019}.
172 created \textmu{}Task, a \gls{TOP} language for specifying non-interruptible embedded systems implemented as an \gls{EDSL} in \gls{HASKELL}.
173 \citet{van_gemert_task_2022} created LTasks, a \gls{TOP} language for interactive terminal applications implemented in LUA, a dynamically typed imperative language.
174 \citet{lijnse_toppyt_2022} created Toppyt, a \gls{TOP} language based on \gls{ITASK}, implemented in \gls{PYTHON}, but designed to be simpler and smaller.
175 Finally there is \gls{MTASK}, \gls{TOP} language designed for defining workflow for \gls{IOT} devices~\cite{koopman_task-based_2018}.
176 It is written in \gls{CLEAN} as an \gls{EDSL} fully integrated with \gls{ITASK} and allows the programmer to define all layers of an \gls{IOT} system from a single source.
177
178 \section{Reading guide}\label{sec:outline}
179 This thesis presents a novel view on programming these \gls{IOT} systems as a purely functional rhapsody in three episodes.
180 On Wikipedia, a rhapsody is defined as follows \citep{wikipedia_contributors_rhapsody_2022}:
181 \begin{quote}
182 \emph{A \emph{rhapsody} in music is a one-movement work that is episodic yet integrated, free-flowing in structure, featuring a range of highly contrasted moods, colour, and tonality.
183 An air of spontaneous inspiration and a sense of improvisation make it freer in form than a set of variations.}
184 \end{quote}
185
186 \subsection*{\nameref{chp:introduction}}
187 \Cref{chp:introduction} introduces the contents of the thesis, provides background material on \gls{IOT}, \glspl{DSL} and \gls{TOP} (\cref{sec:back_iot}, \cref{sec:back_dsl}, and \cref{sec:back_top} respectively) and an overview of the contributions including a more technical outline in \cref{sec:contributions}.
188
189 \subsection*{\Fullref{prt:dsl}}
190
191 \subsection*{\Fullref{prt:top}}
192
193 \subsection*{\Fullref{prt:tvt}}
194
195 \subsection*{\nameref{chp:conclusion}}
196 \Cref{chp:conclusion} wraps up with the coda that provides discussion and an outlook on future work.
197
198 \section{Contributions}\label{sec:contributions}
199 \subsection*{\nameref{prt:dsl}}
200 The \gls{MTASK} system is a heterogeneous \gls{EDSL} and during the development of it, several novel basal techniques for embedding \glspl{DSL} in \gls{FP} languages have been found.
201 This first episode is a cumulative---otherwise known as paper-based---episode consisting of two papers published on novel embedding techniques.
202 Both papers are readable independently.
203
204 \subsubsection*{\Fullref{chp:classy_deep_embedding}}
205 This chapter is based on the paper: \citeentry{lubbers_deep_2022}\todo{change in-press when published}.
206
207 While supervising \citeauthor{amazonas_cabral_de_andrade_developing_2018}'s \citeyear{amazonas_cabral_de_andrade_developing_2018} Master's thesis, focussing on an early version of \gls{MTASK}, a seed was planted for a novel deep embedding technique for \glspl{DSL} where the resulting language is extendible both in constructs and in interpretation using type classes and existential data types.
208 Slowly the ideas organically grew to form the technique shown in the paper.
209 \Cref{sec:classy_reprise} was added after publication and contains a (yet) unpublished extension of the embedding technique.
210 The research from this paper and writing the paper was solely performed by me.
211
212 \subsubsection*{\Fullref{chp:first-class_datatypes}}
213 This chapter is based on the paper: \citeentry{lubbers_first-class_2022}\todo{change when accepted}.
214
215 It shows how to inherit data types from the host language in \glspl{EDSL} using metaprogramming.
216 It does so by providing a proof-of-concept implementation using \gls{HASKELL}'s metaprogramming system: \gls{TH}.
217 Besides showing the result, the paper also serves as a gentle introduction to using \gls{TH} and contains a thorough literature study on research that uses \gls{TH}.
218 The research in this paper and writing the paper was performed by me, though there were weekly meetings with Pieter Koopman and Rinus Plasmeijer in which we discussed and refined the ideas.
219
220 \subsection*{\nameref{prt:top}}
221 This is a monograph compiled from several papers and revised lecture notes on \gls{MTASK}, the \gls{TOP} system used to orchestrate the \gls{IOT}.
222 It provides a gentle introduction to the \gls{MTASK} system elaborates on \gls{TOP} for the \gls{IOT}.
223 \todo[inline]{outline the chapters}
224
225 \begin{itemize}
226 \item \citeentry{koopman_task-based_2018}
227
228 This was the initial \gls{TOP}/\gls{MTASK} paper.
229 Pieter Koopman wrote it, I helped with the software and research.
230 \item \citeentry{lubbers_task_2018}
231
232 This paper was an extension of my Master's thesis \citep{lubbers_task_2017}.
233 It shows how a simple imperative variant of \gls{MTASK} was integrated with \gls{ITASK}.
234 While the language was a lot different than later versions, the integration mechanism is still used in \gls{MTASK} today.
235 The research in this paper and writing the paper was performed by me, though there were weekly meetings with Pieter Koopman and Rinus Plasmeijer in which we discussed and refined the ideas.
236 \item \citeentry{lubbers_multitasking_2019}\footnote{%
237 This work acknowledges the support of the ERASMUS+ project ``Focusing Education on Composability, Comprehensibility and Correctness of Working Software'', no. 2017--1--SK01--KA203--035402
238 }
239
240 This paper was a short paper on the multitasking capabilities of \gls{MTASK} in contrast to traditional multitasking methods for \gls{ARDUINO}.
241 The research in this paper and writing the paper was performed by me, though there were weekly meetings with Pieter Koopman and Rinus Plasmeijer.
242 \item \citeentry{koopman_simulation_2018}\footnotemark[\value{footnote}]\todo{change when published}
243
244 These revised lecture notes are from a course on the \gls{MTASK} simulator was provided at the 2018 \gls{CEFP}/\gls{3COWS} winter school in Ko\v{s}ice, Slovakia.
245 Pieter Koopman wrote and taught it, I helped with the software and research.
246 \item \citeentry{lubbers_writing_2019}\footnotemark[\value{footnote}]\todo{change when published}
247
248 These revised lecture notes are from a course on programming in \gls{MTASK} provided at the 2019 \gls{CEFP}/\gls{3COWS} summer school in Budapest, Hungary.
249 Pieter Koopman prepared and taught half of the lecture and supervised the practical session.
250 I taught the other half of the lecture, wrote the lecture notes, made the assignments and supervised the practical session.
251 \item \citeentry{lubbers_interpreting_2019}
252
253 This paper shows an implementation for \gls{MTASK} for microcontrollers in the form of a compilation scheme and informal semantics description.
254 The research in this paper and writing the paper was performed by me, though there were weekly meetings with Pieter Koopman and Rinus Plasmeijer.
255 \item \citeentry{crooijmans_reducing_2022}\todo{change when published}
256
257 This paper shows how to create a scheduler so that devices running \gls{MTASK} tasks can go to sleep more automatically.
258 The research was carried out by \citet{crooijmans_reducing_2021} during his Master's thesis.
259 I did the daily supervision and helped with the research, Pieter Koopman was the formal supervisor and wrote most of the paper.
260 \item \emph{Green Computing for the Internet of Things}\footnote{
261 This work acknowledges the support of the Erasmus+ project ``SusTrainable---Promoting Sustainability as a Fundamental Driver in Software Development Training and Education'', no. 2020--1--PT01--KA203--078646}\todo{change when published}
262
263 These revised lecture notes are from a course on sustainable programming using \gls{MTASK} provided at the 2022 SusTrainable summer school in Rijeka, Croatia.
264 Pieter prepared and taught a quarter of the lecture and supervised the practical session.
265 I prepared and taught the other three quarters of the lecture, made the assignments and supervised the practical session\todo{writing contribution}.
266 \end{itemize}
267
268 \subsection*{\nameref{prt:tvt}}
269 \Cref{prt:tvt} is based on a journal paper that quantitatively and qualitatively compares traditional \gls{IOT} architectures with \gls{IOT} systems using \gls{TOP} and contains a single chapter.
270 This chapter is based on the journal paper: \citeentry{lubbers_could_2022}\todo{change when published}\footnote{This work is an extension of the conference article: \citeentry{lubbers_tiered_2020}\footnotemark{}}.
271 \footnotetext{This paper was partly funded by the Radboud-Glasgow Collaboration Fund.}
272
273 It compares programming traditional tiered architectures to tierless architectures by showing a qualitative and a quantitative four-way comparison of a smart-campus application.
274 Writing the paper was performed by all authors.
275 I created the server application, the \gls{CLEAN}/\gls{ITASK}/\gls{MTASK} implementation (\glsxtrshort{CWS}) and the \gls{CLEAN}/\gls{ITASK} implementation (\glsxtrshort{CRS})
276 Adrian Ramsingh created the \gls{MICROPYTHON} implementation (\glsxtrshort{PWS}), the original \gls{PYTHON} implementation (\glsxtrshort{PRS}) and the server application were created by \citet{hentschel_supersensors:_2016}.
277
278 \input{subfilepostamble}
279 \end{document}