From 8e40178e9c5dae52e0006f480fa4d385f745ce1d Mon Sep 17 00:00:00 2001 From: Mart Lubbers Date: Thu, 11 May 2023 15:06:45 +0200 Subject: [PATCH] process comments --- back/acknowledgements.tex | 2 +- bib/other.bib | 10 +++++++++- coda/coda.tex | 2 +- dsl/class.tex | 27 ++++++++++++++------------- dsl/first.tex | 19 ++++++++++--------- front/titlepage.tex | 2 +- intro/intro.tex | 34 +++++++++++++++++----------------- preamble/layout.tex | 1 - thesis.tex | 12 +++++++----- top/4iot.tex | 4 ++-- top/finale.tex | 2 +- top/green.tex | 2 +- top/imp.tex | 3 ++- top/int.tex | 4 ++-- top/lang.tex | 8 ++++---- tvt/tvt.tex | 12 +++++++----- 16 files changed, 79 insertions(+), 65 deletions(-) diff --git a/back/acknowledgements.tex b/back/acknowledgements.tex index 50bed14..3e9e8f0 100644 --- a/back/acknowledgements.tex +++ b/back/acknowledgements.tex @@ -12,7 +12,7 @@ First of all I would like to thank Rinus Plasmeijer, Pieter Koopman, and Jan Mar The BEST people, Adrian Ramsingh, Jeremy Singer, and Phil Trinder for the fruitful collaboration and memorable trip to Glasgow. The entire \gls{3COWS}/SusTrainable group, for offering a platform for the various summer schools I had the opportunity to teach; and not to mention the countless meetings, dinners, and drinks we had. The Royal Dutch Navy, in particular Teun de Groot and Ton van Heusden, for trusting me by funding the project. -The manuscript committee, Sven-Bodo Scholz, Gabrielle Keller, Mary Sheeran, for reading this work carefully. +The manuscript committee, Sven-Bodo Scholz, Gabriele Keller, Mary Sheeran, for reading this work carefully. All colleagues and others that I had the privilege of sharing an office with, meeting in conferences and summer schools, interact with in the department, or work with in some other way: Anett Fekete, diff --git a/bib/other.bib b/bib/other.bib index bdf11a6..bedbf9d 100644 --- a/bib/other.bib +++ b/bib/other.bib @@ -2183,7 +2183,7 @@ Publisher: {ACM}}, @software{top_software_viia_2023, title = {{VIIA} (Vessel Information Integrating Application)}, url = {https://www.top-software.nl/VIIA.html}, - author = {{TOP} Software}, + author = {{TOP Software}}, urldate = {2023-02-06}, date = {2023}, } @@ -2201,3 +2201,11 @@ Publisher: {ACM}}, date = {2001}, file = {Hinze and Jones - 2001 - Derivable Type Classes.pdf:/home/mrl/.local/share/zotero/storage/33IF2HMZ/Hinze and Jones - 2001 - Derivable Type Classes.pdf:application/pdf}, } + +@article{bender_benderrule_2019, + title = {The \#{BenderRule}: {On} {Naming} the {Languages} {We} {Study} and {Why} {It} {Matters}}, + url = {https://thegradient.pub/the-benderrule-on-naming-the-languages-we-study-and-why-it-matters/}, + journal = {The Gradient}, + author = {Bender, Emily}, + year = {2019}, +} diff --git a/coda/coda.tex b/coda/coda.tex index 4edce6b..a9f5b86 100644 --- a/coda/coda.tex +++ b/coda/coda.tex @@ -70,7 +70,7 @@ Using tierless programming, many issues that arise with tiered programming are m This has already been observed in web applications. The \gls{MTASK} system show that it is possible to program edge devices of a \gls{IOT} systems using \gls{TOP}. Furthermore, when used together with \gls{ITASK}, entire \gls{IOT} systems can be programmed tierlessly. -Whether this novel approach to programming tiered systems also reduces the \gls{IOT} develop grief is answered in \cref{prt:tvt}. +Whether this novel approach to programming tiered systems also reduces the \gls{IOT} development grief is answered in \cref{prt:tvt}. This episode presents a four-way qualitative and quantitative comparison of the following systems: \gls{PRS}, a tiered system based on resource-rich edge devices powered by \gls{PYTHON}; \gls{PWS}, a tiered system based on resource-constrained edge devices by \gls{MICROPYTHON}; diff --git a/dsl/class.tex b/dsl/class.tex index 30df865..88b58bc 100644 --- a/dsl/class.tex +++ b/dsl/class.tex @@ -53,10 +53,11 @@ However, it is suitable for type system extensions such as \glspl{GADT}. While this chapter is written as a literate \Gls{HASKELL} \citep{peyton_jones_haskell_2003} program using some minor extensions provided by \gls{GHC} \citep{ghc_team_ghc_2021}, the idea is applicable to other languages as well\footnotemark. \footnotetext{Lubbers, M. (2021): Literate Haskell/lhs2\TeX{} source code of the paper ``Deep Embedding -with Class'': TFP 2022.\ DANS.\ \url{https://doi.org/10.5281/zenodo.5081386}.} +with Class'': TFP 2022.\ Zenodo.\ \url{https://doi.org/10.5281/zenodo.6650880}.} \section{Deep embedding} -Pick a \gls{DSL}, any \gls{DSL}, pick the language of literal integers and addition. +Pick the language of literal integers and addition \citep{bender_benderrule_2019}\todo{nodig? grappig?}. +%Pick a \gls{DSL}, any \gls{DSL}, pick the language of literal integers and addition. In deep embedding, terms in the language are represented by data in the host language. Hence, defining the constructs is as simple as creating the following algebraic data type\footnote{All data types and functions are subscripted to indicate the evolution. When definitions are omitted for version $n$, version $n-1$ is assumed.}. @@ -165,7 +166,7 @@ instance Sub_t Eval_t where sub_t (E_t e1) (E_t e2) = E_t (e1 - e2) \end{lstHaskellLhstex} -Finally, adding semantics such as a printer over the language is achieved by providing a data type representing the semantics accompanied by instances for the language constructs. +Finally, adding semantics such as a printer for the language is achieved by providing a data type representing the semantics accompanied by instances for the language constructs. \begin{lstHaskellLhstex} newtype Printer_t = P_t String @@ -280,7 +281,7 @@ data Expr_2 The class alias removes the need for the programmer to visit the main data type when adding additional semantics. Unfortunately, the compiler does need to visit the main data type again. -Some may argue that adding semantics happens less frequently than adding language constructs but in reality it means that we have to concede that the language is not as easily extensible in semantics as in language constructs. +Some may argue that adding semantics happens less frequently than adding language constructs, but in reality it means that we have to concede that the language is not as easily extensible in semantics as in language constructs. More exotic type system extensions such as constraint kinds \citep{bolingbroke_constraint_2011,yorgey_giving_2012} can untangle the semantics from the data types by making the data types parametrised by the particular semantics. However, by adding some boilerplate, even without this extension, the language constructs can be parametrised by the semantics by putting the semantics functions in a data type. First the data types for the language constructs are parametrised by the type variable \haskelllhstexinline{d} as follows. @@ -295,7 +296,7 @@ data Sub_3 d = Sub_3 (Expr_3 d) (Expr_3 d) \end{lstHaskellLhstex} The \haskelllhstexinline{d} type variable is inhabited by an explicit dictionary for the semantics, i.e.\ a witness to the class instance. -Therefore, for all semantics type classes, a data type is made that contains the semantics function for the given semantics. +Therefore, for all semantics type classes, a data type is defined which contains the semantics function for the given semantics. This means that for \haskelllhstexinline{Eval_3}, a dictionary with the function \haskelllhstexinline{EvalDict_3} is defined, a type class \haskelllhstexinline{HasEval_3} for retrieving the function from the dictionary and an instance for \haskelllhstexinline{HasEval_3} for \haskelllhstexinline{EvalDict_3}. \begin{lstHaskellLhstex} @@ -344,7 +345,7 @@ sub_3 :: GDict (d (Sub_3 d)) => Expr_3 d -> Expr_3 d -> Expr_3 d sub_3 e1 e2 = Ext_3 gdict (Sub_3 e1 e2) \end{lstHaskellLhstex} -Finally, we reached the end goal, orthogonal extension of both language constructs as shown by adding subtraction to the language and in language semantics. +Finally, we reached the end goal, orthogonal extension of both language constructs as shown by adding subtraction to the language, and in language semantics. Adding the printer can now be done without touching the original code as follows. First the printer type class, dictionaries and instances for \haskelllhstexinline{GDict} are defined. @@ -515,7 +516,7 @@ Luckily, one does not need to resort to these arguably blunt matters often. Dependent language functionality often does not need to span extensions, i.e.\ it is possible to group them in the same data type. \subsection{Chaining semantics} -Now that the data types are parametrised by the semantics a final problem needs to be overcome. +Now that the data types are parametrised by the semantics, a final problem needs to be overcome. The data type is parametrised by the semantics, thus, using multiple semantics, such as evaluation after optimising is not straightforwardly possible. Luckily, a solution is readily at hand: introduce an ad-hoc combination semantics. @@ -551,7 +552,7 @@ e3 = neg_4 (Lit_4 42 `sub_4` Lit_4 38) `Add_4` Lit_4 1 \section{Generalised algebraic data types}% \Glspl{GADT} are enriched data types that allow the type instantiation of the constructor to be explicitly defined \citep{cheney_first-class_2003,hinze_fun_2003}. Leveraging \glspl{GADT}, deeply embedded \glspl{DSL} can be made statically type safe even when different value types are supported. -Even when \glspl{GADT} are not supported natively in the language, they can be simulated using embedding-projection pairs or equivalence types \citep[\citesection{2.2}]{cheney_lightweight_2002}. +Still when \glspl{GADT} are not supported natively in the language, they can be simulated using embedding-projection pairs or equivalence types \citep[\citesection{2.2}]{cheney_lightweight_2002}. Where some solutions to the expression problem do not easily generalise to \glspl{GADT} (see \cref{sec:cde:related}), classy deep embedding does. Generalising the data structure of our \gls{DSL} is fairly straightforward and to spice things up a bit, we add an equality and boolean negation language construct. To make the existing \gls{DSL} constructs more general, we relax the types of those constructors. @@ -669,7 +670,7 @@ For example, \citet{svenningsson_combining_2013} show that by expressing the dee Classy deep embedding differs from the hybrid approaches in the sense that it does not require the language extensions to be expressible in the core language. \subsection{Comparison} -No \gls{DSL} embedding technique is the silver bullet, there is no way of perfectly satisfying all requirements programmers have. +No single \gls{DSL} embedding technique is the silver bullet, there is no way of perfectly satisfying all requirements programmers have. \Citet{sun_compositional_2022} provided a thorough comparison of embedding techniques including more axes than just the two stated in the expression problem. \Cref{tbl:dsl_comparison_brief} shows a variant of their comparison table. @@ -682,7 +683,7 @@ In shallow embedding, intensional analysis is more complex and requires stateful Simple type system describes the whether it is possible to encode this embedding technique without many type system extensions. In classy deep embedding, there is either a bit more scaffolding and boilerplate required or advanced type system extensions need to be used. -Little boilerplate denotes the amount of scaffolding and boilerplate required. +Minimal boilerplate denotes the amount of scaffolding and boilerplate required. For example, hybrid embedding requires a transcoding step between the deep syntax and the shallow core language. \begin{table} @@ -711,7 +712,7 @@ For example, hybrid embedding requires a transcoding step between the deep synta Simple type system & \CIRCLE{} & \CIRCLE{} & \Circle{} & \CIRCLE{} & \CIRCLE{} & \Circle{} & \textcolor{gray}{\CIRCLE{}}\tnote{4}\\ - Little boilerplate & \CIRCLE{} & \CIRCLE{} & \Circle{} + Minimal boilerplate & \CIRCLE{} & \CIRCLE{} & \Circle{} & \CIRCLE{} & \CIRCLE{} & \Circle{} & \textcolor{gray}{\CIRCLE{}}\tnote{4}\\ \bottomrule @@ -747,7 +748,7 @@ Furthermore, I would like to thank Pieter and Rinus for the fruitful discussions \begin{subappendices} \section{Reprise: reducing boilerplate}% \label{sec:classy_reprise} -One of the unique selling points of this novel \gls{DSL} embedding technique is that it, in its basic form, does not require advanced type system extensions nor a lot of boilerplate. +One of the unique selling points of classy deep embedding is that it, in its basic form, does not require advanced type system extensions nor a lot of boilerplate. However, generalising the technique to \glspl{GADT} arguably unleashes a cesspool of \emph{unsafe} compiler extensions. If we are willing to work with extensions, almost all the boilerplate can be inferred or generated. @@ -881,7 +882,7 @@ It contains examples for expressions, expressions using \glspl{GADT}, detection \section{Data types and definitions}% \label{sec:cde:appendix} -This appendix collects all definitions omitted for brevity. +This appendix contains all definitions omitted for brevity. \lstset{basicstyle=\tt\footnotesize} \begin{lstHaskellLhstex}[caption={Data type definitions.}] diff --git a/dsl/first.tex b/dsl/first.tex index 4170a6f..313b14f 100644 --- a/dsl/first.tex +++ b/dsl/first.tex @@ -209,7 +209,8 @@ It is not possible to construct new values from expressions in the \gls{DSL}, to Furthermore, while in our language the only constraint is the automatically derivable \haskellinline{Show}, in real-world languages the class constraints may be very difficult to satisfy for complex types, for example serialisation to a single stack cell in the case of a compiler. As a consequence, for user-defined data types---such as a pro\-gram\-mer-defined list type\footnotemark---to become first-class citizens in the \gls{DSL}, language constructs for constructors, deconstructors and constructor predicates must be defined. -Field selectors are also useful functions for working with user-defined data types, they are not considered for the sake of brevity but can be implemented using the deconstructor functions. +Field selectors are also useful functions for working with user-defined data types. +They are not considered for the sake of brevity but can be implemented using the deconstructor functions. \footnotetext{ For example: \haskellinline{data List a = Nil \| Cons \{hd :: a, tl :: List a\}} } @@ -271,7 +272,7 @@ For example, it can subvert module boundaries, thus accessing constructors that To achieve the goal of embedding data types in a \gls{DSL} we refrain from using these \emph{unsafe} features. \subsection{Data types} -Firstly, for all of \gls{HASKELL}'s \gls{AST} elements, data types are provided that are mostly isomorphic to the actual data types used in the compiler. +For all of \gls{HASKELL}'s \gls{AST} elements, data types are provided that are mostly isomorphic to the actual data types used in the compiler. With these data types, the entire syntax of a \gls{HASKELL} program can be specified. Often, a data type is suffixed with the context, e.g.\ there is a \haskellinline{VarE} and a \haskellinline{VarP} for a variable in an expression or in a pattern respectively. To give an impression of these data types, a selection of data types available in \gls{TH} is given below: @@ -298,9 +299,9 @@ lamE ps es = LamE <$> sequence ps <*> es \subsection{Splicing} Special splicing syntax (\haskellinline{\$(...)}) marks functions for compile-time execution. -Other than that they always produce a value of an \gls{AST} data type, they are regular functions. +Apart from the fact that they always produce a value of an \gls{AST} data type, they are regular functions. Depending on the context and location of the splice, the result type is either a list of declarations, a type, an expression or a pattern. -The result of this function, when successful, is then spliced into the code and treated as regular code by the compiler. +The result of this function, when executed successfully, is then spliced into the code and treated as regular code by the compiler. Consequently, the code that is generated may not be type safe, in which case the compiler provides a type error on the generated code. The following listing shows an example of a \gls{TH} function generating on-the-fly functions for arbitrary field selection in a tuple. When called as \haskellinline{\$(tsel 2 4)} it expands at compile time to \haskellinline{\\(_, _, f, _)->f}: @@ -316,7 +317,7 @@ tsel field total = do \subsection{Quasiquotation} Another key concept of \gls{TH} is Quasiquotation, the dual of splicing \citep{bawden_quasiquotation_1999}. While it is possible to construct entire programs using the provided data types, it is a little cumbersome. -Using \emph{Oxford brackets} (\verb#[|# \ldots\verb#|]#) or single or double apostrophes, verbatim \gls{HASKELL} code can be entered that is converted automatically to the corresponding \gls{AST} nodes easing the creation of language constructs. +Using \emph{Oxford brackets} (\verb#[|# \ldots\verb#|]#) or single or double apostrophes, verbatim \gls{HASKELL} code can be entered which is converted automatically to the corresponding \gls{AST} nodes easing the creation of language constructs. Depending on the context, different quasiquotes are used: \begin{itemize*} \item \haskellinline{[\|...\|]} or \haskellinline{[e\|...\|]} for expressions @@ -451,7 +452,7 @@ mkDeconstructor n fs = sigD (deconstructorName n) \end{lstHaskell} \subsubsection{Constructor predicates} -The last part of the class definition are the constructor predicates, a function that checks whether the provided value of type $T$ contains a value with constructor $C_k$. +The last part of the class definition consists of the constructor predicates, a function that checks whether the provided value of type $T$ contains a value with constructor $C_k$. A constructor predicate for constructor $C_k$ of type $T$ is defined as a \gls{DSL} function $\mathit{isC_k} \dcolon v~(T~v_0~\ldots~v_n) \shortrightarrow v~\mathit{Bool}$. A constructor predicate---name prefixed by \haskellinline{is}---is generated for all constructors. They all have the same type: @@ -697,7 +698,7 @@ Pattern matching in general is not suitable for a custom quasiquoter because it However, a concrete use of pattern matching, interesting enough to be beneficial, but simple enough for a demonstration is the \emph{simple case expression}, a case expression that does not contain nested patterns and is always exhaustive. They correspond to multi-way conditional expressions and can thus be converted to \gls{DSL} constructs straightforwardly \citep[\citesection{4.4}]{peyton_jones_implementation_1987}. -In contrast to the binary literal quasiquoter example, we do not create the parser by hand. +In contrast to the binary literal quasiquoter example, we do not hand craft the parser. The parser combinator library \emph{parsec} is used instead to ease the creation of the parser \citep{leijen_parsec_2001}. First the location of the quasiquoted code is retrieved using the \haskellinline{location} function that operates in the \haskellinline{Q} monad. This location is inserted in the parsec parser so that errors are localised in the source code. @@ -855,7 +856,7 @@ Using quasiquotation, they make a complicated embedding of non-linear pattern ma \subsubsection{Typed Template Haskell}\label{ssec_fcd:typed_template_haskell} \Gls{TTH} is a very recent extension/alternative to normal \gls{TH} \citep{pickering_multi-stage_2019,xie_staging_2022}. -Where in \gls{TH} you can manipulate arbitrary parts of the syntax tree, add top-level splices of data types, definitions and functions, in \gls{TTH} the programmer can only splice expressions but the \gls{AST} fragments representing the expressions are well-typed by construction instead of untyped. +Whereas in \gls{TH} you can manipulate arbitrary parts of the syntax tree, add top-level splices of data types, definitions and functions, in \gls{TTH} the programmer can only splice expressions but the \gls{AST} fragments representing the expressions are well-typed by construction instead of untyped. \Citet{pickering_staged_2020} implemented staged compilation for the \emph{generics-sop} \citep{de_vries_true_2014} generics library to improve the efficiency of the code using \gls{TTH}. \Citet{willis_staged_2020} used \gls{TTH} to remove the overhead of parsing combinators. @@ -880,7 +881,7 @@ Adding new constructs, e.g.\ constructors, deconstructors, and constructor tests Techniques such as data types \`a la carte \citep{swierstra_data_2008} and open data types \citep{loh_open_2006} show that it is possible to extend data types orthogonally but whether metaprogramming can still readily be used is something that needs to be researched. It may also be possible to implemented (parts) of the boilerplate generation using \gls{TTH} (see \cref{ssec_fcd:typed_template_haskell}) to achieve more confidence in the type correctness of the implementation. -Another venue of research is to try to find the limits of this technique regarding richer data type definitions. +Another direction of research is to try to find the limits of this technique regarding richer data type definitions. It would be interesting to see whether it is possible to apply the technique on data types with existentially quantified type variables or full-fledged generalised \glspl{ADT} \citep{hinze_fun_2003}. It is not possible to straightforwardly lift the deconstructors to type classes because existentially quantified type variables will escape. Rank-2 polymorphism offers tools to define the types in such a way that this is not the case anymore. diff --git a/front/titlepage.tex b/front/titlepage.tex index a009276..33acd5e 100644 --- a/front/titlepage.tex +++ b/front/titlepage.tex @@ -68,7 +68,7 @@ \item Manuscriptcommissie: \begin{itemize}[label={}] \item prof.\ dr.\ S.-B.\ (Sven-Bodo) Scholz - \item prof.\ dr.\ G.K.\ (Gabrielle) Keller (Universiteit Utrecht) + \item prof.\ dr.\ G.K.\ (Gabriele) Keller (Universiteit Utrecht) \item prof.\ dr.\ M.\ (Mary) Sheeran (Chalmers Tekniska H\"ogskola) \end{itemize} \end{itemize} diff --git a/intro/intro.tex b/intro/intro.tex index 04ae56e..c0a43b7 100644 --- a/intro/intro.tex +++ b/intro/intro.tex @@ -31,7 +31,7 @@ The majority of edge devices are powered by microcontrollers. Microcontrollers are equipped with a lot of connectivity for integrating peripherals such as sensors and actuators. The connectivity makes them very suitable to interact with their surroundings. These miniature computers contain integrated circuits that accommodate a microprocessor designed for use in embedded applications. -As a consequence, microcontrollers are cheap; tiny; have little memory; and contain a slow, but energy-efficient processor. +As a consequence, microcontrollers are cheap, tiny, have little memory, and contain a slow, but energy-efficient processor. When coordinating an orchestra of edge devices, there is room for little error. Edge devices come and go, perform their own pieces, or are sometimes instructed to perform a certain piece, they might even operate without a central authority. @@ -39,16 +39,16 @@ In a traditional setting, an \gls{IOT} engineer has to program each device and t This results in semantic friction, which makes programming and maintaining \gls{IOT} systems a complex and error-prone process. This dissertation describes the research carried out around orchestrating these complex \gls{IOT} systems using \gls{TOP}. -\Gls{TOP} is an innovative tierless programming paradigm for interactive multi-layered systems. +\Gls{TOP} is an innovative tierless programming paradigm for interactive multi-layered systems\todo{tierless introductie mist een beetje}. By utilising advanced compiler technologies, much of the internals, communication, and interoperation between the tiers or layers of the applications are automatically generated. -The compiler makes an application controlling all interconnected components from a single declarative specification of the required work. +The compiler generates an application controlling all interconnected components from a single declarative specification of the required work. For example, the \gls{TOP} system \gls{ITASK} is used to program all layers of multi-user distributed web applications from a single source specification. It is implemented in the general-purpose lazy functional programming language \gls{CLEAN}, and therefore requires relatively powerful hardware. The inflated hardware requirements are no problem for regular computers but impractical for the average edge device. This is where an additional \glspl{DSL} must play its part. \Glspl{DSL} are programming languages tailored to a specific domain. -Consequently, jargon is not expressed in terms of the language itself, but are built-in language features. +Consequently, jargon is not expressed in terms of the language itself, but is built into the language. Furthermore, the \gls{DSL} can eschew language or system features that are irrelevant for the domain. Using \glspl{DSL}, hardware requirements can be drastically lowered, even while maintaining a high abstraction level for the specified domain. @@ -60,7 +60,7 @@ As it is integrated with \gls{ITASK}, it allows for all layers of an \gls{IOT} a \section{Reading guide}% \label{lst:reading_guide} This work is structured as a purely functional rhapsody. -The \citet{wikipedia_contributors_rhapsody_2022} define a musical rhapsody is defined as follows: +The \citet{wikipedia_contributors_rhapsody_2022} define a musical rhapsody as follows: \begin{quote}\emph{% A \emph{rhapsody} in music is a one-movement work that is episodic yet integrated, free-flowing in structure, featuring a range of highly contrasted moods, colour, and tonality.} \end{quote} @@ -85,7 +85,7 @@ While the term \gls{IOT} briefly gained interest around 1999 to describe the com \emph{The \glsxtrlong{IOT}, or \glsxtrshort{IOT}, is the integration of people, processes and technology with connectable devices and sensors to enable remote monitoring, status, manipulation and evaluation of trends of such devices.} \end{quote} -Much later, CISCO states that the \gls{IOT} started when there were as many connected devices as there were people on the globe, i.e.\ around 2008 \citep{evans_internet_2011}. +Much later, CISCO stated that the \gls{IOT} started when there were as many connected devices as there were people on the globe, i.e.\ around 2008 \citep{evans_internet_2011}. Today, \gls{IOT} is the term for a system of devices that sense the environment, act upon it, and communicate with each other and the world they operate in. These connected devices are already in households all around us in the form of smart electricity meters, fridges, phones, watches, home automation, \etc. @@ -117,11 +117,11 @@ All layers are connected using the network layer. In some applications this is implemented using conventional networking techniques such as \gls{WIFI} or Ethernet. However, network technology that is tailored to the needs of the specific interconnection between two layers is increasingly popular. Examples of this are BLE, LoRa, ZigBee, and LTE-M as a communication protocol for connecting the perception layer to the application layer using \gls{IOT} transport protocols such as \gls{MQTT}. -Protocols such as HTTP, AJAX, and WebSocket connecting the presentation layer to the application layer that are designed for the use in web applications. +Furthermore, protocols such as HTTP, AJAX, and WebSocket are designed for the use in web applications and connect the presentation layer to the application layer. Across the layers, the devices are a large heterogeneous collection of different platforms, protocols, paradigms, and programming languages. As a result, impedance problems or semantic friction occurs between layers and the maintainability is severely hampered \citep{ireland_classification_2009}. -Even more so, the perception layer itself is often a heterogeneous collection of microcontrollers in itself, each having their own peculiarities, programming language of choice, and hardware interfaces. +Even more so, the perception layer often is a heterogeneous collection of microcontrollers in itself, each having their own peculiarities, programming language of choice, and hardware interfaces. As edge hardware needs to be cheap, small scale, and energy efficient, the microcontrollers used to power them do not have a lot of computational power, only a smidge of memory, and little communication bandwidth. Typically, these devices are unable to run a full-fledged general-purpose \gls{OS}. Rather they employ compiled firmware written in imperative languages that combines all tasks on the device in a single program. @@ -140,7 +140,7 @@ It does so by compiling the \gls{DSL} to byte code that is executed in a feather \section{Domain-specific languages}% \label{sec:back_dsl} % General -Programming languages can be divided up into two categories: \glspl{DSL} and \glspl{GPL} \citep{fowler_domain_2010}. +Programming languages can be divided into two categories: \glspl{DSL} and \glspl{GPL} \citep{fowler_domain_2010}. Where \glspl{GPL} are not made with a demarcated area in mind, \glspl{DSL} are tailor-made for a specific domain. Writing idiomatic domain-specific code in a \gls{DSL} is easier and requires less \gls{GPL} knowledge for a domain expert. This does come at the cost of the \gls{DSL} being sometimes less expressive to an extent that it may not even be Turing complete. @@ -191,7 +191,7 @@ On the other hand, heterogeneous \glspl{EDSL} are languages that are not execute For example, \citet{elliott_compiling_2003} describe the language Pan, for which the final representation in the host language is a compiler that will, when executed, generate code for a completely different target platform. Both \gls{ITASK} and \gls{MTASK} are \glspl{EDSL}. -Programs written in \gls{ITASK} run in the host language, and it is a homogeneous \gls{DSL}. +Programs written in \gls{ITASK} run in the host language, and hence \gls{ITASK} is a homogeneous \gls{DSL}. Tasks written using \gls{MTASK} are dynamically compiled to byte code for an edge device, making it a heterogeneous \gls{DSL}. The interpreter running on the edge device has no knowledge of the higher level task specification. It just interprets the byte code it was sent and takes care of the communication. @@ -249,7 +249,7 @@ The individual components in the miniature systems, the tasks, the \glspl{SDS}, \subsection{The iTask system} The concept of \gls{TOP} originated from the \gls{ITASK} framework, a declarative \gls{TOP} language for defining interactive distributed web applications. The \gls{ITASK} system is implemented as an \gls{EDSL} in the programming language \gls{CLEAN} \citep{plasmeijer_itasks:_2007,plasmeijer_task-oriented_2012}\footnote{\Cref{chp:clean_for_haskell_programmers} contains a guide for \gls{CLEAN} tailored to \gls{HASKELL} programmers.}. -It is under development for over fifteen years and has proven itself through use in industry as well. +It has been under development for over fifteen years and has proven itself through use in industry. For example, it is the main language of VIIA, an advanced application for monitoring coasts \citep{top_software_viia_2023}. Browsers are powering \gls{ITASK}'s presentation layer. The browser runs the actual \gls{ITASK} code using an interpreter that operates on \gls{CLEAN}'s intermediate language \gls{ABC} \citep{staps_lazy_2019}. @@ -295,7 +295,7 @@ Special combinators (e.g.\ \cleaninline{@>>} at \cref{lst:todo_ui}) are used to \subsection{The mTask system} The work for \gls{IOT} edge devices can often be succinctly described by \gls{TOP} programs. -Software on microcontrollers is usually composed of smaller basic tasks, are interactive, and share data with other components or the server. +Software on microcontrollers is usually composed of smaller basic tasks, are interactive, and shares data with other components or the server. The \gls{ITASK} system seems an obvious candidate for bringing \gls{TOP} to \gls{IOT} edge devices. However, an \gls{ITASK} application contains many features that are not needed on \emph{edge devices} such as higher-order tasks, support for a distributed architecture, a multi-user web server, and facilities to generate \glspl{GUI} for any user-defined type. Furthermore, \gls{IOT} edge devices are in general not powerful enough to run or interpret \gls{CLEAN}\slash\gls{ABC} code, they just lack the processor speed and memory. @@ -310,12 +310,12 @@ Using \gls{MTASK}, the programmer can define all layers of an \gls{IOT} system a The \gls{MTASK} language is written in \gls{CLEAN} as a multi-view \gls{EDSL} and hence there are multiple interpretations possible. This thesis mostly discusses the byte code compiler. From an \gls{MTASK} task constructed at run time, a compact binary representation of the work that needs to be done is compiled. -And while the byte code for \gls{MTASK} is generated at run time, the type system of the host language \gls{CLEAN} prevents type errors in the generated code. +While the byte code for \gls{MTASK} is generated at run time, the type system of the host language \gls{CLEAN} prevents type errors in the generated code. This byte code is then sent to a device that running the \gls{MTASK} \gls{RTS}. This feather-light domain-specific \gls{OS} is written in portable \gls{C} with a minimal device specific interface and it executes the tasks using interpretation and rewriting. To illustrate \imtask{}, an example application is shown. -The application is an interactive application for blinking \pgls{LED} on the microcontroller at a certain frequency that can be set and updated at run time. +It is an interactive program for blinking \pgls{LED} on the microcontroller at a certain frequency that can be set and updated at run time. \Cref{lst:intro_blink,fig:intro_blink} show the \gls{ITASK} part of the code and a screenshot. Using \cleaninline{enterInformation}, the connection specification of the \gls{TCP} device is queried through a web editor (\cref{lst:intro_enterDevice,fig:intro_blink_dev}). \Cref{lst:intro_withshared} defines \pgls{SDS} to communicate the blinking interval between the server and the edge device. @@ -354,7 +354,7 @@ The \cleaninline{>>\|.} operator denotes the sequencing of tasks in \gls{MTASK}. \subsection{Other TOP languages} While \gls{ITASK} conceived \gls{TOP}, it is no longer the only \gls{TOP} system. -Some \gls{TOP} languages were created to fill a gap encountered in practise. +Some \gls{TOP} languages were created to fill a gap encountered in practice. Toppyt \citep{lijnse_toppyt_2022} is a general purpose \gls{TOP} language written in \gls{PYTHON} used to host frameworks for modelling command \& control systems. The hTask system is a \gls{TOP} system written in \gls{HASKELL} used as a vessel for experimenting with asynchronous \glspl{SDS} \citep{lubbers_htask_2022}. Furthermore, some \gls{TOP} systems arose from Master's and Bachelor's thesis projects. @@ -372,8 +372,8 @@ Such a formal specification allows for symbolic execution, hint generation, but This section provides a thorough overview of the relation between the scientific publications and the contents of this thesis. \subsection{\Fullref{prt:dsl}} -The \gls{MTASK} system is an \gls{EDSL} and during the development of it, several novel basal techniques for embedding \glspl{DSL} in \gls{FP} languages have been found. -This paper-based episode contains the following papers: +The \gls{MTASK} system is an \gls{EDSL} and during the development of it, several novel basal techniques for embedding \glspl{DSL} in \gls{FP} languages were found. +This paper-based episode is based on the following papers: \begin{enumerate} \item \emph{Deep Embedding with Class} \citep*{lubbers_deep_2022} is the basis for \cref{chp:classy_deep_embedding}. It shows a novel deep embedding technique for \glspl{DSL} where the resulting language is extendible both in constructs and in interpretation just using type classes and existential data types. diff --git a/preamble/layout.tex b/preamble/layout.tex index f1fe9bd..30210bc 100644 --- a/preamble/layout.tex +++ b/preamble/layout.tex @@ -90,7 +90,6 @@ \newenvironment{chapterabstract}{\begin{shaded}\begin{quotation}}{\end{quotation}\end{shaded}} %chktex 6 % Increase the depth for the table of contents -\setcounter{secnumdepth}{3} \renewcommand{\contentsname}{Table of Contents} % change the name of the TOC \AtBeginDocument{\addtocontents{toc}{\protect\thispagestyle{empty}}} % to remove page numbering from the TOC diff --git a/thesis.tex b/thesis.tex index 105cb75..6b388c5 100644 --- a/thesis.tex +++ b/thesis.tex @@ -11,11 +11,11 @@ %\setlength{\overfullrule}{20pt} % Just for the todonotes, can go when it's finished -%\usepackage{todonotes} -%\setuptodonotes{ -% backgroundcolor=white, -% linecolor=black, -%} +\usepackage{todonotes} +\setuptodonotes{ + backgroundcolor=white, + linecolor=black, +} % Document info \title{\mytitle\texorpdfstring{\\[2ex]}{---}\smaller\mysubtitle} @@ -140,4 +140,6 @@ %\label{chp:index} %\printindex +\listoftodos% + \end{document} diff --git a/top/4iot.tex b/top/4iot.tex index efeef52..3baf7cb 100644 --- a/top/4iot.tex +++ b/top/4iot.tex @@ -159,7 +159,7 @@ void loop() { Unfortunately, this does not work because the \arduinoinline{delay} function blocks all other execution. The resulting program blinks the \glspl{LED} after each other instead of at the same time. -To overcome this, it is necessary to slice up the blinking behaviour in small fragments and interleave it manually \citep{feijs_multi-tasking_2013}. +To overcome this, it is necessary to slice up the blinking behaviour in small fragments and interleave them manually \citep{feijs_multi-tasking_2013}. \begin{lstArduino}[float=,label={lst:blinkthread},caption={Threading three blinking patterns.}] long led1 = 0, led2 = 0, led3 = 0; @@ -233,7 +233,7 @@ First, the language setup and interface are shown in \cref{chp:mtask_dsl}. \Cref{chp:integration_with_itask} shows the integration of \gls{MTASK} and \gls{ITASK}. Then, \cref{chp:implementation} provides the implementation of the \gls{DSL}, the compilation schemes, instruction set, and details on the interpreter. \Cref{chp:green_computing_mtask} explains all green computing aspects of \gls{MTASK}, i.e.\ task scheduling and processor interrupts. -Finally, \cref{chp:finale} concludes, shows related work, and provides a short history of \gls{MTASK}. +Finally, \cref{chp:finale} concludes, discusses related work, and provides a short history of \gls{MTASK}. \input{subfilepostamble} \end{document} diff --git a/top/finale.tex b/top/finale.tex index 1b65cfc..b50d05b 100644 --- a/top/finale.tex +++ b/top/finale.tex @@ -30,7 +30,7 @@ However, it is not straightforward to run \gls{TOP} systems on resource-constrai The \gls{MTASK} system bridges this gap by providing a \gls{TOP} programming language for edge devices. It is a full-fledged \gls{TOP} language hosted in a tiny \gls{FP} language. Besides the usual \gls{FP} constructs, it contains basic tasks, task combinators, support for sensors and actuators, and interrupts. -It integrates seamlessly in \gls{ITASK}, a \gls{TOP} system for interactive web applications. +It integrates seamlessly into \gls{ITASK}, a \gls{TOP} system for interactive web applications. In \gls{ITASK}, abstractions are available for the gritty details of interactive web applications such as program distribution, web applications, data storage, and user management. The \gls{MTASK} system abstracts away of all technicalities specific to edge devices such as communication, abstractions for sensors and actuators, interrupts and (multi) task scheduling. When \gls{MTASK} is used together with \gls{MTASK}, all layers of the \gls{IOT} application are programmed from a single declarative specification. diff --git a/top/green.tex b/top/green.tex index 132d9e9..8e8fe8c 100644 --- a/top/green.tex +++ b/top/green.tex @@ -596,7 +596,7 @@ The task emits the status of the pin as a stable value if the information in the Otherwise, no value is emitted. \section{Conclusion} -This chapter show how we can automatically associate execution intervals to tasks. +This chapter shows how we can automatically associate execution intervals to tasks. Based on these intervals, we can delay the executions of those tasks. When all task executions can be delayed, the microprocessor executing those tasks can go to sleep mode to reduce its energy consumption. This is a rather difficult problem that must be solved dynamically, since we make no assumptions on the number and nature of the tasks that will be allocated to an \gls{IOT} device. diff --git a/top/imp.tex b/top/imp.tex index 39a6347..1169187 100644 --- a/top/imp.tex +++ b/top/imp.tex @@ -634,7 +634,8 @@ There are several possible messages that can be received from the server: \subsection{Execution phase} The second phase performs one execution step for all tasks that wish for it. -Tasks are ordered in a priority queue ordered by the time a task needs to execute, the \gls{RTS} selects all tasks that can be scheduled, see \cref{sec:scheduling} for more details. +Tasks are placed in a priority queue orderd by the time a task needs to execute. +The \gls{RTS} selects all tasks that can be scheduled, see \cref{sec:scheduling} for more details. Execution of a task is always an interplay between the interpreter and the rewriter. The rewriter scans the current task tree and tries to rewrite it using small-step reduction. Expressions in the tree are always strictly evaluated by the interpreter. diff --git a/top/int.tex b/top/int.tex index 20df4f8..209c48e 100644 --- a/top/int.tex +++ b/top/int.tex @@ -10,7 +10,7 @@ \chapter{The integration of mTask and iTask}% \label{chp:integration_with_itask} \begin{chapterabstract} - This chapter shows the integration of \gls{MTASK} and \gls{ITASK} by showing: + This chapter shows the integration of \gls{MTASK} and \gls{ITASK} by discussing: \begin{itemize} \item an architectural overview of \gls{MTASK} applications; \item the interface for connecting devices; @@ -29,7 +29,7 @@ Devices in the \gls{MTASK} system are set up with a domain-specific \gls{OS} and \Cref{fig:mtask_integration} shows the architectural layout of a typical \gls{IOT} system created with \gls{ITASK} and \gls{MTASK}. The entire system is written as a single \gls{CLEAN} specification where multiple tasks are executed at the same time. -Tasks can access \glspl{SDS} according to many-to-many communication and multiple clients can work on the same task. +Tasks can access \glspl{SDS} following the many-to-many communication pattern and multiple clients can work on the same task. The diagram contains three labelled arrows that denote the integration functions between \gls{ITASK} and \gls{MTASK}. Devices are connected to the system using the \cleaninline{withDevice} function (see \cref{sec:withdevice}). There can be multiple devices connected to a single \gls{ITASK} host at the same time. diff --git a/top/lang.tex b/top/lang.tex index d16b887..7b901c8 100644 --- a/top/lang.tex +++ b/top/lang.tex @@ -29,13 +29,14 @@ Furthermore, this particular type of embedding has the property that it is exten Adding a language construct is as simple as adding a type class. Adding an interpretation is done by creating a new data type and providing implementations for the various type classes. -In order to reduce the hardware requirements for devices running \gls{MTASK} programs, several measures have been taken. +In order to reduce the hardware requirements for devices running \gls{MTASK} programs, several measures have been taken.\todo{by whom? would be good here to be more explicit to make it clear at this point whether that's part of the contribution} Programs in \gls{MTASK} are written in the \gls{MTASK} \gls{DSL}, separating them from the host \gls{ITASK} program. This allows the tasks to be constructed at compile time in order to tailor-make them for the specific work requirements. Furthermore, the \gls{MTASK} language is restricted: there are no recursive data structures, no higher-order functions, strict evaluation, and functions and objects can only be declared at the top level. \section{Class-based shallow embedding} Let us illustrate this technique by taking the very simple language of literal values. +\todo{show how this relates to the general embedding discussion previously} This language interface can be described using a single type constructor class with a single function \cleaninline{lit}. This function is for lifting values, when it has a \cleaninline{toString} instance, from the host language to our new \gls{DSL}. The type variable \cleaninline{v} of the type class represents the view on the language, the interpretation. @@ -691,9 +692,8 @@ It uses expressions based a simply-typed $\lambda$-calculus with support for som \section{Conclusion} This chapter gave an overview of the complete \gls{MTASK} \gls{DSL}. The \gls{MTASK} language is a rich \gls{TOP} language tailored for \gls{IOT} edge devices. -The language is implemented as a class-based shallowly \gls{EDSL} in the pure functional host language \gls{CLEAN}. -The language is an enriched lambda calculus as a host language. -It provides language constructs for arithmetic expressions, conditionals, functions, but also non-interactive basic tasks, task combinators, peripheral support, and integration with \gls{ITASK}. +The language is implemented as a class-based shallowly embedded \gls{DSL} in the pure functional host language \gls{CLEAN}. +The language uses an enriched lambda calculus as a host language, providing additional language constructs for arithmetic expressions, conditionals, functions, but also non-interactive basic tasks, task combinators, peripheral support, and integration with \gls{ITASK}. Terms in the language are just interfaces and can be interpreted by one or more interpretations. When using the byte code compiler, terms in the \gls{MTASK} language are type checked at compile time but are constructed and compiled at run time. This facilitates tailor-making tasks for the current work requirements. diff --git a/tvt/tvt.tex b/tvt/tvt.tex index e541a06..085acc6 100644 --- a/tvt/tvt.tex +++ b/tvt/tvt.tex @@ -6,7 +6,7 @@ \begin{document} \input{subfileprefix} -\chapter{Could tierless languages reduce IoT development grief?}% +\chapter{Tiered versus tierless programming}% \label{chp:smart_campus} \begin{chapterabstract} @@ -32,7 +32,7 @@ Conventional \gls{IOT} software architectures require the development of separat \begin{enumerate*} \item Interoperating components in multiple languages and paradigms increases the developer's cognitive load who must simultaneously think in multiple languages and paradigms, i.e.\ manage significant semantic friction. \item The developer must correctly interoperate the components, e.g.\ adhere to the \gls{API} or communication protocols between components. - \item To ensure correctness the developer must maintain type safety across a range of very different languages and diverse type systems. + \item To ensure correctness the developer must maintain type safety across a range of very different languages and diverse type systems.\todo[inline]{what do you mean by type safety here precisely? how can that happend if non type safe languages are in the mix} \item The developer must deal with the potentially diverse failure modes of each component, and of component interoperation. \end{enumerate*} @@ -81,7 +81,7 @@ As a prototyping exercise, we use modest commodity sensor nodes (i.e.\ Raspberry Pis) and low-cost, low-precision sensors for indoor environmental monitoring. -We have deployed sensor nodes into 12 rooms in two buildings. The \gls{IOT} system has an online data store, providing live +Sensor nodes have been deployed into 12 rooms in two buildings. The \gls{IOT} system has an online data store, providing live access to sensor data through a RESTful \gls{API}. This allows campus stakeholders to add functionality at a business layer above the layers that we consider here. To date, simple apps have been developed including room temperature @@ -202,6 +202,7 @@ Only a small fraction of these systems are described in the academic literature, \label{sec_t4t:characteristics} This study compares a pair of tierless \gls{IOT} languages with conventional tiered \gls{PYTHON} \gls{IOT} software. \Citask{} and \cimtask{} represent a specific set of tierless language design decisions, however many alternative designs are available. Crucially the limitations of the tierless \gls{CLEAN} languages, e.g.\ that they currently provide limited security, should not be seen as limitations of tierless technologies in general. This section briefly outlines key design decisions for tierless \gls{IOT} languages, discusses alternative designs, and describes the \gls{CLEAN} designs. The \gls{CLEAN} designs are illustrated in the examples in the following section. +\todo{again, here it would be good to have a general definition of tierless, and then describe what the possible design choices are, and where in the space iTask/mTask fits} \subsubsection{Tier splitting and placement} @@ -244,7 +245,8 @@ Tierless languages may adopt a range of communication paradigms for communicatin Security is a major issue and a considerable challenge for many \gls{IOT} systems \citep{alhirabi_security_2021}. There are potentially security issues at each layer in an \gls{IOT} application (\cref{fig_t4t:iot_arch}). The security issues and defence mechanisms at the application and presentation layers are relatively standard, e.g.\ defending against SQL injection attacks. The security issues at the network and perception layers are more challenging. Resource-rich sensor nodes can adopt some standard security measures like encrypting messages, and regularly applying software patches to the operating system. However, microcontrollers often lack the computational resources for encryption, and it is hard to patch their system software because the program is often stored in flash memory. In consequence there are infamous examples of \gls{IOT} systems being hijacked to create botnets \citep{203628,herwig_measurement_2019}. -Securing the entire stack in a conventional tiered \gls{IOT} application is particularly challenging as the stack is implemented in a collection of programming languages with low level programming and communication abstractions. In such polyglot distributed systems it is hard to determine, and hence secure, the flow of data between components. In consequence a small mistake may have severe security implications. +Securing the entire stack in a conventional tiered \gls{IOT} application is particularly challenging as the stack is implemented in a collection of programming languages with low level programming and communication abstractions. In such polyglot distributed systems it is hard to determine, and hence secure, the flow of data between components. +As a consequence, a small mistake may have severe security implications. A number of characteristics of tierless languages help to improve security. Communication and placement vulnerabilities are minimised as communication and placement are automatically generated and checked by the compiler. So injection attacks and the exploitation of communication\slash{}placement protocol bugs are less likely. Vulnerabilities introduced by mismatched types are avoided as the entire system is type checked. Moreover, tierless languages can exploit language level security techniques. For example languages like Jif\slash{}split \citep{zdancewic2002secure} and Swift \citep{chong2007secure} place components to protect the security of data. Another example are programming language technologies for controlling information flow, and these can be used to improve security. For example Haski uses them to improve the security of \gls{IOT} systems \citep{valliappan_towards_2020}. @@ -731,7 +733,7 @@ This section investigates whether tierless languages make \gls{IOT} programming \paragraph{Code size} is widely recognised as an approximate measure of the development and maintenance effort required for a software system \citep{rosenberg1997some}. \Gls{SLOC} is a common code size metric, and is especially useful for multi-paradigm systems like \gls{IOT} systems. It is based on the simple principle that the more \gls{SLOC}, the more developer effort and the increased likelihood of bugs \citep{rosenberg1997some}. It is a simple measure, not dependent on some formula, and can be automatically computed \citep{sheetz2009understanding}. -Of course \gls{SLOC} must be used carefully as it is easily influenced by programming style, language paradigm, and counting method \citep{alpernaswonderful}. Here we are counting code to compare development effort, use the same idiomatic programming style in each component, and only count lines of code, omitting comments and blank lines. +Of course \gls{SLOC} must be used carefully as it is easily influenced by programming style, language paradigm, and counting method \citep{alpernaswonderful}. Here we are counting lines of code to compare development effort, use the same idiomatic programming style in each component, and only count lines of code, omitting comments and blank lines. \Cref{table_t4t:multi} enumerates the \gls{SLOC} required to implement the \gls{UOG} smart campus functionalities in \gls{PWS}, \gls{PRS}, \gls{CWS} and \gls{CRS}. Both \gls{PYTHON} and \gls{CLEAN} implementations use the same server and communication code for Raspberry Pi and for \gls{WEMOS} sensor nodes (rows 5--7 of the table). The Sensor Interface (SI) refers to code facilitating the communication between the peripherals and the sensor node software. % formerly hardware interface -- 2.20.1