change citations to citep

[phd-thesis.git] / tiered_vs._tierless_programming / smart_campus.tex
diff --git a/tiered_vs._tierless_programming/smart_campus.tex b/tiered_vs._tierless_programming/smart_campus.tex

index 0db7732..e1fbe34 100644 (file)
--- a/tiered_vs._tierless_programming/smart_campus.tex
+++ b/tiered_vs._tierless_programming/smart_campus.tex
@@ -26,7 +26,7 @@
  \section{Introduction}%
  \label{sec_t4t:Intro}
  
-Conventional \gls{IOT} software stacks are notoriously complex and pose very significant software development, reliability, and maintenance challenges. \Gls{IOT} software architectures typically comprise multiple components organised in four or more tiers or layers~\cite{sethi2017internet,Ravulavaru18,Alphonsa20}. This is due to the highly distributed nature of typical \gls{IOT} applications that must read sensor data from end points (the \emph{perception} layer), aggregate and select the data and communicate over a network (the \emph{network} layer), store the data in a database and analyse it (the \emph{application} layer) and display views of the data, commonly on web pages (the \emph{presentation} layer).
+Conventional \gls{IOT} software stacks are notoriously complex and pose very significant software development, reliability, and maintenance challenges. \Gls{IOT} software architectures typically comprise multiple components organised in four or more tiers or layers~\citep{sethi2017internet,Ravulavaru18,Alphonsa20}. This is due to the highly distributed nature of typical \gls{IOT} applications that must read sensor data from end points (the \emph{perception} layer), aggregate and select the data and communicate over a network (the \emph{network} layer), store the data in a database and analyse it (the \emph{application} layer) and display views of the data, commonly on web pages (the \emph{presentation} layer).
  
  Conventional \gls{IOT} software architectures require the development of separate programs in various programming languages for each of the components/tiers in the stack. This is modular, but a significant burden for developers, and some key challenges are as follows.
  \begin{enumerate*}
@@ -36,20 +36,20 @@ Conventional \gls{IOT} software architectures require the development of separat
      \item The developer must deal with the potentially diverse failure modes of each component, and of component interoperations.
  \end{enumerate*}
  
-A radical alternative development paradigm uses a single  \emph{tierless} language that synthesizes all components/tiers in the software stack. There are established \emph{tierless} languages for web stacks, e.g.\ Links~\cite{cooper2006links} or Hop~\cite{serrano2006hop}.
+A radical alternative development paradigm uses a single  \emph{tierless} language that synthesizes all components/tiers in the software stack. There are established \emph{tierless} languages for web stacks, e.g.\ Links~\citep{cooper2006links} or Hop~\citep{serrano2006hop}.
  In a tierless language the developer writes the application as a single program. The code for different tiers is simultaneously checked by the compiler, and compiled to the required component languages. For example, Links compiles to HTML and JavaScript for the web client and to SQL on the server to interact with the database system. Tierless languages for \gls{IOT} stacks are more recent and less common, examples include
-Potato~\cite{troyer_building_2018} and \gls{CLEAN} with \gls{ITASK}\slash\gls{MTASK}~\cite{lubbers_interpreting_2019}.
+Potato~\citep{troyer_building_2018} and \gls{CLEAN} with \gls{ITASK}\slash\gls{MTASK}~\citep{lubbers_interpreting_2019}.
  
  \Gls{IOT} sensor nodes may be microcontrollers with very limited compute resources, or supersensors: resource-rich single board computers like a Raspberry Pi. A tierless language may target either class of sensor node, and microcontrollers  are the more demanding target due to the limited resources, e.g.\ small memory, executing on bare metal etc.
  
  Potentially a tierless language both reduces the development effort and improves correctness as correct interoperation and communication is automatically generated by the compiler. A tierless language may, however, introduce other problems. How expressive is the language? That is, can it readily express the required functionality? How maintainable is the software? Is the generated code efficient in terms of time, space, and power?
  
  
-This paper reports a systematic comparative evaluation of two tierless language technologies for \gls{IOT} stacks: one targeting resource-constrained microcontrollers, and the other resource-rich supersensors. The basis of the comparison is four implementations of a typical smart campus \gls{IOT} stack~\cite{hentschel_supersensors:_2016}. Two implementations are conventional tiered  \gls{PYTHON}-based stacks: \gls{PRS} and \gls{PWS}. The other two implementations are tierless: \gls{CRS} and \gls{CWS}. Our work makes the following research contributions, and the key results are summarised, discussed, and quantified in \cref{sec_t4t:Conclusion}.
+This paper reports a systematic comparative evaluation of two tierless language technologies for \gls{IOT} stacks: one targeting resource-constrained microcontrollers, and the other resource-rich supersensors. The basis of the comparison is four implementations of a typical smart campus \gls{IOT} stack~\citep{hentschel_supersensors:_2016}. Two implementations are conventional tiered  \gls{PYTHON}-based stacks: \gls{PRS} and \gls{PWS}. The other two implementations are tierless: \gls{CRS} and \gls{CWS}. Our work makes the following research contributions, and the key results are summarised, discussed, and quantified in \cref{sec_t4t:Conclusion}.
  
  \begin{description}
         \item[C1] We show that \emph{tierless languages have the potential to significantly reduce the development effort for  \gls{IOT} systems}.
-               We  systematically compare code size (\gls{SLOC}) of the four smart campus implementations as a  measure of development effort and maintainability~\cite{alpernaswonderful,rosenberg1997some}.
+               We  systematically compare code size (\gls{SLOC}) of the four smart campus implementations as a  measure of development effort and maintainability~\citep{alpernaswonderful,rosenberg1997some}.
          The tierless implementations require 70\% less code than the tiered implementations. We analyse the codebases to attribute the code reduction to three factors.
          \begin{enumerate*}
              \item Tierless languages benefit from reduced interoperation, requiring far fewer languages, paradigms and source code files e.g.\ \gls{CWS} uses two languages, one paradigm and three source code files where \gls{PWS} uses seven languages, two paradigms and 35 source code files
@@ -67,7 +67,7 @@ This paper reports a systematic comparative evaluation of two tierless language
      We show that the bare metal execution environment enforces some restrictions on \gls{MTASK} although they remain high level. Moreover, the environment conveys some advantages, e.g.\ better control over timing (\cref{sec_t4t:ComparingTierless}).
  \end{description}
  
-The current work extends~\cite{lubbers_tiered_2020} as follows. Contributions C3 and C4 are entirely new, and C1 is enhanced by being based on the analysis of four rather than two languages and implementations.
+The current work extends~\citep{lubbers_tiered_2020} as follows. Contributions C3 and C4 are entirely new, and C1 is enhanced by being based on the analysis of four rather than two languages and implementations.
  
  \section{Background and related work}%
  \label{sec_t4t:Background}
@@ -86,9 +86,9 @@ We have deployed sensor nodes into 12 rooms in two buildings. The \gls{IOT} syst
  access to sensor data through a RESTful \gls{API}.
  This allows campus stakeholders to add functionality at a business layer above the layers that we consider here. To date,
  simple apps have been developed including room temperature
-monitors and campus utilization maps~\cite{hentschel_supersensors:_2016}.
+monitors and campus utilization maps~\citep{hentschel_supersensors:_2016}.
  A longitudinal study of sensor accuracy has also been
-conducted~\cite{harth_predictive_2018}.
+conducted~\citep{harth_predictive_2018}.
  
  \subsection{\texorpdfstring{\Gls{IOT}}{IoT} applications}%
  \label{sec_t4t:Stacks}
@@ -97,7 +97,7 @@ Web applications are necessarily complex distributed systems, with client browse
  are even more complex as they combine a web application with a second distributed system of sensor and actuator nodes that collect and aggregate data, operate on it, and communicate with the server.
  
  Both web and \gls{IOT} applications are commonly structured into tiers, e.g.\ the classical four-tier Linux, Apache, MySQL and PHP (LAMP) stack.
-\Gls{IOT} stacks typically have more tiers than webapps, with the number  depending on the  complexity of the application~\cite{sethi2017internet}. While other tiers, like the business layer~\cite{muccini2018iot} may be added above them, the focus of our study is on programming the lower four tiers of the \gls{PRS}, \gls{CRS}, \gls{PWS} and \gls{CWS} stacks, as illustrated in \cref{fig_t4t:iot_arch}.
+\Gls{IOT} stacks typically have more tiers than webapps, with the number  depending on the  complexity of the application~\citep{sethi2017internet}. While other tiers, like the business layer~\citep{muccini2018iot} may be added above them, the focus of our study is on programming the lower four tiers of the \gls{PRS}, \gls{CRS}, \gls{PWS} and \gls{CWS} stacks, as illustrated in \cref{fig_t4t:iot_arch}.
  
  \begin{landscape}
         \begin{figure}[ht]
@@ -130,11 +130,11 @@ Using multiple tiers to
  structure complex software is a common software engineering practice that provides significant architectural benefits for \gls{IOT} and other software. The tiered \gls{PYTHON} \gls{PRS} and \gls{PWS} stacks exhibit these benefits.
  \begin{enumerate}
  
-       \item Modularity: tiers allow a system to be structured as a set of components with clearly defined functionality. They can be implemented independently, and may be interchanged with other components that have similar functionality~\cite{maccormack2007impact}. In \gls{PRS} and \gls{PWS}, for example, a different NoSQL DBMS could relatively easily be substituted for {MongoDB}
+       \item Modularity: tiers allow a system to be structured as a set of components with clearly defined functionality. They can be implemented independently, and may be interchanged with other components that have similar functionality~\citep{maccormack2007impact}. In \gls{PRS} and \gls{PWS}, for example, a different NoSQL DBMS could relatively easily be substituted for {MongoDB}
  
-       \item Abstraction: the hierarchical composition of components in the stack abstracts the view of the system as a whole. Enough detail is provided to understand the roles of each layer and how the components relate to one another~\cite{belle2013layered}. \Cref{fig_t4t:iot_arch} illustrates the abstraction of \gls{PRS} and \gls{PWS} into four tiers.
+       \item Abstraction: the hierarchical composition of components in the stack abstracts the view of the system as a whole. Enough detail is provided to understand the roles of each layer and how the components relate to one another~\citep{belle2013layered}. \Cref{fig_t4t:iot_arch} illustrates the abstraction of \gls{PRS} and \gls{PWS} into four tiers.
  
-       \item Cohesion: well-defined boundaries ensure each tier contains functionality directly related to the task of the component~\cite{lee2001component}. The tiers in \gls{PRS} and \gls{PWS} contain all the functionality associated with perception, networking, application and presentation respectively.
+       \item Cohesion: well-defined boundaries ensure each tier contains functionality directly related to the task of the component~\citep{lee2001component}. The tiers in \gls{PRS} and \gls{PWS} contain all the functionality associated with perception, networking, application and presentation respectively.
  
  \end{enumerate}
  
@@ -142,7 +142,7 @@ However, a tiered architecture poses significant challenges for developers of \g
  
  \begin{enumerate}
  
-       \item Polyglot Development {--} the developer must be fluent in all the languages and components in the stack, known as being a full stack developer for webapps~\cite{mazzei2018full}. That is, the developer must correctly use multiple languages that have different paradigms, i.e.\ manage significant \emph{semantic friction}~\cite{ireland2009classification}. For example the \gls{PWS} developer must integrate components written in seven languages with two paradigms (\cref{sec_t4t:interoperation}).
+       \item Polyglot Development {--} the developer must be fluent in all the languages and components in the stack, known as being a full stack developer for webapps~\citep{mazzei2018full}. That is, the developer must correctly use multiple languages that have different paradigms, i.e.\ manage significant \emph{semantic friction}~\citep{ireland2009classification}. For example the \gls{PWS} developer must integrate components written in seven languages with two paradigms (\cref{sec_t4t:interoperation}).
  
         \item Correct Interoperation {--} the developer must adhere to the \gls{API} or communication protocols between components. \Cref{sec_t4t:codesize,sec_t4t:resourcerich} show that communication requires some 17\% of \gls{PRS} and \gls{PWS} code, so around 100 \gls{SLOC}. \Cref{sec_t4t:Communication} discusses the complexity of writing this distributed communication code.
         
@@ -153,32 +153,32 @@ However, a tiered architecture poses significant challenges for developers of \g
  \end{enumerate}
  
  
-Beyond \gls{PRS}  and \gls{PWS} the challenges of tiered polyglot software development are evidenced in real world studies. As recent examples, a study of GitHub open source projects found an average of five different languages in each project, with many using tiered architectures~\cite{mayer2017multi}.
-An earlier empirical study of GitHub shows that using more languages to implement a project has a significant effect on project quality, since it increases defects~\cite{kochhar2016large}.
-A study of \gls{IOT} stack developers found that interoperation poses a real challenge, that microservices blur the abstraction between tiers, and that both testing and scaling \gls{IOT} applications to more devices are hard~\cite{motta2018challenges}.
+Beyond \gls{PRS}  and \gls{PWS} the challenges of tiered polyglot software development are evidenced in real world studies. As recent examples, a study of GitHub open source projects found an average of five different languages in each project, with many using tiered architectures~\citep{mayer2017multi}.
+An earlier empirical study of GitHub shows that using more languages to implement a project has a significant effect on project quality, since it increases defects~\citep{kochhar2016large}.
+A study of \gls{IOT} stack developers found that interoperation poses a real challenge, that microservices blur the abstraction between tiers, and that both testing and scaling \gls{IOT} applications to more devices are hard~\citep{motta2018challenges}.
  
  One way of minimising the challenges of developing tiered polyglot \gls{IOT} software is to standardise and reuse components. This approach has been hugely successful for web stacks, e.g.\ browser standards. The W3C
-Web of Things aims to facilitate re-use by providing standardised metadata and other re-usable technological \gls{IOT} building blocks~\cite{guinard_building_2016}.  However, the Web of Things has yet to gain widespread adoption. Moreover, as it is based on web technology, it requires the \emph{thing} to run a web server, significantly increasing the hardware requirements.
+Web of Things aims to facilitate re-use by providing standardised metadata and other re-usable technological \gls{IOT} building blocks~\citep{guinard_building_2016}.  However, the Web of Things has yet to gain widespread adoption. Moreover, as it is based on web technology, it requires the \emph{thing} to run a web server, significantly increasing the hardware requirements.
  
  \section{Tierless languages}%
  \label{sec_t4t:TiredvsTierless}
  
  A radical approach to overcoming the challenges raised by tiered distributed software is to use a tierless programming language that eliminates the semantic friction between tiers by generating code for all tiers, and all communication between  tiers, from a single program. 
-%\adriancomment{Also referred to as multi-tier programming, tierless language applications usually utilise a single language, paradigm and type system, and the entire system is simultaneously checked by the compiler~\cite{weisenburger2020survey}.}
+%\adriancomment{Also referred to as multi-tier programming, tierless language applications usually utilise a single language, paradigm and type system, and the entire system is simultaneously checked by the compiler~\citep{weisenburger2020survey}.}
  Typically a tierless program uses a single language, paradigm and type system, and the entire distributed system is simultaneously checked by the compiler.
  
-There is intense interest in developing tierless, or multitier, language technologies with a number of research languages developed over the last fifteen years, e.g.\ \cite{cooper2006links, serrano2006hop, troyer_building_2018, 10.1145/2775050.2633367}. These languages demonstrate the
-advantages of the paradigm, including less development effort, better maintainability, and sound semantics of distributed execution. At the same time a number of industrial technologies incorporate tierless concepts, e.g.\ \cite{balat2006ocsigen, bjornson2010composing, strack2015getting}. These languages demonstrate the benefits of the paradigm in practice. Some tierless languages use (embedded) \glspl{DSL} to specify parts of the multi-tier software.
+There is intense interest in developing tierless, or multitier, language technologies with a number of research languages developed over the last fifteen years, e.g.\ \citep{cooper2006links, serrano2006hop, troyer_building_2018, 10.1145/2775050.2633367}. These languages demonstrate the
+advantages of the paradigm, including less development effort, better maintainability, and sound semantics of distributed execution. At the same time a number of industrial technologies incorporate tierless concepts, e.g.\ \citep{balat2006ocsigen, bjornson2010composing, strack2015getting}. These languages demonstrate the benefits of the paradigm in practice. Some tierless languages use (embedded) \glspl{DSL} to specify parts of the multi-tier software.
  
-Tierless languages have been developed for a range of distributed paradigms, including web applications, client-server applications, mobile applications, and generic distributed systems. A recent and substantial survey of these tierless technologies is available~\cite{weisenburger2020survey}. Here we provide a brief introduction to tierless languages with a focus on \gls{IOT} software.
+Tierless languages have been developed for a range of distributed paradigms, including web applications, client-server applications, mobile applications, and generic distributed systems. A recent and substantial survey of these tierless technologies is available~\citep{weisenburger2020survey}. Here we provide a brief introduction to tierless languages with a focus on \gls{IOT} software.
  
  \subsection{Tierless web languages}
  % Standalone DSLs
  There are established tierless languages for web development, both standalone languages and \glspl{DSL} embedded in a host language.
-Example standalone tierless web languages are Links~\cite{cooper2006links} and Hop~\cite{serrano2006hop}.
+Example standalone tierless web languages are Links~\citep{cooper2006links} and Hop~\citep{serrano2006hop}.
  From a single declarative program the client, server and database code is simultaneously checked by the compiler, and compiled to the required component languages. For example, Links compiles to HTML and JavaScript for the client side and to SQL on the server-side to interact with the database system.
  
-An example tierless web framework that uses a \gls{DSL} is  Haste~\cite{10.1145/2775050.2633367},  that embeds the \gls{DSL} in \gls{HASKELL}. Haste programs are compiled multiple times: the server code is generated by the standard \gls{GHC} \gls{HASKELL} compiler~\cite{hall1993glasgow}; Javascript for the client is generated by a custom \gls{GHC} compiler backend. The design leverages \gls{HASKELL}'s high-level programming abstractions and strong typing, and benefits from \gls{GHC}: a mature and sophisticated compiler.
+An example tierless web framework that uses a \gls{DSL} is  Haste~\citep{10.1145/2775050.2633367},  that embeds the \gls{DSL} in \gls{HASKELL}. Haste programs are compiled multiple times: the server code is generated by the standard \gls{GHC} \gls{HASKELL} compiler~\citep{hall1993glasgow}; Javascript for the client is generated by a custom \gls{GHC} compiler backend. The design leverages \gls{HASKELL}'s high-level programming abstractions and strong typing, and benefits from \gls{GHC}: a mature and sophisticated compiler.
  
  
  \subsection{Tierless \texorpdfstring{\gls{IOT}}{IoT} languages}
@@ -188,24 +188,24 @@ The presentation layer of a tierless \gls{IOT} language, like tierless web langu
  
  \subsubsection{\texorpdfstring{\Glspl{DSL}}{DSLs} for microcontrollers}
  Many \glspl{DSL} provide high-level programming for microcontrollers, for example providing strong typing and memory safety.
-For example Copilot~\cite{hess_arduino-copilot_2020}
-and Ivory~\cite{elliott_guilt_2015} are imperative \glspl{DSL} embedded in a functional language that compile to \gls{C}\slash\gls{CPP}.  In contrast to  \gls{CLEAN}/\gls{ITASK}/\gls{MTASK} such \glspl{DSL} are not tierless \gls{IOT} languages as they have no automatic integration with the server, i.e.\ with the application and presentation layers.
+For example Copilot~\citep{hess_arduino-copilot_2020}
+and Ivory~\citep{elliott_guilt_2015} are imperative \glspl{DSL} embedded in a functional language that compile to \gls{C}\slash\gls{CPP}.  In contrast to  \gls{CLEAN}/\gls{ITASK}/\gls{MTASK} such \glspl{DSL} are not tierless \gls{IOT} languages as they have no automatic integration with the server, i.e.\ with the application and presentation layers.
  
  
  \subsubsection{\texorpdfstring{\Gls{FRP}}{Functional reactive programming}}
  \Gls{FRP} is a declarative paradigm often used for implementing the perception layer of an \gls{IOT} stack.
-Examples include mfrp~\cite{sawada_emfrp:_2016}, CFRP~\cite{suzuki_cfrp_2017}, XFRP~\cite{10.1145/3281366.3281370}, Juniper~\cite{helbling_juniper:_2016}, Hailstorm~\cite{sarkar_hailstorm_2020}, and Haski~\cite{valliappan_towards_2020}.
+Examples include mfrp~\citep{sawada_emfrp:_2016}, CFRP~\citep{suzuki_cfrp_2017}, XFRP~\citep{10.1145/3281366.3281370}, Juniper~\citep{helbling_juniper:_2016}, Hailstorm~\citep{sarkar_hailstorm_2020}, and Haski~\citep{valliappan_towards_2020}.
  None of these languages are tierless \gls{IOT} languages as they have no automatic integration with the server.
  
-Potato goes beyond other \gls{FRP} languages to provide a tierless \gls{FRP} \gls{IOT} language for resource rich sensor nodes~\cite{troyer_building_2018}. It does so using the Erlang programming language and sophisticated virtual machine. 
+Potato goes beyond other \gls{FRP} languages to provide a tierless \gls{FRP} \gls{IOT} language for resource rich sensor nodes~\citep{troyer_building_2018}. It does so using the Erlang programming language and sophisticated virtual machine. 
  
-TOP allows for more complex collaboration patterns than \gls{FRP}~\cite{wang_maintaining_2018}, and in consequence is unable to provide the strong guarantees on memory usage available in a restricted variant of \gls{FRP} such as arrowized \gls{FRP}~\cite{nilsson_functional_2002}.
+TOP allows for more complex collaboration patterns than \gls{FRP}~\citep{wang_maintaining_2018}, and in consequence is unable to provide the strong guarantees on memory usage available in a restricted variant of \gls{FRP} such as arrowized \gls{FRP}~\citep{nilsson_functional_2002}.
  
  \subsubsection{Erlang/Elixir \texorpdfstring{\gls{IOT}}{IoT} systems}
  A number of production \gls{IOT} systems are engineered in Erlang or Elixir, and many are mostly tierless.
  That is the perception, network and application layers are sets of distributed Erlang processes, although the presentation layer typically uses some conventional web technology.
  A resource-rich sensor node may support many Erlang processes on an Erlang VM, or low level code (typically \gls{C}\slash\gls{CPP}) on a resource-constrained microcontroller can emulate an Erlang process.
-Only a small fraction of these systems are described in the academic literature, example exceptions are~\cite{sivieri2012drop,shibanai_distributed_2018}, with many described only in grey literature or not at all.
+Only a small fraction of these systems are described in the academic literature, example exceptions are~\citep{sivieri2012drop,shibanai_distributed_2018}, with many described only in grey literature or not at all.
  
  \subsection{Characteristics of tierless \texorpdfstring{\gls{IOT}}{IoT} languages}%
  \label{sec_t4t:characteristics}
@@ -215,7 +215,7 @@ This study compares a pair of tierless \gls{IOT} languages with conventional tie
  \subsubsection{Program splitting}
  
  A key challenge for an automatically segmented tierless language is to determine which parts of the program correspond to a particular tier and hence should be executed by a specific component on a specific host, so-called tier splitting.
-For example a tierless web language must identify client code to ship to browsers, database code to execute in the DBMS, and application code to run on the server. Some tierless languages split programs using types, others use syntactic markers, e.g.\ pragmas  like \cleaninline{server} or \cleaninline{client}, to split the program~\cite{cooper2006links,10.1145/2775050.2633367}. It may be possible to infer the splitting between tiers, relieving the developers from the need specify it, as illustrated for Javascript as a tierless web language~\cite{10.1145/2661136.2661146}.
+For example a tierless web language must identify client code to ship to browsers, database code to execute in the DBMS, and application code to run on the server. Some tierless languages split programs using types, others use syntactic markers, e.g.\ pragmas  like \cleaninline{server} or \cleaninline{client}, to split the program~\citep{cooper2006links,10.1145/2775050.2633367}. It may be possible to infer the splitting between tiers, relieving the developers from the need specify it, as illustrated for Javascript as a tierless web language~\citep{10.1145/2661136.2661146}.
  
  In \gls{CLEAN}\slash\gls{ITASK}/\gls{MTASK} and \gls{CLEAN}/\gls{ITASK} tier splitting is specified by functions, and hence is a first-class language construct.
  For example in \gls{CLEAN}\slash\gls{ITASK} the \cleaninline{asyncTask}  function identifies a task for execution on a remote device and \cleaninline{liftmTask} executes the given task on an \gls{IOT} device. The tier splitting functions are illustrated in examples in the next section, e.g.\ on \cref{lst_t4t:itaskTempFull:startdevtask} in \cref{lst_t4t:itaskTempFull} and \cref{lst_t4t:mtaskTemp:liftmtask} in \cref{lst_t4t:mtaskTemp}.
@@ -223,7 +223,7 @@ Specifying splitting as functions means that new splitting functions can be comp
  
  \subsubsection{Communication}\label{ssec_t4t:communication}
  
-Tierless languages may adopt a range of communication paradigms for communicating between components. Different tierless languages specify communication in different ways~\cite{weisenburger2020survey}. Remote procedures are the most common communication mechanism: a procedure/function executing on a remote host/machine is called as if it was local. The communication of the arguments to, and the results from, the remote procedure is automatically provided by the language implementation.  Other mechanisms include explicit message passing between components; publish/subscribe where components subscribe to topics of interest from other components; reactive programming defines event streams between remote components; finally shared state makes changes in a shared and potentially remote data structure visible to components.
+Tierless languages may adopt a range of communication paradigms for communicating between components. Different tierless languages specify communication in different ways~\citep{weisenburger2020survey}. Remote procedures are the most common communication mechanism: a procedure/function executing on a remote host/machine is called as if it was local. The communication of the arguments to, and the results from, the remote procedure is automatically provided by the language implementation.  Other mechanisms include explicit message passing between components; publish/subscribe where components subscribe to topics of interest from other components; reactive programming defines event streams between remote components; finally shared state makes changes in a shared and potentially remote data structure visible to components.
  
  \Gls{CLEAN}\slash\gls{ITASK}/\gls{MTASK} and \gls{CLEAN}/\gls{ITASK} communicate using a combination of remote task invocation, similar to remote procedures, and shared state through \glspl{SDS}.
  \Cref{lst_t4t:itaskTempFull} illustrates: \cref{lst_t4t:itaskTempFull:startdevtask} shows a server task launching  a remote task, \cleaninline{devTask}, on to a sensor node; and \cref{lst_t4t:itaskTempFull:remoteShare} shows the sharing of the remote \cleaninline{latestTemp} \gls{SDS}.
@@ -235,57 +235,57 @@ In many \gls{IOT} systems the sensor nodes are microcontrollers that are program
  Hence, most \gls{IOT} systems compile sensor node code directly for the target architecture or via an existing language such as \gls{C}\slash\gls{CPP}.
  
  Techniques such as over-the-air programming and interpreters allow microcontrollers to be dynamically provisioned, increasing their maintainability and resilience.
-For example Baccelli et al.\ provide a single language \gls{IOT} system based on the RIOT \gls{OS} that allows runtime deployment of code snippets called containers~\cite{baccelli_reprogramming_2018}.
+For example Baccelli et al.\ provide a single language \gls{IOT} system based on the RIOT \gls{OS} that allows runtime deployment of code snippets called containers~\citep{baccelli_reprogramming_2018}.
  Both client and server are written in JavaScript. However, there is no integration between the client and the server other than that they are programmed from a single source.
-Mat\`e is an example of an early tierless sensor network framework where devices are provided with a virtual machine using TinyOS for dynamic provisioning~\cite{levis_mate_2002}.
+Mat\`e is an example of an early tierless sensor network framework where devices are provided with a virtual machine using TinyOS for dynamic provisioning~\citep{levis_mate_2002}.
  
  Placement specifies how data and computations in a tierless program are assigned to the
-devices/hosts in the distributed system. Different tierless languages specify placement in different ways, e.g.\ code annotations or configuration files, and at different granularities, e.g.\ per function or per class~\cite{weisenburger2020survey}.
+devices/hosts in the distributed system. Different tierless languages specify placement in different ways, e.g.\ code annotations or configuration files, and at different granularities, e.g.\ per function or per class~\citep{weisenburger2020survey}.
  
  \Gls{CLEAN}\slash\gls{ITASK}/\gls{MTASK} and \gls{CLEAN}/\gls{ITASK} both use dynamic task placement, similar to dynamic function placement.
  In \gls{CLEAN}\slash\gls{ITASK}/\gls{MTASK} sensor nodes are programmed once with the \gls{MTASK} \gls{RTS}, and possibly some precompiled tasks.
  Thereafter a sensor node can dynamically receive \gls{MTASK} programs, compiled at runtime by the server.
-In \gls{CLEAN}\slash\gls{ITASK} the sensor node runs an \gls{ITASK} server that recieves and executes code from the (\gls{IOT}) server~\cite{oortgiese_distributed_2017}.
-%The \gls{ITASK} server decides what code to execute depending on the serialised execution graph that the server sends~\cite{oortgiese_distributed_2017}.
+In \gls{CLEAN}\slash\gls{ITASK} the sensor node runs an \gls{ITASK} server that recieves and executes code from the (\gls{IOT}) server~\citep{oortgiese_distributed_2017}.
+%The \gls{ITASK} server decides what code to execute depending on the serialised execution graph that the server sends~\citep{oortgiese_distributed_2017}.
  Placement happens automatically as part of the first-class splitting constructs outlined in \cref{ssec_t4t:communication}, so \cref{lst_t4t:mtaskTemp:liftmtask} in \cref{lst_t4t:mtaskTemp} places \cleaninline{devTask} onto the \cleaninline{dev} sensor node. 
  
  \subsubsection{Security}
  
-Security is a major issue and a considerable challenge for many \gls{IOT} systems~\cite{10.1145/3437537}. There are potentially security issues at each layer in an \gls{IOT} application (\cref{fig_t4t:iot_arch}). The security issues and defence mechanisms at the application and presentation layers are relatively standard, e.g.\ defending against SQL injection attacks. The security issues at the network and perception layers are more challenging. Resource-rich sensor nodes can adopt some standard security measures like encrypting messages, and regularly applying software patches to the operating system. However microcontrollers often lack the computational resources for encryption, and it is hard to patch their system software because the program is often stored in flash memory. In consequence there are infamous examples of \gls{IOT} systems being hijacked to create botnets~\cite{203628,herwig_measurement_2019}.
+Security is a major issue and a considerable challenge for many \gls{IOT} systems~\citep{10.1145/3437537}. There are potentially security issues at each layer in an \gls{IOT} application (\cref{fig_t4t:iot_arch}). The security issues and defence mechanisms at the application and presentation layers are relatively standard, e.g.\ defending against SQL injection attacks. The security issues at the network and perception layers are more challenging. Resource-rich sensor nodes can adopt some standard security measures like encrypting messages, and regularly applying software patches to the operating system. However microcontrollers often lack the computational resources for encryption, and it is hard to patch their system software because the program is often stored in flash memory. In consequence there are infamous examples of \gls{IOT} systems being hijacked to create botnets~\citep{203628,herwig_measurement_2019}.
  
  Securing the entire stack in a conventional tiered \gls{IOT} application is particularly challenging as the stack is implemented in a collection of programming languages with low level programming and communication abstractions. In such polyglot distributed systems it is hard to determine, and hence secure, the flow of data between components. In consequence a small mistake may have severe security implications. 
  
-A number of characteristics of tierless languages help to improve security. Communication and placement vulnerabilities are minimised as communication and placement are automatically generated and checked by the compiler.  So injection attacks and the exploitation of communication/placement protocol bugs are less likely. Vulnerabilities introduced by mismatched types are avoided as the entire system is type checked. Moreover, tierless languages can exploit language level security techniques. For example  languages like Jif/split~\cite{zdancewic2002secure} and Swift~\cite{chong2007secure} place components to protect the security of data. Another example are programming language technologies for controlling information flow, and these can be used to improve security. For example Haski uses them to improve the security of \gls{IOT} systems~\cite{valliappan_towards_2020}. 
+A number of characteristics of tierless languages help to improve security. Communication and placement vulnerabilities are minimised as communication and placement are automatically generated and checked by the compiler.  So injection attacks and the exploitation of communication/placement protocol bugs are less likely. Vulnerabilities introduced by mismatched types are avoided as the entire system is type checked. Moreover, tierless languages can exploit language level security techniques. For example  languages like Jif/split~\citep{zdancewic2002secure} and Swift~\citep{chong2007secure} place components to protect the security of data. Another example are programming language technologies for controlling information flow, and these can be used to improve security. For example Haski uses them to improve the security of \gls{IOT} systems~\citep{valliappan_towards_2020}. 
  
-However many tierless languages have yet to provide a comprehensive set of security technologies, despite its importance in domains like web and \gls{IOT} applications. For example Erlang and many Erlang-based systems~\cite{shibanai_distributed_2018,sivieri2012drop}, lack important security measures. Indeed security is not covered in a recent, otherwise comprehensive, survey of tierless technologies~\cite{weisenburger2020survey}.
+However many tierless languages have yet to provide a comprehensive set of security technologies, despite its importance in domains like web and \gls{IOT} applications. For example Erlang and many Erlang-based systems~\citep{shibanai_distributed_2018,sivieri2012drop}, lack important security measures. Indeed security is not covered in a recent, otherwise comprehensive, survey of tierless technologies~\citep{weisenburger2020survey}.
  
-\Gls{CLEAN}\slash\gls{ITASK} and \gls{CLEAN}/\gls{ITASK}/\gls{MTASK} are typical in this respect: little effort has yet been expended on improving their security. Of course as tierless languages they  benefit from static type safety and automatically generated communication and placement. Some preliminary work shows that, as the communication between layers is protocol agnostic, more secure alternatives can be used. One example is to run the \gls{ITASK} server behind a reverse proxy implementing TLS/SSL encryption~\cite{wijkhuizen_security_2018}. A second is to add  integrity checks or even encryption to the communication protocol for resource-rich sensor nodes~\cite{boer_de_secure_2020}.
+\Gls{CLEAN}\slash\gls{ITASK} and \gls{CLEAN}/\gls{ITASK}/\gls{MTASK} are typical in this respect: little effort has yet been expended on improving their security. Of course as tierless languages they  benefit from static type safety and automatically generated communication and placement. Some preliminary work shows that, as the communication between layers is protocol agnostic, more secure alternatives can be used. One example is to run the \gls{ITASK} server behind a reverse proxy implementing TLS/SSL encryption~\citep{wijkhuizen_security_2018}. A second is to add  integrity checks or even encryption to the communication protocol for resource-rich sensor nodes~\citep{boer_de_secure_2020}.
  
  \section{Task-oriented and \texorpdfstring{\gls{IOT}}{IoT} programming in \texorpdfstring{\gls{CLEAN}}{Clean}}
  
  To make this paper self-contained we provide a concise overview of \gls{CLEAN}, \gls{TOP}, and \gls{IOT} programming in \gls{ITASK} and \gls{MTASK}. The minor innovations reported here are the interface to the \gls{IOT} sensors, and the \gls{CLEAN} port for the Raspberry Pi.
  
-\Gls{CLEAN} is a statically typed functional programming language similar to \gls{HASKELL}: both languages are pure and non-strict~\cite{achten_clean_2007}.
-A key difference is how state is handled: \gls{HASKELL} typically embeds stateful actions in the \haskellinline{IO} Monad~\cite{peyton_jones_imperative_1993,wiki:IO}.
-In contrast, \gls{CLEAN} has a uniqueness type system to ensure the single-threaded use of stateful objects like files and windows~\cite{barendsen_smetsers_1996}.
-Both \gls{CLEAN} and \gls{HASKELL} support fairly similar models of generic programming~\cite{ComparingGenericProgramming}, enabling functions to work on many types. As we shall see generic programming is heavily used in task-oriented programming~\cite{GenericProgrammingExtensionForClean,HinzeGenericFunctionalProgramming},  for example to construct web editors and communication protocols that work for any user-defined datatype.
+\Gls{CLEAN} is a statically typed functional programming language similar to \gls{HASKELL}: both languages are pure and non-strict~\citep{achten_clean_2007}.
+A key difference is how state is handled: \gls{HASKELL} typically embeds stateful actions in the \haskellinline{IO} Monad~\citep{peyton_jones_imperative_1993,wiki:IO}.
+In contrast, \gls{CLEAN} has a uniqueness type system to ensure the single-threaded use of stateful objects like files and windows~\citep{barendsen_smetsers_1996}.
+Both \gls{CLEAN} and \gls{HASKELL} support fairly similar models of generic programming~\citep{ComparingGenericProgramming}, enabling functions to work on many types. As we shall see generic programming is heavily used in task-oriented programming~\citep{GenericProgrammingExtensionForClean,HinzeGenericFunctionalProgramming},  for example to construct web editors and communication protocols that work for any user-defined datatype.
  
  \subsection{\texorpdfstring{\Acrlong{TOP}}{Task-oriented programming}}
  
-\Gls{TOP} is a declarative programming paradigm for  constructing interactive distributed systems~\cite{plasmeijer_task-oriented_2012}.
+\Gls{TOP} is a declarative programming paradigm for  constructing interactive distributed systems~\citep{plasmeijer_task-oriented_2012}.
  Tasks are the basic blocks of \gls{TOP} and represent work that needs to be done in the broadest sense.
  Examples of typical tasks range from allowing a user to complete a form, controlling peripherals, moderating other tasks, or monitoring a database.
  From a single declarative description of tasks all of the required software components are generated.
  This may include web servers, client code for browsers or \gls{IOT} devices, and for their interoperation.
  That is, from a single \gls{TOP} program the language implementation automatically generates an \emph{integrated distributed system}.
-Application areas range from simple web forms or blinking \glspl{LED} to multi-user distributed collaboration between people and machines~\cite{oortgiese_distributed_2017}.
+Application areas range from simple web forms or blinking \glspl{LED} to multi-user distributed collaboration between people and machines~\citep{oortgiese_distributed_2017}.
  
  
  \Gls{TOP} adds three concepts: tasks, task combinators, and \glspl{SDS}. Example basic tasks are web editors for user-defined datatypes, reading some \gls{IOT} sensor,  or controlling peripherals like a servo motor.
  Task combinators compose tasks into more advanced tasks, either in parallel or sequential and allow task values to be observed by other tasks.
  As tasks can be returned as the result of a function, recursion can be freely used, e.g.\ to express the repetition of tasks.
  There are also standard combinators for common patterns.
-Tasks can exchange information via \glspl{SDS}~\cite{ParametricLenses}.
+Tasks can exchange information via \glspl{SDS}~\citep{ParametricLenses}.
  All tasks involved can atomically observe and change the value of a typed \gls{SDS}, allowing more flexible communication than with task combinators.
  \glspl{SDS} offer a general abstraction of data shared by different tasks, analogous to variables, persistent values, files, databases and peripherals like sensors. Combinators  compose \glspl{SDS} into a larger \gls{SDS}, and
  parametric lenses define a specific view on an \gls{SDS}.
@@ -296,7 +296,7 @@ parametric lenses define a specific view on an \gls{SDS}.
  \label{sec_t4t:itasks}
  
  
-The \gls{ITASK} \gls{EDSL} is designed for constructing multi-user distributed  applications, including web~\cite{TOP-ICFP07} or \gls{IOT} applications.
+The \gls{ITASK} \gls{EDSL} is designed for constructing multi-user distributed  applications, including web~\citep{TOP-ICFP07} or \gls{IOT} applications.
  Here we present \gls{ITASK} by example, and the first is a complete program to repeatedly read the room temperature from a digital humidity and temperature (DHT) sensor attached to the machine and display it on a web page (\cref{lst_t4t:itaskTemp}).
  The first line is the module name, the third imports the \cleaninline{iTask} module, and the main function (\cref{lst_t4t:itaskTemp:systemfro,lst_t4t:itaskTemp:systemto}) launches \cleaninline{readTempTask} and the \gls{ITASK} system to generate the web interface in \cref{fig_t4t:itaskTempSimple}.
  
@@ -628,7 +628,7 @@ Communication between a sensor node and the server is always initiated by the no
  relatively powerful Raspberry Pi 3 Model Bs. There is a simple object-oriented \gls{PYTHON} collector for configuring the sensors and reading their values. The collector daemon service marshals the sensor data and transmits using \gls{MQTT} to the central monitoring server at a preset frequency.
  The collector caches sensor data locally when the server is unreachable.
  
-In contrast to \gls{PRS}, \gls{PWS}'s sensor nodes are microcontrollers running \gls{MICROPYTHON}, a dialect of \gls{PYTHON} specifically designed to run on small, low powered embedded devices~\cite{kodali2016low}.
+In contrast to \gls{PRS}, \gls{PWS}'s sensor nodes are microcontrollers running \gls{MICROPYTHON}, a dialect of \gls{PYTHON} specifically designed to run on small, low powered embedded devices~\citep{kodali2016low}.
  To enable a fair comparison between the software stacks we are careful to use the same object-oriented software architecture, e.g.\ using the same classes in  \gls{PWS} and \gls{PRS}.
  
  \Gls{PYTHON} and \gls{MICROPYTHON} are appropriate tiered comparison languages. Tiered \gls{IOT} systems are implemented in a whole range of programming languages, with \gls{PYTHON}, \gls{MICROPYTHON}, \gls{C} and \gls{CPP} being popular for some tiers in many implementations. \gls{C}\slash\gls{CPP} implementations would probably result in more verbose programs and even less type safety.
@@ -644,7 +644,7 @@ SQLite as a database backend.
  Communication between a sensor node and the server is initiated by the server.
  
  \Gls{CRS}'s sensor nodes are Raspberry Pi 4s, and execute \gls{CLEAN}\slash\gls{ITASK} programs.
-Communication from the sensor node to the server is implicit and happens via \glspl{SDS} over \gls{TCP} using platform independent execution graph serialisation~\cite{oortgiese_distributed_2017}.
+Communication from the sensor node to the server is implicit and happens via \glspl{SDS} over \gls{TCP} using platform independent execution graph serialisation~\citep{oortgiese_distributed_2017}.
  
  \Gls{CWS}'s sensor nodes are Wemos microcontrollers running \gls{MTASK} tasks. Communication and serialisation is, by design,  very similar to \gls{ITASK}, i.e.\ via \glspl{SDS} over either a serial port connection, raw \gls{TCP}, or \gls{MQTT} over \gls{TCP}.
  
@@ -726,12 +726,12 @@ As the tierless languages synthesize the code to be executed on the sensor nodes
  \end{table}
  
  \Cref{tbl_t4t:mem} shows the maximum memory residency after garbage collection of the sensor node for all four smart campus implementations. The smart campus sensor node programs executing on the Wemos microcontrollers have low maximum residencies: \qty{20270}{\byte} for \gls{PWS} and \qty{880}{\byte} for \gls{CWS}. In \gls{CWS} the \gls{MTASK} system generates very high level \gls{TOP} byte code that is interpreted by the \gls{MTASK} virtual machine and uses a small and predictable amount of heap memory.
-In \gls{PWS}, the hand-written \gls{MICROPYTHON} is compiled to byte code for execution on the virtual machine. Low residency is achieved with a fixed size heap and efficient memory management. For example both \gls{MICROPYTHON} and \gls{MTASK} use fixed size allocation units and mark\&sweep garbage collection to minimise memory usage at the cost of some execution time~\cite{plamauer2017evaluation}.
+In \gls{PWS}, the hand-written \gls{MICROPYTHON} is compiled to byte code for execution on the virtual machine. Low residency is achieved with a fixed size heap and efficient memory management. For example both \gls{MICROPYTHON} and \gls{MTASK} use fixed size allocation units and mark\&sweep garbage collection to minimise memory usage at the cost of some execution time~\citep{plamauer2017evaluation}.
  
  The smart campus sensor node programs executing on the Raspberry Pis have far higher maximum residencies than those executing on the microcontrollers: \qty{3.5}{\mebi\byte} for \gls{PRS} and \qty{2.7}{\mebi\byte} for \gls{CRS}. In \gls{CRS} the sensor node code is a set of \gls{ITASK} executing on a full-fledged \gls{ITASK} server running in distributed child mode and this consumes far more memory.
  %The memory used is actually very similar to the memory usage of the server with a single client connected.
  In \gls{PRS} the sensor node program is written in \gls{PYTHON}, a language far less focused on minimising memory usage than \gls{MICROPYTHON}. For example an object like a string is larger in \gls{PYTHON} than in \gls{MICROPYTHON} and consequently does not support all features such as \emph{f-strings}.
-Furthermore, not all advanced \gls{PYTHON} feature regarding classes are available in \gls{MICROPYTHON}, i.e.\ only a subset of the \gls{PYTHON} specification is supported~\cite{diffmicro}.%\mlcomment{reference \url{https://docs.micropython.org/en/latest/genrst/index.html} ? It contains an overview of supported features}
+Furthermore, not all advanced \gls{PYTHON} feature regarding classes are available in \gls{MICROPYTHON}, i.e.\ only a subset of the \gls{PYTHON} specification is supported~\citep{diffmicro}.%\mlcomment{reference \url{https://docs.micropython.org/en/latest/genrst/index.html} ? It contains an overview of supported features}
  
  In summary the sensor node code generated by both tierless languages, \gls{ITASK} and \gls{MTASK}, is sufficiently memory efficient for the target sensor node hardware. Indeed, the maximum residencies of the \gls{CLEAN} sensor node code is less than the corresponding hand-written (Micro)\gls{PYTHON} code. Of course in a tiered stack the hand-written code can be more easily optimised to minimise residency, and this could even entail using a memory efficienthat thet language like \gls{C}\slash\gls{CPP}. However, such optimisation requires additional developer effort, and a new language would introduce additional semantic friction.
  
@@ -757,9 +757,9 @@ This section investigates whether tierless languages make \gls{IOT} programming
  \label{sec_t4t:codesize}
  %A comparison of the Temperature sensor in \gls{PYTHON} Micropyton, Itask \& \gls{MTASK}.
  
-\paragraph{Code Size} is widely recognised as an approximate measure of the development and maintenance effort required for a software system~\cite{rosenberg1997some}. \gls{SLOC} is a common code size metric, and is especially useful for multi-paradigm systems like \gls{IOT} systems. It is based on the simple principle that the more \gls{SLOC}, the more developer effort and the increased likelihood of  bugs~\cite{rosenberg1997some}. It is a simple measure, not dependent on some formula, and can be automatically computed~\cite{sheetz2009understanding}.
+\paragraph{Code Size} is widely recognised as an approximate measure of the development and maintenance effort required for a software system~\citep{rosenberg1997some}. \gls{SLOC} is a common code size metric, and is especially useful for multi-paradigm systems like \gls{IOT} systems. It is based on the simple principle that the more \gls{SLOC}, the more developer effort and the increased likelihood of  bugs~\citep{rosenberg1997some}. It is a simple measure, not dependent on some formula, and can be automatically computed~\citep{sheetz2009understanding}.
  
-Of course \gls{SLOC} must be used carefully as it is easily influenced by programming style, language paradigm, and counting method~\cite{alpernaswonderful}. Here we are counting code to compare development effort, use the same idiomatic programming style in each component, and only count lines of code, omitting comments and blank lines.
+Of course \gls{SLOC} must be used carefully as it is easily influenced by programming style, language paradigm, and counting method~\citep{alpernaswonderful}. Here we are counting code to compare development effort, use the same idiomatic programming style in each component, and only count lines of code, omitting comments and blank lines.
  
  \Cref{table_t4t:multi} enumerates the \gls{SLOC} required to implement the \gls{UOG} smart campus functionalities in  \gls{PWS}, \gls{PRS}, \gls{CWS} and \gls{CRS}. Both \gls{PYTHON} and \gls{CLEAN} implementations use the same server and communication code for Raspberry Pi and for Wemos sensor nodes (rows 5--7 of the table).
  The Sensor Interface (SI) refers to code facilitating the communication between the peripherals and the sensor node software. % formerly hardware interface
@@ -813,7 +813,7 @@ The total size of \gls{CWS} and \gls{CRS} would be reduced by a factor of two an
  Before exploring the reasons for the smaller tierless codebase we compare the implementations for resource-rich and resource-constrained sensor nodes, again using \gls{SLOC} and code proportions. \Cref{table_t4t:multi} shows that the two tiered implementations are very similar in size: with \gls{PWS} for microcontrollers requiring 562 \gls{SLOC} and \gls{PRS} for supersensors requiring 576 \gls{SLOC}.
  The two tierless implementations are also similar in size: \gls{CWS} requiring 166 and  \gls{CRS} 155 \gls{SLOC}.
  
-There are several main reasons for the similarity. One is that the server-side code, i.e.\ for the presentation and application layers, is identical for both resource rich/constrained implementations. The identical server code accounts for approximately 40\% of the  \gls{PWS} and  \gls{PRS} codebases, and approximately 85\% of the  \gls{CWS} and  \gls{CRS} codebases (\cref{fig_t4t:multipercentage}). For the perception and network layers on the sensor nodes, the \gls{PYTHON} and \gls{MICROPYTHON} implementations have the same structure, e.g.\ a class for each type of sensor, and use analogous libraries. Indeed, approaches like CircuitPython~\cite{CircuitPython} allow the same code to execute on both resource-rich and resource-constrained sensor nodes.
+There are several main reasons for the similarity. One is that the server-side code, i.e.\ for the presentation and application layers, is identical for both resource rich/constrained implementations. The identical server code accounts for approximately 40\% of the  \gls{PWS} and  \gls{PRS} codebases, and approximately 85\% of the  \gls{CWS} and  \gls{CRS} codebases (\cref{fig_t4t:multipercentage}). For the perception and network layers on the sensor nodes, the \gls{PYTHON} and \gls{MICROPYTHON} implementations have the same structure, e.g.\ a class for each type of sensor, and use analogous libraries. Indeed, approaches like CircuitPython~\citep{CircuitPython} allow the same code to execute on both resource-rich and resource-constrained sensor nodes.
  
  
  Like \gls{PYTHON} and \gls{MICROPYTHON}, \gls{ITASK} and \gls{MTASK} are designed to be similar, as elaborated in \cref{sec_t4t:ComparingTierless}. The similarity is apparent when comparing the \gls{ITASK} \gls{CRTS} and \gls{ITASK}\slash\gls{MTASK} \gls{CWTS} room temperature systems in \cref{lst_t4t:itaskTempFull,lst_t4t:mtaskTemp}. That is, both implementations use similar \glspl{SDS} and lenses; they have similar \cleaninline{devTask}s that execute on the sensor node, and the server-side \cleaninline{mainTask}s are almost identical: they deploy the remote \cleaninline{devTask} before generating the web page to report the readings.
@@ -876,9 +876,9 @@ A caveat is that the smart campus system is relatively simple, and  developing m
  
  The vast majority of \gls{IOT} systems are implemented using a number of different programming languages and paradigms, and these must be effectively used and interoperated. A major reason that the tierless \gls{IOT} implementations are simpler and shorter than the tiered implementations is that they use far fewer programming languages and paradigms. Here we use language to distinguish \glspl{EDSL} from their host language: so \gls{ITASK} and \gls{MTASK} are considered distinct from \gls{CLEAN}; and to distinguish dialects: so \gls{MICROPYTHON} is considered distinct from \gls{PYTHON}.
  
-The tierless implementations use just two conceptually-similar \glspl{DSL} embedded in the same host language, and a single paradigm (\cref{table_t4t:languages,table_t4t:paradigms}). In contrast, the tiers in \gls{PRS} and \gls{PWS} use six or more very different languages, and both imperative and declarative paradigms. Multiple languages are commonly used in other typical software systems like web stacks, e.g.\ a recent survey of open source projects reveals that on average at least five different languages are used~\cite{mayer2015empirical}. Interoperating components in multiple languages and paradigms raises a plethora of issues.
+The tierless implementations use just two conceptually-similar \glspl{DSL} embedded in the same host language, and a single paradigm (\cref{table_t4t:languages,table_t4t:paradigms}). In contrast, the tiers in \gls{PRS} and \gls{PWS} use six or more very different languages, and both imperative and declarative paradigms. Multiple languages are commonly used in other typical software systems like web stacks, e.g.\ a recent survey of open source projects reveals that on average at least five different languages are used~\citep{mayer2015empirical}. Interoperating components in multiple languages and paradigms raises a plethora of issues.
  
-Interoperation \emph{increases the cognitive load on the developer} who must simultaneously think in multiple languages and paradigms. This is commonly known as semantic friction or impedance mismatch~\cite{ireland2009classification}. A simple illustration of this is that the tiered \gls{PRS} source code comprises some 38 source and configuration files, whereas the tierless \gls{CRS} requires just 3 files (\cref{table_t4t:multi}). The source could be structured as a single file, but to separate concerns is structured into three modules, one each for \glspl{SDS}, types, and control logic~\cite{wang_maintaining_2018}.
+Interoperation \emph{increases the cognitive load on the developer} who must simultaneously think in multiple languages and paradigms. This is commonly known as semantic friction or impedance mismatch~\citep{ireland2009classification}. A simple illustration of this is that the tiered \gls{PRS} source code comprises some 38 source and configuration files, whereas the tierless \gls{CRS} requires just 3 files (\cref{table_t4t:multi}). The source could be structured as a single file, but to separate concerns is structured into three modules, one each for \glspl{SDS}, types, and control logic~\citep{wang_maintaining_2018}.
  
  The developer must \emph{correctly interoperate the components}, e.g.\ adhere to the \gls{API} or communication protocols between components. The interoperation often entails additional programming tasks like marshalling or demarshalling data between components. For example, in the tiered \gls{PRS} and \gls{PWS} architectures, \gls{JSON} is used to serialise and deserialise data strings  from the \gls{PYTHON} collector component before storing the data in the Redis database (\cref{lst_t4t:json}).
  %e.g.\ to marshall and demarshall data between components.
@@ -905,7 +905,7 @@ To ensure correctness the developer \emph{must maintain type safety} across a ra
  \subsection{Automatic Communication}%
  \label{sec_t4t:Communication}
  
-In conventional tiered \gls{IOT} implementations the developer must write and maintain code to communicate between tiers. For example \gls{PRS} and \gls{PWS} create, send and read \gls{MQTT}~\cite{light2017mosquitto} messages
+In conventional tiered \gls{IOT} implementations the developer must write and maintain code to communicate between tiers. For example \gls{PRS} and \gls{PWS} create, send and read \gls{MQTT}~\citep{light2017mosquitto} messages
  between the perception and application layers. \Cref{table_t4t:multi} shows that communication between these layers require some 94 \gls{SLOC} in \gls{PWS} and 98 in \gls{PRS}, accounting for 17\% of the codebase (bottom bars in \cref{fig_t4t:multipercentage}). To illustrate, \cref{lst_t4t:mwssmqtt} shows part of the code to communicate sensor readings from the \gls{PWS} sensor node to the Redis store on the server.
  
  Not only must the tiered developer write additional code, but \gls{IOT} communication code is often intricate. In such a distributed system the sender and receiver must be correctly configured, correctly follow the communication protocol through all execution states, and deal with potential failures. For example line 3 of \cref{lst_t4t:mwssmqtt}: \pythoninline{redis host = config.get('Redis', 'Host')} will fail if either the host or IP are incorrect.
@@ -977,7 +977,7 @@ However, there are various ways that high-level abstractions make the \gls{CWS}
  Firstly, functional programming languages are  generally more concise than most other programming languages because their powerful abstractions like higher-order and/or polymorphic functions require less code to describe a computation.
  Secondly, the \gls{TOP} paradigm used in \gls{ITASK} and \gls{MTASK} reduces the code size further by making it easy to specify \gls{IOT} functionality concisely.
  As examples, the step combinator \cleaninline{>>*.} allows the task value on the left-hand side to be observed until one of the steps is enabled;
-and the \cleaninline{viewSharedInformation} (line 31 of \cref{lst_t4t:mtaskTemp}) part of the UI will be automatically updated when the value of the \gls{SDS} changes. Moreover, each \gls{SDS} provides automatic updates to all coupled \glspl{SDS} and associated tasks. Thirdly, the amount of explicit type information is minimised in comparison to other languages, as much is automatically inferred~\cite{hughes1989functional}.
+and the \cleaninline{viewSharedInformation} (line 31 of \cref{lst_t4t:mtaskTemp}) part of the UI will be automatically updated when the value of the \gls{SDS} changes. Moreover, each \gls{SDS} provides automatic updates to all coupled \glspl{SDS} and associated tasks. Thirdly, the amount of explicit type information is minimised in comparison to other languages, as much is automatically inferred~\citep{hughes1989functional}.
  
  \section{Could Tierless \texorpdfstring{\gls{IOT}}{IoT} Programming Be More Reliable than Tiered?}%
  \label{sec_t4t:Discussion}
@@ -988,14 +988,14 @@ This section investigates whether tierless languages make \gls{IOT} programming
  \subsection{Type Safety}%
  \label{sec_t4t:typesafety}
  Strong typing identifies errors early in the development cycle, and hence plays a crucial role in improving software quality. In consequence almost all modern languages provide strong typing, and encourage static typing to minimise runtime errors.
-% Phil: so widely known that a citation is unnecessary~\cite{madsen1990strong}.
-That said, many distributed system components written in languages that primarily use static typing, like \gls{HASKELL} and Scala, use some dynamic typing, e.g.\ to ensure that the data arriving in a message has the anticipated type~\cite{epstein2011towards,gupta2012akka}.
+% Phil: so widely known that a citation is unnecessary~\citep{madsen1990strong}.
+That said, many distributed system components written in languages that primarily use static typing, like \gls{HASKELL} and Scala, use some dynamic typing, e.g.\ to ensure that the data arriving in a message has the anticipated type~\citep{epstein2011towards,gupta2012akka}.
  
-In a typical tiered multi-language \gls{IOT} system the developer must integrate software in different languages with very different type systems, and potentially executing on different hardware. The challenges of maintaining type safety have long been recognised as a major component of the semantic friction in multi-language systems, e.g.\ \cite{ireland2009classification}.
+In a typical tiered multi-language \gls{IOT} system the developer must integrate software in different languages with very different type systems, and potentially executing on different hardware. The challenges of maintaining type safety have long been recognised as a major component of the semantic friction in multi-language systems, e.g.\ \citep{ireland2009classification}.
  
-Even if the different languages used in two components are both strongly typed,  they may attribute, often quite subtly, different types to a value. Such type errors can lead to runtime errors, or the application silently reporting erroneous data. Such errors can be hard to find. Automatic detection of such errors is sometimes possible, but requires an addition tool like Jinn~\cite{Jinn,Furr2005}.
+Even if the different languages used in two components are both strongly typed,  they may attribute, often quite subtly, different types to a value. Such type errors can lead to runtime errors, or the application silently reporting erroneous data. Such errors can be hard to find. Automatic detection of such errors is sometimes possible, but requires an addition tool like Jinn~\citep{Jinn,Furr2005}.
  %Such errors can be hard to debug, partly because there is very limited tool support for detecting them
-%Phil: another possible source to discuss ~\cite{egyed1999automatically}
+%Phil: another possible source to discuss ~\citep{egyed1999automatically}
  
  \begin{lstPython}[caption={\Gls{PRS} loses type safety as a sensor node sends a {\tt\footnotesize double}, and the server stores a {\tt\footnotesize string}.},label={lst_t4t:float},morekeywords={message,enum,uint64,double}]
  message SensorData {
@@ -1012,7 +1012,7 @@ channel = 'sensor_status.%s.%s' % (hostname,
  
  Analysis of the \gls{PRS} codebase reveals an instance where it, fairly innocuously, loses type safety. The fragment in \cref{lst_t4t:float} first shows a \pythoninline{double} sensor value being sent from the sensor node, and then shows the value being stored in Redis as a \pythoninline{string} on the server. As \gls{PWS} preserves the same server components it also suffers from the same loss of type safety.
  
-\emph{A tierless language makes it possible to guarantee type safety across an entire \gls{IOT} stack}.  For example the \gls{CLEAN} compiler guarantees static type safety as the entire \gls{CWS} software stack is type checked, and  generated, from a single source. Tierless web stack languages like Links~\cite{cooper2006links} and Hop~\cite{serrano2006hop} provide the same guarantee for web stacks.
+\emph{A tierless language makes it possible to guarantee type safety across an entire \gls{IOT} stack}.  For example the \gls{CLEAN} compiler guarantees static type safety as the entire \gls{CWS} software stack is type checked, and  generated, from a single source. Tierless web stack languages like Links~\citep{cooper2006links} and Hop~\citep{serrano2006hop} provide the same guarantee for web stacks.
  
  
  \subsection{Failure Management}%
@@ -1063,7 +1063,7 @@ In summary, while a tiered approach makes replacing components easy, refactoring
  \subsection{Support}%
  \label{sec_t4t:support}
  %\mlcomment{I've shortened this quite a bit}
-Community and tool support are essential for engineering reliable production software. \gls{PRS} and \gls{PWS} are both \gls{PYTHON} based, and \gls{PYTHON}\slash\gls{MICROPYTHON} are among the most popular programming languages~\cite{cass2020top}. \gls{PYTHON} is also a common choice for some tiers of \gls{IOT} applications~\cite{tanganelli2015coapthon}.
+Community and tool support are essential for engineering reliable production software. \gls{PRS} and \gls{PWS} are both \gls{PYTHON} based, and \gls{PYTHON}\slash\gls{MICROPYTHON} are among the most popular programming languages~\citep{cass2020top}. \gls{PYTHON} is also a common choice for some tiers of \gls{IOT} applications~\citep{tanganelli2015coapthon}.
  Hence, there are a wide range of development tools like \glspl{IDE} and debuggers, a thriving community and a wealth of training material. There are even specialised \gls{IOT} Boards like PyBoard \& WiPy that are specifically programmed using \gls{PYTHON} variations like \gls{MICROPYTHON}.
  
  In contrast, tierless languages are far less mature than the languages used in tiered stacks, and far less widely adopted.
@@ -1108,7 +1108,7 @@ This section compares the \gls{ITASK} and \gls{MTASK} \glspl{EDSL}, with referen
  
  \subsection{Language Restrictions for Resource-Constrained Execution}
  
-Executing components on a resource-constrained sensor node imposes restrictions on programming abstractions available in a tierless \gls{IOT} language or \gls{DSL}. The small and fixed-size memory are key limitations. The limitations are shared by any high-level language that targets microcontrollers such as BIT, PICBIT, PICOBIT,  Microscheme and uLisp~\cite{dube_bit:_2000,feeley_picbit:_2003,st-amour_picobit:_2009,suchocki_microscheme:_2015, johnson-davies_lisp_2020}.
+Executing components on a resource-constrained sensor node imposes restrictions on programming abstractions available in a tierless \gls{IOT} language or \gls{DSL}. The small and fixed-size memory are key limitations. The limitations are shared by any high-level language that targets microcontrollers such as BIT, PICBIT, PICOBIT,  Microscheme and uLisp~\citep{dube_bit:_2000,feeley_picbit:_2003,st-amour_picobit:_2009,suchocki_microscheme:_2015, johnson-davies_lisp_2020}.
  Even in low level languages some language features are disabled by default when targeting microcontrollers, such as runtime type information (RTTI) in \gls{CPP}.
  
  Here we investigate the restrictions imposed by resource-constrained sensor nodes on  \gls{MTASK}, in comparison with \gls{ITASK}. While \gls{ITASK} and \gls{MTASK} are by design superficially similar languages, to execute on resource-constrained sensor nodes \gls{MTASK} tasks are more restricted, and have a different semantics.
@@ -1142,7 +1142,7 @@ Such competing tasks, or indeed other \gls{OS} threads and processes, consume pr
  However, even when using multiple \gls{MTASK} tasks, it is easier to control the number of tasks on a device than controlling the number of processes and threads executing under an \gls{OS}.
  
  An \gls{MTASK} program has more control over energy consumption.
-The \gls{MTASK} \gls{EDSL} and the \gls{MTASK} \gls{RTS} are designed to minimise energy usage~\cite{crooijmans_reducing_2021}.
+The \gls{MTASK} \gls{EDSL} and the \gls{MTASK} \gls{RTS} are designed to minimise energy usage~\citep{crooijmans_reducing_2021}.
  Intensional analysis of the declarative task description and current progress at run time allow the \gls{RTS} to schedule tasks and maximise idle time.
  As the \gls{RTS} is the only program running on the device, it can enforce deep sleep and wake up without having to worry about influencing other processes.
  
@@ -1155,7 +1155,7 @@ The downside of this direct control is that \gls{CWS} has to handle some excepti
  \Cref{table_t4t:languagecomparison} summarises the differences between the \gls{CLEAN} \gls{IOT} \gls{EDSL} and their host language.
  The restrictions imposed by a resource-constrained execution environment on the tierless \gls{IOT} language are relatively minor. Moreover the \gls{MTASK} programming abstraction is broadly compatible with \gls{ITASK}. As a simple example compare the \gls{ITASK} and \gls{MTASK} temperature sensors in \cref{lst_t4t:itaskTempFull,lst_t4t:mtaskTemp}. As a more realistic example, the \gls{MTASK} based \gls{CWS} smart campus implementation is similar to the \gls{ITASK} based \gls{CRS}, and requires less than 10\% additional code: 166 \gls{SLOC} compared with  155 \gls{SLOC} (\cref{table_t4t:multi}).
  
-Even with these restrictions, \gls{MTASK} programming is at a far higher level of abstraction than almost all bare metal languages, e.g.\ BIT, PICBIT, PICOBIT and Microscheme. That is \gls{MTASK} provides a set of higher order task combinators, shared distributed data stores, etc. (\cref{sec_t4t:mtasks}).  Moreover, it seems that common sensor node programs are readily expressed using  \gls{MTASK}. In addition to the \gls{CWTS} and \gls{CWS} systems outlined here, other case studies include Arduino examples as well as some bigger tasks~\cite{koopman_task-based_2018,lubbers_writing_2019,LubbersMIPRO}. We conclude that the programming of sensor tasks is well-supported by both \glspl{DSL}.
+Even with these restrictions, \gls{MTASK} programming is at a far higher level of abstraction than almost all bare metal languages, e.g.\ BIT, PICBIT, PICOBIT and Microscheme. That is \gls{MTASK} provides a set of higher order task combinators, shared distributed data stores, etc. (\cref{sec_t4t:mtasks}).  Moreover, it seems that common sensor node programs are readily expressed using  \gls{MTASK}. In addition to the \gls{CWTS} and \gls{CWS} systems outlined here, other case studies include Arduino examples as well as some bigger tasks~\citep{koopman_task-based_2018,lubbers_writing_2019,LubbersMIPRO}. We conclude that the programming of sensor tasks is well-supported by both \glspl{DSL}.
  
  \section{Conclusion}%
  \label{sec_t4t:Conclusion}
@@ -1171,14 +1171,14 @@ We show that \emph{tierless languages have the potential to significantly reduce
         \item Tierless developers benefit from automatically generated, and hence correct, communication (\cref{lst_t4t:mtaskTemp}), and write 6$\times$ less communication code (\cref{fig_t4t:multipercentage}).
  %and TODO).%~\ref{lst_t4t:mqtt}).
         \item Tierless developers can exploit powerful high-level declarative and task-oriented \gls{IOT} programming abstractions (\cref{table_t4t:temp}), specifically the composable, higher-order task combinators outlined in \cref{sec_t4t:itasks}. 
-Our empirical results for \gls{IOT} systems are consistent with the benefits claimed for tierless languages in other application domains. Namely that a tierless language provides a \textit{Higher Abstraction Level}, \textit{Improved Software Design}, and improved \textit{Program Comprehension}~\cite{weisenburger2020survey}.
+Our empirical results for \gls{IOT} systems are consistent with the benefits claimed for tierless languages in other application domains. Namely that a tierless language provides a \textit{Higher Abstraction Level}, \textit{Improved Software Design}, and improved \textit{Program Comprehension}~\citep{weisenburger2020survey}.
  \end{enumerate*}
  
  We show that \emph{tierless languages have the potential to significantly improve the reliability of \gls{IOT} systems}.  We illustrate how \gls{CLEAN} maintains type safety, contrasting this with a loss of type safety in \gls{PRS}.
-We illustrate higher order failure management in \gls{CLEAN}\slash\gls{ITASK}/\gls{MTASK} in contrast to the \gls{PYTHON}-based failure management in  \gls{PRS}. For maintainability a tiered approach makes replacing components easy, but refactoring within the components is far harder than in a tierless \gls{IOT} language. Again our findings are consistent with the simplied \textit{Code Maintenance}  benefits claimed for tierless languages~\cite{weisenburger2020survey}.
+We illustrate higher order failure management in \gls{CLEAN}\slash\gls{ITASK}/\gls{MTASK} in contrast to the \gls{PYTHON}-based failure management in  \gls{PRS}. For maintainability a tiered approach makes replacing components easy, but refactoring within the components is far harder than in a tierless \gls{IOT} language. Again our findings are consistent with the simplied \textit{Code Maintenance}  benefits claimed for tierless languages~\citep{weisenburger2020survey}.
  Finally, we contrast community support for the technologies (\cref{sec_t4t:Discussion}).
  
-%\pwtcomment{Pieter: please check discussion of ~\cite{weisenburger2020survey} in preceding 2 paragraphs}
+%\pwtcomment{Pieter: please check discussion of ~\citep{weisenburger2020survey} in preceding 2 paragraphs}
  
  We report \emph{the first comparison of a tierless \gls{IOT} codebase for resource-rich sensor nodes with one for resource-constrained sensor nodes}.
  \begin{enumerate*}
@@ -1188,13 +1188,13 @@ as it is in the tiered \gls{PYTHON} implementations (\cref{fig_t4t:multipercenta
  \end{enumerate*}
  
  We present \emph{the first comparison of two tierless \gls{IOT} languages: one designed for resource-constrained sensor nodes (\gls{CLEAN} with \gls{ITASK} and \gls{MTASK}), and the other for resource-rich sensor nodes (\gls{CLEAN} with \gls{ITASK}).}  \gls{CLEAN}\slash\gls{ITASK} can implement all layers of the \gls{IOT} stack if the sensor nodes have the computational resources, as the Raspberry Pis do in \gls{CRS}. On resource constrained sensor nodes \gls{MTASK} are required to implement the perception and network layers, as on the Wemos minis in \gls{CWS}. We show that a bare metal execution environment allows \gls{MTASK} to have better control of peripherals, timing and energy consumption. The memory available on a microcontroller restricts the  programming abstractions available in \gls{MTASK} to a fixed set of combinators, no user defined or recursive data types,  strict evaluation, and makes it harder to add new abstractions. Even with these restrictions \gls{MTASK} provide a higher level of abstraction than most bare metal languages, and can  readily express many \gls{IOT} applications including the \gls{CWS} \gls{UOG} smart campus application (\cref{sec_t4t:ComparingTierless}).
-Our empirical results are consistent with the  benefits of tierless languages listed in Section 2.1 of~\cite{weisenburger2020survey}.
+Our empirical results are consistent with the  benefits of tierless languages listed in Section 2.1 of~\citep{weisenburger2020survey}.
  
  \subsection{Reflections} 
  
  This study is based on a specific pair of tierless \gls{IOT} languages, and the \gls{CLEAN} language frameworks represent a specific set of tierless language design decisions. Many alternative tierless \gls{IOT} language designs are possible, and some are outlined in \cref{sec_t4t:characteristics}. Crucially the limitations of the tierless \gls{CLEAN} languages, e.g.\ that they currently provide limited security, should not be seen as limitations of tierless technologies in general. 
  
-This study has explored some, but not all, of the potential benefits of tierless languages for \gls{IOT} systems. An \gls{IOT} system specified as a single tierless program is amenable to a host of programming language technologies. For example, if the language has a formal semantics, as Links, Hop and \gls{CLEAN} tasks do~\cite{cooper2006links,serrano2006hop,plasmeijer_task-oriented_2012}, it is possible to prove properties of the system, e.g.\ \cite{Steenvoorden2019tophat}. As another example program analyses can be applied, and \cref{sec_t4t:characteristics} and~\cite{weisenburger2020survey} outline some of the analyses could be, and in some cases have been, used to improve \gls{IOT} systems. Examples include automatic tier splitting~\cite{10.1145/2661136.2661146}, and controlling information flow to enhance security~\cite{valliappan_towards_2020}. 
+This study has explored some, but not all, of the potential benefits of tierless languages for \gls{IOT} systems. An \gls{IOT} system specified as a single tierless program is amenable to a host of programming language technologies. For example, if the language has a formal semantics, as Links, Hop and \gls{CLEAN} tasks do~\citep{cooper2006links,serrano2006hop,plasmeijer_task-oriented_2012}, it is possible to prove properties of the system, e.g.\ \citep{Steenvoorden2019tophat}. As another example program analyses can be applied, and \cref{sec_t4t:characteristics} and~\citep{weisenburger2020survey} outline some of the analyses could be, and in some cases have been, used to improve \gls{IOT} systems. Examples include automatic tier splitting~\citep{10.1145/2661136.2661146}, and controlling information flow to enhance security~\citep{valliappan_towards_2020}. 
  
  While offering real benefits for \gls{IOT} systems development, tierless languages also raise some challenges. Programmers must master new tierless programming abstractions, and the semantics of these automatic multi-tier behaviours are necessarily relatively complex. In the \gls{CLEAN} context this entails becoming proficient with the  \gls{ITASK} and \gls{MTASK} \glspl{DSL}. Moreover, specifying a behaviour that is not already provided by the tierless language requires either a workaround, or extending a \gls{DSL}. However, implementing the relatively simple smart campus application required no such adaption. Finally, tierless \gls{IOT} technology is very new, and both tool and community support have yet to mature.