\section{Requirements}
\subsection{Introduction}
-As almost every other computer starts with a set of requirements so will this
-application. Requirements are a set of goals within different categories that
-will define what the application has to be able to do and they are
-traditionally defined at the start of the project and not expected to change
-much. In the case of this project the requirements were a lot more flexible
-because there was only one person doing the programming and there was a weekly
-meeting to discuss the matters and most importantly discuss the required
-changes. Because of this a lot of initial requirements are removed and a some
-requirements were added in the process. The list below shows the definitive
-requirements and also the suspended requirements.
-
-The two types of requirements that are formed are functional and non-functional
-requirements. Respectively they are requirements that describe a certain
-function and the latter are requirements that describe a certain property such
-as efficiency or compatibility. To make us able to refer to them we give the
-requirements unique codes. We also specify in the list with active requirements
-the reason for the choice.
+As almost every plan for an application starts with a set of requirements so
+will this application. Requirements are a set of goals within different
+categories that will define what the application has to be able to do and they
+are traditionally defined at the start of the project and not expected to
+change much. In the case of this project the requirements were a lot more
+flexible because there was only one person doing the programming and there was
+a weekly meeting to discuss the matters and most importantly discuss the
+required changes. Because of this a lot of initial requirements are removed and
+a some requirements were added in the process. The list below shows the
+definitive requirements and also the suspended requirements.
+
+There are two types of requirements, functional and non-functional
+requirements. Functional requirements are requirements that describe a certain
+function in the technical sense. Non-functional requirements describe a
+property. Properties can be for example efficiency, portability or
+compatibility. To make us able to refer to them later we give the
+requirements unique codes. As for the definitive requirements a verbose
+explanation is also provided.
\subsection{Functional requirements}
\subsubsection{Original functional requirements}
\end{itemize}
\item[F2:] Apply low level matching techniques on isolated data.
\item[F3:] Insert the data in the database.
- \item[F4:] User interface to train crawlers must be usable by non computer
- science people.
- \item[F5:] There must be a control center for the crawlers.
+ \item[F4:] User interface to train crawlers that is usable someone
+ without a particular computer science background.
+ \item[F5:] Control center for the crawlers.
+ \item[F6:] Report to the user or maintainer when a source has been
+ changed too much for successful crawling.
\end{itemize}
\subsubsection{Definitive functional requirements}
Requirement F2 is the sole requirement that is dropped completely, this is
-because this seemed to lie out of the scope of the project. This is mainly
-because we chose to build an interactive intuitive user interface around the
-core of the pattern extraction program. All other requirements changed or kept
-the same. Below, all definitive requirements with on the first line the title
-and with a description underneath.
+due to the fact that it lies outside of the time available for the project.
+The less time available is partly because we chose to implement certain other
+requirements like an interactive intuitive user interface around the core of
+the pattern extraction program. All other requirements changed or kept the
+same. Below are all definitive requirements with on the first line the title and
+with a description underneath.
\begin{itemize}
- \item[F6:] Be able to crawl RSS feeds only.
-
- This requirement is an adapted version of the compound requirements
- F1a-F1d. We stripped down from crawling four different sources to only one
- source because of the scope of the project. Most sources require an
- entirely different strategy. The full reason why we chose RSS feeds can be
- found in Section~\ref{sec:whyrss}.
-
- \item[F7:] Export the data to a strict XML feed.
-
- This requirement is an adapted version of requirement F3, this to done to
- make the scope smaller. We chose to no interact with the database or the
- \textit{Temporum}. The application will have to be able to output XML data
- that is formatted following a string XSD scheme so that it is easy to
- import the data in the database or \textit{Temporum}.
- \item[F8:] A control center interface that is usable by non computer
+ \item[F7:] Be able to crawl RSS feeds.
+
+ This requirement is an adapted version of the compound
+ requirements F1a-F1d. We stripped down from crawling four
+ different sources to only one source because of the scope of
+ the project. Most sources require an entirely different
+ strategy and therefore we could not easily combine them. The
+ full reason why we chose RSS feeds can be found in
+ Section~\ref{sec:whyrss}.
+
+ \item[F8:] Export the data to a strict XML feed.
+
+ This requirement is an adapted version of requirement F3, this
+ is als done to make the scope smaller. We chose to no interact
+ with the database or the \textit{Temporum}. The application
+ however is able to output XML data that is formatted
+ following a string XSD scheme so that it is easy to import the
+ data in the database or \textit{Temporum}.
+ \item[F9:] User interface to train crawlers that is usable someone
+ without a particular computer science background.
science people.
- This requirement is a combination of F4 and F5. At first the user interface
- for adding and training crawlers was done via a webinterface that was user
- friendly and usable for non computer science people as the requirement
- stated. However in the first prototypes the control center that could test,
- edit and remove crawlers was a command line application and thus not very
- usable for the general audience. This combined requirements asks for a
- single control center that can do all previously described task with an
- interface that is usable by almost everyone.
- \item[F9:] Report to the user or maintainer when a source has been changed
- too much for successful crawling.
-
- This requirement was also present in the original requirements and has not
- changed. When the crawler fails to crawl a source, this can be due to any
- reason, a message is sent to the people using the program so that they can
- edit or remove the faulty crawler. This is a crucial component because the
- program, a non computer science person can do this task and is essential in
- shortening the feedback loop explained in Figure~\ref{fig:1.1.2}.
+ This requirement is a combination of F4 and F5. At first the
+ user interface for adding and training crawlers was done via a
+ webinterface that was user friendly and usable by someone
+ without a particular computer science background as the
+ requirement stated. However in the first prototypes the control
+ center that could test, edit and remove crawlers was a command
+ line application and thus not very usable for the general
+ audience. This combined requirements asks for a single control
+ center that can do all previously described tasks with an
+ interface that is usable without prior knowledge in computer
+ science.
+ \item[F10:] Report to the user or maintainer when a source has been
+ changed too much for successful crawling.
+
+ This requirement was also present in the original requirements
+ and has not changed. When the crawler fails to crawl a source,
+ this can be due to any reason, a message is sent to the people
+ using the program so that they can edit or remove the faulty
+ crawler. Updating without the need of a programmer is essential
+ in shortening the feedbackloop explained in
+ Figure~\ref{feedbackloop}.
\end{itemize}
\subsection{Non-functional requirements}
\subsubsection{Original functional requirements}
\begin{itemize}
\item[N1:] Integrate in the original system.
- \item[N2:] Work in a modular fashion, thus be able to, in the future, extend
- the program.
+ \item[N2:] Work in a modular fashion, thus be able to, in the future,
+ extend the program.
\end{itemize}
\subsubsection{Active functional requirements}
\begin{itemize}
- \item[N2:] Work in a modular fashion, thus be able to, in the future, extend
- the program.
+ \item[N2:] Work in a modular fashion, thus be able to, in the future,
+ extend the program.
- The modularity is very important so that the components can be easily
- extended and components can be added. Possible extensions are discussed in
- Section~\ref{sec:discuss}.
+ The modularity is very important so that the components can be
+ easily extended and components can be added. Possible
+ extensions are discussed in Section~\ref{sec:discuss}.
\item[N3:] Operate standalone on a server.
- Non-functional requirement N1 is dropped because we want to keep the
- program as modular as possible and via an XML interface we still have a
- very stable connection with the database but we avoided getting entangled
- in the software managing the database.
+ Non-functional requirement N1 is dropped because we want to
+ keep the program as modular as possible and via an XML
+ interface we still have a very intimate connection with the
+ database without having to maintain a direct connection.
\end{itemize}
\section{Application overview}
\subsection{Frontend}
\subsubsection{General description}
-The frontend is a web interface to the backend applications that allow the user
-to interact with the backend by for example adding crawlers. The frontend
-consists of a basic graphical user interface that is shown in
-Figure~\ref{frontendfront}. As the interface shows, there are three main
-components that the user can use. There is also an button for downloading the
-XML. The XML output is a quick shortcut to make the backend to generate XML.
-However the XML button is only for diagnostic purposes located there. In the
-standard workflow the XML button is not used. In the standard workflow the
-server periodically calls the XML output from the backend to process it.
+The frontend is a web interface that is connected to the backend applications
+which allows the user to interact with the backend. The frontend consists of a
+basic graphical user interface which is shown in Figure~\ref{frontendfront}. As
+the interface shows, there are three main components that the user can use.
+There is also an button for downloading the XML. The \textit{Get xml} button is
+a quick shortcut to make the backend to generate XML. The button for grabbing
+the XML data is only for diagnostic purposes located there. In the standard
+workflow the XML button is not used. In the standard workflow the server
+periodically calls the XML output option from the command line interface of the
+backend to process it.
\begin{figure}[H]
\label{frontendfront}