From 8db5bb2224e817264f1677acf4178381c8d26465 Mon Sep 17 00:00:00 2001 From: Mart Lubbers Date: Mon, 9 Mar 2015 21:24:09 +0100 Subject: [PATCH] update with eps etc --- thesis2/.gitignore | 1 + thesis2/2.requirementsanddesign.tex | 169 +++++++++++++++------------- thesis2/thesis.tex | 2 +- 3 files changed, 93 insertions(+), 79 deletions(-) diff --git a/thesis2/.gitignore b/thesis2/.gitignore index 1fbd0f0..4b8e74e 100644 --- a/thesis2/.gitignore +++ b/thesis2/.gitignore @@ -8,6 +8,7 @@ *.blg *.pdf *.ps +*.eps *.pyg scheme[12].xsd log.txt diff --git a/thesis2/2.requirementsanddesign.tex b/thesis2/2.requirementsanddesign.tex index 31e56f0..6bbb29c 100644 --- a/thesis2/2.requirementsanddesign.tex +++ b/thesis2/2.requirementsanddesign.tex @@ -1,22 +1,23 @@ \section{Requirements} \subsection{Introduction} -As almost every other computer starts with a set of requirements so will this -application. Requirements are a set of goals within different categories that -will define what the application has to be able to do and they are -traditionally defined at the start of the project and not expected to change -much. In the case of this project the requirements were a lot more flexible -because there was only one person doing the programming and there was a weekly -meeting to discuss the matters and most importantly discuss the required -changes. Because of this a lot of initial requirements are removed and a some -requirements were added in the process. The list below shows the definitive -requirements and also the suspended requirements. - -The two types of requirements that are formed are functional and non-functional -requirements. Respectively they are requirements that describe a certain -function and the latter are requirements that describe a certain property such -as efficiency or compatibility. To make us able to refer to them we give the -requirements unique codes. We also specify in the list with active requirements -the reason for the choice. +As almost every plan for an application starts with a set of requirements so +will this application. Requirements are a set of goals within different +categories that will define what the application has to be able to do and they +are traditionally defined at the start of the project and not expected to +change much. In the case of this project the requirements were a lot more +flexible because there was only one person doing the programming and there was +a weekly meeting to discuss the matters and most importantly discuss the +required changes. Because of this a lot of initial requirements are removed and +a some requirements were added in the process. The list below shows the +definitive requirements and also the suspended requirements. + +There are two types of requirements, functional and non-functional +requirements. Functional requirements are requirements that describe a certain +function in the technical sense. Non-functional requirements describe a +property. Properties can be for example efficiency, portability or +compatibility. To make us able to refer to them later we give the +requirements unique codes. As for the definitive requirements a verbose +explanation is also provided. \subsection{Functional requirements} \subsubsection{Original functional requirements} @@ -30,78 +31,89 @@ the reason for the choice. \end{itemize} \item[F2:] Apply low level matching techniques on isolated data. \item[F3:] Insert the data in the database. - \item[F4:] User interface to train crawlers must be usable by non computer - science people. - \item[F5:] There must be a control center for the crawlers. + \item[F4:] User interface to train crawlers that is usable someone + without a particular computer science background. + \item[F5:] Control center for the crawlers. + \item[F6:] Report to the user or maintainer when a source has been + changed too much for successful crawling. \end{itemize} \subsubsection{Definitive functional requirements} Requirement F2 is the sole requirement that is dropped completely, this is -because this seemed to lie out of the scope of the project. This is mainly -because we chose to build an interactive intuitive user interface around the -core of the pattern extraction program. All other requirements changed or kept -the same. Below, all definitive requirements with on the first line the title -and with a description underneath. +due to the fact that it lies outside of the time available for the project. +The less time available is partly because we chose to implement certain other +requirements like an interactive intuitive user interface around the core of +the pattern extraction program. All other requirements changed or kept the +same. Below are all definitive requirements with on the first line the title and +with a description underneath. \begin{itemize} - \item[F6:] Be able to crawl RSS feeds only. - - This requirement is an adapted version of the compound requirements - F1a-F1d. We stripped down from crawling four different sources to only one - source because of the scope of the project. Most sources require an - entirely different strategy. The full reason why we chose RSS feeds can be - found in Section~\ref{sec:whyrss}. - - \item[F7:] Export the data to a strict XML feed. - - This requirement is an adapted version of requirement F3, this to done to - make the scope smaller. We chose to no interact with the database or the - \textit{Temporum}. The application will have to be able to output XML data - that is formatted following a string XSD scheme so that it is easy to - import the data in the database or \textit{Temporum}. - \item[F8:] A control center interface that is usable by non computer + \item[F7:] Be able to crawl RSS feeds. + + This requirement is an adapted version of the compound + requirements F1a-F1d. We stripped down from crawling four + different sources to only one source because of the scope of + the project. Most sources require an entirely different + strategy and therefore we could not easily combine them. The + full reason why we chose RSS feeds can be found in + Section~\ref{sec:whyrss}. + + \item[F8:] Export the data to a strict XML feed. + + This requirement is an adapted version of requirement F3, this + is als done to make the scope smaller. We chose to no interact + with the database or the \textit{Temporum}. The application + however is able to output XML data that is formatted + following a string XSD scheme so that it is easy to import the + data in the database or \textit{Temporum}. + \item[F9:] User interface to train crawlers that is usable someone + without a particular computer science background. science people. - This requirement is a combination of F4 and F5. At first the user interface - for adding and training crawlers was done via a webinterface that was user - friendly and usable for non computer science people as the requirement - stated. However in the first prototypes the control center that could test, - edit and remove crawlers was a command line application and thus not very - usable for the general audience. This combined requirements asks for a - single control center that can do all previously described task with an - interface that is usable by almost everyone. - \item[F9:] Report to the user or maintainer when a source has been changed - too much for successful crawling. - - This requirement was also present in the original requirements and has not - changed. When the crawler fails to crawl a source, this can be due to any - reason, a message is sent to the people using the program so that they can - edit or remove the faulty crawler. This is a crucial component because the - program, a non computer science person can do this task and is essential in - shortening the feedback loop explained in Figure~\ref{fig:1.1.2}. + This requirement is a combination of F4 and F5. At first the + user interface for adding and training crawlers was done via a + webinterface that was user friendly and usable by someone + without a particular computer science background as the + requirement stated. However in the first prototypes the control + center that could test, edit and remove crawlers was a command + line application and thus not very usable for the general + audience. This combined requirements asks for a single control + center that can do all previously described tasks with an + interface that is usable without prior knowledge in computer + science. + \item[F10:] Report to the user or maintainer when a source has been + changed too much for successful crawling. + + This requirement was also present in the original requirements + and has not changed. When the crawler fails to crawl a source, + this can be due to any reason, a message is sent to the people + using the program so that they can edit or remove the faulty + crawler. Updating without the need of a programmer is essential + in shortening the feedbackloop explained in + Figure~\ref{feedbackloop}. \end{itemize} \subsection{Non-functional requirements} \subsubsection{Original functional requirements} \begin{itemize} \item[N1:] Integrate in the original system. - \item[N2:] Work in a modular fashion, thus be able to, in the future, extend - the program. + \item[N2:] Work in a modular fashion, thus be able to, in the future, + extend the program. \end{itemize} \subsubsection{Active functional requirements} \begin{itemize} - \item[N2:] Work in a modular fashion, thus be able to, in the future, extend - the program. + \item[N2:] Work in a modular fashion, thus be able to, in the future, + extend the program. - The modularity is very important so that the components can be easily - extended and components can be added. Possible extensions are discussed in - Section~\ref{sec:discuss}. + The modularity is very important so that the components can be + easily extended and components can be added. Possible + extensions are discussed in Section~\ref{sec:discuss}. \item[N3:] Operate standalone on a server. - Non-functional requirement N1 is dropped because we want to keep the - program as modular as possible and via an XML interface we still have a - very stable connection with the database but we avoided getting entangled - in the software managing the database. + Non-functional requirement N1 is dropped because we want to + keep the program as modular as possible and via an XML + interface we still have a very intimate connection with the + database without having to maintain a direct connection. \end{itemize} \section{Application overview} @@ -115,15 +127,16 @@ and with a description underneath. \subsection{Frontend} \subsubsection{General description} -The frontend is a web interface to the backend applications that allow the user -to interact with the backend by for example adding crawlers. The frontend -consists of a basic graphical user interface that is shown in -Figure~\ref{frontendfront}. As the interface shows, there are three main -components that the user can use. There is also an button for downloading the -XML. The XML output is a quick shortcut to make the backend to generate XML. -However the XML button is only for diagnostic purposes located there. In the -standard workflow the XML button is not used. In the standard workflow the -server periodically calls the XML output from the backend to process it. +The frontend is a web interface that is connected to the backend applications +which allows the user to interact with the backend. The frontend consists of a +basic graphical user interface which is shown in Figure~\ref{frontendfront}. As +the interface shows, there are three main components that the user can use. +There is also an button for downloading the XML. The \textit{Get xml} button is +a quick shortcut to make the backend to generate XML. The button for grabbing +the XML data is only for diagnostic purposes located there. In the standard +workflow the XML button is not used. In the standard workflow the server +periodically calls the XML output option from the command line interface of the +backend to process it. \begin{figure}[H] \label{frontendfront} diff --git a/thesis2/thesis.tex b/thesis2/thesis.tex index ad07889..8f426da 100644 --- a/thesis2/thesis.tex +++ b/thesis2/thesis.tex @@ -19,7 +19,7 @@ tabsize=2, } -\newcommand{\cvartitle}{Adaptable crawler specification generation system for% +\newcommand{\cvartitle}{Adaptable crawler specification generation system for % leisure activity RSS feeds} % Setup hyperlink formatting -- 2.20.1