reqs update
[bsc-thesis1415.git] / thesis2 / 2.requirementsanddesign.tex
1 \section{Requirements}
2 \subsection{Introduction}
3 As almost every other computer starts with a set of requirements so will this
4 application. Requirements are a set of goals within different categories that
5 will define what the application has to be able to do and they are
6 traditionally defined at the start of the project and not expected to change
7 much. In the case of this project the requirements were a lot more flexible
8 because there was only one person doing the programming and there was a weekly
9 meeting to discuss the matters and most importantly discuss the required
10 changes. Because of this a lot of initial requirements are removed and a some
11 requirements were added in the process. The list below shows the definitive
12 requirements and also the suspended requirements.
13
14 The two types of requirements that are formed are functional and non-functional
15 requirements. Respectively they are requirements that describe a certain
16 function and the latter are requirements that describe a certain property such
17 as efficiency or compatibility. To make us able to refer to them we give the
18 requirements unique codes. We also specify in the list with active requirements
19 the reason for the choice.
20
21 \subsection{Functional requirements}
22 \subsubsection{Original functional requirements}
23 \begin{itemize}
24 \item[F1:] Be able to crawl several source types.
25 \begin{itemize}
26 \item[F1a:] Fax/email.
27 \item[F1b:] XML feeds.
28 \item[F1c:] RSS feeds.
29 \item[F1d:] Websites.
30 \end{itemize}
31 \item[F2:] Apply low level matching techniques on isolated data.
32 \item[F3:] Insert the data in the database.
33 \item[F4:] User interface to train crawlers must be usable by non computer
34 science people.
35 \item[F5:] There must be a control center for the crawlers.
36 \end{itemize}
37
38 \subsubsection{Definitive functional requirements}
39 Requirements F2 is the sole requirement that is dropped completely. All other
40 definitive requirements are formed out of the original functional requirements.
41 Together they make the following definitive requirements:
42 \begin{itemize}
43 \item[F6:] Be able to crawl RSS feeds only.
44
45 This requirement is an adapted version of the compound requirements
46 F1a-F1d. We stripped down from crawling four different sources to only one
47 source because of the scope of the project. Most sources require an
48 entirely different strategy. The full reason why we chose RSS feeds can be
49 found in Section~\ref{sec:whyrss}.
50
51 \item[F7:] Export the data to a strict XML feed.
52
53 This requirement is an adapted version of requirement F3, this to done to
54 make the scope smaller. We chose to no interact with the database or the
55 \textit{Temporum}. The application will have to be able to output XML data
56 that is formatted following a string XSD scheme so that it is easy to
57 import the data in the database or \textit{Temporum}.
58 \item[F8:] A control center interface that is usable by non computer
59 science people.
60
61 This requirement is a combination of F4 and F5. At first the user interface
62 for adding and training crawlers was done via a webinterface that was user
63 friendly and usable for non computer science people as the requirement
64 stated. However in the first prototypes the control center that could test,
65 edit and remove crawlers was a command line application and thus not very
66 usable for the general audience. This combined requirements asks for a
67 single control center that can do all previously described task with an
68 interface that is usable by almost everyone.
69 \item[F9:] Report to the user or maintainer when a source has been changed
70 too much for successful crawling.
71
72 This requirement was also present in the original requirements and hasn't
73 changed. When the crawler fails to crawl a source, this can be due to any
74 reason, a message is sent to the people using the program so that they can
75 edit or remove the faulty crawler. This is a crucial component because the
76 program, a non computer science person can do this task and is essential in
77 shortening the feedback loop explained in Figure~\ref{fig:1.1.2}.
78 \end{itemize}
79
80 \subsection{Non-functional requirements}
81 \subsubsection{Original functional requirements}
82 \begin{itemize}
83 \item[N1:] Integrate in the original system.
84 \item[N2:] Work in a modular fashion, thus be able to, in the future, extend
85 the program.
86 \end{itemize}
87
88 \subsubsection{Active functional requirements}
89 \begin{itemize}
90 \item[N2:] Work in a modular fashion, thus be able to, in the future, extend
91 the program.
92
93 The modularity is very important so that the components can be easily
94 extended and components can be added. Possible extensions are discussed in
95 Section~\ref{sec:discuss}.
96 \item[N3:] Operate standalone on a server.
97
98 Non-functional requirement N1 is dropped because we want to keep the
99 program as modular as possible and via an XML interface we still have a
100 very stable connection with the database but we avoided getting entangled
101 in the software managing the database.
102 \end{itemize}
103
104 \section{Design}