log/2014-04-08.txt

   1 TIME: 3
   2
   3 Meeting with Alessandro and discussed with Jan about the project scope.
   4
   5 Worst case a trainable by non IT rss feed crawler. Best case also websites
   6 parseable.
   7
   8 PLANS
   9 =====
  10 literature research, compare programming languages, python, php/javascript.
  11 Server of HL has python. Crawler is going to be python for sure.
  12
  13 So basically there is are three components:
  14 - Frontend
  15         The frontend is the user interface for the non IT user and is probably a
  16         plugin for chrome or firefox. This generates a scheme which is parseable by
  17         the     crawler.
  18 - Crawler
  19         The crawler periodically crawls the sites/feeds using the generated schemes
  20         and notifies the admins if there is a change in layout. The crawler
  21         generates xml that is later parsed by the backend.
  22 - Backend
  23         The backend is not within the scope of this project but it will parse the
  24         xml given by the crawler.