Intro: Hyperleap + infotaintment Relieve programmer fixing crawlers System to generate crawler specification Frontend useable for non programmers Frontend: Runs in browser Runs from apache and python Backend: Converts the user patterns from frontend to nodelists. Nodelists are merged into DAWG minimization to generate patterns(graphs). The crawler reads the patterns and crawls the site. Crawler results are send via an XML/XSD stream to the original backend. Results: Few RSS Much RSS misuse Future: Extend to HTML (program to convert HTML to RSS) Reuse interface Low level matching can increase Questions: - Why is user interface easy to use Direct feedback Familiar interface with buttons and textboxes - Why did you choose RSS We had to limit scope RSS is very consistent in underlying structure But RSS doesn't have any structure in itself but underlying because they are generated