Efficient extraction of news articles based on RSS crawling

Title	Efficient extraction of news articles based on RSS crawling
Publication Type	Conference Paper
Year of Publication	2010
Authors	Bouras, C, Poulopoulos, V, Adam, G
Conference Name	International Conference on Machine and Web Intelligence, Algiers, Algeria (Invited Paper)
Date Published	3 - 5 October
Abstract	The expansion of the World Wide Web has led to a state where a vast amount of Internet users face and have to overcome the major problem of discovering desired information. It is inevitable that hundreds of web pages and weblogs are generated daily or changing on a daily basis. The main problem that arises from the continuous generation and alteration of web pages is the discovery of useful information, a task that becomes difficult even for the experienced internet users. Many mechanisms have been constructed and presented in order to overcome the puzzle of information discovery on the Internet and they are mostly based on crawlers which are browsing the WWW, downloading pages and collect the information that might be of user interest. In this manuscript we describe a mechanism that fetches web pages that include news articles from major news portals and blogs. This mechanism is constructed in order to support tools that are used to acquire news articles from all over the world, process them and present them back to the end users in a personalized manner

File:

You are here