Assigning Web News to Clusters

TitleAssigning Web News to Clusters
Publication TypeConference Paper
Year of Publication2010
AuthorsBouras, C, Tsogkas, V
Conference NameThe Fifth International Conference on Internet and Web Applications and Services, (ICIW 2010), Barcelona, Spain
Date PublishedMay 9 - 15
Abstract

The Web is overcrowded with news articles, an
overwhelming information source both with its amount and
diversity. Assigning news articles to similar groups, on the
other hand, provides a very powerful data mining and
manipulation technique for topic discovery from text
documents. In this paper, we are investigating the application
of a great spectrum of clustering algorithms, as well as
similarity measures, to news articles that originate from the
Web and compare their efficiency for use in an online Web
news service application. We also examine the effect of
preprocessing on clustering. Our experimentation showed that
k-means, despite its simplicity, accompanied with preliminary
steps for data cleaning and normalizing, gives better aggregate
results when it comes to efficiency.