Vol.70, No.1, 2022, pp.1263-1280, doi:10.32604/cmc.2022.019451
OPEN ACCESS
ARTICLE
Hierarchical Stream Clustering Based NEWS Summarization System
  • M. Arun Manicka Raja1,*, S. Swamynathan2
1 Department of Computer Science and Engineering, RMK College of Engineering and Technology, Chennai, 602106, India
2 Department of Information Science and Technology, College of Engineering Guindy, Anna University, Chennai, 600025, India
* Corresponding Author: M. Arun Manicka Raja. Email:
Received 14 April 2021; Accepted 16 May 2021; Issue published 07 September 2021
Abstract
News feed is one of the potential information providing sources which give updates on various topics of different domains. These updates on various topics need to be collected since the domain specific interested users are in need of important updates in their domains with organized data from various sources. In this paper, the news summarization system is proposed for the news data streams from RSS feeds and Google news. Since news stream analysis requires live content, the news data are continuously collected for our experimentation. The major contributions of this work involve domain corpus based news collection, news content extraction, hierarchical clustering of the news and summarization of news. Many of the existing news summarization systems lack in providing dynamic content with domain wise representation. This is alleviated in our proposed system by tagging the news feed with domain corpuses and organizing the news streams with the hierarchical structure with topic wise representation. Further, the news streams are summarized for the users with a novel summarization algorithm. The proposed summarization system generates topic wise summaries effectively for the user and no system in the literature has handled the news summarization by collecting the data dynamically and organizing the content hierarchically. The proposed system is compared with existing systems and achieves better results in generating news summaries. The Online news content editors are highly benefitted by this system for instantly getting the news summaries of their domain interest.
Keywords
News feed; content similarity; parallel crawler; collaborative filtering; hierarchical clustering; news summarization
Cite This Article
Arun, M., Swamynathan, S. (2022). Hierarchical Stream Clustering Based NEWS Summarization System. CMC-Computers, Materials & Continua, 70(1), 1263–1280.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.