Open Access Open Access  Restricted Access Subscription or Fee Access

A Survey on Title and Keyword based Extraction of News Contents using Machine Learning

William Cook, J Waycott

Abstract


Newspapers provide a valuable resource for information. Recently, many models for content extraction have been proposed, such models are highly scalable and inexpensive in time, but most models are difficult to extract content accurately and completely, and are prone to noise. In the past decade, most major newspapers and magazines have built websites that provide news or other material. Also, only online newspapers appeared. The quality and quantity of content displayed on all of these websites has been greatly improved, providing valuable information resources. In this article, we investigate various data mining methods used to process extracted content information and summaries of results. This survey examined popular and effective machine learning techniques and their advantages and disadvantages.


Keywords


Web News, Data Mining, Information Extraction, Title-Based Extraction, Machine Learning.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.