Open Access Open Access  Restricted Access Subscription or Fee Access

Improving Text Summarization Using Latent Semantic Analysis

N. Magesh, T. E. Ramya

Abstract


Text Summarization is a method of generating a shorter version of the given document using natural language processing that enables the users to quickly identify the major points of a document. Text summarization aims at getting the most symbolic content in a system in a compact form from given document while it retains the semantic information of text to a large extent. It is considered to be an effective way of attempting the information and solves the problem of presenting information in more condense form. There are different approaches to produce well defined form of summaries and one of the modern methods is Latent Semantic Analysis. Though the available information about any topic is large and incredible, so there is a need for rapid view of those articles to determine accordance of the article as per user’s wish.

In this paper, the successive way of summarizing the text document by involving the sequence of the techniques and its evaluation using rouge scores was engaged. The SVD plays an important role in separating important sentences from input document. Every sentence is enabled with rank based on its importance in original document. Sentence selection is done based on their ranks and the summary generated. The rouge will produce three distinct scores as, Recall, Precision and F-score. The F-score is considered for evaluating the correctness of summary. The observation of three distinct summaries by reducing input document by 1/2nd, 1/3rd, 1/4th rouge scores and f-score is found to provide the effective results in summarizing the text document.


Keywords


Information Retrieval (IR), Latent Semantic Analysis (LSA), Text Summarization Component.

Full Text:

PDF

References


Avishikta Ghosh, “Bengali Text Summarization using Singular Value Decomposition”, 2014.

Ercan G. Automated text summarization and keyphrase extraction. MSc Thesis, 2006.

Gong, Y. and Liu, X. 2001, “Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis”, Proceedings of SIGIR'01.

Hahn U, Mani I. The challenges of automatic summarization. Computer 2000; 33: 29–36.

Jezek K, Steinberger J. Automatic Text Summarization: the state of the art 2007 and new challenges. Znalosti 2008: 1–12.

Landauer TK, Foltz PW, Laham D. An introduction to Latent Semantic Analysis. Discourse Processes 1998; 25: 259–284.

Lin CY, Hovy E. Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 conference, North American chapter of the Association for Computational Linguistics on human language technology (HLT- NAACL-2003) 2003: 71–78.

Lin CY. ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the workshop on text summarization branches out (WAS 2004) 2004.

Makbule Gulcin Ozsoy and Ferda Nur Alpaslan, “Text summarization using Latent Semantic Analysis”, Journal of Information Science 1–13, 2011.

Makbule Gulcin Ozsoy, Ilyas Cicekli, Ferda Nur Alpaslan, “Text Summarization of Turkish Texts using Latent Semantic Analysis”, Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 869–876, Beijing, August 2010.

Mihalcea R. Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the 42st annual meeting of the Association for Computational Linguistics 2004: 170–173.

Mihalcea R, Tarau P. Text-rank: bringing order into texts. In: Proceeding of the conference on “empirical methods in natural language processing “ 2004: 404–411.

Murray, G., Renals, S. and Carletta, J. 2005. Extractive summarization of meeting recordings. Proceedings of the 9th European Conference on Speech Communication and Technology.

R. A. García-Hernández and Y. Ledeneva, “Single Extractive Text Summarization Based on a Genetic Algorithm,” In Pattern Recognition, Springer Berlin Heidelberg, 2013, pp. 374-383.

R. Mihalcea and P. Tarau, “A language independent algorithm for single and multiple document summarizations,” Proceedings of the Second International Joint Conference Natural Language Processing (IJCNLP’05), Korea, pp. 602– 607, 11–13 October 2005.

Radev D, Blair-Goldstein S, Zang Z. Experiments in single and multi-document summarization using MEAD. In: Proceedings of the document understanding conference 2001.

Steinberger J. Text summarization within the LSA framework. Doctoral Thesis, 2007.

Wang R, Dunnion J, Carthy J. Machine learning approach to augmenting news head-line generation. In: Proceedings of the international joint conference on natural language processing 2005.

Y. Gong and X. Liu, “Generic text summarization using relevance measure and latent semantic analysis,” in: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, USA, pp. 19–25, 9–12 September, 2001


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.