Open Access Open Access  Restricted Access Subscription or Fee Access

Hindi Language Document Summarization using Context Based Indexing Model

Swati Sargule, Ramesh M Kagalkar

Abstract


-Hindi Document Summarization (DS) is an Information Retrieval (IR) process in which summery of document is extracted to provide overview of that document. Existing document summarization models generally use the similarity among sentences in the original document to extract the maximum significant sentences. The documents along with the sentences are generally indexed using standard term indexing computation techniques, which do not take into account the context related to document. Thus, the similarity values of sentence are independent of the context. In this paper, a context sensitive document indexing model is propose which based on the Bernoulli model of randomness for Hindi text document. The Bernoulli model has been used to check the probability of the co-occurrences of two terms in a large set of documents.

Keywords


Document Summarization, Lexical Association, Context Indexing.

Full Text:

PDF

References


P. Goyal, L. Behera, and T. M. McGinnity, “A Context-Based Word Indexing Model for Document Summarization”, IEEE Trans. on Knowledge and Data Engineering, vol. 25, no. 8, August 2013.

T. Yoshinari, E.-S. Atlam, K. Morita, K. Kiyoi, and J.-i. Aoe, “Automatic Acquisition for Sensibility Knowledge Using Co- Occurrence Relation,” Int’l J. Computer Applications in Technology, vol. 33, pp. 218-225, Dec. 2008.

B. Andreopoulos, D. Alexopoulou, and M. Schroeder, “Word Sense Disambiguation in Biomedical Ontologies with Term Co-Occurrence Analysis and Document Clustering,” Int’l J. Data Mining and Bioinformatics, vol. 2, pp. 193-215, Sept. 2008.

P. Goyal, L. Behera, and T. McGinnity, “Query Representation Through Lexical Assoc. for Information Retrieval,” IEEE Trans. Knowledge and Data Eng., vol. 24, no. 12, pp. 2260-2273, Dec. 2011.

K. Cai, C. Chen, and J. Bu, “Exploration of Term Relationship for Bayesian Network Based Sentence Retrieval,” Pattern Recognition Letters, vol. 30, no. 9, pp. 805-811, 2009.

C.C. Chen and M.C. Chen, “TSCAN: A Content Anatomy Approach to Temporal Topic Summarization,” IEEE Trans. Knowledge and Data Eng., vol. 24, no. 1, pp. 170-183, Jan. 2012.

L.L. Bando, F. Scholer, and A. Turpin, “Constructing Query-Biased Summaries: A Comparison of Human and System Generated Snippets,” Proc. Third Symp. Information Interaction in Context, pp. 195-204, 2010.

X. Wan, “Towards a Unified Approach to Simultaneous Single- Document and Multi-Document Summarizations,” Proc. 23rd Int’l Conf. Computational Linguistics, pp. 1137-1145, 2010.

X. Wan, “An Exploration of Document Impact on Graph-Based Multi-Document Summarization,” Proc. Conf. Empirical Methods in Natural Language Processing, pp. 755-762, 2008.

C. Shen and T. Li, “Multi-Document Summarization via the Minimum Dominating Set,” Proc. 23rd Int’l Conf. Computational Linguistics, pp. 984-992, 2010.

X. Wan and J. Yang, “Multi-Document Summarization Using Cluster-Based Link Analysis,” Proc. 31st Ann. Int’l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 299-306, 2008.

D. Wang, T. Li, S. Zhu, and C. Ding, “Multi-Document Summarization via Sentence-Level Semantic Analysis and Symmetric Matrix Factorization,” Proc. 31st Ann. Int’l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 307-314, 2008.

S. Harabagiu and F. Lacatusu, “Using Topic Themes for Multi- Document Summarization,” ACM Trans. Information Systems, vol. 28, pp. 13:1-13:47, July 2010.

R. Varadarajan, V. Hristidis, and T. Li, “Beyond Single-Page Web Search Results,” IEEE Trans. Knowledge and Data Eng., vol. 20, no. 3, pp. 411-424, Mar. 2008.

Amit kumar and Ramesh Kagalkar "Advanced Marathi Sign Language Recognition using Computer Vision", International Journal of Computer Applications (0975 – 8887) Volume 118 – No. 13, May 2015.

Amitkumar and Ramesh Kagalkar "Sign Langauge Recognition for Deaf User", Internal Journal for Research in Applied Science and Engineering Technology, Volume 2 Issue XII, December 2014.

Ramesh M. Kagalkar and Nagaraja H.N, "New Methodology for Translation of Static Sign Symbol to Words in Kannada Language", International Journal of Computer Applications (0975 – 8887) Volume 121 – No.20, July 2015.

Ramesh M. Kagalkar, Dr. Nagaraj H.N and Dr. S.V Gumaste," A Novel Technical Approach for Implementing Static Hand Gesture Recognition", International Journal of Advanced Research in Computer and Communication Engineering (ISSN (Online) 2278-1021 ISSN (Print) 2319-5940), Vol. 4, Issue 7, July 2015.

Amitkumar and Ramesh Kagalkar "Methodology for Translation of Sign Language into Textual Version in Marathi", CIIT, Digital Image Processing,( ISSN: 0974 – 9586 ,Print and Online),Aug- 2015r


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.