An Enhanced Method for Efficient Information Retrieval from Resume Documents using SPARQL

P. Sheba Alice; A.M. Abirami; Dr. A. Askarunisa

An Enhanced Method for Efficient Information Retrieval from Resume Documents using SPARQL

P. Sheba Alice, A.M. Abirami, Dr. A. Askarunisa

Abstract

It is more important to retrieve information from various types of documents like DOC, HTML, etc that contain vital information to be preserved and used in future. Information retrieval from these documents is mostly the manual effort. Though search algorithms do this retrieval, they may not be accurate as expected by the user. Also, some documents like candidates‟ resumes cannot be stored into the relational database as such because the number of fields is more. Much of manual efforts are put in use to analyze the various resumes to select the candidates who satisfy the specific criteria. To minimize the manual efforts and to get the results faster, this paper proposes the use of Semantic Web Technology like OWL, RDF and SPARQL to retrieve the information from the documents efficiently. This paper proposes to create the Ontology for the required domain as a first step. Based on the fields or tags in the owl file, the user is given a form to provide his personal and academic details. These data is converted into RDF/XML document. RDF files are retrieved and grouped based on some category. Query text is entered and the relevant records are retrieved from RDF documents using SPARQL. SPARQL is an RDF query language that enhances fast and efficient search of data when compared to other XML query languages like XPATH and XQUERY. Comparison between SPARQL and XPATH in terms of time taken to retrieve records is also analyzed in this paper.

Keywords

RDF, OWL, SPARQL, Document Filter, Information Retrieval.

Full Text:

PDF

References

David Camacho and Maria D. R-Moreno, “Web Data Extraction from Semantic Generators”, VSP International Science Publishers, The Netherlands, 2006.

Gopinath Ganapathy, S. Sagayaraj, “To Generate the Ontology from Java Source Code”, International Journal of Advanced Computer Science and Applications, Vol. 2, No.2, February 2011.

Pavel Smr and Marek Schmidt, “Information Extraction in Semantic Wikis”.

Urvi Shah, Tim Finin, Anupam Joshi, “Information Retrieval on the Semantic Web”.

Peter Haase, Nenad Stojanovic, York Sure, and Johanna Volker, “Personalized Information Retrieval in Bibster, a Semantics-Based Bibliographic Peer-to-Peer System”.

Shengping Liu, Yuan Ni, Jing Mei, “iSMART: Ontology-based Semantic Query of CDA Documents”.

Joost De Valk, “Semantic HTML and Search Engine Optimization”.

Peter Coetzee, Tom Heath, Enrico Motta, “SparqPlug: Generating Linked Data from Legacy HTML, SPARQL and the DOM”.

Josef Petrak, Jan Zemanek, Vojtech Svatek, “Case Study on Linked Data and SPARQL Usage for Web Application Development”.

Olaf Hartig, Christian Bizer, “Executing SPARQL Queries over the Web of Linked Data”.

Tim Finin, James Mayfield, “Information Retrieval and the Semantic Web”.

B.Hemanth Kumar, M.Surendra Prasad Babu, “An Implementation of Semantic Web System for Information retrieval using J2EE technologies”, International Journal of Computer Science and Engineering.

Dennis Quan, “Improving life sciences information retrieval using semantic web technology”, Briefing in Bioinformatics, Vol 8, No.3, 172-182, 2007.

http://www.w3.org/TR/rdf-syntax/

http://jena.sourceforge.net/tutorial/RDF_API/

http://www.w3.org/TR/rdf-sparql-query/

http://protege.stanford.edu/doc/owl/getting-started.html

http://www.obitko.com/tutorials/ontologies-semantic-web/ontologies.html

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me