Open Access Open Access  Restricted Access Subscription or Fee Access

A Study on Enhanced Path Sequence Algorithm Using Web Directories

S. Mohanapriya

Abstract


A web directory is not a search engine and does not display lists of web pages based on keywords; instead, it lists web sites by category and subcategory. The categorization is usually based on the whole web site rather than one page or a set of keywords, and sites are often limited to inclusion in only a few categories. Web directories often allow site owners to submit their site for inclusion, and have editors review submissions for fitness.

In dissimilarity to most of the work on Web usage mining, the usage data that are analyzed here communicate to user navigation throughout the Web, to a certain extent than a particular Web site exhibit as a result a high amount of thematic diversity. Due to proxy servers and cached versions of the pages used by the client using ‘Back’, the sessions identified have many missed pages. Enhanced Path Sequence Algorithm proposed there are chances of missing pages after constructing transactions due to proxy servers and caching problems.

Three approaches used for this 1. Time Window: A time window transaction is framed from triplets of ip address, user identification, and time length of each webpage up to a limit called time window. 2. Reference Length approach: This approach is based on the assumption that the amount of time a user spends on a page correlates to whether the page is an auxiliary page or content page for that user. 3. Maximal Forward Reference: A transaction is considered as the set of pages from the visited page until there is a backward reference.

Forward reference pages are considered as content pages and the path is taken as index pages. The primary usage to store sessions and pointers to secondary table which is having complete path navigation.


Keywords


Web Directories, Personalization, Path Sequence, Log File.

Full Text:

PDF

References


B. Mobasher, R. Cooley, and J. Srivastava, “Automatic Personalization Based on Web Usage Mining,” Comm. ACM, vol. 43, no. 8, pp. 142-151, 2000.

J. Srivastava, R. Cooley, M. Deshpande, and P.T. Tan, “Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data,” SIGKDD Explorations, vol. 1, no. 2, pp. 12-23, 2000.

D. Pierrakos, G. Paliouras, C. Papatheodorou, and C.D. Spyropoulos, “Web Usage Mining as a Tool for Personalization: A Survey,” User Modeling and User-Adapted Interaction, vol. 13, no. 4, pp. 311-372, 2003.

G. Paliouras, C. Papatheodorou, V. Karkaletsis, and C.D. Spyropoulos, “Discovering User Communities on the Internet Using Unsupervised Machine Learning Techniques,” Interacting with Computers J., vol. 14, no. 6, pp. 761-791, 2002.

G. Xu, Y. Zhang, and Y. Xun, “Modeling User Behaviour for Web Recommendation Using lda Model,” Proc. IEEE/WIC/ACM Int’l Conf. Web Intelligence and Intelligent Agent Technology, pp. 529-532, 2008.

W. Chu and S.-T.P. Park, “Personalized Recommendation on Dynamic Content Using Predictive Bilinear Models,” Proc. 18th Int’l Conf. World Wide Web (WWW), pp. 691-700, 2009.

The Adaptive Web, Methods and Strategies of Web Personalization, P. Brusilovsky, A. Kobsa, and W. Neijdl, eds. Springer, 2007.

D. Pierrakos, G. Paliouras, C. Papatheodorou, V. Karkaletsis, and M. Dikaiakos, “Web Community Directories: A New Approach to Web Personalization,” Web Mining: From Web to Semantic Web, B. Berendt et al., eds., pp. 113-129, Springer, 2004.

D. Pierrakos and G. Paliouras, “Exploiting Probabilistic Latent Information for the Construction of Community Web Directories,” Proc. 10th Int’l Conf. User Modeling, L. Ardissono, P. Brna, and A. Mitrovic, eds., pp. 89-98, 2005.

C. Christophi, D. Zeinalipour-Yazti, M.D. Dikaiakos, and G. Paliouras, “Automatically Annotating the ODP Web Taxonomy,” Proc. 11th Panhellenic Conf. Informatics (PCI ’07), 2007.

P.I. Hofgesang, “Online Mining of Web Usage Data: An Overview,” Web Mining Applications in E-Commerce and E-Services, pp. 1-24, Springer, 2009.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.