bringing the world locally
About us
Subscription
Review Committee
Conferences
Publications
FAQ
Contact
 
Members Login
    You are not logged in.
Username


Password



CiiT International Journal of Data Mining Knowledge Engineering
Print: ISSN 0974 9683 & Online: ISSN 0974 9578

20082009 2010 2011 2012 2013
  January February March April May June July August September October November December


Issue : June 2011
DOI: DMKE062011001
Title: Intellectual Question Categorization for Assessing the Learner Performance in E-Learning
Authors: R. Kavitha
Keywords: Intellectual Question Classification, E-learning, ANFIS
Abstract:
     E-learning plays a critical role in education. Each learner is having their own way of learning style that cannot be assessed in an exclusive way. They should not be evaluated only on number of right and wrong answers. Testing must be intelligent to pose intellectual questions based on the performance during the session. So, questions have to be classified based on the item difficulty using item responses. The study involves the categorization of questions based on ANN and ANFIS techniques. This paper reports the investigation of the effectiveness and performances of these methods to observe the question classification abilities depending on item responses, item difficulty and question levels. The effectiveness of these methods was evaluated by comparing the performance and class correctness. The comparative test performance analysis based on error rating revealed that ANFIS yield better performance. This study is focused because, each item affects a students’ overall success throughout the test in terms of difficulty.

Full PDF


Issue : June 2011
DOI: DMKE062011002
Title: Measuring and Evaluating Library Infrastructure and Information Services in the Constituent Colleges of Vinayaka Missions University in Tamilnadu: A Study
Authors: K. Ayyanar and Dr.M. Kanakaraj
Keywords: Library Collection Resources, Library Services
Abstract:
      This study examines measuring and evaluating, library infrastructure and information services for the user satisfaction in constitutional colleges of Vinayaka Missions University, Tamil Nadu State, India. It also studies user satisfaction on library resources, various services, and infrastructure and role of library professionals. This research measures and evaluates the infrastructure and an information service provided to the users and suggests important methods to make the library user-friendly, rich information provider and career development organizer. This study would highlight possible solution to improve services and users needs of the constituent colleges at the Vinayaka Missions University (VMU). The objectives of the study is to get feedback on current services as required by and faculty ,research scholar and students to find ways for optimizing the use of library resources through a clear cut and well uttered approach facilities. This study works such as resource collection, user’s usages, infrastructure, information services, staff helping attitude various methods, used in measuring and evaluating of tasks required at Vinayaka Missions University.

Full PDF


Issue : June 2011
DOI: DMKE062011003
Title: Pattern Mining in E-Learning
Authors: Subrata Sahana, Saurabh Mittal and Sarthak Jauhari
Keywords: Pattern Mining, Clustering, e-learning, Tracker
Abstract:
      Group study is necessary in learning. Our aim is to distinguish the huge data provided into different categories based on unique patterns. These patterns will be generated according to students expertise. We extract patterns distinguishing the better from the weaker groups and get insights in the success factors. The results point to the importance of leadership and group interaction, and give promising indications if they are occurring. Patterns indicating good individual practices were also identified. Clustering is used in order to separate data into different groups. This whole idea leads to provide a better methodology in e-learning as an individual can perform online discussions as it include chat rooms , performance based on stronger and weaker groups can also be identified and by the use of sequence tool the admin can also identify the subjects which require more detail study material. For this whole concept sequence by sequence(SBS) algorithm is defined and sequential tool is designed based on different patterns.

Full PDF


Issue : June 2011
DOI: DMKE062011004
Title: Mining Temporal Reservoir Data Using Sliding Window Technique
Authors: Wan Hussain Wan Ishak, Ku-Ruhana Ku-Mahamud and Norita Md Norwawi
Keywords: Neural Network, Sliding Window, Reservoir Management, Reservoir Water Level, Temporal Data Mining
Abstract:
    Decision on reservoir water release is crucial during both intense and less intense rainfall seasons. Even though reservoir water release is guided by the procedures, decision usually made based on the past experiences. Past experiences are recorded either hourly, daily, or weekly in the reservoir operation log book. In a few years this log book will become knowledge-rich repository, but very difficult and time consuming to be referred. In addition, the temporal relationship between the data cannot be easily identified. In this study window sliding technique is applied to extract information from the reservoir operational database: a digital version of the reservoir operation log book. Several data sets were constructed based on different sliding window size. Artificial neural network was used as modelling tool. The findings indicate that eight days is the significant time lags between upstream rainfall and reservoir water level. The best artificial neural network model is 24-15-3.

Full PDF


Issue : June 2011
DOI: DMKE062011005
Title: Research Community Mining by Using Brush Structure Model
Authors: M. Geetha and Dr.G.M. Kadhar Nawaz
Keywords: Information Retrieval Engine, E-mail ID Harvester, Research Community, Brush Datastructure, Web Structure Mining, Bi-Partite Graph
Abstract:
      Since research trends can change dynamically, researchers have to keep up with these new trends and undertake new research topics. Therefore, research communities for new research domains are important. Domain-specific search engines (or vertical search engines) alleviate the problem to some extent, by allowing researchers to perform searches in a particular domain and providing customized community researchers. In this paper, we propose a method to discover research communities. The key features of our method are a network model of papers and a word assignment technique for the communities obtained. This research work is focused on evaluating a bibliometrics search engine called Domain Expert, which produces list of community researchers e-mail addresses of cyber domain experts available freely in the Web. This exploits techniques such as bibliometrics and community mining using Brush structure, which is a new data structure model for indexing by community formation.

Full PDF


Issue : June 2011
DOI: DMKE062011006
Title: Document Clustering Using Hybrid Ant Algorithm
Authors: R. Priya Vaijayanthi and A.M. Natarajan
Keywords: Ant Colony, Document Clustering, Meta- Heuristic, Optimization
Abstract:
      In recent years it is required to store/retrieve a huge quantum of documents across network in World Wide Web (WWW) due to the wide spread usage of computers across the globe. This has placed many challenges to the Information Retrieval (IR) system like fetching of relevant documents matching with user’s query, classification of electronic documents etc. Clustering is an unsupervised learning that partitions the available documents into several clusters based on the similarity between the documents. The problem of clustering has become a combinatorial optimization problem in IR system due to the exponential growth in information over WWW. In this paper, a novel Hybrid Ant Algorithm, a blended scheme of Tabu Search and Ant Colony Optimatization algorithm has been proposed to form better quality clusters with documents of similar features. The viability of the proposed algorithm is tested over a few standard benchmark datasets and the numerical experimental results reveal that the proposed algorithm yields promising quality clusters compared to other ones produced by K-means algorithm.

Full PDF


Issue : June 2011
DOI: DMKE062011007
Title: Efficient Mining of Active and Valuable Clustered Sequential Patterns
Authors: Sahista Machchhar, Madhuri Vaghasia and Chintan Bhatt
Keywords: Data Clustering, Projected Database, Sequential Patterns, K-Means
Abstract:
    Clustering of inherent sequential natured data sets is useful for various purposes. Over the years, many methods have been developed for clustering objects having sequential nature according to their similarity. However, these methods tend to have a computational complexity that is at least quadratic on the number of sequences. Also, clustering algorithms often require that the entire dataset be kept in the computer memory. In this paper, we present novel algorithm for Mining of constraint based clustered sequential patterns (CBCSP) algorithm for clustering only user interesting sequential data using recency, monetary and compactness constraints. So, the algorithm generates a compact set of clusters of sequential patterns according to user interest by applying constraints in mining process. It minimizes the I/O cost involved. The proposed algorithm basically applies the well known K-means clustering algorithm along with Prefix-Projected Database construction to the set of sequential patterns. In this approach, the method first performs clustering based on a novel similarity function and then captures the sequential patterns of which are only user interesting in each cluster using a sequential pattern mining algorithm which employs pattern growth method not. The proposed work results in reduced search space as user intended sequential patterns tend to be discovered in the resulting list. Through experimental evaluation under various simulated conditions, the proposed method is shown to deliver excellent performance and leads to reasonably good clusters.

Full PDF


Issue : June 2011
DOI: DMKE062011008
Title: Analysis of Predictive Models for Cardiovascular Heart Disease Diagnosis
Authors: Sunila Godara and Dr. Prabhat Panday
Keywords: Heart Disease, Data Mining Techniques, Random Decision Tree, Decision Tree Forest, Artificial Neural Networks, and Support Vector Machine
Abstract:
      Medical science industry has huge amount of data, but most of this data is not mined to find out hidden information in data. Data mining techniques can be used to discover hidden patterns. Diagnosing of heart disease is one of important issue to develop medical decision support system which will help the physicians to take effective decision. In this research paper data mining classification techniques Random Decision Tree , Decision Tree Forest, Artificial neural networks (ANNs), and Support Vector Machine (SVM) are analyzed on cardiovascular disease dataset. Performance of these techniques is compared through sensitivity, specificity, accuracy, F measure, True Positive Rate, False Positive Rate and ROC. In our study 10-fold cross validation method was used to measure the unbiased estimate of these prediction models.

Full PDF


Issue : June 2011
DOI: DMKE062011009
Title: An Image Mining Technique for Identifying Tuberculosis Meningitis of the Brain
Authors: Dr.A.R. Mohamed Shanavas and M. Arul Kothai Priya
Keywords: Image Mining, Data Mining, Tuberculosis Meningitis (TBM), Hu Moment Invariant, Modified K Mean Clustering.
Abstract:
      The main focus of image mining in the proposed system is concerned with the identification of meningitis in the membrane of brain using CT scan brain image. Identifying the type of tuberculosis affecting the meninges of brain is a crucial step in computer assisted Meningitis TB detection. The system proposes a method based on modified K mean clustering to enhance the diagnosis of medical images like CT scan brain image. The system analyzes medical images and automatically generates suggestions of diagnosis employing modified K mean clustering and Hu Moment Invariant method. The proposed method uses two important algorithm of image mining. The first method extracts features present in the CT scan of brain image and the second method cluster the type of meningitis present in the image. In the existing system it classifies the presence of bacteria through the sputum analysis and identifies the TB affecting the lung. The proposed system identifies Meningitis TB affecting the membranes of the brain. The method has been applied on several real datasets, and the results shows high accuracy to claim that the use of modified K mean clustering is a powerful means to assist in the diagnosing task.

Full PDF


Issue : June 2011
DOI: DMKE062011010
Title: A Partition Model for Multilevel Association Rule Mining
Authors: Pratima Gautam and K.R. Pardasani
Keywords: Association Rule, Frequent Itemset, Transaction Database, Tree Map, Multilevel Association Rule, Level Wise Filtered Tables.
Abstract:
    We have extended the capacity of the learn of mining association rules from single level to multiple concept levels and studied methods for mining multiple-level association rules from large transaction databases. Mining multiple-level association rules may lead to progressive mining of refined knowledge from data and have interesting applications for knowledge discovery in transaction databases, as well as other business or engineering databases.Mining frequent patterns in huge transactional database is an extremely researched area in the field of data mining. Mining frequent itemsets is a basic problem for mining association rules. Taking out association rules at multiple levels helps in discovers more specific and applicable knowledge. Even as computing the number of occurrence of an item we require to scan the given database lots of times. Thus we used partition method and boolean methods for finding frequent itemsets at each concept levels which reduce the number of scans, I/O cost and also reduce CPU overhead. In this paper a new approach is introduced for solving the abovementioned issues. Therefore this algorithm is above all fit for very large size databases. We also use a top-down progressive deepening method is developed for efficient mining of multiple-level association rules from large transaction databases based on the Apriori principle. This method first finds frequent data items at the topmost level and then progressively deepens the mining process into their descendants at lower concept levels.

Full PDF


Issue : June 2011
DOI: DMKE062011011
Title: Identifying the Rice Diseases Using Classification Techniques
Authors: A. Nithya and Dr.V. Sundaram
Keywords: Decision Trees, Classification, Data Mining, Neural Network
Abstract:
      Rice disease identification is one of the main issue of the country. The essence of the paper is identifying the rice disease in initial stage. This paper going to be create ready reckoned of the farmers. The main advantage of the paper is easily identifying the disease and gives the better solution for the farmers. It is the process which includes defining and redefining problems, formulating hypothesis, collecting, organizing and evaluating data; making deductions and reaching conclusions and at last presenting it in a detailed, accurate manner. This paper mainly focuses on concepts of data mining such as Classification, Decision Trees, and Neural Networks. A disease is an abnormal condition that injures the plant or causes it to function improperly. Diseases are readily recognized by their symptoms - associated visible changes in the plant. The organisms that cause diseases are known as pathogens. Many species of bacteria, fungus, nematode, virus and mycoplasma-like organisms cause diseases in rice. Disorders or abnormalities may also cause by abiotic factors such as low or high temperature beyond the limits for normal growth of rice, deficiency or excess of nutrients in the soil and water, pH and other soil conditions which affect the availability and uptake of nutrients, toxic substances such as H2S produced in soil, water stress and reduced light.. However, here we will cover only the common diseases of rice those cause by pathogen.

Full PDF


Issue : June 2011
DOI: DMKE062011012
Title: Privacy-Preserving Online Feedback System
Authors: Reena S. Kharat, B.N. Jagdale and Swati Tonge
Keywords: Privacy, Security, Data Mining, Online Feedback
Abstract:
      In online feedback evaluation, data is immediately available for analysis and reporting. But response to online feedback evaluation is less due to lack of privacy. Solution to achieve adequate response rate is to improve privacy. This paper introduces a novel framework to a problem of privacy in online feedback system. Our focus is to evaluate and analyse feedback, without disclosing the actual data from user. Here, data has been randomized to preserve privacy of individual user. We study how to analyse private feedback without disclosing it to other user or any other party. To tackle this demanding problem, we develop a secure protocol to conduct the desired computation. We define a protocol using homomorphic encryption techniques to send the feedback while keeping it private. Finally, we present privacy and correctness analysis that validates the algorithm.

Full PDF


Issue : June 2011
DOI: DMKE062011013
Title: Data-Deduplication in Linux Kernel File-System
Authors: Amit Savyanavar, Sachin Katarnaware, Pritam Bankar, Prashant Jadhav and Nikhil Bagde
Abstract:
    The Data Deduplication is basically a compression technique to eliminate redundant data from hard disk or storage space to efficiently use the storage space. As in every operating system the storage space is manage by file system or we can say data is stored on secondary storage space by file system. So we are modifying the file system so that it can eliminate the redundant block of data before storing to the secondary space which is also called as Inline Data Deduplication. Ext4 is latest file system which is used in Linux, which is having so many new features, so we are modifying Ext4 and adding this one more feature called as Data Deduplication [5].In our method Inline data deduplication we create a table to store a hash key, and the corresponding block number, which contains the data for that hash key. The hash key is generated using sha1 algorithm. Every time whenever the new data comes it is given to sha1 before allocating any blocks for it and the key is generated. Then this key is compare with already stored keys in the table, it the key is already present then in that case only the corresponding counter of the key is modified or incremented, this counter is basically used to keep track of count of pointers that are pointing to block on the physical device. Whenever the key is not present in that case key is stored and the control is passed to superblock which allocates the free blocks, from the list which it contains and then returns the allocated block numbers to table where they are stored corresponding there key and the counter is also incremented. So by using this method we can eliminate redundant allocation of data blocks, as result we can save the space and increase the efficiency of the storage space. This is how enterprises and big organization can save space as there data is growing exponentially in their field. An also as this method is block level elimination it elimination ratio is also good and good save of storage space.


Full PDF