20082009
2010
2011
2012
2013
April May June July August September October November
Issue : April
2009
DOI: DMKE042009001
Title: High Dimensional Data Mining Using Clustering
Authors: A. Bharathi, Dr. A.M. Natarajan
Keywords: Data mining, High Dimensional Clustering, Distance Measure
Abstract:
Clustering is one of the major tasks in data mining. Clustering algorithms are based on a criterion that maximizes inter cluster distance and minimize intra cluster distance. In higher dimensional feature spaces, the performance and efficiency deteriorates to a greater extent. Large dimensions confuse the clustering algorithms and it is difficult to group similar data points becomes almost the same and is usually called as the “dimensionality curse” problem. These algorithms find a subset of dimensions by removing irrelevant and redundant dimensions on which clustering is performed. Dimensionality reduction technique such as Principal Component Analysis (PCA) is used for feature reduction. If different subsets of the points cluster well on different subspaces of the feature space, a global dimensionality reduction will fail. To overcome these problems, recent directions in research proposed to compute subspace cluster. The algorithms have two common limitations. First, they usually have problems with subspace clusters of different dimensionality. Second, they often fail to discover clusters of different shape and dimensionalities. The goal of this project is to develop new efficient and effective methods for high dimensional clustering.
Full
PDF
Issue : April
2009
DOI: DMKE042009002
Title: Using Context Transformations as a Pre-Processing Step in Mining Large Datasets
Authors: Dr. B. Kalpana
Keywords: Data mining, Context, Formal Foncepts
Abstract:
Data mining is being applied in several diverse areas such as market basket analysis, analysis of dependencies in biological sequences, search and extraction of information in the web, predicting trends in stock market and many others. In such applications, one of the methods of data analysis, that is gaining recognition is Formal Concept Analysis (FCA). The use of FCA in representing large datasets is particularly promising in reducing the time and storage representation . The characteristic that distinguishes FCA from other analysis methods is the absence of loss of information during the analysis of data. discusses two context transformations that do not change the structure of the concept lattice, namely context clarification and reduction. The objective is to explore the possibility of using such transformations as a preprocessing step so that, the dataset can be represented as a reduced context.
Full
PDF
Issue : April
2009
DOI: DMKE042009003
Title: SD-tree Based Indexing for Nested Object Query Processing
Authors: Dr. I. Elizabeth Shanthi
Keywords: Signature files, Indexing, OODB, Query Evaluation
Abstract:
Aiming at the fast retrieval of nested objects, we introduce a variation of the signature file based top-down hierarchy retrieval using an index structure called SD (Signature Declustering) tree. Signature files which were initially used on text data for their filtering capability have now been applied in Object Oriented Data Base Systems (OODBSs). Most of the proposed methods for Object Oriented query handling suffer from either longer retrieval time or comparison procedure complexity. This is mainly due to the poor filtering capability of the index structure in order to support complex query styles in OODBSs. In this paper we focus on the Object Oriented query handling of nested queries in the class hierarchy using an intermediate indexing structure called SD-tree that represents object signatures in a compact manner. Further it helps to retrieve all matching objects in a single access. We compare the performance of SD-tree based query processing with the signature tree based query processing reported recently. Our experimental analysis on large data sets shows that combined with query signature hierarchies SD-tree retrieves the matching objects quickly and therefore improves the time complexity of query evaluation substantially.
Full
PDF
Issue
: April 2009
DOI: DMKE042009004
Title: An Implementation of FP-Growth
Algorithm for Software Specification Mining
Authors: R. Jeevarathinam, Dr. Antony
Selvadoss Thanamani
Keywords: Mining Specifications, Program
Execution Traces, Apriori, FP_growth, Frequent Itemsets,
Frequent Pattern.
Abstract:
Specification mining is a machine learning approach for discovering formal specifications of the protocols that code must obey when interacting with an application program interface or abstract data type. Two major concerns in engineering software systems are high maintenance costs and reliability of systems. To reduce maintenance efforts, there is a need for automated tools to help software developers understand their existing code base. So, there is a need to extract specifications to aid program comprehension. In this paper a novel technique to efficiently mine software specifications, called FP_TraceMiner is proposed which mines software specifications from program execution traces. The FP-growth algorithm is currently one of the fastest approaches. To address the limitations of Apriori-like methods, a mining paradigm has been proposed, which uses FP-growth algorithm which transforms a database into FP-tree stored in main memory and then performs mining on that optimized FP-tree structure.
Full
PDF
Issue
: April 2009
DOI: DMKE042009005
Title: Discovery of Semantic Web Services
Using Intelligent Predictions for Business Applications
Authors: M.R. Sumalatha, P. Gowrishankar
(Member, IEEE), B. Balamurali, R. Jayakandan
Keywords: Ontology, Personalization, Web
Services, RSS, Semantic Description, Mapping Ontologies,
Event Prediction and Service Filtration
Abstract:
In the internet, web services are frequently used to perform a variety of task across several domains. The real problem with web services is finding out the service which suits to the user's needs and expectations. In traditional methods, deployment of web services using WSDL contextual information is not being given much importance. In the centralised web service, the context information of the user is used to list out the appropriate asset management services for business solutions. The web services have registered their service descriptions and these descriptions are being represented in OWL format. RSS feeds are used to analyze the current share market scenario and with the help of the past set of RSS information available, a prediction of the profitable asset management services are being structured and listed to the user. The user's contextual information helps in analyzing the user behaviour and hence a service is provided based on the user profile.
Full
PDF
Issue
: April 2009
DOI: DMKE042009006
Title: Quality Depth-First Closed Itemsets
(DCI_Closure) Associator
Authors: Mr. Sakthi Ganesh.M, Dr. C. Kalairasan,
R. Shalini Dr. V.D. Mytri
Keywords: Depth-First, DCI Closure Associator
Algorithm, Lattice Structure
Abstract:
The objective of this thesis work is to design
an efficient Data Mining algorithm to extract the
data efficiently from the transactional database.
There are different algorithms available to mine the
data from databases. We propose a new Data Mining
Algorithm named DCI_CLOSURE ALGORITHM using Association
rules for discovering closed frequent Itemsets. DCI_CLOSURE
Algorithm is an extension of DCI_CLOSED Algorithm
with Association Rules, Efficient Lattices and Hash
Map. This algorithm adopts several optimization techniques
to save the storage space as well as extraction time
in computing itemset closures and their support value.
The proposed algorithm, which unlike other previous
proposals does not scan the whole data set. We are
going to eliminate single Itemsets by the purpose
we need only pair of items so we reduce the single
itemset and calculate number of itemset through the
formula 2n – (n+1).
Full
PDF
Issue
: April 2009
DOI: DMKE042009007
Title: Software Tool for Agent Based Distributed
Data Mining
Authors: K. Anandakumar, Dr. M. Punithavalli
Keywords: Data Mining, Frequent Item set,
Distributed Data Mining
Abstract:
The main objective of this project is to illustrate
the maximum utilization of available resources for
the data mining activities. Mining information and
knowledge from huge data sources such as Weather databases,
financial data portals or emerging disease information
systems has been recognized by industrial companies
as an important area with an opportunity of major
revenues from applications such as business data warehousing,
process control, and personalized on-line customer
services over Internet and web. Distributed Data mining
is expected to perform partial analysis of data at
clients and then to send the outcome as results to
the server where it is sometimes required to be aggregated
to the global result The primary issues to be considered
for DDM are Scalability, privacy of data and autonomy
of data. These issues can be easily handled when we
go for intelligent software agents for Distributed
Data mining, because of its inherent features of being
autonomous, capable of adaptive and deliberative reasoning.
Full
PDF
Issue
: April 2009
DOI: DMKE042009008
Title: Mining Frequent Itemsets using
Temporal Association Rule
Authors: M. Krishnamurthy, A. Kannan,
R. Baskaran and S. Kanmanirajan
Keywords: Frequent Item set, Calendar
Schema, Temporal Association Rule Mining, Temporal
Data Mining and Temporal Database
Abstract:
Association rule mining is to find association
relationships among large data sets. Mining frequent
patterns is an important aspect in association rule
mining. Most of the popular associationship rule mining
methods are having performance bottleneck for database
with different characteristics of data such as dense
vs. sparse. In this paper, an efficient algorithm
named Temporal FP-Tree (Frequent Pattern - Tree) algorithm
and the FP-tree structure is presented to mine frequent
patterns, conditional pattern bases and sub- conditional
pattern tree recursively .This algorithm is used to
mine frequent patterns from temporal database and
it needs limited memory space. When dataset becomes
dense it can be scaled up to large database by partitioning
it, conditionally temporal FP-tree can be constructed
dynamically as part of mining.
Full
PDF
Issue
: April 2009
DOI: DMKE042009009
Title: Image Clustering Techniques for
the Exploration of Video Sequences
Authors: Rekha B Venkatapur, Dr. V.D.
Mytri, Dr. A. Damodaram
Keywords: Information Retrieval, Image
Retrieval, Clustering of Video Sequences, Video Segmentation
Abstract:
Digital video libraries are generating tremendous interest in pattern recognition, computer vision, and multimedia research communities. The amount of information currently available in internet and in proprietary databases is increasing every day. In the present study a systematic study is made for the exploration of video sequences. The system, GAMBAL-EVS, segments video sequences extracting an image for each shot and then clusters such images and presents them in a visualization system. The system permits to find similarities between images and to traverse along the video sequences to find the rellevant ones.
Full
PDF
|