Open Access Open Access  Restricted Access Subscription or Fee Access

A Comparative Study on Similarity Analysis in Time Series Data Mining

Kaushik Raj Palanichamy, Selvakumar Subbiah

Abstract


Time series is a sequence of values observed over the
time. There are several patterns such as periodic patterns, similarity
patterns, seasonal patterns, etc. This paper deals with the similarity
analysis, which is concerned with efficiently locating subsequences in
large archives of sequences. It also discusses the appropriate use of
Piecewise Constant Approximation (PCA) and coefficient of variation
method for data reduction technique. Finally, the fuzzy c-means and
k-medoid cluster analysis are applied to the reduced data to measure
the similarity between two sequences and the results are compared
numerically and graphically.


Keywords


Similarity Search, Data Reduction, Coefficient of Variation, Distance Measures, Clustering, Fuzzy C-Means and KMedoids.

Full Text:

PDF

References


Agarwal R, “Christos Faloutsos and Arun N. Swami. Efficient Similarity

search in sequence databases”, In proceedings of the 4th International

Conference of foundations of data, Organization and Algorithms(FODO),

Chicago, Illinois, pp.69-84, 1993.

Bautista Thompson E and Santos De la cruz, “Shape Similarity Index for

Time Series based on Features of Euclidean Distance Histograms”, In

process of the 15th IEEE International Conference on computing

(CIC’06), No. 5, 2006.

Cooray T.M.J.A. “Applied time series analysis and forecasting”, Narosa

publishing house, New Delhi, 2008.

Durga Toshniwal, and R.C. Joshi, “Similarity Search in Time Series Data

using Time Weighted Slope”, Informatica, Vol. 29, pp.79–88, 2005.

Francesco Gullo, Giovanni Ponti, Andrea Tagarelli, and Sergio Greco, “A

time series representation model for accurate and fast similarity

detection”, Pattern Recognition, No.33, 2009.

Hung Chim and Xiaotie Deng, “Efficient phrase based document

similarity for clustering”, IEEE Transactions on knowledge and data

engineering, Vol.20, No.9, 2008.

Ira Assent, Marc Wichterich, Tobias Meisen, Thomas Seidl,“ Efficient

similarity search using the earth mover’s distance”, In proceedings of the

th IEEE International Conference on data Engineering, No. 10, 2008.

Keogh E.J and M. J. Pazzani, “An enhanced representation of time series

which allows fast and accurate classification, clustering and relevance

feedback”, Proceedings of 4th International Conference on Knowledge

Discovery and Data Mining, KDD, pp.239-243, 1999.

Qiang Wang and Vasileios Megalooikonomou, “A dimensionality

education technique for efficient time series similarity analysis”, IEEE

information systems vol.33, pp.115- 132, 2007.

Sangjun Lee, Dongseop Kwon and Sukho Lee, “Minimum distance

queries for time series data”, Journal of Systems and Software, vol.69,

pp.105-113, 2004.

Selvakumar S and Senthamarai Kannan K, “Similarity Analysis and

Forecasting Using Markov Models”, Proc. Of Recent Trends in Statistical

Research, Manonmaniam Sundaranar University, pp.49- 56, 2010.

Selvakumar S and Senthamarai Kannan K, “Time Series Similarity

Analysis Using Dimensionality Reduction.”, Statistics and Systems, vol.

, pp.547-554, 2010.

Selvakumar S and Senthamarai Kannan K, “Periodicity Detection in Time

Series Data”, Proc. of Recent Trends in Statistics and Computer

Applications, Manonmaniam Sundaranar University, pp.145- 150,2011.

Selvakumar S, Senthamarai Kannan K, and Balamurugan V, “Similarity

Analysis of Digital Images with Nonparametric Tests on Time Series”,

Data Mining and Knowledge Engineering, vol.2, pp. 365 – 371, 2010.

Tamer Kahveci and Ambuj K. Singh, “Optimizing Similarity Search for

Arbitrary Length Time Series Queries”, IEEE Transactions on

Knowledge and Data Engineering, Vol.16, No.4, pp.418- 433, 2004.

Xiang Lian, “Efficient similarity search over future stream time series”,

IEEE Transactions on knowledge and data engineering, Vol.20, No.1,


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.