Genes Analysis of Data by Using Hierarchical Quality Threshold Clustering

Shiv Kumar; Vijay K. Chaudhari; Md. Ilyas Khan; Neetesh Gupta; Bupendra Verma

Genes Analysis of Data by Using Hierarchical Quality Threshold Clustering

Shiv Kumar, Vijay K. Chaudhari, Md. Ilyas Khan, Neetesh Gupta, Bupendra Verma

Abstract

In this paper “Genes Analysis of Data by Using Hierarchical Quality Threshold Clustering” is an approach which proposed dynamically Growing Hierarchical Self Organizing Map (DGHSOM) with Nano array to identify co-expressed genes. The DGHSOM overcomes the problem of specifying the number of clusters and total number of iteration before the processing now, we are using QT (quality threshold) clustering is a method of partitioning data, which is invented for gene clustering. It requires more computing power than k- means, but does not require specifying the number of clusters. DNA Nano array technology is a challenging area in bioinformatics research, as we have to monitor millions of genes simultaneously. The expression profile of the gene can be useful in cancer disease analysis and its diagnosis. Gene expression data is very voluminous and very difficult to analyze. Several clustering algorithm have been proposed to identify co expressed genes. The Self-organizing-maps (SOM) is a powerful tool for recognizing and classifying features in complex, micro array data. But the interpretation of co- expression of genes are heavily depends on domain knowledge and SOM lacks since the number of clusters must be determined before training.

Keywords

Gene Expression Profile, Image Processing, Dynamically Growing Self Organizing Map, Nano Array, Qt Clustering

Full Text:

PDF

References

PABLOTAMAYO*, DONNA SLONIM, “Interpreting patterns of gene expression with self-organizing maps:Methods and application to hematopoietic differentiation

Sandrine Dudoit “Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data”.

Standford Micrarray Database http://genome-www5.stanford.edu/

Brown, P. O., and Botstein, D. (1999). Exploring the new world of the genome with DNA microarrays. Nat Genet 21, 33-7.

Cluster and Tree View Manual By Michael Eisen, Stanford University

Eisen, M. B., Spellman, P. T., Brown, P. O., and Botstein, D. (1998). Cluster analysis and display of genome- wide expression patterns. Proc Natl Acad Sci U S A 95, 14863-8. Hartigan, J. A. (1975). Clustering algorithms (New York,: Wiley).

Jain, A. K., and Dubes, R. C. (1988). Algorithms for clustering data (Englewood Cliffs, N.J.: Prentice Hall).

Jardine, N., and Sibson, R. (1971). Mathematical taxonomy (London, New York,Wiley).

Tryon, R. C., and Bailey, D. E. (1970). Cluster analysis (New York,: McGraw-Hill). Tukey, J. W. (1977). Exploratory data analysis (Reading, Mass.: Addison-Wesley Pub.Co.).

Wen, X., Fuhrman, S., Michaels, G. S., Carr, D. B., Smith, S., Barker, J. L., and Somogyi, R. (1998). Large-scale temporal gene expression mapping of central nervous system development. Proc Natl Acad Sci U S A 95, 334-9.

QT (Quality Threshold) Clustering, “Agilent Technologies, Inc. 2005 http://www.chem.agilent.com/cag/bsp/products/gsgx/Downloads/pdf/qt_clustering.pdf

Stefan Janssen , “Partition Based Clustering”, June 11, 2007 http://wwwhomes.uni bielefeld.de/sjanssen2/Vortraege/Clustering.pdf

Microsoft PowerPoint - Hank_Clustering&Classification_GS_20050921 http://www.hmwu.idv.tw/web/CourseSMDA/MADA/Hank_Clustering&Classification_GS_20050921.pdf.

Theresa Scharl1 and Friedrich Leisch2, “The Stochastic QT–Clust Algorithm: Evaluation of Stability and Variance on Time–Course Microarray Data”, Theresa Scharl and Friedrich Leisch, In Alfredo Rizzi and Maurizio Vichi, editors, Compstat 2006, Proceedings in Computational Statistics, pages 1015–1022. Physica Verlag, Heidelberg, Germany, 2006 http://www.ci.tuwien.ac.at/papers/Scharl+Leisch-2006.pdf

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me