An Efficient T-Score Ranking for Microarray Gene Selection
Gene selection is an important issue in microarray data processing. In this work, propose a capable method for selecting relevant genes. This work aim at finding the smallest set of genes that can ensure highly accurate classification of cancers from microarray data by using supervised machine learning algorithms. Initially utilized spectral biclustering to achieve the best two eigenvectors for class partition. Then gene combinations are chosen based on the similarity among the genes and the best eigenvectors. Proposed simple yet very effective method involves two steps. In the first step, choose some important genes using a feature importance ranking scheme. In the second step, test the classification capability of all simple combinations of those important genes by using a good classifier. This work demonstrates semi-unsupervised and T-Score gene selection method using two microarray cancer data sets, i.e., the lymphoma and leukemia data sets. Experimental result shows proposed method is able to identify a single gene which leads to predictions with very high accuracy.
Alberts, B., Johnson, A., Lewis, J., Roberts, K., Raff, M. & Walter, P. 2002. Molecular biology of the cell. Garland publishing.
Cho, S. & Won, H. 2003. Machine learning in DNA microarray analysis for cancer classification. In APBC, vol. 34, 189–198.
Dettling, M. 2004. Supervised learning in very high dimensional problems with application to microarray data. PHD thesis, Swiss Federal Institute of Technology Zurich.
Dubitzky, W., Granzow, M., Downes, C.S. & Berrar, D. 2002. Introduction to Microarray Data Analysis. Kluwer Academic Publishers, Boston/Dordrecht/London.
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D. & Lander, E.S. 2002. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286, 531–537.
Gregory, P.S. & Pablo, T. 2003. Microarray data mining: facing the challenges. SIGKDD Explorations.
Guyon, Isabelle, Jason Weston, Stephen Barnhill, and Vladimir Vapnik 2002. Gene selection for cancer classification using support vector machines. Machine learning 46, no. 1-3: 389-422.
Kim, Young Bun, and Jean Gao 2006. Unsupervised gene selection for high dimensional data. In Bio Informatics and Bio Engineering, 2006. BIBE 2006. Sixth IEEE Symposium on, pp. 227-234.
Liu, Bing, Chunru Wan, and Lipo Wang 2006. An efficient semi-unsupervised gene selection method via spectral biclustering. NanoBioscience, IEEE Transactions on 5, no. 2: 110-114.
Lu, Ying, and Jiawei Han 2003. Cancer classification using gene expression data. Information Systems 28, no. 4: 243-268.
Prabakaran, S., Sahu, R., & S.Verma 2005. Genomic signal processing using micro arrays, submitted to hybrid system.
Reis-Filho, Jorge S., and Lajos Pusztai 2011. Gene expression profiling in breast cancer: classification, prognostication, and prediction. The Lancet 378, no. 9805: 1812-1823.
Tan, Niyue 2007. Cancer Gene Expression Data Analysis: a Neuro-Fuzzy System Approach. PhD diss., University of Oxford.
Wang, Yu, Igor V., Tetko, Mark A., Hall, Eibe Frank, Axel Facius, Klaus FX Mayer, and Hans W. Mewes 2005. Gene selection from microarray data for cancer classification—machine learning approach. Computational biology and chemistry 29, no. 1: 37-46.
Wang, Zhenyu 2005. Neuro-fuzzy modeling for microarray cancer gene expression data. First year transfer report, University of Oxford.
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.