Vol: 52(66) No: 4 / December 2007 Visualizing a Genetic Algorithm - Support Vector Machine Approach to Gene Microarrays Supervised Learning Nicolae Teodor Melita Computer Science and Engineering Department, \"Politehnica\" University of Timisoara, Faculty of Automation and Computers, Bd. V. Parvan 2, RO-300223 Timisoara, Romania, e-mail: nt_melita@yahoo.com Stefan Holban Computer Science and Engineering Department, \"Politehnica\" University of Timisoara, Faculty of Automation and Computers, Bd. V. Parvan 2, RO-300223 Timisoara, Romania, phone: +40256404060, e-mail: stefan@cs.utt.ro, web: http://www.cs.utt.ro/~stefan/ Keywords: Support Vector Machines, Genetic Algorithm, Feature Selection, DNA microarrays, Visualization Tools Abstract We address the problem of collecting and analyzing vast amount of information in medicine and biology, in the light of the revolutionary technological evolution in the last decades. Currently, the methods of achieving information overcome our capacity to sort and process that information. However, we use the methods of machine learning to sort and analyze this information. In this comprehensive review we describe an experiment of analyzing DNA microarrays using Support Vector Machines (SVM). We study how the SVM performs in classifying three instances of the same dataset. We classify the brute dataset, a t-test based filtered dataset, and a dataset with features selected by a Genetic Algorithm (GA). We emphasize on the methods to visualize and present the results, given the fact that such a study provides usually tools for multidisciplinary research teams. References [1] Robert Gentleman, Vince Carey, Wolfgang Huber, Rafael A. Irizarry, Sandrine Dudoit (2005), Bioinformatics and Computational Biology Solutions using R and Bioconductor. [2] W. N. Venables, D. M. Smith & the R Development Core Team (2006), An Introduction to R. [3] William N. Venables and Brian D. Ripley (2002). Modern Applied Statistics with S. Fourth Edition. Springer, New York. [4] William N. Venables and Brian D. Ripley, (2000) S Programming. Springer, New York. [5] Helen Causton, John Quackenbush, Alvis Brazma (2003). Microarray Gene Expression Data Analysis: A Beginner\'s Guide, Blackwell Publishing Professional. [6] Dov Stekel, (2003), Microarray Bioinformatics Cambridge University Press. [7] H. Ressom (2007), Georgetown University, unpublished. [8] R. O. Duda, P. E. Hart and D. G. Stork, (2001), Pattern Classification, Second Edition, Wiley. [9] D. G. Stork and E. Yom-Tov, (2004), Computer Manual in MATLAB to Accompany Pattern Classification, Second Edition, Wiley. [10]Sam Roberts, (2005), Using Genetic Algorithms to Select a Subset of Predictive Variables from a High-Dimensional Microarray Dataset, MATLAB Digest. [11] I. Witten and E. Frank, (2005), Data Mining (2nd Ed.), Morgan Kaufmann. |