Gene Selection using Fuzzy Discretization and Rough set Theory Ramasamy Prema1,*, Kandasamy Premalatha2 1Research Scholar, Department of Computer Science and Engineering, Bannari Amman Institute of Technology, Sathyamangalam 2Professor, Department of Computer Science and Engineering, Bannari Amman Institute of Technology, Sathyamangalam *Corresponding Author E-mail: premabit@gmail.com, kpl_barath@yahoo.co.in
Online published on 20 December, 2018. Abstract Contemporary biological technologies like gene expression microarrays produce extremely high-dimensional datasets with limited samples. The complicated relations among the different genes make analysis more difficult, and removing irrelevant genes improves the quality of Results. In this regard, a rough set based gene selection algorithm is developed to select genes from microarray data. In this paper, a novel fuzzy discretization technique, Gaussian Fuzzy Discretization (GFD) is proposed for rough set based gene selection algorithm. Using three publicly available gene expression datasets, the GFD is compared with two standard discretization techniques (Equal-Width and Equal-Frequency). The classifiers, Support Vector Machine (SVM), k-Nearest-Neighbor (kNN) and Random Forest (RF) are used to identify the accuracy of selected features. The experimental Results show that the genes selected using Gaussian fuzzy-discretized datasets give high classification accuracy. Top Keywords Classification, feature selection, fuzzy discretization, high-dimensional data. Top |