Introduction

Selection of informative genes is an important problem in gene expression studies. The small sample size and the large number of genes in gene expression data make the selection process complex. Further, the selected informative genes may act as a vital input for gene co-expression network analysis. Moreover, the identification of hub genes and module interactions in gene co-expression networks is yet to be fully explored. This paper presents a statistically sound gene selection technique based on support vector machine algorithm for selecting informative genes from high dimensional gene expression data. Also, an attempt has been made to develop a statistical approach for identification of hub genes in the gene co-expression network. Besides, a differential hub gene analysis approach has also been developed to group the identified hub genes into various groups based on their gene connectivity in a case vs. control study. Based on this proposed approach, an R package, i.e., dhga (https://cran.r-project.org/web/packages/dhga) has been developed. The comparative performance of the proposed gene selection technique as well as hub gene identification approach was evaluated on three different crop microarray datasets. The proposed gene selection technique outperformed most of the existing techniques for selecting robust set of informative genes. Based on the proposed hub gene identification approach, a few number of hub genes were identified as compared to the existing approach, which is in accordance with the principle of scale free property of real networks. In this study, some key genes along with their Arabidopsis orthologs has been reported, which can be used for Aluminum toxic stress response engineering in soybean. The functional analysis of various selected key genes revealed the underlying molecular mechanisms of Aluminum toxic stress response in soybean.

Publications

  1. Statistical Approaches for Gene Selection, Hub Gene Identification and Module Interaction in Gene Co-Expression Network Analysis: An Application to Aluminum Stress in Soybean (Glycine max L.).
    Cite this
    Das S, Meher PK, Rai A, Bhar LM, Mandal BN, 2017-01-01 - PLoS ONE

Credits

  1. Samarendra Das
    Developer

    Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, India

  2. Prabina Kumar Meher
    Developer

    Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, India

  3. Anil Rai
    Developer

    Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, India

  4. Lal Mohan Bhar
    Developer

    Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, India

  5. Baidya Nath Mandal
    Investigator

    Division of Design of Experiments, ICAR-Indian Agricultural Statistics Research Institute, India

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT001975
Tool TypeApplication
Category
PlatformsLinux/Unix
TechnologiesR
User InterfaceTerminal Command Line
Download Count0
Country/RegionIndia
Submitted ByBaidya Nath Mandal