- Brent R Perry: Department of Biology, Pennsylvania State University, University Park, PA, 16802, USA.
- Raquel Assis: Department of Biology, Pennsylvania State University, University Park, PA, 16802, USA. rassis@psu.edu.
BACKGROUND: Gene duplication is a major source of new genes that is thought to play an important role in phenotypic innovation. Though several mechanisms have been hypothesized to drive the functional evolution and long-term retention of duplicate genes, there are currently no software tools for assessing their genome-wide contributions. Thus, the evolutionary mechanisms by which duplicate genes acquire novel functions remain unclear in a number of taxa.
RESULTS: In a recent study, researchers developed a phylogenetic approach that uses gene expression data from two species to classify the mechanisms underlying the retention of duplicate genes (Proc Natl Acad Sci USA 110:1740917414, 2013). We have implemented their classification method, as well as a more generalized method, in the R package CDROM, enabling users to apply these methods to their data and gain insights into the origin of novel biological functions after gene duplication. The CDROM R package, source code, and user manual for the R package are available for download from CRAN at https://cran.rstudio.com/web/packages/CDROM/ . Additionally, the CDROM R source code, user manual for running CDROM from the source code, and sample dataset used in this manuscript can be accessed at www.personal.psu.edu/rua15/software.html .
CONCLUSIONS: CDROM is the first software package that enables genome-wide classification of the mechanisms driving the long-term retention of duplicate genes. It is user-friendly and flexible, providing researchers with a tool for studying the functional evolution of duplicate genes in a variety of taxa.