SR4R: An Integrative SNP Resource for Genomic Breeding and Population Research in Rice.

Jun Yan, Dong Zou, Chen Li, Zhang Zhang, Shuhui Song, Xiangfeng Wang
Author Information
  1. Jun Yan: Department of Crop Genomics and Bioinformatics, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China.
  2. Dong Zou: China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100101, China.
  3. Chen Li: Rice Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China.
  4. Zhang Zhang: China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100101, China.
  5. Shuhui Song: China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100101, China. Electronic address: songshh@big.ac.cn.
  6. Xiangfeng Wang: Department of Crop Genomics and Bioinformatics, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China. Electronic address: xwang@cau.edu.cn.

Abstract

The information commons for rice (IC4R) database is a collection of 18 million single nucleotide polymorphisms (SNPs) identified by resequencing of 5152 rice accessions. Although IC4R offers ultra-high density rice variation map, these raw SNPs are not readily usable for the public. To satisfy different research utilizations of SNPs for population genetics, evolutionary analysis, association studies, and genomic breeding in rice, raw genotypic data of these 18 million SNPs were processed by unified bioinformatics pipelines. The outcomes were used to develop a daughter database of IC4R - SnpReady for Rice (SR4R). SR4R presents four reference SNP panels, including 2,097,405 hapmapSNPs after data filtration and genotype imputation, 156,502 tagSNPs selected from linkage disequilibrium-based redundancy removal, 1180 fixedSNPs selected from genes exhibiting selective sweep signatures, and 38 barcodeSNPs selected from DNA fingerprinting simulation. SR4R thus offers a highly efficient rice variation map that combines reduced SNP redundancy with extensive data describing the genetic diversity of rice populations. In addition, SR4R provides rice researchers with a web interface that enables them to browse all four SNP panels, use online toolkits, as well as retrieve the original data and scripts for a variety of population genetics analyses on local computers. SR4R is freely available to academic users at http://sr4r.ic4r.org/.

Keywords

MeSH Term

DNA Barcoding, Taxonomic
Databases, Genetic
Ecotype
Genetic Variation
Genetics, Population
Genome, Plant
Genomics
Haplotypes
Linkage Disequilibrium
Machine Learning
Oryza
Phenotype
Plant Breeding
Polymorphism, Single Nucleotide
Reproducibility of Results
Research
User-Computer Interface