SR4R: An Integrative SNP Resource for Genomic Breeding and Population Research in Rice.

Jun Yan, Dong Zou, Chen Li, Zhang Zhang, Shuhui Song, Xiangfeng Wang
Author Information
  1. Jun Yan: Department of Crop Genomics and Bioinformatics, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China.
  2. Dong Zou: China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100101, China.
  3. Chen Li: Rice Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China.
  4. Zhang Zhang: China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100101, China.
  5. Shuhui Song: China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100101, China. Electronic address: songshh@big.ac.cn.
  6. Xiangfeng Wang: Department of Crop Genomics and Bioinformatics, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China. Electronic address: xwang@cau.edu.cn.

Abstract

The information commons for rice (IC4R) database is a collection of 18 million single nucleotide polymorphisms (SNPs) identified by resequencing of 5152 rice accessions. Although IC4R offers ultra-high density rice variation map, these raw SNPs are not readily usable for the public. To satisfy different research utilizations of SNPs for population genetics, evolutionary analysis, association studies, and genomic breeding in rice, raw genotypic data of these 18 million SNPs were processed by unified bioinformatics pipelines. The outcomes were used to develop a daughter database of IC4R - SnpReady for Rice (SR4R). SR4R presents four reference SNP panels, including 2,097,405 hapmapSNPs after data filtration and genotype imputation, 156,502 tagSNPs selected from linkage disequilibrium-based redundancy removal, 1180 fixedSNPs selected from genes exhibiting selective sweep signatures, and 38 barcodeSNPs selected from DNA fingerprinting simulation. SR4R thus offers a highly efficient rice variation map that combines reduced SNP redundancy with extensive data describing the genetic diversity of rice populations. In addition, SR4R provides rice researchers with a web interface that enables them to browse all four SNP panels, use online toolkits, as well as retrieve the original data and scripts for a variety of population genetics analyses on local computers. SR4R is freely available to academic users at http://sr4r.ic4r.org/.

Keywords

References

  1. Bioinformatics. 2012 Oct 15;28(20):2685-6 [PMID: 22923298]
  2. Nature. 2018 May;557(7703):43-49 [PMID: 29695866]
  3. Gigascience. 2015 Feb 25;4:7 [PMID: 25722852]
  4. Genetics. 2007 Dec;177(4):2223-32 [PMID: 17947413]
  5. Gigascience. 2014 May 28;3:7 [PMID: 24872877]
  6. Nucleic Acids Res. 2016 Jan 4;44(D1):D1172-80 [PMID: 26519466]
  7. Bioinformatics. 2011 Aug 1;27(15):2156-8 [PMID: 21653522]
  8. Annu Rev Genet. 2005;39:197-218 [PMID: 16285858]
  9. Nature. 2007 Oct 18;449(7164):851-61 [PMID: 17943122]
  10. Mol Biol Evol. 2016 Jul;33(7):1870-4 [PMID: 27004904]
  11. PLoS Genet. 2015 Feb 17;11(2):e1004982 [PMID: 25689273]
  12. Nat Commun. 2011 Sep 13;2:467 [PMID: 21915109]
  13. J Bioinform Comput Biol. 2013 Apr;11(2):1250022 [PMID: 23600813]
  14. PLoS One. 2014 Apr 09;9(4):e93766 [PMID: 24718290]
  15. Am J Hum Genet. 2007 Nov;81(5):1084-97 [PMID: 17924348]
  16. Annu Rev Plant Biol. 2003;54:357-74 [PMID: 14502995]
  17. Nucleic Acids Res. 2013 Jul;41(Web Server issue):W98-103 [PMID: 23632162]
  18. Nat Commun. 2016 Feb 04;7:10532 [PMID: 26842267]
  19. Nat Genet. 2010 Dec;42(12):1053-9 [PMID: 21076406]

MeSH Term

DNA Barcoding, Taxonomic
Databases, Genetic
Ecotype
Genetic Variation
Genetics, Population
Genome, Plant
Genomics
Haplotypes
Linkage Disequilibrium
Machine Learning
Oryza
Phenotype
Plant Breeding
Polymorphism, Single Nucleotide
Reproducibility of Results
Research
User-Computer Interface

Links to CNCB-NGDC Resources

Database Commons: DBC006921 (SR4R)

Word Cloud

Similar Articles

Cited By