- The Genome Variation Map (GVM) is a data repository and retrieval system of genome variations, including single nucleotide polymorphisms and small insertions and deletions, with particular focuses on human, domesticated animals and cultivated plants, and other species. GVM is one of core database resources in BIG Data Center.Users can retrieve the variants by genomic coordinate, variant effects, gene names and gene functions, and the search results can be downloaded directly. Users can also obtain whole genome variation in VCF and FASTA file formats via ftp service. Both online and offline data submission systems are available.
1.What is the GVM (Genome Variation Map)?
- GVM adopts a data submission system consistently deployed for all database resources in BIG Data Center, to accept, archive and manage VCF files. Users should register first, enter the data submission system, and create BioProject (an overall description of a single research initiative) and BioSample (a description of biological source material) if needed.
Submissions should consist of VCF file(s) and metadata that describe sample(s), experiment(s), and analysis procedure that lead to variants and/or genotype call(s). There are two ways to submit data to GVM, See Data Submission.
2.How can I submit data to GVM?
If submitting data in raw sequence, a Bioproject/GSA accession number is available for users, and we will identify variants by our standard pipelines (see Standards) and integrate them into GVM databases at regular intervals. The data source will be marked for each newly identified variants and each newly added individuals.
If submitting variants data in VCF for Hapmap formats, a GVM accession number is available for users, and we will integrate them into GVM after data release. The data source will be marked for those newly identified variants and each newly added individuals.When you have successfully submitted data to GVM, please consider to use the following words to describe data deposition in your manuscript.
The variation data reported in this paper have been deposited in the Genome Variation Map  in BIG Data Center , Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, under accession number GVMXXXXXX that can be publicly accessible at http://bigd.big.ac.cn/gvm/getProjectDetail?project=GVMXXXXXX
1. Genome Variation Map: a data repository of genome variations in BIG Data Center, Nucleic Acids Res 2017. [PMID=29069473]
2. Database Resources of the BIG Data Center in 2018, Nucleic Acids Res 2017. [PMID=29036542]
3.How do I cite my data submitted to GVM?
- The Genome Analysis Toolkit (GATK) is employed in the variants identification. See details in the Standards.
4.How are the variations identified?
- Users can search variants by variation ID or genomic coordinates directly. Users can also filter SNPs by minor allele frequency and consequence type. We also support SNPs query in genes or gene functional class by gene identifier or gene functional term or phenotype term.
Search result can be browsed in Gbrowser and downloaded directly for future analysis.