Licenses GVM is free for academic use only. For any commercial use, please contact us for commercial licensing terms.

    1.What is the GVM (Genome Variation Map)?

  • The Genome Variation Map (GVM) is a data repository and retrieval system of genome variations, including single nucleotide polymorphisms and small insertions and deletions, with particular focuses on human, domesticated animals and cultivated plants, and other species. GVM is one of core database resources in BIG Data Center.Users can retrieve the variants by genomic coordinate, variant effects, gene names and gene functions, and the search results can be downloaded directly. Users can also obtain whole genome variation in VCF and FASTA file formats via ftp service. Both online and offline data submission systems are available.

    2.How can I submit data to GVM?

  • GVM adopts a data submission system consistently deployed for all database resources in BIG Data Center, to accept, archive and manage VCF files. Users should register first, enter the data submission system, and create BioProject (an overall description of a single research initiative) and BioSample (a description of biological source material) if needed.
    Submissions should consist of VCF file(s) and metadata that describe sample(s), experiment(s), and analysis procedure that lead to variants and/or genotype call(s). There are two ways to submit data to GVM, See Data Submission.

    3.How do I cite my data submitted to GVM?

  • If submitting data in raw sequence, a Bioproject/GSA accession number is available for users, and we will identify variants by our standard pipelines (see Standards) and integrate them into GVM databases at regular intervals. The data source will be marked for each newly identified variants and each newly added individuals.
    If submitting variants data in VCF for Hapmap formats, a GVM accession number is available for users, and we will integrate them into GVM after data release. The data source will be marked for those newly identified variants and each newly added individuals.
    When you have successfully submitted data to GVM, please consider to use the following words to describe data deposition in your manuscript.
    The variation data reported in this paper has been deposited in the Genome Variation Map [1] in National Genomics Data Center [2], China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences, under accession number GVMXXXXXX that can be publicly accessible at

    1. Genome Variation Map: a worldwide collection of genome variations across multiple species. Nucleic Acids Res 2021, 49(D1):D1186-D1191.[PMID=33170268].
    2. Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2021. Nucleic Acids Res 2021, 49(D1):D18-D28. [PMID=33175170].
    3. GWAS Atlas: a curated resource of genome-wide variant-trait associations in plants and animals. Nucleic Acids Res 2020, 48(D1):D927-D932.[PMID=31566222].
    4. Genome Variation Map: a data repository of genome variations in BIG Data Center. Nucleic Acids Res 2018, 46(D1):D944-D949.[PMID=29069473].

    4.How are the variations identified?

  • The Genome Analysis Toolkit (GATK) is employed in the variants identification. See details in the Standards.

    5.How can I search the variation?

  • Users can search variants by variation ID or genomic coordinates directly. Users can also filter SNPs by minor allele frequency and consequence type. We also support SNPs query in genes or gene functional class by gene identifier or gene functional term or phenotype term.
    Search result can be browsed in Gbrowser and downloaded directly for future analysis.