GSA Data Model

Designed for compatibility, Genome Sequence Archive (GSA) follows Nucleotide Sequence Database Collaboration (INSDC) data standards and structures. Organizational framework of the GSA data is based on the concepts of BIOPROJECT (corresponds to PROJECT in the BioProject database), BIOSAMPLE (corresponds to SAMPLE in the BioSample database), EXPERIMENT, and RUN.

Figure 1. Data model in GSA

Organization of metadata objects

Followings are examples of metadata. Submitters can organize meta data objects flexibly.

♦   Comparative genome sequencing of three strains (paired-end) Include paired-end read files in a Run(Figure 2).

Figure 2. Comparative genome sequencing of three strains (paired-end)

♦   Technical and biological replicates. Biological replicates should be classified as two different samples; technical replicates should be considered as two different experiments.

Figure 3. Technical and biological replicates