| URL: | http://esper.lab.nig.ac.jp/genome-composition-database/ |
| Full name: | Genome Composition Database |
| Description: | Genome Composition Database (GCD) shows how accurately each genome can be approximated by a model. The GCD also provides the sequences of over- and underrepresented DNA words. The unique point of this database is that it allows to compare compositional complexity of genomes and to analyze over- or underrepresentation of particular oligonucleotides. |
| Year founded: | 2012 |
| Last update: | |
| Version: | |
| Accessibility: |
Unaccessible
|
| Country/Region: | Japan |
| Data type: | |
| Data object: | |
| Database category: | |
| Major species: |
NA
|
| Keywords: |
| University/Institution: | National Institute of Genetics |
| Address: | |
| City: | Mishima |
| Province/State: | |
| Country/Region: | Japan |
| Contact name (PI/Team): | Naruya Saitou |
| Contact email (PI/Helpdesk): | saitounr@lab.nig.ac.jp |
|
A new database (GCD) on genome composition for eukaryote and prokaryote genome sequences and their initial analyses. [PMID: 22417913]
Eukaryote genomes contain many noncoding regions, and they are quite complex. To understand these complexities, we constructed a database, Genome Composition Database, for the whole genome composition statistics for 101 eukaryote genome data, as well as more than 1,000 prokaryote genomes. Frequencies of all possible one to ten oligonucleotides were counted for each genome, and these observed values were compared with expected values computed under observed oligonucleotide frequencies of length 1-4. Deviations from expected values were much larger for eukaryotes than prokaryotes, except for fungal genomes. Mammalian genomes showed the largest deviation among animals. The results of comparison are available online at http://esper.lab.nig.ac.jp/genome-composition-database/. |