Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

diArk

General information

URL: http://www.diark.org
Full name:
Description: diArk has been developed to provide fast and easy access to all sequenced eukaryotic genomes worldwide.
Year founded: 2007
Last update: 2018
Version: 3.0
Accessibility:
Accessible
Country/Region: Germany

Classification & Tag

Data type:
DNA
Data object:
Database category:
Major species:
Keywords:

Contact information

University/Institution: Max Planck Institute for Biophysical Chemistry
Address: Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Göttingen, Germany
City: Göttingen
Province/State:
Country/Region: Germany
Contact name (PI/Team): Kollmar M
Contact email (PI/Helpdesk): mako@nmr.mpibpc.mpg.de

Publications

25378341
diArk--the database for eukaryotic genome and transcriptome assemblies in 2014. [PMID: 25378341]
Kollmar M, Kollmar L, Hammesfahr B, Simm D.

Eukaryotic genomes are the basis for understanding the complexity of life from populations to the molecular level. Recent technological innovations have revolutionized the speed of data generation enabling the sequencing of eukaryotic genomes and transcriptomes within days. The database diArk (http://www.diark.org) has been developed with the aim to provide access to all available assembled genomes and transcriptomes. In September 2014, diArk contains about 2600 eukaryotes with 6000 genome and transcriptome assemblies, of which 22% are not available via NCBI/ENA/DDBJ. Several indicators for the quality of the assemblies are provided to facilitate their comparison for selecting the most appropriate dataset for further studies. diArk has a user-friendly web interface with extensive options for filtering and browsing the sequenced eukaryotes. In this new version of the database we have also integrated species, for which transcriptome assemblies are available, and we provide more analyses of assemblies.

Nucleic Acids Res. 2015:43(Database issue) | 5 Citations (from Europe PMC, 2025-12-20)
21906294
diArk 2.0 provides detailed analyses of the ever increasing eukaryotic genome sequencing data. [PMID: 21906294]
Hammesfahr B, Odronitz F, Hellkamp M, Kollmar M.

BACKGROUND: Nowadays, the sequencing of even the largest mammalian genomes has become a question of days with current next-generation sequencing methods. It comes as no surprise that dozens of genome assemblies are released per months now. Since the number of next-generation sequencing machines increases worldwide and new major sequencing plans are announced, a further increase in the speed of releasing genome assemblies is expected. Thus it becomes increasingly important to get an overview as well as detailed information about available sequenced genomes. The different sequencing and assembly methods have specific characteristics that need to be known to evaluate the various genome assemblies before performing subsequent analyses.
RESULTS: diArk has been developed to provide fast and easy access to all sequenced eukaryotic genomes worldwide. Currently, diArk 2.0 contains information about more than 880 species and more than 2350 genome assembly files. Many meta-data like sequencing and read-assembly methods, sequencing coverage, GC-content, extended lists of alternatively used scientific names and common species names, and various kinds of statistics are provided. To intuitively approach the data the web interface makes extensive usage of modern web techniques. A number of search modules and result views facilitate finding and judging the data of interest. Subscribing to the RSS feed is the easiest way to stay up-to-date with the latest genome data.
CONCLUSIONS: diArk 2.0 is the most up-to-date database of sequenced eukaryotic genomes compared to databases like GOLD, NCBI Genome, NHGRI, and ISC. It is different in that only those projects are stored for which genome assembly data or considerable amounts of cDNA data are available. Projects in planning stage or in the process of being sequenced are not included. The user can easily search through the provided data and directly access the genome assembly files of the sequenced genome of interest. diArk 2.0 is available at http://www.diark.org.

BMC Res Notes. 2011:4() | 11 Citations (from Europe PMC, 2025-12-20)
17439643
diArk--a resource for eukaryotic genome research. [PMID: 17439643]
Odronitz F, Hellkamp M, Kollmar M.

BACKGROUND: The number of completed eukaryotic genome sequences and cDNA projects has increased exponentially in the past few years although most of them have not been published yet. In addition, many microarray analyses yielded thousands of sequenced EST and cDNA clones. For the researcher interested in single gene analyses (from a phylogenetic, a structural biology or other perspective) it is therefore important to have up-to-date knowledge about the various resources providing primary data.
DESCRIPTION: The database is built around 3 central tables: species, sequencing projects and publications. The species table contains commonly and alternatively used scientific names, common names and the complete taxonomic information. For projects the sequence type and links to species project web-sites and species homepages are stored. All publications are linked to projects. The web-interface provides comprehensive search modules with detailed options and three different views of the selected data. We have especially focused on developing an elaborate taxonomic tree search tool that allows the user to instantaneously identify e.g. the closest relative to the organism of interest.
CONCLUSION: We have developed a database, called diArk, to store, organize, and present the most relevant information about completed genome projects and EST/cDNA data from eukaryotes. Currently, diArk provides information about 415 eukaryotes, 823 sequencing projects, and 248 publications.

BMC Genomics. 2007:8() | 16 Citations (from Europe PMC, 2025-12-20)

Ranking

All databases:
4340/6895 (37.07%)
Raw bio-data:
334/582 (42.784%)
4340
Total Rank
32
Citations
1.778
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2018-01-27
Curated by:
Lina Ma [2018-11-16]
huma shireen [2018-09-02]
Meiye Jiang [2018-02-26]