Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

BioC-BioGRID corpus

General information

URL: http://bioc.sourceforge.net/BioC-BioGRID.html
Full name:
Description: The BioC-BioGRID corpus contains human annotations on 120 full text biomedical literature articles for genetic and protein interaction data. The annotated corpus contains 6409 mentions of genes and their Entrez Gene IDs, 186 mentions of organism names and their NCBI Taxonomy IDs, 1867 mentions of protein-protein interactions (PPI) and 701 annotations of PPI experimental evidence statements, 856 mentions of genetic interactions (GI) and 399 annotations of GI evidence statements.
Year founded: 2017
Last update:
Version:
Accessibility:
Accessible
Country/Region: United States

Classification & Tag

Data type:
Data object:
Database category:
Major species:
Keywords:

Contact information

University/Institution: National Center for Biotechnology Information
Address: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD20894, USA
City:
Province/State: Maryland
Country/Region: United States
Contact name (PI/Team): Rezarta Islamaj Doğan
Contact email (PI/Helpdesk): Rezarta.Islamaj@nih.gov

Publications

28077563
The BioC-BioGRID corpus: full text articles annotated for curation of protein-protein and genetic interactions. [PMID: 28077563]
Islamaj Dogan R, Kim S, Chatr-Aryamontri A, Chang CS, Oughtred R, Rust J, Wilbur WJ, Comeau DC, Dolinski K, Tyers M.

A great deal of information on the molecular genetics and biochemistry of model organisms has been reported in the scientific literature. However, this data is typically described in free text form and is not readily amenable to computational analyses. To this end, the BioGRID database systematically curates the biomedical literature for genetic and protein interaction data. This data is provided in a standardized computationally tractable format and includes structured annotation of experimental evidence. BioGRID curation necessarily involves substantial human effort by expert curators who must read each publication to extract the relevant information. Computational text-mining methods offer the potential to augment and accelerate manual curation. To facilitate the development of practical text-mining strategies, a new challenge was organized in BioCreative V for the BioC task, the collaborative Biocurator Assistant Task. This was a non-competitive, cooperative task in which the participants worked together to build BioC-compatible modules into an integrated pipeline to assist BioGRID curators. As an integral part of this task, a test collection of full text articles was developed that contained both biological entity annotations (gene/protein and organism/species) and molecular interaction annotations (protein-protein and genetic interactions (PPIs and GIs)). This collection, which we call the BioC-BioGRID corpus, was annotated by four BioGRID curators over three rounds of annotation and contains 120 full text articles curated in a dataset representing two major model organisms, namely budding yeast and human. The BioC-BioGRID corpus contains annotations for 6409 mentions of genes and their Entrez Gene IDs, 186 mentions of organism names and their NCBI Taxonomy IDs, 1867 mentions of PPIs and 701 annotations of PPI experimental evidence statements, 856 mentions of GIs and 399 annotations of GI evidence statements. The purpose, characteristics and possible future uses of the BioC-BioGRID corpus are detailed in this report.Database URL: http://bioc.sourceforge.net/BioC-BioGRID.html.

Database (Oxford). 2017:2017() | 20 Citations (from Europe PMC, 2025-12-20)

Ranking

All databases:
3641/6895 (47.208%)
Interaction:
672/1194 (43.802%)
Literature:
318/577 (45.061%)
3641
Total Rank
20
Citations
2.5
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2018-01-28
Curated by:
Lina Ma [2018-12-17]
[2018-12-04]
Sidra Younas [2018-04-17]
Yang Zhang [2018-01-28]