| URL: | http://phenominer.mml.cam.ac.uk |
| Full name: | Online database of phenotypes and associated disorders |
| Description: | The PhenoMiner Portal provides a way to search the vocabulary of terms we extracted from the scientific literature. The system that does this extraction is based on text/data-mining technology - natural language processing, machine learning and conceptual analysis. It builds on insights gained from semantic parsing to extract structured information about phenotypes from whole sentences - in contrast to existing techniques which often apply string matching. The system exploits the wealth of scientific data locked within the scientific literature in databases such as PubMed Central and Europe PMC to extract the semantic vocabulary of phenotypes that scientists use. PhenoMiner aims to provide scientists, clinicians and informaticians with the data and tools they need to gain new insights into Mendelian diseases. |
| Year founded: | 2015 |
| Last update: | 2014-12-31 |
| Version: | v1.0 |
| Accessibility: |
Accessible
|
| Country/Region: | United Kingdom |
| Data type: | |
| Data object: | |
| Database category: | |
| Major species: | |
| Keywords: |
| University/Institution: | University of Cambridge |
| Address: | Cambridge, CB3 9DB, UK |
| City: | Cambridge |
| Province/State: | Cambridgeshire |
| Country/Region: | United Kingdom |
| Contact name (PI/Team): | Dr. Nigel Collier |
| Contact email (PI/Helpdesk): | nhc30@cam.ac.uk |
|
PhenoMiner: from text to a database of phenotypes associated with OMIM diseases. [PMID: 26507285]
Analysis of scientific and clinical phenotypes reported in the experimental literature has been curated manually to build high-quality databases such as the Online Mendelian Inheritance in Man (OMIM). However, the identification and harmonization of phenotype descriptions struggles with the diversity of human expressivity. We introduce a novel automated extraction approach called PhenoMiner that exploits full parsing and conceptual analysis. Apriori association mining is then used to identify relationships to human diseases. We applied PhenoMiner to the BMC open access collection and identified 13?636 phenotype candidates. We identified 28?155 phenotype-disorder hypotheses covering 4898 phenotypes and 1659 Mendelian disorders. Analysis showed: (i) the semantic distribution of the extracted terms against linked ontologies; (ii) a comparison of term overlap with the Human Phenotype Ontology (HP); (iii) moderate support for phenotype-disorder pairs in both OMIM and the literature; (iv) strong associations of phenotype-disorder pairs to known disease-genes pairs using PhenoDigm. The full list of PhenoMiner phenotypes (S1), phenotype-disorder associations (S2), association-filtered linked data (S3) and user database documentation (S5) is available as supplementary data and can be downloaded at http://github.com/nhcollier/PhenoMiner under a Creative Commons Attribution 4.0 license.Database URL: phenominer.mml.cam.ac.uk. © The Author(s) 2015. Published by Oxford University Press. |