Finding Diagnostically Useful Patterns in Quantitative Phenotypic Data.

Stuart Aitken, Helen V Firth, Jeremy McRae, Mihail Halachev, Usha Kini, Michael J Parker, Melissa M Lees, Katherine Lachlan, Ajoy Sarkar, Shelagh Joss, Miranda Splitt, Shane McKee, Andrea H Németh, Richard H Scott, Caroline F Wright, Joseph A Marsh, Matthew E Hurles, David R FitzPatrick, DDD Study
Author Information
  1. Stuart Aitken: MRC Human Genetics Unit, Institute of Genetic and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK.
  2. Helen V Firth: Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK; Clinical Genetic Department, Addenbrooke's Hospital Cambridge University Hospitals, Cambridge, UK.
  3. Jeremy McRae: Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK.
  4. Mihail Halachev: MRC Human Genetics Unit, Institute of Genetic and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; South East Scotland Regional Genetics Services, Western General Hospital, Edinburgh, UK.
  5. Usha Kini: Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, UK.
  6. Michael J Parker: Sheffield Children's Hospital NHS Foundation Trust, Western Bank, Sheffield, UK.
  7. Melissa M Lees: North East Thames Regional Genetics Service, Great Ormond Street Hospital for Children NHS Foundation Trust, London WC1N 3EH, UK.
  8. Katherine Lachlan: Wessex Clinical Genetics Service, University Hospitals of Southampton NHS Trust, Southampton, UK.
  9. Ajoy Sarkar: Nottingham Regional Genetics Service, City Hospital Campus, Nottingham University Hospitals NHS Trust, The Gables, Hucknall Road, Nottingham NG5 1PB, UK.
  10. Shelagh Joss: West of Scotland Regional Genetics Service, Queen Elizabeth University Hospital, Glasgow G51 4TF, UK.
  11. Miranda Splitt: Northern Genetics Service, Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK.
  12. Shane McKee: Northern Ireland Regional Genetics Service, Belfast City Hospital, Belfast BT9 7AB, UK.
  13. Andrea H Németh: Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, UK; Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK; Oxford Centre for Genomic Medicine, Oxford University Hospitals National Health Service Foundation Trust, Oxford, UK.
  14. Richard H Scott: North East Thames Regional Genetics Service, Great Ormond Street Hospital for Children NHS Foundation Trust, London WC1N 3EH, UK.
  15. Caroline F Wright: University of Exeter Medical School, RILD Level 4, Royal Devon & Exeter Hospital, Barrack Road, Exeter, UK.
  16. Joseph A Marsh: MRC Human Genetics Unit, Institute of Genetic and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK.
  17. Matthew E Hurles: Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK.
  18. David R FitzPatrick: MRC Human Genetics Unit, Institute of Genetic and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK. Electronic address: david.fitzpatrick@ed.ac.uk.
  19. : Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK.

Abstract

Trio-based whole-exome sequence (WES) data have established confident genetic diagnoses in ∼40% of previously undiagnosed individuals recruited to the Deciphering Developmental Disorders (DDD) study. Here we aim to use the breadth of phenotypic information recorded in DDD to augment diagnosis and disease variant discovery in probands. Median Euclidean distances (mEuD) were employed as a simple measure of similarity of quantitative phenotypic data within sets of ≥10 individuals with plausibly causative de novo mutations (DNM) in 28 different developmental disorder genes. 13/28 (46.4%) showed significant similarity for growth or developmental milestone metrics, 10/28 (35.7%) showed similarity in HPO term usage, and 12/28 (43%) showed no phenotypic similarity. Pairwise comparisons of individuals with high-impact inherited variants to the 32 individuals with causative DNM in ANKRD11 using only growth z-scores highlighted 5 likely causative inherited variants and two unrecognized DNM resulting in an 18% diagnostic uplift for this gene. Using an independent approach, naive Bayes classification of growth and developmental data produced reasonably discriminative models for the 24 DNM genes with sufficiently complete data. An unsupervised naive Bayes classification of 6,993 probands with WES data and sufficient phenotypic information defined 23 in silico syndromes (ISSs) and was used to test a "phenotype first" approach to the discovery of causative genotypes using WES variants strictly filtered on allele frequency, mutation consequence, and evidence of constraint in humans. This highlighted heterozygous de novo nonsynonymous variants in SPTBN2 as causative in three DDD probands.

Keywords

References

  1. Stat Med. 1992 Jul;11(10):1305-19 [PMID: 1518992]
  2. Am J Med Genet A. 2015 Jan;167A(1):1-10 [PMID: 25393061]
  3. J Child Neurol. 2013 Oct;28(10):1292-5 [PMID: 22914369]
  4. Nucleic Acids Res. 2014 Jan;42(Database issue):D966-74 [PMID: 24217912]
  5. Nat Commun. 2017 Nov 7;8(1):1350 [PMID: 29116080]
  6. Genet Med. 2018 Oct;20(10):1216-1223 [PMID: 29323667]
  7. Chin Med J (Engl). 2016 20th Oct;129(20):2516-2517 [PMID: 27748352]
  8. Mol Syndromol. 2016 Jul;7(3):110-21 [PMID: 27587987]
  9. Genome Med. 2015 Jul 30;7(1):81 [PMID: 26229552]
  10. Nat Commun. 2019 May 30;10(1):2373 [PMID: 31147538]
  11. Nature. 2017 Feb 23;542(7642):433-438 [PMID: 28135719]
  12. Acta Inform Med. 2016 Oct;24(5):364-369 [PMID: 28077895]
  13. Am J Med Genet C Semin Med Genet. 2014 Jun;166C(2):140-55 [PMID: 24839169]
  14. Eur J Hum Genet. 2018 Jul;26(7):928-929 [PMID: 29795474]
  15. Nature. 2015 Mar 12;519(7542):223-8 [PMID: 25533962]
  16. Elife. 2014 Jun 24;3:e02020 [PMID: 24963138]
  17. Lancet. 2015 Apr 4;385(9975):1305-14 [PMID: 25529582]
  18. Eur J Pediatr. 2012 Sep;171(9):1285-300 [PMID: 21898032]
  19. J Clin Res Pediatr Endocrinol. 2014;6(1):1-8 [PMID: 24637303]
  20. Nat Protoc. 2015 Dec;10(12):2004-15 [PMID: 26562621]
  21. J Med Genet. 2014 Oct;51(10):659-68 [PMID: 25125236]
  22. Bioinformatics. 2018 Jun 15;34(12):2087-2095 [PMID: 29360927]
  23. Sci Rep. 2017 Oct 18;7(1):13509 [PMID: 29044180]
  24. Nat Rev Genet. 2018 Oct;19(10):649-666 [PMID: 29995837]
  25. Nat Genet. 2006 Feb;38(2):184-90 [PMID: 16429157]
  26. Nat Genet. 2015 Nov;47(11):1363-9 [PMID: 26437029]
  27. Nature. 2009 Nov 12;462(7270):231-4 [PMID: 19907496]
  28. Genet Med. 2016 Jun;18(6):608-17 [PMID: 26562225]
  29. Brain. 2015 Jul;138(Pt 7):1817-32 [PMID: 25981959]
  30. Genet Med. 2016 Jul;18(7):678-85 [PMID: 26633545]
  31. Am J Med Genet A. 2016 Oct;170(10):2570-7 [PMID: 27155212]

Grants

  1. MC_UU_00007/3/Medical Research Council
  2. MC_PC_16018/Medical Research Council
  3. MR/M02122X/1/Medical Research Council
  4. /Wellcome Trust
  5. MC_UP_1502/3/Medical Research Council

MeSH Term

Bayes Theorem
Child
Developmental Disabilities
Dwarfism
Exome
Female
Gene Frequency
Genetic Predisposition to Disease
Heterozygote
Humans
Male
Mutation
Phenotype
Repressor Proteins
Spectrin
Exome Sequencing

Chemicals

Repressor Proteins
Spectrin

Word Cloud

Created with Highcharts 10.0.0datacausativeindividualsphenotypicsimilarityDNMdevelopmentalvariantsWESDDDprobandsshowedgrowthnaiveBayesinformationdiseasediscoverydenovogenesinheritedusinghighlightedapproachclassificationTrio-basedwhole-exomesequenceestablishedconfidentgeneticdiagnoses∼40%previouslyundiagnosedrecruitedDecipheringDevelopmentalDisordersstudyaimusebreadthrecordedaugmentdiagnosisvariantMedianEuclideandistancesmEuDemployedsimplemeasurequantitativewithinsets≥10plausiblymutations28differentdisorder13/28464%significantmilestonemetrics10/28357%HPOtermusage12/2843%Pairwisecomparisonshigh-impact32ANKRD11z-scores5likelytwounrecognizedresulting18%diagnosticupliftgeneUsingindependentproducedreasonablydiscriminativemodels24sufficientlycompleteunsupervised6993sufficientdefined23silicosyndromesISSsusedtest"phenotypefirst"genotypesstrictlyfilteredallelefrequencymutationconsequenceevidenceconstrainthumansheterozygousnonsynonymousSPTBN2threeFindingDiagnosticallyUsefulPatternsQuantitativePhenotypicDatagenotypephenotypetSNE

Similar Articles

Cited By