Implementation of Cloud based next generation sequencing data analysis in a clinical laboratory.

Getiria Onsongo, Jesse Erdmann, Michael D Spears, John Chilton, Kenneth B Beckman, Adam Hauge, Sophia Yohe, Matthew Schomaker, Matthew Bower, Kevin A T Silverstein, Bharat Thyagarajan
Author Information
  1. Kevin A T Silverstein: Research Informatics Support Systems, Minnesota Supercomputing Institute, University of Minnesota, Room 599 Walter Library 117 Pleasant St SE, Minneapolis, MN 55455, USA. kats@umn.edu.

Abstract

BACKGROUND: The introduction of next generation sequencing (NGS) has revolutionized molecular diagnostics, though several challenges remain limiting the widespread adoption of NGS testing into clinical practice. One such difficulty includes the development of a robust bioinformatics pipeline that can handle the volume of data generated by high-throughput sequencing in a cost-effective manner. Analysis of sequencing data typically requires a substantial level of computing power that is often cost-prohibitive to most clinical diagnostics laboratories.
FINDINGS: To address this challenge, our institution has developed a Galaxy-based data analysis pipeline which relies on a web-based, cloud-computing infrastructure to process NGS data and identify genetic variants. It provides additional flexibility, needed to control storage costs, resulting in a pipeline that is cost-effective on a per-sample basis. It does not require the usage of EBS disk to run a sample.
CONCLUSIONS: We demonstrate the validation and feasibility of implementing this bioinformatics pipeline in a molecular diagnostics laboratory. Four samples were analyzed in duplicate pairs and showed 100% concordance in mutations identified. This pipeline is currently being used in the clinic and all identified pathogenic variants confirmed using Sanger sequencing further validating the software.

References

  1. Genome Biol. 2009;10(11):R134 [PMID: 19930550]
  2. Arch Pathol Lab Med. 2015 Feb;139(2):204-10 [PMID: 25611102]
  3. Genome Res. 2005 Oct;15(10):1451-5 [PMID: 16169926]
  4. Fed Regist. 1992 Feb 28;57(40):7002-186 [PMID: 10170937]
  5. Nat Biotechnol. 2011 Jan;29(1):24-6 [PMID: 21221095]
  6. Brief Bioinform. 2013 Mar;14(2):178-92 [PMID: 22517427]
  7. Curr Protoc Mol Biol. 2010 Jan;Chapter 19:Unit 19.10.1-21 [PMID: 20069535]
  8. BMC Bioinformatics. 2012 Mar 19;13:42 [PMID: 22429538]
  9. PLoS One. 2013;8(1):e53083 [PMID: 23326386]
  10. BMC Bioinformatics. 2010 Dec 21;11 Suppl 12:S4 [PMID: 21210983]
  11. J Mol Diagn. 2013 Jan;15(1):81-93 [PMID: 23159595]
  12. Science. 2013 Mar 29;339(6127):1546-58 [PMID: 23539594]
  13. Genome Biol. 2010;11(8):R86 [PMID: 20738864]
  14. Arch Pathol Lab Med. 2009 May;133(5):743-55 [PMID: 19415949]
  15. BMC Bioinformatics. 2011 Aug 30;12:356 [PMID: 21878105]
  16. PLoS One. 2013 May 29;8(5):e65226 [PMID: 23734239]
  17. Bioinformatics. 2013 Jul 01;29(13):1685-6 [PMID: 23630176]
  18. Endocr Connect. 2013 May 28;2(2):104-11 [PMID: 23781326]
  19. Curr Protoc Bioinformatics. 2012 Jun;Chapter 11:Unit11.9 [PMID: 22700313]
  20. J Mol Diagn. 2013 Sep;15(5):607-22 [PMID: 23810757]

MeSH Term

Clinical Laboratory Techniques
High-Throughput Nucleotide Sequencing
Humans
Internet
Reproducibility of Results
Sequence Analysis, DNA
Statistics as Topic

Word Cloud

Created with Highcharts 10.0.0sequencingpipelinedataNGSdiagnosticsclinicalnextgenerationmolecularbioinformaticscost-effectiveanalysisvariantslaboratoryidentifiedBACKGROUND:introductionrevolutionizedthoughseveralchallengesremainlimitingwidespreadadoptiontestingpracticeOnedifficultyincludesdevelopmentrobustcanhandlevolumegeneratedhigh-throughputmannerAnalysistypicallyrequiressubstantiallevelcomputingpoweroftencost-prohibitivelaboratoriesFINDINGS:addresschallengeinstitutiondevelopedGalaxy-basedreliesweb-basedcloud-computinginfrastructureprocessidentifygeneticprovidesadditionalflexibilityneededcontrolstoragecostsresultingper-samplebasisrequireusageEBSdiskrunsampleCONCLUSIONS:demonstratevalidationfeasibilityimplementingFoursamplesanalyzedduplicatepairsshowed100%concordancemutationscurrentlyusedclinicpathogenicconfirmedusingSangervalidatingsoftwareImplementationCloudbased

Similar Articles

Cited By