Introduction

Until recently, sequencing has primarily been carried out in large genome centers which have invested heavily in developing the computational infrastructure that enables genomic sequence analysis. The recent advancements in next generation sequencing (NGS) have led to a wide dissemination of sequencing technologies and data, to highly diverse research groups. It is expected that clinical sequencing will become part of diagnostic routines shortly. However, limited accessibility to computational infrastructure and high quality bioinformatic tools, and the demand for personnel skilled in data analysis and interpretation remains a serious bottleneck. To this end, the cloud computing and Software-as-a-Service (SaaS) technologies can help address these issues.We successfully enabled the Atlas2 Cloud pipeline for personal genome analysis on two different cloud service platforms: a community cloud via the Genboree Workbench, and a commercial cloud via the Amazon Web Services using Software-as-a-Service model. We report a case study of personal genome analysis using our Atlas2 Genboree pipeline. We also outline a detailed cost structure for running Atlas2 Amazon on whole exome capture data, providing cost projections in terms of storage, compute and I/O when running Atlas2 Amazon on a large data set.We find that providing a web interface and an optimized pipeline clearly facilitates usage of cloud computing for personal genome analysis, but for it to be routinely used for large scale projects there needs to be a paradigm shift in the way we develop tools, in standard operating procedures, and in funding mechanisms.

Publications

  1. Atlas2 Cloud: a framework for personal genome analysis in the cloud.
    Cite this
    Evani US, Challis D, Yu J, Jackson AR, Paithankar S, Bainbridge MN, Jakkamsetti A, Pham P, Coarfa C, Milosavljevic A, Yu F, 2012-01-01 - BMC genomics
  2. An integrative variant analysis suite for whole exome next-generation sequencing data.
    Cite this
    Challis D, Yu J, Evani US, Jackson AR, Paithankar S, Coarfa C, Milosavljevic A, Gibbs RA, Yu F, 2012-01-01 - BMC bioinformatics

Credits

  1. Uday S Evani
    Developer

    The Human Genome Sequencing Center, Baylor College of Medicine, United States of America

  2. Danny Challis
    Developer

  3. Jin Yu
    Developer

  4. Andrew R Jackson
    Developer

  5. Sameer Paithankar
    Developer

  6. Matthew N Bainbridge
    Developer

  7. Adinarayana Jakkamsetti
    Developer

  8. Peter Pham
    Developer

  9. Cristian Coarfa
    Developer

  10. Aleksandar Milosavljevic
    Developer

  11. Fuli Yu
    Investigator

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT000452
Tool TypeApplication
Category
PlatformsLinux/Unix
Technologies
User InterfaceTerminal Command Line
Download Count0
Submitted ByFuli Yu