CloudPhylo Phylogeny reconstruction on large-scale datasets
Introduction
Phylogeny reconstruction is a routine analysis for most evolutionary related studies, determining and picturing evolutionary relationships among many genes or species. However, most existing tools for phylogeny reconstruction are simply based on single process model or traditional parallel paradigms, such as PThread, OpenMP etc., and therefore, cannot scale well with the dramatically increasing size of input dataset. To tackle this challenge, BIGD (Big Data Center) presents a Spark-based tool, CloudPhylo, to handle large dataset for fast and scalable phylogeny reconstruction Spark is a newly proposed cloud computing framework, which incorporates MapReduce paradigm and efficiently caches internal calculation results, significantly boosting the performance of CloudPhylo and enabling CloudPhylo to be used for largescale phylogenetic tree inference.
CloudPhylo is not only the world’s first phylogeny reconstruction tool available for large-scale dataset, but also the first Spark-based bioinformatics software in China. According to the comparison results, CloudPhylo achieves high efficiency and good scalability, and is well suited for largescale phylogenetic tree inference
Publications
Credits
- Xingjian Xu dotswing@gmail.com Developer
Beijing Institute of Genomics (BIG), BIGD, China
Community Ratings
Usability | Efficiency | Reliability | Rated By |
---|---|---|---|
0 user | |||
Sign in to rate |
Accession | BT000006 |
---|---|
Tool Type | Application |
Category | |
Platforms | Linux/Unix |
Technologies | Java, Spark |
User Interface | Terminal Command Line |
Latest Release | 1.0 (June 26, 2017) |
Download Count | 1903 |
Submitted By | Xingjian Xu |