Introduction

BACKGROUND: Short-read aligners have recently gained a lot of speed by exploiting the massive parallelism of GPU. An uprising alterative to GPU is Intel MIC; supercomputers like Tianhe-2, currently top of TOP500, is built with 48,000 MIC boards to offer ~55 PFLOPS. The CPU-like architecture of MIC allows CPU-based software to be parallelized easily; however, the performance is often inferior to GPU counterparts as an MIC card contains only ~60 cores (while a GPU card typically has over a thousand cores). RESULTS: To better utilize MIC-enabled computers for NGS data analysis, we developed a new short-read aligner MICA that is optimized in view of MIC's limitation and the extra parallelism inside each MIC core. By utilizing the 512-bit vector units in the MIC and implementing a new seeding strategy, experiments on aligning 150 bp paired-end reads show that MICA using one MIC card is 4.9 times faster than BWA-MEM (using 6 cores of a top-end CPU), and slightly faster than SOAP3-dp (using a GPU). Furthermore, MICA's simplicity allows very efficient scale-up when multiple MIC cards are used in a node (3 cards give a 14.1-fold speedup over BWA-MEM). SUMMARY: MICA can be readily used by MIC-enabled supercomputers for production purpose. We have tested MICA on Tianhe-2 with 90 WGS samples (17.47 Tera-bases), which can be aligned in an hour using 400 nodes. MICA has impressive performance even though MIC is only in its initial stage of development. AVAILABILITY AND IMPLEMENTATION: MICA's source code is freely available at http://sourceforge.net/projects/mica-aligner under GPL v3. SUPPLEMENTARY INFORMATION: Supplementary information is available as "Additional File 1". Datasets are available at www.bio8.cs.hku.hk/dataset/mica.

Publications

  1. MICA: A fast short-read aligner that takes full advantage of Many Integrated Core Architecture (MIC).
    Cite this
    Luo R, Cheung J, Wu E, Wang H, Chan SH, Law WC, He G, Yu C, Liu CM, Zhou D, Li Y, Li R, Wang J, Zhu X, Peng S, Lam TW, 2015-01-01 - BMC bioinformatics

Credits

  1. Ruibang Luo
    Developer

  2. Jeanno Cheung
    Developer

  3. Edward Wu
    Developer

  4. Heng Wang
    Developer

  5. Sze-Hang Chan
    Developer

  6. Wai-Chun Law
    Developer

  7. Guangzhu He
    Developer

  8. Chang Yu
    Developer

  9. Chi-Man Liu
    Developer

  10. Dazong Zhou
    Developer

  11. Yingrui Li
    Developer

  12. Ruiqiang Li
    Developer

  13. Jun Wang
    Developer

  14. Xiaoqian Zhu
    Developer

  15. Shaoliang Peng
    Developer

  16. Tak-Wah Lam
    Investigator

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT003711
Tool TypeApplication
Category
PlatformsLinux/Unix
TechnologiesC++
User InterfaceTerminal Command Line
Download Count0
Submitted ByTak-Wah Lam