Detecting Somatic Insertions/Deletions (Indels) Using Tumor RNA-Seq Data.

Kohei Hagiwara, Jinghui Zhang
Author Information
  1. Kohei Hagiwara: Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA. kohei.hagiwara@stjude.org.
  2. Jinghui Zhang: Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA.

Abstract

Identification of somatic indels remains a major challenge in cancer genomic analysis and is rarely attempted for tumor-only RNA-Seq due to the lack of matching normal data and the complexity of read alignment, which involves mapping of both splice junctions and indels. In this chapter, we introduce RNAIndel, a software tool designed for identifying somatic coding indels using tumor-only RNA-Seq. RNAIndel performs indel realignment and employs a machine learning model to estimate the probability of a coding indel being somatic, germline, or artifact. Its high accuracy has been validated in RNA-Seq generated from multiple tumor types.

Keywords

References

  1. Piskol R, Ramaswami G, Li JB (2013) Reliable identification of genomic variants from RNA-seq data. Am J Hum Genet 93:641���651. https://doi.org/10.1016/j.ajhg.2013.08.008 [DOI: 10.1016/j.ajhg.2013.08.008]
  2. Mosen-Ansorena D (2019) Identification of mutated cancer driver genes in unpaired RNA-Seq samples. Methods Mol Biol 1878:95���108. https://doi.org/10.1007/978-1-4939-8868-6_5 [DOI: 10.1007/978-1-4939-8868-6_5]
  3. Coudray A, Battenhouse AM, Bucher P, Iyer VR (2018) Detection and benchmarking of somatic mutations in cancer genomes using RNA-seq data. PeerJ. https://doi.org/10.7717/peerj.5362
  4. Sun Z, Bhagwate A, Prodduturi N, Yang P, Kocher JA (2017) Indel detection from RNA-seq data: tool evaluation and strategies for accurate detection of actionable mutations. Brief Bioinform 18:973���983. https://doi.org/10.1093/bib/bbw069 [DOI: 10.1093/bib/bbw069]
  5. Hagiwara K, Ding L, Edmonson MN, Rice SV, Newman S, Easton J, Dai J, Meshinchi S, Ries RE, Rusch M, Zhang J (2020) RNAIndel: discovering somatic coding indels from tumor RNA-Seq data. Bioinformatics 36:1382���1390. https://doi.org/10.1093/bioinformatics/btz753 [DOI: 10.1093/bioinformatics/btz753]
  6. Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C, Gabriel S, Meyerson M, Lander ES, Getz G (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31:213���219. https://doi.org/10.1038/nbt.2514 [DOI: 10.1038/nbt.2514]
  7. Hagiwara K, Edmonson MN, Wheeler DA, Zhang J (2022) indelPost: harmonizing ambiguities in simple and complex indel alignments. Bioinformatics 38:549���551. https://doi.org/10.1093/bioinformatics/btab601 [DOI: 10.1093/bioinformatics/btab601]
  8. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15���21. https://doi.org/10.1093/bioinformatics/bts635 [DOI: 10.1093/bioinformatics/bts635]
  9. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell TJ, Kernytsky AM, Sivachenko AY, Cibulskis K, Gabriel SB, Altshuler D, Daly MJ (2011) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43:491���498. https://doi.org/10.1038/ng.806 [DOI: 10.1038/ng.806]
  10. Picard Tools. https://broadinstitute.github.io/picard/
  11. Edmonson MN, Zhang J, Yan C, Finney RP, Meerzaman DM, Buetow KH (2011) Bambino: a variant detector and alignment viewer for next-generation sequencing data in the SAM/BAM format. Bioinformatics 27:865���866. https://doi.org/10.1093/bioinformatics/btr032 [DOI: 10.1093/bioinformatics/btr032]
  12. Tate JG, Bamford S, Jubb HC, Sondka Z, Beare DM, Bindal N, Boutselakis H, Cole CG, Creatore C, Dawson E, Fish P, Harsha B, Hathaway C, Jupe SC, Kok CY, Noble K, Ponting L, Ramshaw CC, Rye CE, Speedy HE, Stefancsik R, Thompson SL, Wang S, Ward S, Campbell PJ, Forbes SA (2019) COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res 47(D1):D941���D947. https://doi.org/10.1093/nar/gky1015 [DOI: 10.1093/nar/gky1015]

MeSH Term

Humans
Software
Neoplasms
RNA-Seq
INDEL Mutation
Computational Biology
Machine Learning
Genomics
Sequence Analysis, RNA

Word Cloud

Created with Highcharts 10.0.0RNA-Seqsomaticindelstumor-onlyRNAIndelcodingindellearningIdentificationremainsmajorchallengecancergenomicanalysisrarelyattemptedduelackmatchingnormaldatacomplexityreadalignmentinvolvesmappingsplicejunctionschapterintroducesoftwaretooldesignedidentifyingusingperformsrealignmentemploysmachinemodelestimateprobabilitygermlineartifacthighaccuracyvalidatedgeneratedmultipletumortypesDetectingSomaticInsertions/DeletionsIndelsUsingTumorDataCancerIndelMachineRealignment

Similar Articles

Cited By