Ultra-accurate microbial amplicon sequencing with synthetic long reads.

Benjamin J Callahan, Dmitry Grinevich, Siddhartha Thakur, Michael A Balamotis, Tuval Ben Yehezkel
Author Information
  1. Benjamin J Callahan: Department of Population Health and Pathobiology, College of Veterinary Medicine, North Carolina State University, Raleigh, NC, USA. benjamin.j.callahan@gmail.com. ORCID
  2. Dmitry Grinevich: Department of Population Health and Pathobiology, College of Veterinary Medicine, North Carolina State University, Raleigh, NC, USA.
  3. Siddhartha Thakur: Department of Population Health and Pathobiology, College of Veterinary Medicine, North Carolina State University, Raleigh, NC, USA.
  4. Michael A Balamotis: Loop Genomics, San Jose, CA, USA.
  5. Tuval Ben Yehezkel: Loop Genomics, San Jose, CA, USA.

Abstract

BACKGROUND: Out of the many pathogenic bacterial species that are known, only a fraction are readily identifiable directly from a complex microbial community using standard next generation DNA sequencing. Long-read sequencing offers the potential to identify a wider range of species and to differentiate between strains within a species, but attaining sufficient accuracy in complex metagenomes remains a challenge.
METHODS: Here, we describe and analytically validate LoopSeq, a commercially available synthetic long-read (SLR) sequencing technology that generates highly accurate long reads from standard short reads.
RESULTS: LoopSeq reads are sufficiently long and accurate to identify microbial genes and species directly from complex samples. LoopSeq perfectly recovered the full diversity of 16S rRNA genes from known strains in a synthetic microbial community. Full-length LoopSeq reads had a per-base error rate of 0.005%, which exceeds the accuracy reported for other long-read sequencing technologies. 18S-ITS and genomic sequencing of fungal and bacterial isolates confirmed that LoopSeq sequencing maintains that accuracy for reads up to 6 kb in length. LoopSeq full-length 16S rRNA reads could accurately classify organisms down to the species level in rinsate from retail meat samples, and could differentiate strains within species identified by the CDC as potential foodborne pathogens.
CONCLUSIONS: The order-of-magnitude improvement in length and accuracy over standard Illumina amplicon sequencing achieved with LoopSeq enables accurate species-level and strain identification from complex- to low-biomass microbiome samples. The ability to generate accurate and long microbiome sequencing reads using standard short read sequencers will accelerate the building of quality microbial sequence databases and removes a significant hurdle on the path to precision microbial genomics. Video abstract.

Keywords

References

  1. Nucleic Acids Res. 2019 Oct 10;47(18):e103 [PMID: 31269198]
  2. Water Res. 2020 Jul 1;178:115815 [PMID: 32380296]
  3. Science. 2020 Jul 17;369(6501):297-301 [PMID: 32471856]
  4. ISME J. 2017 Dec;11(12):2639-2643 [PMID: 28731476]
  5. Cell Syst. 2015 Jul 29;1(1):4-5 [PMID: 27135683]
  6. Immunogenetics. 2020 May;72(4):225-239 [PMID: 32112172]
  7. PLoS One. 2016 Jan 20;11(1):e0147229 [PMID: 26789840]
  8. Nat Commun. 2019 Nov 1;10(1):5009 [PMID: 31676752]
  9. Nat Commun. 2019 Nov 6;10(1):5029 [PMID: 31695033]
  10. J Immunol. 2020 Jun 15;204(12):3434-3444 [PMID: 32376650]
  11. Nat Biotechnol. 2014 Mar;32(3):261-266 [PMID: 24561555]
  12. Lancet Infect Dis. 2020 Aug;20(8):920-928 [PMID: 32422201]
  13. PeerJ. 2016 Sep 20;4:e2492 [PMID: 27688981]
  14. Nucleic Acids Res. 2001 Jan 1;29(1):181-4 [PMID: 11125085]
  15. Microbiome. 2018 Oct 23;6(1):190 [PMID: 30352611]
  16. Immunogenetics. 2017 Apr;69(4):211-229 [PMID: 28078358]
  17. Nucleic Acids Res. 2019 Oct 10;47(18):e104 [PMID: 31418021]
  18. Nat Methods. 2016 Jul;13(7):581-3 [PMID: 27214047]
  19. Bioinformatics. 2014 Aug 1;30(15):2114-20 [PMID: 24695404]
  20. Cell Syst. 2015 Jul 29;1(1):72-87 [PMID: 26594662]
  21. Hum Immunol. 2015 Dec;76(12):891-6 [PMID: 26028281]
  22. Sci Rep. 2018 Jul 19;8(1):10950 [PMID: 30026539]
  23. Nat Biotechnol. 2018 Feb;36(2):190-195 [PMID: 29291348]
  24. Nat Biotechnol. 2016 Mar;34(3):303-11 [PMID: 26829319]
  25. Curr Opin Microbiol. 2015 Feb;23:110-20 [PMID: 25461581]
  26. Genome Biol. 2019 Nov 20;20(1):246 [PMID: 31747936]
  27. J Comput Biol. 2012 May;19(5):455-77 [PMID: 22506599]
  28. Nat Commun. 2016 Jun 24;7:11708 [PMID: 27339440]
  29. Immunity. 2019 Jan 15;50(1):241-252.e6 [PMID: 30552025]
  30. Bioinformatics. 2015 Nov 1;31(21):3476-82 [PMID: 26139637]
  31. BMC Genomics. 2019 May 7;20(1):344 [PMID: 31064321]
  32. Nat Med. 2017 Feb;23(2):185-191 [PMID: 28092665]
  33. Genome Biol. 2014;15(11):517 [PMID: 25406369]
  34. Nat Commun. 2018 Apr 10;9(1):1357 [PMID: 29636477]
  35. Cell Syst. 2015 Jul 29;1(1):6-7 [PMID: 27135684]
  36. Nat Methods. 2021 Feb;18(2):165-169 [PMID: 33432244]

Grants

  1. R35 GM133745/NIGMS NIH HHS

MeSH Term

High-Throughput Nucleotide Sequencing
Metagenome
Microbiota
RNA, Ribosomal, 16S
Sequence Analysis, DNA

Chemicals

RNA, Ribosomal, 16S

Word Cloud

Created with Highcharts 10.0.0sequencingreadsLoopSeqspeciesmicrobiallongstandardaccuracyaccuratecomplexstrainssyntheticsamplesbacterialknowndirectlycommunityusingLong-readpotentialidentifydifferentiatewithinlong-readshortgenes16SrRNAlengthampliconmicrobiomeBACKGROUND:manypathogenicfractionreadilyidentifiablenextgenerationDNAofferswiderrangeattainingsufficientmetagenomesremainschallengeMETHODS:describeanalyticallyvalidatecommerciallyavailableSLRtechnologygenerateshighlyRESULTS:sufficientlyperfectlyrecoveredfulldiversityFull-lengthper-baseerrorrate0005%exceedsreportedtechnologies18S-ITSgenomicfungalisolatesconfirmedmaintains6kbfull-lengthaccuratelyclassifyorganismslevelrinsateretailmeatidentifiedCDCfoodbornepathogensCONCLUSIONS:order-of-magnitudeimprovementIlluminaachievedenablesspecies-levelstrainidentificationcomplex-low-biomassabilitygeneratereadsequencerswillacceleratebuildingqualitysequencedatabasesremovessignificanthurdlepathprecisiongenomicsVideoabstractUltra-accurateAmpliconMetagenomicsSynthetic

Similar Articles

Cited By (52)