PipeCoV: a pipeline for SARS-CoV-2 genome assembly, annotation and variant identification.

Renato R M Oliveira, Tatianne Costa Negri, Gisele Nunes, In��cio Medeiros, Guilherme Ara��jo, Fabricio de Oliveira Silva, Jorge Estefano Santana de Souza, Ronnie Alves, Guilherme Oliveira
Author Information
  1. Renato R M Oliveira: Environmental Genomics, Instituto Tecnol��gico Vale, Bel��m, Par��, Brazil. ORCID
  2. Tatianne Costa Negri: Environmental Genomics, Instituto Tecnol��gico Vale, Bel��m, Par��, Brazil.
  3. Gisele Nunes: Environmental Genomics, Instituto Tecnol��gico Vale, Bel��m, Par��, Brazil.
  4. In��cio Medeiros: Programa de P��s-Gradua����o em Bioinform��tica, Universidade Federal do Rio Grande do Norte, Natal, Rio Grande do Norte, Brazil.
  5. Guilherme Ara��jo: Programa de P��s-Gradua����o em Bioinform��tica, Universidade Federal do Rio Grande do Norte, Natal, Rio Grande do Norte, Brazil.
  6. Fabricio de Oliveira Silva: Environmental Genomics, Instituto Tecnol��gico Vale, Bel��m, Par��, Brazil.
  7. Jorge Estefano Santana de Souza: Programa de P��s-Gradua����o em Bioinform��tica, Universidade Federal do Rio Grande do Norte, Natal, Rio Grande do Norte, Brazil.
  8. Ronnie Alves: Environmental Genomics, Instituto Tecnol��gico Vale, Bel��m, Par��, Brazil.
  9. Guilherme Oliveira: Environmental Genomics, Instituto Tecnol��gico Vale, Bel��m, Par��, Brazil. ORCID

Abstract

Motivation: Since the identification of the novel coronavirus (SARS-CoV-2), the scientific community has made a huge effort to understand the virus biology and to develop vaccines. Next-generation sequencing strategies have been successful in understanding the evolution of infectious diseases as well as facilitating the development of molecular diagnostics and treatments. Thousands of genomes are being generated weekly to understand the genetic characteristics of this virus. Efficient pipelines are needed to analyze the vast amount of data generated. Here we present a new pipeline designed for genomic analysis and variant identification of the SARS-CoV-2 virus.
Results: PipeCoV shows better performance when compared to well-established SARS-CoV-2 pipelines, with a lower content of Ns and higher genome coverage when compared to the Wuhan reference. It also provides a variant report not offered by other tested pipelines.
Availability: https://github.com/alvesrco/pipecov.

Keywords

References

  1. Bioinformatics. 2009 Jul 15;25(14):1754-60 [PMID: 19451168]
  2. PLoS One. 2017 Oct 26;12(10):e0185056 [PMID: 29073143]
  3. BMC Res Notes. 2016 Feb 12;9:88 [PMID: 26868221]
  4. Gigascience. 2012 Dec 27;1(1):18 [PMID: 23587118]
  5. J Gen Virol. 2021 Mar;102(3): [PMID: 33587028]
  6. Nat Rev Genet. 2021 Jul;22(7):415-426 [PMID: 33948037]
  7. Transbound Emerg Dis. 2021 May;68(3):1625-1638 [PMID: 32954666]
  8. Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9241-9243 [PMID: 32269081]
  9. Nat Methods. 2012 Mar 04;9(4):357-9 [PMID: 22388286]
  10. J Comput Biol. 2012 May;19(5):455-77 [PMID: 22506599]
  11. Bioinformatics. 2021 Jan 20;: [PMID: 33471068]
  12. Viruses. 2020 Aug 15;12(8): [PMID: 32824272]
  13. Bioinformatics. 2009 Aug 15;25(16):2078-9 [PMID: 19505943]
  14. Euro Surveill. 2017 Mar 30;22(13): [PMID: 28382917]
  15. Bioinformatics. 2012 Jul 15;28(14):1838-44 [PMID: 22569178]
  16. Genes (Basel). 2021 Mar 08;12(3): [PMID: 33800487]
  17. Science. 2020 Oct 30;370(6516):571-575 [PMID: 32913002]
  18. Brief Funct Genomics. 2017 Nov 1;16(6):361-378 [PMID: 28453648]

Grants

  1. BB/P027849/1/Biotechnology and Biological Sciences Research Council

MeSH Term

Humans
SARS-CoV-2
COVID-19
Genome, Viral
Genomics
Viruses

Word Cloud

Created with Highcharts 10.0.0identificationSARS-CoV-2viruspipelinesvariantunderstandgeneratedpipelinecomparedgenomeMotivation:SincenovelcoronavirusscientificcommunitymadehugeeffortbiologydevelopvaccinesNext-generationsequencingstrategiessuccessfulunderstandingevolutioninfectiousdiseaseswellfacilitatingdevelopmentmoleculardiagnosticstreatmentsThousandsgenomesweeklygeneticcharacteristicsEfficientneededanalyzevastamountdatapresentnewdesignedgenomicanalysisResults:PipeCoVshowsbetterperformancewell-establishedlowercontentNshighercoverageWuhanreferencealsoprovidesreportofferedtestedAvailability:https://githubcom/alvesrco/pipecovPipeCoV:assemblyannotationAnnotationCovid19GenomicsPipelineSarscov2VariantVirus

Similar Articles

Cited By (7)