VirusLab: A Tool for Customized SARS-CoV-2 Data Analysis.

Pietro Pinoli, Anna Bernasconi, Anna Sandionigi, Stefano Ceri
Author Information
  1. Pietro Pinoli: Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milano, Italy. ORCID
  2. Anna Bernasconi: Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milano, Italy. ORCID
  3. Anna Sandionigi: Quantia Consulting S.r.l., Mariano Comense, 22066 Como, Italy. ORCID
  4. Stefano Ceri: Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milano, Italy. ORCID

Abstract

Since the beginning of 2020, the COVID-19 pandemic has posed unprecedented challenges to viral data analysis and connected host disease diagnostic methods. We propose VirusLab, a flexible system for analysing SARS-CoV-2 viral sequences and relating them to metadata or clinical information about the host. VirusLab capitalizes on two existing resources: ViruSurf, a database of public SARS-CoV-2 sequences supporting metadata-driven search, and VirusViz, a tool for visual analysis of search results. VirusLab is designed for taking advantage of these resources within a server-side architecture that: (i) covers pipelines based on approaches already in use (ARTIC, Galaxy) but entirely cutomizable upon user request; (ii) predigests analysis of raw sequencing data from different platforms (Oxford Nanopore and Illumina); (iii) gives access to public archives datasets; (iv) supplies user-friendly reporting - making it a tool that can also be integrated into a business environment. VirusLab can be installed and hosted within the premises of any organization where information about SARS-CoV-2 sequences can be safely integrated with information about hosts (e.g., clinical metadata). A system such as VirusLab is not currently available in the landscape of similar providers: our results show that VirusLab is a powerful tool to generate tabular/graphical and machine readable reports that can be integrated in more complex pipelines. We foresee that the proposed system can support many research-oriented and therapeutic scenarios within hospitals or the tracing of viral sequences and their mutational processes within organizations for viral surveillance.

Keywords

References

  1. Lancet Infect Dis. 2021 May;21(5):602 [PMID: 33571446]
  2. Nucleic Acids Res. 2021 Jan 8;49(D1):D706-D714 [PMID: 33045727]
  3. Zool Res. 2020 Nov 18;41(6):705-708 [PMID: 33045776]
  4. Nat Biotechnol. 2020 Mar;38(3):276-278 [PMID: 32055031]
  5. Nature. 2021 Jul;595(7869):707-712 [PMID: 34098568]
  6. J Mol Biol. 1970 Mar;48(3):443-53 [PMID: 5420325]
  7. Virus Evol. 2021 Jul 30;7(2):veab064 [PMID: 34527285]
  8. Fly (Austin). 2012 Apr-Jun;6(2):80-92 [PMID: 22728672]
  9. Nucleic Acids Res. 2021 Sep 7;49(15):e90 [PMID: 34107016]
  10. Elife. 2021 Feb 23;10: [PMID: 33620031]
  11. Brief Bioinform. 2021 Mar 22;22(2):690-700 [PMID: 33057582]
  12. Brief Bioinform. 2021 Mar 22;22(2):664-675 [PMID: 33348368]
  13. Nat Microbiol. 2020 Nov;5(11):1403-1407 [PMID: 32669681]
  14. Bioinformatics. 2020 Jun 1;36(11):3552-3555 [PMID: 32108862]
  15. Nucleic Acids Res. 2021 Jan 8;49(D1):D817-D824 [PMID: 33045721]
  16. Lancet Microbe. 2020 Jul;1(3):e99-e100 [PMID: 32835336]
  17. Bioinformatics. 2018 Dec 1;34(23):4121-4123 [PMID: 29790939]
  18. Euro Surveill. 2017 Mar 30;22(13): [PMID: 28382917]
  19. Nat Biotechnol. 2021 Oct;39(10):1178-1179 [PMID: 34588690]
  20. Bioinformatics. 2020 Dec 21;: [PMID: 33346830]
  21. Nucleic Acids Res. 2019 Jan 8;47(D1):D94-D99 [PMID: 30365038]
  22. Genome Biol. 2019 Jan 8;20(1):8 [PMID: 30621750]
  23. Nucleic Acids Res. 2016 Jan 4;44(D1):D733-45 [PMID: 26553804]
  24. Front Microbiol. 2020 Jun 25;11:1502 [PMID: 32670259]
  25. J Mol Biol. 1990 Oct 5;215(3):403-10 [PMID: 2231712]
  26. Brief Bioinform. 2021 Mar 22;22(2):642-663 [PMID: 33147627]

Grants

  1. 20663/European Institute of Innovation and Technology

Word Cloud

Created with Highcharts 10.0.0VirusLabSARS-CoV-2canviralanalysissequenceswithindatahostsysteminformationtoolintegratedCOVID-19metadataclinicalpublicsearchresultspipelinessequencingSincebeginning2020pandemicposedunprecedentedchallengesconnecteddiseasediagnosticmethodsproposeflexibleanalysingrelatingcapitalizestwoexistingresources:ViruSurfdatabasesupportingmetadata-drivenVirusVizvisualdesignedtakingadvantageresourcesserver-sidearchitecturethat:coversbasedapproachesalreadyuseARTICGalaxyentirelycutomizableuponuserrequestiipredigestsrawdifferentplatformsOxfordNanoporeIlluminaiiigivesaccessarchivesdatasetsivsuppliesuser-friendlyreporting-makingalsobusinessenvironmentinstalledhostedpremisesorganizationsafelyhostsegcurrentlyavailablelandscapesimilarproviders:showpowerfulgeneratetabular/graphicalmachinereadablereportscomplexforeseeproposedsupportmanyresearch-orientedtherapeuticscenarioshospitalstracingmutationalprocessesorganizationssurveillanceVirusLab:ToolCustomizedDataAnalysisintegrationdiagnosticspopulationmutationvirus

Similar Articles

Cited By (3)