Towards the characterization of the hidden world of small proteins in Staphylococcus aureus, a proteogenomics approach.

Stephan Fuchs, Martin Kucklick, Erik Lehmann, Alexander Beckmann, Maya Wilkens, Baban Kolte, Ayten Mustafayeva, Tobias Ludwig, Maurice Diwo, Josef Wissing, Lothar Jänsch, Christian H Ahrens, Zoya Ignatova, Susanne Engelmann
Author Information
  1. Stephan Fuchs: Robert Koch Institute, Methodenentwicklung und Forschungsinfrastruktur (MF), Berlin, Germany.
  2. Martin Kucklick: University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany. ORCID
  3. Erik Lehmann: University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany. ORCID
  4. Alexander Beckmann: University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany. ORCID
  5. Maya Wilkens: Robert Koch Institute, Methodenentwicklung und Forschungsinfrastruktur (MF), Berlin, Germany. ORCID
  6. Baban Kolte: University of Hamburg, Institute of Biochemistry and Molecular Biology, Hamburg, Germany.
  7. Ayten Mustafayeva: University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany.
  8. Tobias Ludwig: University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany.
  9. Maurice Diwo: University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany. ORCID
  10. Josef Wissing: Helmholtz Center for Infection Research GmbH, Cellular Proteomics, Braunschweig, Germany.
  11. Lothar Jänsch: Helmholtz Center for Infection Research GmbH, Cellular Proteomics, Braunschweig, Germany.
  12. Christian H Ahrens: Agroscope, Research Group Molecular Diagnostics, Genomics and Bioinformatics & SIB Swiss Institute of Bioinformatics, Basel, Switzerland. ORCID
  13. Zoya Ignatova: University of Hamburg, Institute of Biochemistry and Molecular Biology, Hamburg, Germany. ORCID
  14. Susanne Engelmann: University of Technical Sciences Braunschweig, Institute for Microbiology, Braunschweig, Germany. ORCID

Abstract

Small proteins play essential roles in bacterial physiology and virulence, however, automated algorithms for genome annotation are often not yet able to accurately predict the corresponding genes. The accuracy and reliability of genome annotations, particularly for small open reading frames (sORFs), can be significantly improved by integrating protein evidence from experimental approaches. Here we present a highly optimized and flexible bioinformatics workflow for bacterial proteogenomics covering all steps from (i) generation of protein databases, (ii) database searches and (iii) peptide-to-genome mapping to (iv) visualization of results. We used the workflow to identify high quality peptide spectrum matches (PSMs) for small proteins (≤ 100 aa, SP100) in Staphylococcus aureus Newman. Protein extracts from S. aureus were subjected to different experimental workflows for protein digestion and prefractionation and measured with highly sensitive mass spectrometers. In total, 175 proteins with up to 100 aa (SP100) were identified. Out of these 24 (ranging from 9 to 99 aa) were novel and not contained in the used genome annotation.144 SP100 are highly conserved and were found in at least 50% of the publicly available S. aureus genomes, while 127 are additionally conserved in other staphylococci. Almost half of the identified SP100 were basic, suggesting a role in binding to more acidic molecules such as nucleic acids or phospholipids.

References

  1. Peptides. 2009 Apr;30(4):817-23 [PMID: 19150639]
  2. Nature. 1970 Aug 15;227(5259):680-5 [PMID: 5432063]
  3. PLoS One. 2009 Dec 04;4(12):e8176 [PMID: 19997597]
  4. J Bacteriol. 2005 Apr;187(7):2426-38 [PMID: 15774886]
  5. PLoS One. 2008;3(12):e4027 [PMID: 19107199]
  6. Sci Rep. 2016 Jun 27;6:28172 [PMID: 27344979]
  7. Nat Chem Biol. 2013 Jan;9(1):59-64 [PMID: 23160002]
  8. Proteome Sci. 2013 Dec 09;11(1):47 [PMID: 24321360]
  9. J Proteomics. 2014 Mar 17;99:123-37 [PMID: 24486812]
  10. Nat Rev Microbiol. 2013 Oct;11(10):667-73 [PMID: 24018382]
  11. Mol Cell. 2019 May 2;74(3):481-493.e6 [PMID: 30904393]
  12. Genome Res. 2017 Dec;27(12):2083-2095 [PMID: 29141959]
  13. Nat Microbiol. 2017 Feb 13;2:17005 [PMID: 28191904]
  14. EMBO J. 2014 May 2;33(9):981-93 [PMID: 24705786]
  15. Microbiome. 2021 Feb 23;9(1):55 [PMID: 33622394]
  16. J Biol Chem. 2011 Sep 16;286(37):32464-74 [PMID: 21778229]
  17. J Proteomics. 2021 Jan 6;230:103988 [PMID: 32949814]
  18. Bioinformatics. 2014 May 15;30(10):1469-70 [PMID: 24470574]
  19. Annu Rev Biochem. 2014;83:753-77 [PMID: 24606146]
  20. J Proteome Res. 2013 Jun 7;12(6):3019-25 [PMID: 23614390]
  21. Mol Microbiol. 2008 Dec;70(6):1487-501 [PMID: 19121005]
  22. Mol Microbiol. 2019 Jun;111(6):1571-1591 [PMID: 30873665]
  23. Nat Protoc. 2016 May;11(5):993-1006 [PMID: 27123950]
  24. Proteomics Clin Appl. 2016 Oct;10(9-10):1025-1035 [PMID: 27273978]
  25. PLoS One. 2013 Aug 13;8(8):e70669 [PMID: 23967085]
  26. Biochemistry. 2018 Jan 9;57(1):56-60 [PMID: 29039649]
  27. Genome Res. 2011 Apr;21(4):634-41 [PMID: 21367939]
  28. Electrophoresis. 1998 Apr;19(4):536-44 [PMID: 9588799]
  29. Science. 2009 Apr 10;324(5924):218-23 [PMID: 19213877]
  30. Int J Med Microbiol. 2014 Mar;304(2):133-41 [PMID: 24424242]
  31. BMC Genomics. 2013 Sep 23;14:648 [PMID: 24059539]
  32. N Engl J Med. 1998 Aug 20;339(8):520-32 [PMID: 9709046]
  33. Microbiol Mol Biol Rev. 2004 Sep;68(3):560-602, table of contents [PMID: 15353570]
  34. J Proteome Res. 2010 Mar 5;9(3):1323-9 [PMID: 20113005]
  35. J Proteome Res. 2020 Oct 2;19(10):4004-4018 [PMID: 32812434]
  36. Anal Chem. 2016 Apr 5;88(7):3967-75 [PMID: 27010111]
  37. Microlife. 2020 Oct 17;1(1):uqaa002 [PMID: 37223003]
  38. mBio. 2019 Mar 5;10(2): [PMID: 30837344]
  39. Curr Opin Microbiol. 2008 Oct;11(5):472-7 [PMID: 19086349]
  40. Mol Syst Biol. 2019 May 3;15(5):e8719 [PMID: 31053575]
  41. Environ Microbiol Rep. 2016 Dec;8(6):966-974 [PMID: 27717237]
  42. Bioessays. 2015 Jan;37(1):103-12 [PMID: 25345765]
  43. BMC Evol Biol. 2002 Nov 1;2:20 [PMID: 12410938]
  44. Antimicrob Agents Chemother. 2012 Feb;56(2):787-804 [PMID: 22106209]
  45. Proteomics. 2016 Jan;16(2):257-72 [PMID: 26442651]
  46. Cell. 2019 Aug 22;178(5):1245-1259.e14 [PMID: 31402174]
  47. Nat Protoc. 2006;1(4):1790-8 [PMID: 17487161]
  48. RNA. 2011 Apr;17(4):578-94 [PMID: 21357752]
  49. J Proteomics. 2020 Feb 20;213:103604 [PMID: 31841667]
  50. Mol Microbiol. 2019 Jan;111(1):131-144 [PMID: 30276893]
  51. Proteomics. 2010 Apr;10(8):1634-44 [PMID: 20186749]
  52. PLoS Comput Biol. 2008 Nov;4(11):e1000176 [PMID: 19043537]
  53. J Proteome Res. 2017 Oct 6;16(10):3722-3731 [PMID: 28861998]
  54. Lancet. 2006 Mar 4;367(9512):731-9 [PMID: 16517273]
  55. BMC Bioinformatics. 2008 Mar 27;9:173 [PMID: 18371216]
  56. BMC Genomics. 2014;15 Suppl 9:S19 [PMID: 25521444]
  57. Mol Microbiol. 2006 Nov;62(4):1035-47 [PMID: 17078814]
  58. Nature. 2011 May 19;473(7347):337-42 [PMID: 21593866]
  59. Nucleic Acids Res. 2020 Feb 20;48(3):1029-1042 [PMID: 31504789]
  60. Proc Natl Acad Sci U S A. 2016 Jun 28;113(26):E3801-9 [PMID: 27286824]
  61. Mol Syst Biol. 2019 Feb 22;15(2):e8290 [PMID: 30796087]
  62. J Gen Microbiol. 1952 Feb;6(1-2):95-107 [PMID: 14927856]
  63. J Bacteriol. 2008 Jan;190(1):300-10 [PMID: 17951380]
  64. Nucleic Acids Res. 2021 Sep 7;49(15):e89 [PMID: 34125903]
  65. Bioinformatics. 2014 Mar 15;30(6):884-6 [PMID: 24162465]
  66. PLoS Genet. 2015 Oct 23;11(10):e1005613 [PMID: 26495981]
  67. Front Genet. 2018 Apr 25;9:144 [PMID: 29922328]
  68. Nat Med. 2007 Dec;13(12):1510-4 [PMID: 17994102]
  69. Eur J Immunol. 2018 Aug;48(8):1336-1349 [PMID: 29749611]
  70. J Proteome Res. 2016 Oct 7;15(10):3773-3783 [PMID: 27557128]
  71. Proteomics. 2021 Jan;21(2):e2000246 [PMID: 33111431]
  72. Biochim Biophys Acta. 1986 Jun 12;864(1):123-41 [PMID: 2424507]

MeSH Term

Bacterial Proteins
Computer Simulation
Databases, Protein
Mass Spectrometry
Molecular Sequence Annotation
Open Reading Frames
Peptide Hydrolases
Phylogeny
Proteogenomics
Staphylococcus aureus

Chemicals

Bacterial Proteins
Peptide Hydrolases

Word Cloud

Created with Highcharts 10.0.0proteinsSP100aureusgenomesmallproteinhighlyaabacterialannotationexperimentalworkflowproteogenomicsused100StaphylococcusSidentifiedconservedSmallplayessentialrolesphysiologyvirulencehoweverautomatedalgorithmsoftenyetableaccuratelypredictcorrespondinggenesaccuracyreliabilityannotationsparticularlyopenreadingframessORFscansignificantlyimprovedintegratingevidenceapproachespresentoptimizedflexiblebioinformaticscoveringstepsgenerationdatabasesiidatabasesearchesiiipeptide-to-genomemappingivvisualizationresultsidentifyhighqualitypeptidespectrummatchesPSMsNewmanProteinextractssubjecteddifferentworkflowsdigestionprefractionationmeasuredsensitivemassspectrometerstotal17524ranging999novelcontained144foundleast50%publiclyavailablegenomes127additionallystaphylococciAlmosthalfbasicsuggestingrolebindingacidicmoleculesnucleicacidsphospholipidsTowardscharacterizationhiddenworldapproach

Similar Articles

Cited By