PanEffect: a pan-genome visualization tool for variant effects in maize.

Carson M Andorf, Olivia C Haley, Rita K Hayford, John L Portwood, Stephen Harding, Shatabdi Sen, Ethalinda K Cannon, Jack M Gardiner, Hye-Seon Kim, Margaret R Woodhouse
Author Information
  1. Carson M Andorf: USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States. ORCID
  2. Olivia C Haley: USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States.
  3. Rita K Hayford: USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States.
  4. John L Portwood: USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States.
  5. Stephen Harding: USDA-ARS, Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Peoria, IL 61604, United States.
  6. Shatabdi Sen: Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, United States.
  7. Ethalinda K Cannon: USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States.
  8. Jack M Gardiner: Division of Animal Sciences, University of Missouri, Columbia, MO 65211, United States.
  9. Hye-Seon Kim: USDA-ARS, Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Peoria, IL 61604, United States.
  10. Margaret R Woodhouse: USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States. ORCID

Abstract

SUMMARY: Understanding the effects of genetic variants is crucial for accurately predicting traits and functional outcomes. Recent approaches have utilized artificial intelligence and protein language models to score all possible missense variant effects at the proteome level for a single genome, but a reliable tool is needed to explore these effects at the pan-genome level. To address this gap, we introduce a new tool called PanEffect. We implemented PanEffect at MaizeGDB to enable a comprehensive examination of the potential effects of coding variants across 50 maize genomes. The tool allows users to visualize over 550 million possible amino acid substitutions in the B73 maize reference genome and to observe the effects of the 2.3 million natural variations in the maize pan-genome. Each variant effect score, calculated from the Evolutionary Scale Modeling (ESM) protein language model, shows the log-likelihood ratio difference between B73 and all variants in the pan-genome. These scores are shown using heatmaps spanning benign outcomes to potential functional consequences. In addition, PanEffect displays secondary structures and functional domains along with the variant effects, offering additional functional and structural context. Using PanEffect, researchers now have a platform to explore protein variants and identify genetic targets for crop enhancement.
AVAILABILITY AND IMPLEMENTATION: The PanEffect code is freely available on GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/PanEffect). A maize implementation of PanEffect and underlying datasets are available at MaizeGDB (https://www.maizegdb.org/effect/maize/).

References

  1. PLoS Genet. 2014 Dec 04;10(12):e1004845 [PMID: 25474422]
  2. BMC Plant Biol. 2021 Aug 20;21(1):385 [PMID: 34416864]
  3. Science. 2023 Sep 22;381(6664):eadg7492 [PMID: 37733863]
  4. Nucleic Acids Res. 2019 Jan 8;47(D1):D1146-D1154 [PMID: 30407532]
  5. Nucleic Acids Res. 2021 Jul 2;49(W1):W535-W540 [PMID: 33999203]
  6. Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489 [PMID: 33237286]
  7. Sci Rep. 2024 Apr 7;14(1):8136 [PMID: 38584172]
  8. Nat Genet. 2023 Feb;55(2):355 [PMID: 36732475]
  9. Science. 2023 Mar 17;379(6637):1123-1130 [PMID: 36927031]
  10. Nat Genet. 2020 Sep;52(9):950-957 [PMID: 32719517]
  11. Mol Biol Evol. 2019 Nov 1;36(11):2604-2619 [PMID: 31406981]
  12. Science. 2021 Aug 6;373(6555):655-662 [PMID: 34353948]
  13. Hum Genet. 2024 Aug 8;: [PMID: 39117802]
  14. Front Mol Biosci. 2023 Jul 05;10:1204157 [PMID: 37475887]
  15. Nature. 2017 Jun 22;546(7659):524-527 [PMID: 28605751]
  16. Genome Biol. 2016 Jun 06;17(1):122 [PMID: 27268795]
  17. Bioinformatics. 2014 May 1;30(9):1236-40 [PMID: 24451626]
  18. Nucleic Acids Res. 2020 Jan 8;48(D1):D927-D932 [PMID: 31566222]
  19. Database (Oxford). 2011 May 09;2011:bar016 [PMID: 21565781]
  20. Plant Cell. 2023 Jan 2;35(1):162-186 [PMID: 36370076]
  21. PLoS Comput Biol. 2011 Oct;7(10):e1002195 [PMID: 22039361]
  22. Genome Biol. 2023 Jul 3;24(1):147 [PMID: 37394429]
  23. Database (Oxford). 2009;2009:bap020 [PMID: 21847242]
  24. Mol Syst Biol. 2018 Dec 20;14(12):e8430 [PMID: 30573687]
  25. Biopolymers. 1983 Dec;22(12):2577-637 [PMID: 6667333]
  26. Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419 [PMID: 33125078]
  27. Hum Genomics. 2017 May 16;11(1):10 [PMID: 28511696]
  28. Nat Plants. 2022 Jul;8(7):750-763 [PMID: 35851624]
  29. Cell Syst. 2018 Jan 24;6(1):116-124.e3 [PMID: 29226803]
  30. Genome Biol Evol. 2023 Nov 1;15(11): [PMID: 37936309]
  31. Genetics. 2023 May 4;224(1): [PMID: 36755109]
  32. Sci Rep. 2016 Sep 27;6:33964 [PMID: 27670777]
  33. Int J Plant Genomics. 2011;2011:923035 [PMID: 22253616]
  34. Nat Genet. 2023 Sep;55(9):1512-1522 [PMID: 37563329]

Grants

  1. /U.S. Department of Agriculture
  2. /Agricultural Research Service

MeSH Term

Zea mays
Databases, Genetic
Artificial Intelligence
Genome, Plant
Phenotype
Software

Word Cloud

Created with Highcharts 10.0.0effectsPanEffectmaizevariantsfunctionalvarianttoolpan-genomeproteingeneticoutcomeslanguagescorepossiblelevelgenomeexploreMaizeGDBpotentialmillionB73availableSUMMARY:UnderstandingcrucialaccuratelypredictingtraitsRecentapproachesutilizedartificialintelligencemodelsmissenseproteomesinglereliableneededaddressgapintroducenewcalledimplementedenablecomprehensiveexaminationcodingacross50genomesallowsusersvisualize550aminoacidsubstitutionsreferenceobserve23naturalvariationseffectcalculatedEvolutionaryScaleModelingESMmodelshowslog-likelihoodratiodifferencescoresshownusingheatmapsspanningbenignconsequencesadditiondisplayssecondarystructuresdomainsalongofferingadditionalstructuralcontextUsingresearchersnowplatformidentifytargetscropenhancementAVAILABILITYANDIMPLEMENTATION:codefreelyGitHubhttps://githubcom/Maize-Genetics-and-Genomics-Database/PanEffectimplementationunderlyingdatasetshttps://wwwmaizegdborg/effect/maize/PanEffect:visualization

Similar Articles

Cited By