Population-Level Cell Trajectory Inference Based on Gaussian Distributions.

Xiang Chen, Yibing Ma, Yongle Shi, Yuhan Fu, Mengdi Nan, Qing Ren, Jie Gao
Author Information
  1. Xiang Chen: School of Science, Jiangnan University, Wuxi 214122, China.
  2. Yibing Ma: School of Science, Jiangnan University, Wuxi 214122, China.
  3. Yongle Shi: School of Science, Jiangnan University, Wuxi 214122, China.
  4. Yuhan Fu: School of Science, Jiangnan University, Wuxi 214122, China.
  5. Mengdi Nan: School of Science, Jiangnan University, Wuxi 214122, China.
  6. Qing Ren: School of Science, Jiangnan University, Wuxi 214122, China.
  7. Jie Gao: School of Science, Jiangnan University, Wuxi 214122, China. ORCID

Abstract

In the past decade, inferring developmental trajectories from single-cell data has become a significant challenge in bioinformatics. RNA velocity, with its incorporation of directional dynamics, has significantly advanced the study of single-cell trajectories. However, as single-cell RNA sequencing technology evolves, it generates complex, high-dimensional data with high noise levels. Existing trajectory inference methods, which overlook cell distribution characteristics, may perform inadequately under such conditions. To address this, we introduce CPvGTI, a Gaussian distribution-based trajectory inference method. CPvGTI utilizes a Gaussian mixture model, optimized by the Expectation-Maximization algorithm, to construct new cell populations in the original data space. By integrating RNA velocity, CPvGTI employs Gaussian Process Regression to analyze the differentiation trajectories of these cell populations. To evaluate the performance of CPvGTI, we assess CPvGTI's performance against several state-of-the-art methods using four structurally diverse simulated datasets and four real datasets. The simulation studies indicate that CPvGTI excels in pseudo-time prediction and structural reconstruction compared to existing methods. Furthermore, the discovery of new branch trajectories in human forebrain and mouse hematopoiesis datasets confirms CPvGTI's superior performance.

Keywords

References

  1. Nat Methods. 2021 Jul;18(7):723-732 [PMID: 34155396]
  2. Nat Rev Cancer. 2024 Oct 16;: [PMID: 39414948]
  3. Nature. 2018 Aug;560(7719):494-498 [PMID: 30089906]
  4. Int J Mol Sci. 2017 May 02;18(5): [PMID: 28468316]
  5. Nature. 2019 Feb;566(7745):490-495 [PMID: 30787436]
  6. Methods Mol Biol. 2023;2584:269-292 [PMID: 36495456]
  7. Neurosci Bull. 2011 Jun;27(3):185-96 [PMID: 21614101]
  8. Science. 2020 Feb 14;367(6479): [PMID: 31974159]
  9. Comput Biol Med. 2022 Dec;151(Pt A):106249 [PMID: 36335815]
  10. Int J Mol Sci. 2019 Apr 16;20(8): [PMID: 31014006]
  11. Science. 2023 Oct 13;382(6667):eadf1226 [PMID: 37824650]
  12. Nat Biotechnol. 2020 Dec;38(12):1408-1414 [PMID: 32747759]
  13. Nat Biotechnol. 2019 May;37(5):547-554 [PMID: 30936559]
  14. Cell Rep Methods. 2021 Oct 25;1(6):100095 [PMID: 35474895]
  15. Nat Genet. 2001 Jan;27(1):48-54 [PMID: 11137997]
  16. Brief Bioinform. 2023 Sep 22;24(6): [PMID: 37864293]
  17. Nature. 2021 Feb;590(7847):649-654 [PMID: 33627808]
  18. Bioinformatics. 2023 Sep 2;39(9): [PMID: 37624916]
  19. Development. 2000 Dec;127(24):5253-63 [PMID: 11076748]
  20. Nature. 2020 May;581(7806):77-82 [PMID: 32376949]
  21. Development. 2019 Jun 17;146(12): [PMID: 31160421]
  22. Nat Methods. 2016 Oct;13(10):845-8 [PMID: 27571553]
  23. Biomolecules. 2023 Aug 12;13(8): [PMID: 37627306]
  24. IEEE J Biomed Health Inform. 2023 Nov;27(11):5665-5674 [PMID: 37656653]
  25. NAR Genom Bioinform. 2020 Jul 22;2(3):lqaa053 [PMID: 33575604]
  26. Nat Biotechnol. 2014 Apr;32(4):381-386 [PMID: 24658644]
  27. Nat Commun. 2021 Jun 24;12(1):3942 [PMID: 34168133]
  28. Nat Methods. 2022 Feb;19(2):159-170 [PMID: 35027767]
  29. Bioinformatics. 2021 Oct 25;37(20):3509-3513 [PMID: 33974009]
  30. Nat Biotechnol. 2020 Feb;38(2):147-150 [PMID: 31937974]
  31. Nat Commun. 2024 Jan 27;15(1):833 [PMID: 38280860]
  32. Genome Biol. 2019 Mar 19;20(1):59 [PMID: 30890159]
  33. Nat Methods. 2023 May;20(5):665-672 [PMID: 37037999]
  34. IEEE/ACM Trans Comput Biol Bioinform. 2022 Jul-Aug;19(4):2512-2522 [PMID: 33630737]
  35. Curr Opin Neurobiol. 2003 Feb;13(1):34-41 [PMID: 12593980]
  36. Proc Natl Acad Sci U S A. 2019 Sep 24;116(39):19490-19499 [PMID: 31501331]
  37. Cell Rep Methods. 2022 Dec 19;2(12):100359 [PMID: 36590685]
  38. Wiley Interdiscip Rev Cogn Sci. 2013 Jan;4(1):93-103 [PMID: 23359318]
  39. Biomolecules. 2024 Jul 10;14(7): [PMID: 39062541]

Grants

  1. 12271216/National Natural Science Foundation of China
  2. 11831015/National Natural Science Foundation of China

MeSH Term

Normal Distribution
Animals
Mice
Single-Cell Analysis
Algorithms
Humans
Computational Biology
Sequence Analysis, RNA
Cell Differentiation
Hematopoiesis

Word Cloud

Created with Highcharts 10.0.0CPvGTIGaussiantrajectoriessingle-celldataRNAvelocitytrajectoryinferencemethodscellperformancedatasetsdistributionnewpopulationsCPvGTI'sfourpseudo-timepastdecadeinferringdevelopmentalbecomesignificantchallengebioinformaticsincorporationdirectionaldynamicssignificantlyadvancedstudyHoweversequencingtechnologyevolvesgeneratescomplexhigh-dimensionalhighnoiselevelsExistingoverlookcharacteristicsmayperforminadequatelyconditionsaddressintroducedistribution-basedmethodutilizesmixturemodeloptimizedExpectation-MaximizationalgorithmconstructoriginalspaceintegratingemploysProcessRegressionanalyzedifferentiationevaluateassessseveralstate-of-the-artusingstructurallydiversesimulatedrealsimulationstudiesindicateexcelspredictionstructuralreconstructioncomparedexistingFurthermorediscoverybranchhumanforebrainmousehematopoiesisconfirmssuperiorPopulation-LevelCellTrajectoryInferenceBasedDistributions

Similar Articles

Cited By