scDART: integrating unmatched scRNA-seq and scATAC-seq data and learning cross-modality relationship simultaneously.

Ziqi Zhang, Chengkai Yang, Xiuwei Zhang
Author Information
  1. Ziqi Zhang: School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, 30308, GA, USA.
  2. Chengkai Yang: Department of Electrical Engineering and Information Systems, Graduate School of Engineering, The University of Tokyo, Tokyo, Japan.
  3. Xiuwei Zhang: School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, 30308, GA, USA. xiuwei.zhang@gatech.edu. ORCID

Abstract

It is a challenging task to integrate scRNA-seq and scATAC-seq data obtained from different batches. Existing methods tend to use a pre-defined gene activity matrix to convert the scATAC-seq data into scRNA-seq data. The pre-defined gene activity matrix is often of low quality and does not reflect the dataset-specific relationship between the two data modalities. We propose scDART, a deep learning framework that integrates scRNA-seq and scATAC-seq data and learns cross-modalities relationships simultaneously. Specifically, the design of scDART allows it to preserve cell trajectories in continuous cell populations and can be applied to trajectory inference on integrated data.

Keywords

References

  1. Proc Natl Acad Sci U S A. 2018 Jul 24;115(30):7723-7728 [PMID: 29987051]
  2. Genome Biol. 2017 Jul 24;18(1):138 [PMID: 28738873]
  3. Nat Methods. 2017 Oct;14(10):979-982 [PMID: 28825705]
  4. Cell. 2018 Aug 23;174(5):1309-1324.e18 [PMID: 30078704]
  5. Nat Biotechnol. 2019 Aug;37(8):925-936 [PMID: 31375813]
  6. Nat Biotechnol. 2018 Jun;36(5):421-427 [PMID: 29608177]
  7. Science. 2018 Sep 28;361(6409):1380-1385 [PMID: 30166440]
  8. Nat Biotechnol. 2022 May;40(5):703-710 [PMID: 35058621]
  9. Nat Methods. 2019 Nov;16(11):1139-1145 [PMID: 31591579]
  10. Nat Methods. 2022 Jan;19(1):41-50 [PMID: 34949812]
  11. BMC Genomics. 2018 Jun 19;19(1):477 [PMID: 29914354]
  12. Genome Biol. 2019 Aug 14;20(1):166 [PMID: 31412909]
  13. Nat Methods. 2021 Nov;18(11):1333-1341 [PMID: 34725479]
  14. Nat Commun. 2019 Jun 3;10(1):2395 [PMID: 31160568]
  15. Curr Protoc Bioinformatics. 2016 Jun 20;54:1.30.1-1.30.33 [PMID: 27322403]
  16. Nat Commun. 2019 Jun 13;10(1):2611 [PMID: 31197158]
  17. Nat Biotechnol. 2021 Oct;39(10):1202-1215 [PMID: 33941931]
  18. Cell. 2020 Nov 12;183(4):1103-1116.e20 [PMID: 33098772]
  19. Genome Biol. 2020 Aug 7;21(1):198 [PMID: 32767996]
  20. Science. 2000 Dec 22;290(5500):2319-23 [PMID: 11125149]
  21. Nat Biotechnol. 2019 May;37(5):547-554 [PMID: 30936559]
  22. Cell Rep Methods. 2021 Oct 25;1(6):100095 [PMID: 35474895]
  23. Genome Biol. 2022 Jun 27;23(1):139 [PMID: 35761403]
  24. Nat Commun. 2018 Dec 17;9(1):5345 [PMID: 30559361]
  25. Genome Biol. 2018 Feb 6;19(1):15 [PMID: 29409532]
  26. Nat Methods. 2019 Dec;16(12):1289-1296 [PMID: 31740819]
  27. Nat Biotechnol. 2019 Dec;37(12):1482-1492 [PMID: 31796933]
  28. Cell. 2018 May 31;173(6):1535-1548.e16 [PMID: 29706549]
  29. Science. 2012 Apr 13;336(6078):183-7 [PMID: 22499939]
  30. Genome Biol. 2020 May 11;21(1):111 [PMID: 32393329]
  31. Bioinformatics. 2020 Jul 1;36(Suppl_1):i48-i56 [PMID: 32657382]
  32. Cell. 2019 Jun 13;177(7):1873-1887.e17 [PMID: 31178122]
  33. Genome Biol. 2021 Dec 20;22(1):346 [PMID: 34930412]
  34. Nat Methods. 2016 Oct;13(10):845-8 [PMID: 27571553]
  35. Sci Rep. 2019 Mar 26;9(1):5233 [PMID: 30914743]
  36. Blood. 2020 Aug 13;136(7):845-856 [PMID: 32392346]
  37. Cell. 2019 Jun 13;177(7):1888-1902.e21 [PMID: 31178118]
  38. Genome Biol. 2019 Mar 19;20(1):59 [PMID: 30890159]
  39. ACM BCB. 2020 Sep;2020:1-10 [PMID: 33954299]
  40. Nat Biotechnol. 2019 Dec;37(12):1452-1457 [PMID: 31611697]

Grants

  1. R35 GM143070/NIGMS NIH HHS
  2. R35GM143070/NIH HHS

MeSH Term

Gene Expression Profiling
Sequence Analysis, RNA
Single-Cell Analysis
Exome Sequencing

Word Cloud

Created with Highcharts 10.0.0datascRNA-seqscATAC-seqpre-definedgeneactivitymatrixrelationshipscDARTlearningsimultaneouslycellinferencechallengingtaskintegrateobtaineddifferentbatchesExistingmethodstenduseconvertoftenlowqualityreflectdataset-specifictwomodalitiesproposedeepframeworkintegrateslearnscross-modalitiesrelationshipsSpecificallydesignallowspreservetrajectoriescontinuouspopulationscanappliedtrajectoryintegratedscDART:integratingunmatchedcross-modalityIntegrativeanalysisSingle-cellmultiomicsTrajectory

Similar Articles

Cited By