scDART: integrating unmatched scRNA-seq and scATAC-seq data and learning cross-modality relationship simultaneously.
Ziqi Zhang, Chengkai Yang, Xiuwei Zhang
Author Information
Ziqi Zhang: School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, 30308, GA, USA.
Chengkai Yang: Department of Electrical Engineering and Information Systems, Graduate School of Engineering, The University of Tokyo, Tokyo, Japan.
Xiuwei Zhang: School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, 30308, GA, USA. xiuwei.zhang@gatech.edu. ORCID
It is a challenging task to integrate scRNA-seq and scATAC-seq data obtained from different batches. Existing methods tend to use a pre-defined gene activity matrix to convert the scATAC-seq data into scRNA-seq data. The pre-defined gene activity matrix is often of low quality and does not reflect the dataset-specific relationship between the two data modalities. We propose scDART, a deep learning framework that integrates scRNA-seq and scATAC-seq data and learns cross-modalities relationships simultaneously. Specifically, the design of scDART allows it to preserve cell trajectories in continuous cell populations and can be applied to trajectory inference on integrated data.