ESICCC ESICCC: A systematic computational framework for evaluation, selection and integration of cell-cell communication inference methods

Introduction

We benchmark two types of CCC inference methods, one type of methods predict LR pairs based on scRNA-seq data, and another type of methods that can predict ligand/receptor-targets regulations.

For the first benchmark, we evaluated the accuracy, stability and usability of 18 LR inference methods. In term of accuracy, paired ST datasets, CAGE expression/Proteomics data and sampled scRNA-seq datasets were used to benchmark the 18 methods. Firstly, 11 scRNA-seq datasets were used as input for methods to predict intercellular communication and the two defined similarity index (SI, modified Jaccard index) and rank-based similarity index (RSI) were used to compare the similarity of LR pairs predicted by methods.Furthermore, we benchmark the 18 methods using 11 paired ST datasets with the hypothesis that the values of mutual information (MI) of LR pairs are greater in the close group than that in the distant group. In addition, three PBMC datasets from 10X Genomics website were used as input for methods to predict LR pairs and CAGE expression/Proteomics data were used as pseudo gold standards to benchmark the 18 methods. In term of stability, we ramdomly sampled different ratios of cells in all the scRNA-seq, resulting 70 sampled datasets and 14 original datasets as input for methods. We calculated the Jaccard index of the LR pairs predicted based sampled datasets and original datasets and a stability value was defined to test the robustness of methods to sampling rates of scRNA-seq data. In term of usability, we recorded the running time and maximum memory usage of methods in all the 84 scRNA-seq datasets.

For the second benchmark, 8 ST datasets were used as the input for 5 LR-Targets inference tools to predict ligand/receptor-targets regulations, and the cell line perturbation datasets were used for evaluation, involving knockout/mutant conditions for 5 receptors, and treatment conditions for 10 ligands. And the differentially expressed genes (DEGs) in each cell line perturbation dataset, were used as the ground truth of ligand/receptor-targets regulations. The score of ligand/receptor-targets predicted by different tools were compared to the differential expression status (DGEs or not DEGs) of corresponding targets to calculate AUROC and AUPRC. In addition, we also record the running time and maximum memory usage of methods in all the ST datasets.

Publications

  1. ESICCC as a systematic computational framework for evaluation, selection, and integration of cell-cell communication inference methods
    Luo J, Deng M, Zhang X, Sun X, - Genome Research
    Cited by Luo, Jiaxin et al. “ESICCC as a systematic computational framework for evaluation, selection, and integration of cell-cell communication inference methods.” Genome research vol. 33,10 (2023): 1788-1805. doi:10.1101/gr.278001.123 (Google Schoolar as of February 23, 2024)

Credits

  1. Xiaoqiang Sun sunxq6@mail.sysu.edu.cn
    Investigator

    School of Mathematics, Sun Yat-sen University, China

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT007417
Tool TypePipeline & Protocol
CategoryIntercellular signaling data
PlatformsWindows
TechnologiesPython2, Python3, R
User InterfaceTerminal Command Line
Input DataBAM
Download Count0
Country/RegionChina
Submitted ByXiaoqiang Sun
Fundings

National Key R&D Program of China (2021YFF1200903)