Multimodal hierarchical classification of CITE-seq data delineates immune cell states across lineages and tissues.
Daniel P Caron, William L Specht, David Chen, Steven B Wells, Peter A Szabo, Isaac J Jensen, Donna L Farber, Peter A Sims
Author Information
Daniel P Caron: Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York, NY 10032, USA.
William L Specht: Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York, NY 10032, USA.
David Chen: Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10032, USA.
Steven B Wells: Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10032, USA.
Peter A Szabo: Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York, NY 10032, USA.
Isaac J Jensen: Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York, NY 10032, USA.
Donna L Farber: Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York, NY 10032, USA; Department of Surgery, Columbia University Irving Medical Center, New York, NY 10032, USA.
Peter A Sims: Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University Irving Medical Center, New York, NY 10032, USA. Electronic address: pas2182@cumc.columbia.edu.
Single-cell RNA sequencing (scRNA-seq) is invaluable for profiling cellular heterogeneity and transcriptional states, but transcriptomic profiles do not always delineate subsets defined by surface proteins. Cellular indexing of transcriptomes and epitopes (CITE-seq) enables simultaneous profiling of single-cell transcriptomes and surface proteomes; however, accurate cell-type annotation requires a classifier that integrates multimodal data. Here, we describe multimodal classifier hierarchy (MMoCHi), a marker-based approach for accurate cell-type classification across multiple single-cell modalities that does not rely on reference atlases. We benchmark MMoCHi using sorted T lymphocyte subsets and annotate a cross-tissue human immune cell dataset. MMoCHi outperforms leading transcriptome-based classifiers and multimodal unsupervised clustering in its ability to identify immune cell subsets that are not readily resolved and to reveal subset markers. MMoCHi is designed for adaptability and can integrate annotation of cell types and developmental states across diverse lineages, samples, or modalities.