It is very common to see in the scRNAseq papers that the authors compare cell type abundance across groups (e.g., treatment vs control, responder vs non-responder).

Let’s create some dummy data.

library(tidyverse)
set.seed(23) # we have 6 treatment samples and 6 control samples, 3 clusters A,B,C

but in the treatment samples, cluster C is absent (0 cells) in sample7

sample_id<- c(paste0("sample", 1:6, "_control", rep(c("_A","_B","_C"),each = 6)), paste0("sample", 8:12, "_treatment", …