Ambiguous sample labeling for some souporcell clusters #234

aranham · 2024-05-07T15:46:43Z

Hi,

I’m working with scRNA-seq data where three samples are pooled together and we have 20 pools. I used souporcell for demultiplexing without initial whole-genome sequencing SNP data (as it wasn’t available for all samples at the time). I was able to demultiplex 17 pools by assigning labels from wgs data to clusters after souporcell analysis. However, I run into issues labeling samples in 3 pools. For these three pools I have two souporcell clusters clearly matching a single wgs sample each. The remaining souporcell cluster ambiguously matches parts of two wgs samples at levels greater than background noise but not reaching a clear match level. I suspect this ambiguous cluster might be high in heterotypic doublets.

To separate these ambiguous cells, I progressively increased the number of souporcell clusters from 3 to 7. This seemed to work for one samples, where all three wgs matches became distinct clusters with 7 clusters specified. Is increasing the number of clusters a valid approach for resolving ambiguous sample assignments, or are there potential pitfalls? We ruled out the possibility of closely related individuals based on wgs snp data analysis.

Before rerunning the experiment, are there any other solutions or checks you recommend to improve sample labeling accuracy?

Thanks!

Best,
Michelle

plijnzaad · 2024-05-16T12:06:41Z

Not entirely sure what your setup is but if you concatenate the bam files that you think contain identical genotypes (amongst potentially other genotypes) , the genotype estimates that SoupOrCell makes tend to get better because they have more data to go on. This can be especially important when some genotypes are only present in small numbers in some of the libraries. Be sure to disambiguate the cellbarcodes by their library names prior to the concatenation though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ambiguous sample labeling for some souporcell clusters #234

Ambiguous sample labeling for some souporcell clusters #234

aranham commented May 7, 2024

plijnzaad commented May 16, 2024

Ambiguous sample labeling for some souporcell clusters #234

Ambiguous sample labeling for some souporcell clusters #234

Comments

aranham commented May 7, 2024

plijnzaad commented May 16, 2024