Downsampling output missing #1731
-
Hi! First of all, thank you very much for developing MIXCR, it's a fantastic tool and the documentation is great! I would like to compare different stats (# clonotypes, # IGH sequences, # TRA... etc) across bulk RNAseq samples. I've run the analysis with the rnaseq preset, but there is quite a difference in coverage between samples, so I thought the downsampling function might be best to compare samples. When I run downsampling, it seems to downsample it fine, but I don't get the stats I need. What I get
What I would like to get
Note that if I run mixcr exportReports on the downsampled .clns files, I get the exact same numbers as with the original file. Is there something I'm missing? Thank you so much in advance! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Hi Laura, The downsample function does not re-align or re-assemble anything. What it does is takes an already processed sample (clns file with clones) and downsample it based on criteria selected by the user (for example, randomly picking a certain number of reads from the clonotype table, NOT from the initial fastq file). That is why the align and assemble reports do not change. What does change is shown in the summary_downsampling.csv: the number of clones (nElements) and the number of reads in clones (sumWeight) before and after downsampling. I hope this clarifies things. Feel free to reach out if you have more questions! Sincerely, |
Beta Was this translation helpful? Give feedback.
-
Thanks so much for your fast answer Mark! I see, then perhaps it isn't exactly what I need. What I need: What I have tried so far:
Not sure if you have done similar analyses - how do you compare inter-sample results in terms of # clones, chains etc? What do you recommend? |
Beta Was this translation helpful? Give feedback.
I think the best approach is to use mixcr postanalysis function, which includes downsampling:
This will create multiple postanalysis metrics for each chain, including diversity indices. Observed diversity is the number of distinct chains.
Then can also use
mixcr exportPlots
with the postanalysis output to generate plots comparing the diversity between groups of samples defined by the metadata provided as a table.