Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prune SingleR results? #803

Open
jashapiro opened this issue Nov 21, 2024 · 0 comments
Open

Prune SingleR results? #803

jashapiro opened this issue Nov 21, 2024 · 0 comments

Comments

@jashapiro
Copy link
Member

We are currently storing the full SingleR results table in the SCE processed output, but this object can actually be quite large in some ways we had not previously accounted for.

The results are in the form of a DataFrame which has a number of large elements in its metadata. In particular, it has lists of all of the genes that are differentially expressed between each pair of labels (de.genes) and a list of all of the genes that are common among all annotations (common.genes). I think this can probably be removed to save on output data size. While they might occasionally be convenient, they probably are not as useful as the other statistics in that object in evaluating quality of the labels, and they take up quite a bit of space. In some quick estimation, over 70% of our SCE metadata is in these two elements.

Before adding the singleR results, we can pare them down with the following code.

metadata(singler_results)$de.genes <- NULL
metadata(singler_results)$common.genes <- NULL
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant