New Analysis Example: Simpler example of gene clustering -- k-means for microarray #351

cansavvy · 2020-11-11T22:23:00Z

What are the goals of this new example analysis?

Currently ORA in microarray uses differential expression results and the RNA-seq one (not yet developed #344) will end up using a gene module from WGCNA.

However, we should give users a more basic way to find gene clusters from data. Some users may find WGCNA a bit daunting (it also requires some computing power). And it may be more than what a user needs for their particular question, so an example that shows something like k-means.

What kind of dataset will this need?

Something with enough samples that a cluster would make some kind of sense.
I think GSE37382 which is medulloblastoma with subgroups and is used for dimension reduction seems like a reasonable dataset to use for this too.

What steps should be included in this analysis?

These are the roughest ideas of steps I have right now that will need to be made more specific and further polished when we dig into this example more.

Import data and metadata
Use k-means function
Do some exploration into how "well" k-means ran -- unclear to me without doing a bit more digging what this looks like. It may be as simple as printing out some kind of summary stats.
May want to run more iterations and see if you get the same-ish results?
Get some kind of annotation for the genes that you can use as a test for seeing if your gene clustering seems sensible. This could be something like GO terms (But maybe not GO terms since they overlap so much).
Probably plot gene-wise PCA and label the k-means clusters as colors and another form of gene annotation as shapes and see if it makes sense.

What packages/methods do you recommend using or looking into for this analysis?

May not need extra packages besides magrittr, and tidyverse ones (which are assumed everywhere). Both k-means and prcomp are in base R.

Note if/when this issue is completed, the ORA example should be updated to use this output (this should be its own issue and PR).

The text was updated successfully, but these errors were encountered:

cansavvy · 2020-11-11T22:25:01Z

If all goes alright with this example, it can be made into an RNA-seq version as well which will require additional steps for DESeq2 transformation.

cansavvy · 2020-11-11T22:26:43Z

I think this example could just as easily use KNN if we think that would be better for a particular reason.

jaclyn-taroni · 2020-11-11T22:36:02Z

My gut tells me that this is not going to simpler than WGCNA from an explanation point of view, to be honest. Particularly the part about picking k...

cansavvy · 2020-11-11T22:40:01Z

My gut tells me that this is not going to simpler than WGCNA from an explanation point of view, to be honest. Particularly the part about picking k...

I agree its not simply "plug and chug" but at least its mainly k and not 4-5 other parameters? I think its more straightforward than WGCNA, but that's because WGCNA has a lot of pieces in comparison.
If we don't like k-means, do you have an even simpler suggestion for finding gene groups?

jaclyn-taroni · 2020-11-11T22:49:29Z

No, not really. I think whenever you're going to talk about number of clusters or cluster validation it's going to be tricky.

cansavvy changed the title ~~New Analysis Example: Microarray simpler example of gene clustering -- k-means~~ New Analysis Example: Simpler example of gene clustering -- k-means for microarray Nov 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Analysis Example: Simpler example of gene clustering -- k-means for microarray #351

New Analysis Example: Simpler example of gene clustering -- k-means for microarray #351

cansavvy commented Nov 11, 2020 •

edited

Loading

cansavvy commented Nov 11, 2020

cansavvy commented Nov 11, 2020

jaclyn-taroni commented Nov 11, 2020

cansavvy commented Nov 11, 2020

jaclyn-taroni commented Nov 11, 2020

New Analysis Example: Simpler example of gene clustering -- k-means for microarray #351

New Analysis Example: Simpler example of gene clustering -- k-means for microarray #351

Comments

cansavvy commented Nov 11, 2020 • edited Loading

What are the goals of this new example analysis?

What kind of dataset will this need?

What steps should be included in this analysis?

What packages/methods do you recommend using or looking into for this analysis?

cansavvy commented Nov 11, 2020

cansavvy commented Nov 11, 2020

jaclyn-taroni commented Nov 11, 2020

cansavvy commented Nov 11, 2020

jaclyn-taroni commented Nov 11, 2020

cansavvy commented Nov 11, 2020 •

edited

Loading