Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis Updates: RNA-seq differential-expression module #137

Closed
cansavvy opened this issue Jul 10, 2020 · 10 comments · Fixed by #183
Closed

Analysis Updates: RNA-seq differential-expression module #137

cansavvy opened this issue Jul 10, 2020 · 10 comments · Fixed by #183
Assignees

Comments

@cansavvy
Copy link
Contributor

cansavvy commented Jul 10, 2020

Since we are trying to integrate more DESeq2 functionality into our RNA-seq section, we should make differential expression reflect that as well since that is its main utility.

We can borrow from training-modules steps: https://github.com/AlexsLemonade/training-modules/blob/master/RNA-seq/05-nb_cell_line_DESeq2.Rmd and maybe useful tidbits from here: https://github.com/AlexsLemonade/training-modules/blob/master/RNA-seq/03-gastric_cancer_exploratory.Rmd

I'm debating about whether we should still maintain a QN'ed from refinebio version of differential-expression? One strategy is we could keep the current example but reframe it as a "pre-normalized DE" example and make the new notebook a "starting at counts DE" example. Thoughts?

@jaclyn-taroni
Copy link
Member

The current example isn't QN'd data.

@cansavvy
Copy link
Contributor Author

Oh I see that now. Hmm.. I guess I was thrown off by DGEList thing. Why did we choose limma-voom here?

@cansavvy cansavvy changed the title Make RNA-seq differential-expression module more DESeq2-centered Reconsider the examples in RNA-seq differential-expression module Jul 10, 2020
@jaclyn-taroni
Copy link
Member

I do not have a good reason.

@cansavvy
Copy link
Contributor Author

We should look at the examples listed here and decide what we would like to do (or maybe a few of them): https://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/tximport.html#Downstream_DGE_in_Bioconductor

I do not understand the rationale of limma-voom vs the various DESEq2 models and I suspect this will also be an area of confusion for our users.

So although we don't want to get too far into the weeds on dge models, we should evaluate:

@cansavvy cansavvy self-assigned this Jul 10, 2020
@cansavvy
Copy link
Contributor Author

I think I remember reading this before, but it's a great little summary to keep in mind for this: https://www.biostars.org/p/284775/

@cansavvy
Copy link
Contributor Author

Also this by Mike Love: https://mikelove.wordpress.com/2016/09/28/deseq2-or-edger/

@cansavvy
Copy link
Contributor Author

cansavvy commented Jul 10, 2020

I don't feel like its super necessary or helpful to keep an example of each edgeR, limma-voom, and DESEq2. So what I think might be a good way forward is to change this example to be DESEq2-centric, but briefly acknowledge the others and link out to examples of those as well as the nice summary articles.

Our rationale for choosing DESeq2 would basically be

  • Other methods are all about as good for normalization purposes
  • It has a lot of nice docs and examples
  • Be honest: we have preferences.

So the current rna-seq_DGE.RMd notebook we we switch the strategy to DESeq2 so it will be more parallel to the rest of of module changes. Basic steps would be:

  1. Intro includes more wordy version of our DESeq2 rationale I described above plus links out to EdgeR and limma-voom examples.
  2. Download non-QN'ed refinebio data
  3. Create DESeq2 object: DESeq2::DESeq2::DESeqDataSetFromMatrix() (Side note maybe show how to save it to an RDS) where design = ~ mutation.status.
  4. Follow the modeling set up steps not unlike: https://github.com/AlexsLemonade/training-modules/blob/master/RNA-seq/05-nb_cell_line_DESeq2.Rmd
  5. Keep a volcano plot. Add some interpretation bits.
  6. At the end link to heatmap clustering example because that flows nicely.

@cbethell
Copy link
Contributor

cbethell commented Aug 5, 2020

As discussed with @jaclyn-taroni and @cansavvy in the data science team Slack channel.

The changes made in PR #140 seem to be better suited for the differential-expression example notebook. That being said, the current branch in PR #140 (which has been closed, but branch retained) can be merged into a new branch for the purpose of getting started on this ticket.

@jaclyn-taroni edit: the branch name is cbethell/add-clustering-rna-seq-example

@jaclyn-taroni
Copy link
Member

jaclyn-taroni commented Aug 5, 2020

Expanding on this point a bit - from #140 (review)

What if instead we used much of what we have here (e.g., the dataset and collapsing steps) as a DGE example? We could have multiple RNA-seq DGE examples that vary in designs and this dataset could be an example of collapsing replicates and multifactor designs. If it is presented that way, I would be less concerned about the extent of the metadata cleaning steps required, too.

So more generally, I think multiple DGE examples could be very useful where what is on cbethell/add-clustering-rna-seq-example is not the first example that is presented. I expect that putting multiple examples into the same notebook is going to make an individual notebook pretty long, which is to say that I think there will be multiple differential-expression notebooks and what's on that branch will be 02 or later.

@jaclyn-taroni
Copy link
Member

I also wanted to note that the dataset currently in use on #140 is also included in one of our exercises for training–I expect that material will also help with getting the DGE example done.

@cansavvy cansavvy changed the title Reconsider the examples in RNA-seq differential-expression module Analysis Updates: RNA-seq differential-expression module Aug 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants