RNA-Seq Header Section #216

cansavvy · 2020-09-15T15:30:28Z

Purpose:

#125

Strategy

Tried to make sure the major points discussed in the issue #125 are addressed here.

How do we feel about the structure and general information added?
Any sources to add?
I haven't added the links to the individual modules yet (the TODOs) will do this once we know we are okay with this set up. (The links might change).

Analysis Pull Request Check List (roughly in order):

Content checks

All {{BLANKS}} have been replaced with the correct content.
Sources are cited
Seed is set (if applicable)

Formatting Checks

Removed any manual numbering of sections.
Removed any instances of chunk naming.
Spell checked any Rmd file or md file.
Comments and documentation are up to date.

Add datasets to S3

Added data and metadata files to S3.

Docker/Snakemake rendering components

Added the .html link to the navigation bar.
Any not yet added packages needed for this analysis have been added to the Dockerfile and it successfully builds.
In the Docker container, snakemake was run for rendering.

cbethell

The structure and content in this PR looks great to me @cansavvy! I haven't thought of anything else that should be included (yet) that you haven't covered here, but I do have some other suggestions below.

03-rnaseq/00-intro-to-rnaseq.Rmd

cbethell

LGTM! 🚀

jaclyn-taroni

Returning the review so we can talk about #187 and #189

03-rnaseq/00-intro-to-rnaseq.Rmd

jaclyn-taroni · 2020-09-17T12:03:46Z

03-rnaseq/00-intro-to-rnaseq.Rmd

+### RNA-seq data **strengths**:  
+
+- RNA-seq can collect data on more transcripts (it is less bound to a pre-determined set of probes like microarray is). 
+- It's values are considered more dynamic than microarray values which are constrained to the number of probes.


Do you mean the dynamic range of values? Do you have a citation for microarray values which are constrained to the number of probes?

03-rnaseq/00-intro-to-rnaseq.Rmd

jaclyn-taroni

Wrapping up the review that I sent early!

jaclyn-taroni · 2020-09-17T12:30:40Z

03-rnaseq/00-intro-to-rnaseq.Rmd

+### DESeq2 normalization methods
+
+Although DESeq2 has multiple normalization methods, we generally stick to `vst()` (Variance Stablizing Transformation) or `rlog()`. 


I would probably call these transformations, not normalization. You can normalize (e.g., adjust for size factors; counts(<dataset>, normalize = TRUE)) without transforming. This could be confusing for someone coming in with some level of experience. Also should talk about what these are specifically doing beyond that normalization.

This comment is more general and probably should be applied in other RNA-seq notebooks (e.g., 03-rnaseq/dimension_reduction_rnaseq_01_pca.Rmd), too.

Filed: #220

03-rnaseq/00-intro-to-rnaseq.Rmd

cansavvy · 2020-09-17T19:44:58Z

@jaclyn-taroni I think your comments have been addressed. I didn't end up adding a More resources section, I put the table you posted a link to but I dropped the StatsQuest FPKM video, it seemed less relevant, but since we already link to some StatsQuest videos, I'm sure interested users will find it anyway.

I also tried to be more articulate about normalize/transform but let me know if I should go into more detail than what is here.
Did some editing about why genes might not show up, mainly just focused on the annotation thing, sounds like that might be the main reason.

jaclyn-taroni

Changes look good! I had a couple remaining comments.

jaclyn-taroni · 2020-09-17T20:26:24Z

03-rnaseq/00-intro-to-rnaseq.Rmd

+### RNA-seq data **strengths**  
+
+- RNA-seq can assay unknown transcripts, as it is not bound to a pre-determined set of probes like microarrays [@Zhong2009].
+- Its values are considered more dynamic than microarray values which are constrained to the number of probes [@Zhong2009].


I don't understand this point the way it is currently written - is this about the background signal point in the cited article?

From that paper: (No source to it that I can see)

and a limited dynamic range of detection owing to both background and saturation of signals.

Background and saturation. I'll put the word "saturation" in there to help make the point more clear.

references.bib

03-rnaseq/00-intro-to-rnaseq.Rmd

jaclyn-taroni · 2020-09-17T20:51:00Z

03-rnaseq/00-intro-to-rnaseq.Rmd

+
+To normalize and transform our data with DESeq2, we generally use `vst()` (Variance Stabilizing Transformation) or `rlog()`. 
+[Both methods are very similar](http://master.bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#the-variance-stabilizing-transformation-and-the-rlog).
+Both _normalize_ your data by correcting for library size differences but they also _transform_ your data by altering their distributions.


I was looking for a nod to this point (quoting this section of the vignette):

The point of these two transformations, the VST and the rlog, is to remove the dependence of the variance on the mean, particularly the high variance of the logarithm of count data when the mean is low.

But I'm not sure what the exact right level of detail is here.

I'll add a tad more. I don't think we need to put too much, because a good portion of users won't really care, and the ones that do will probably look at the sources we've put here, but a tad more detail; a tad less vagueness would be good.

03-rnaseq/00-intro-to-rnaseq.Rmd

jaclyn-taroni · 2020-09-18T01:10:40Z

As I mentioned in #212 (comment), I'm playing around with the process for making sure the pull request branches are up to date with master. What I did was check out this branch locally, used git merge origin/master, and then resolved the merge conflicts in GitKraken. (I'll push that commit shortly.) For references.bib, I included both references and for all the HTML files I used the most recent commit to master (d5f6c7a). The process wasn't too bad, but I think we're going to have the HTML issue every time. I imagine it's good that the references.bib was tended to at this point. (Was the other change that "disappeared" in references.bib?)

While I was doing resolving conflicts, I noticed that the last round of re-rendering wasn't run in the Docker container: https://alexslemonade.github.io/refinebio-examples/03-rnaseq/dimension-reduction_rnaseq_01_pca.html#6_print_session_info

But I think everything will be rerun once the next round of edits go in anyway.

cansavvy · 2020-09-18T14:30:18Z

While I was doing resolving conflicts, I noticed that the last round of re-rendering wasn't run in the Docker container: https://alexslemonade.github.io/refinebio-examples/03-rnaseq/dimension-reduction_rnaseq_01_pca.html#6_print_session_info

I haven't been running these items on anything but the Docker container, so that is odd... will see if that's resolved.

cansavvy · 2020-09-18T14:34:04Z

The process wasn't too bad, but I think we're going to have the HTML issue every time.

What's the html issue? A merge conflict problem?

jaclyn-taroni · 2020-09-18T14:38:28Z

What's the html issue? A merge conflict problem?

Yes, I think we'll have merge conflicts with the HTML files every time.

cansavvy · 2020-09-18T15:26:25Z

While I was doing resolving conflicts, I noticed that the last round of re-rendering wasn't run in the Docker container: https://alexslemonade.github.io/refinebio-examples/03-rnaseq/dimension-reduction_rnaseq_01_pca.html#6_print_session_info

So mentioned this on Slack, these differences are because snakemake doesn't recognize docker changes so when I switched to an in-development docker image for #206 and then switched back, it doesn't realize things need to be re-run.

jaclyn-taroni · 2020-09-18T15:30:03Z

This question is somewhat related to thinking about status checks - when would these ever be run on Mac OS Mojave if people were following the contributing guidelines? The in-development image will always be Ubuntu 20.04. And is this an area where some kind automation would make our lives easier?

cansavvy · 2020-09-18T15:48:13Z

This question is somewhat related to thinking about status checks - when would these ever be run on Mac OS Mojave if people were following the contributing guidelines? The in-development image will always be Ubuntu 20.04. And is this an area where some kind automation would make our lives easier?

I've never run it on a non-ubuntu Docker image? But I see that is in there?

cansavvy · 2020-09-18T15:50:30Z

The only one I'm seeing Mojave on is the most recent annotation microarray PR #212. @cbethell did you forget to run snakemake on the Docker image for you last render? I missed that in my review if so.

cansavvy · 2020-09-18T15:58:41Z

And is this an area where some kind automation would make our lives easier?

We can definitely look into this if it would be helpful, the PR checklist is admittedly long, so if this would help reduce author burden, that seems good. I'm unsure what heavy a lift it is to get this going?

jaclyn-taroni · 2020-09-18T16:00:14Z

Ah, I had assumed it would be all of the ones that went in on the last PR but it makes sense given what you're saying about Snakemake. (All good info for thinking about automation or not.)

cansavvy · 2020-09-18T16:04:28Z

In regards to the RNA-seq content, it's ready for a another look, @jaclyn-taroni. I added links in modules with search and replace and tested them.

cbethell · 2020-09-18T16:11:09Z

The only one I'm seeing Mojave on is the most recent annotation microarray PR #212. @cbethell did you forget to run snakemake on the Docker image for you last render? I missed that in my review if so.

Ah, it is quite possible that the last one did not get rendered as it should have been. Although I have been running it on the Docker image thus far, I have recently been moving back and forth between the Docker for OpenPBTA-analysis and Docker for refinebio-examples, so again your theory is quite possible! I'll be sure to look out for this moving forward (and will also try @cansavvy's tip of using different ports for the different repos)!

jaclyn-taroni

LGTM! I found one instance of a word being repeated inside and outside of a link and that was my only remaining suggestion.

03-rnaseq/00-intro-to-rnaseq.Rmd

…nsavvy/rna-seq-header

cansavvy added 7 commits September 10, 2020 12:08

Get it started

bf8d000

I put words down

c278547

moar words and links

3d1ddef

More words and citations

8e31eed

RNA-seq header section a bit more polished

29b73ac

add figure in

a0663af

Few tiny edits

2fa1990

cansavvy changed the title ~~WIP: RNA-Seq Header Section~~ RNA-Seq Header Section Sep 15, 2020

cansavvy marked this pull request as ready for review September 15, 2020 19:16

cansavvy requested a review from cbethell September 15, 2020 19:16

cbethell reviewed Sep 16, 2020

View reviewed changes

cansavvy added 2 commits September 16, 2020 13:58

Incorporate @cbethell review

19fb0ac

Fix one little wording change

e86cb52

cansavvy requested a review from cbethell September 16, 2020 20:47

cbethell approved these changes Sep 16, 2020

View reviewed changes

jaclyn-taroni reviewed Sep 17, 2020

View reviewed changes

Put a TODO for that one link

a63f03e

jaclyn-taroni reviewed Sep 17, 2020

View reviewed changes

cansavvy added 3 commits September 17, 2020 10:23

Incorporate most of the comments in Jackie's review

7e3950a

Re-render

493f749

Re-render after fixing references.bib

c3e058e

cansavvy mentioned this pull request Sep 17, 2020

RNA-seq modules: Be more clear about normalization vs transformation when using DESeq2 functions #220

Closed

cansavvy added 7 commits September 17, 2020 10:51

More wording changes

5ed3afd

Doctoc and re-render

c93c3c9

rearrange wording about normalization

08eae92

Re-render

f024ffd

A few more minor edits

4778a09

Just a few more wording edits and sentence rearrangments

bf9a12e

Merge branch 'master' into cansavvy/rna-seq-header

be1161b

cansavvy requested a review from jaclyn-taroni September 17, 2020 19:45

jaclyn-taroni reviewed Sep 17, 2020

View reviewed changes

Merge origin/master into cansavvy/rna-seq-header

8904ffc

jaclyn-taroni added 2 commits September 17, 2020 21:11

Alphabetical order after resolving conflicts

779779d

Merge branch 'master' into cansavvy/rna-seq-header

ddfcef8

Few smaller changes and rerender everything

5a0fb1c

Add links to the RNA-seq header section

56e77e1

Merge branch 'master' into cansavvy/rna-seq-header

ab0c39a

cansavvy requested a review from jaclyn-taroni September 18, 2020 16:03

jaclyn-taroni approved these changes Sep 18, 2020

View reviewed changes

03-rnaseq/00-intro-to-rnaseq.Rmd Outdated Show resolved Hide resolved

cansavvy added 2 commits September 18, 2020 14:11

Get rid of the one typo Jackie found

f176ef0

Merge remote-tracking branch 'origin/cansavvy/rna-seq-header' into ca…

bea4e3f

…nsavvy/rna-seq-header

cansavvy merged commit 8a2c52c into master Sep 18, 2020

cansavvy deleted the cansavvy/rna-seq-header branch September 18, 2020 18:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RNA-Seq Header Section #216

RNA-Seq Header Section #216

cansavvy commented Sep 15, 2020 •

edited

Loading

cbethell left a comment

cbethell left a comment

jaclyn-taroni left a comment

jaclyn-taroni Sep 17, 2020

jaclyn-taroni left a comment

jaclyn-taroni Sep 17, 2020

jaclyn-taroni Sep 17, 2020

cansavvy Sep 17, 2020

cansavvy commented Sep 17, 2020

jaclyn-taroni left a comment

jaclyn-taroni Sep 17, 2020

cansavvy Sep 18, 2020 •

edited

Loading

jaclyn-taroni Sep 17, 2020

cansavvy Sep 18, 2020

jaclyn-taroni commented Sep 18, 2020

cansavvy commented Sep 18, 2020

cansavvy commented Sep 18, 2020

jaclyn-taroni commented Sep 18, 2020

cansavvy commented Sep 18, 2020 •

edited

Loading

jaclyn-taroni commented Sep 18, 2020 •

edited

Loading

cansavvy commented Sep 18, 2020

cansavvy commented Sep 18, 2020

cansavvy commented Sep 18, 2020

jaclyn-taroni commented Sep 18, 2020

cansavvy commented Sep 18, 2020 •

edited

Loading

cbethell commented Sep 18, 2020

jaclyn-taroni left a comment

		### DESeq2 normalization methods

		Although DESeq2 has multiple normalization methods, we generally stick to `vst()` (Variance Stablizing Transformation) or `rlog()`.

RNA-Seq Header Section #216

RNA-Seq Header Section #216

Conversation

cansavvy commented Sep 15, 2020 • edited Loading

Purpose:

Strategy

Analysis Pull Request Check List (roughly in order):

Content checks

Formatting Checks

Add datasets to S3

Docker/Snakemake rendering components

cbethell left a comment

Choose a reason for hiding this comment

cbethell left a comment

Choose a reason for hiding this comment

jaclyn-taroni left a comment

Choose a reason for hiding this comment

jaclyn-taroni Sep 17, 2020

Choose a reason for hiding this comment

jaclyn-taroni left a comment

Choose a reason for hiding this comment

jaclyn-taroni Sep 17, 2020

Choose a reason for hiding this comment

jaclyn-taroni Sep 17, 2020

Choose a reason for hiding this comment

cansavvy Sep 17, 2020

Choose a reason for hiding this comment

cansavvy commented Sep 17, 2020

jaclyn-taroni left a comment

Choose a reason for hiding this comment

jaclyn-taroni Sep 17, 2020

Choose a reason for hiding this comment

cansavvy Sep 18, 2020 • edited Loading

Choose a reason for hiding this comment

jaclyn-taroni Sep 17, 2020

Choose a reason for hiding this comment

cansavvy Sep 18, 2020

Choose a reason for hiding this comment

jaclyn-taroni commented Sep 18, 2020

cansavvy commented Sep 18, 2020

cansavvy commented Sep 18, 2020

jaclyn-taroni commented Sep 18, 2020

cansavvy commented Sep 18, 2020 • edited Loading

jaclyn-taroni commented Sep 18, 2020 • edited Loading

cansavvy commented Sep 18, 2020

cansavvy commented Sep 18, 2020

cansavvy commented Sep 18, 2020

jaclyn-taroni commented Sep 18, 2020

cansavvy commented Sep 18, 2020 • edited Loading

cbethell commented Sep 18, 2020

jaclyn-taroni left a comment

Choose a reason for hiding this comment

cansavvy commented Sep 15, 2020 •

edited

Loading

cansavvy Sep 18, 2020 •

edited

Loading

cansavvy commented Sep 18, 2020 •

edited

Loading

jaclyn-taroni commented Sep 18, 2020 •

edited

Loading

cansavvy commented Sep 18, 2020 •

edited

Loading