Add microarray gene ID conversion example #212

cbethell · 2020-09-10T00:52:42Z

Purpose:

What issue(s) does your PR address?

This PR closes #113 (the issue on creating a microarray gene ID conversion example notebook).

Strategy

What was your strategy for this new or edited analysis?

I followed the comments on issue #113 and found a new (larger than n = 4 samples) mouse microarray dataset.

The dataset used in this paper is a [mouse glioma cancer stem cell dataset](Cancer Stem Cells Are Enriched In The Side-Population Cells In A Mouse Model Of Glioma) with n = 15 samples (which has been uploaded to the S3 bucket for testing/review).

I then copied the analysis steps from the RNA-seq gene ID conversion notebook, but changed the annotation package being loaded in to that relevant to the mouse genome.

Concerns/Questions for reviewers:

What things should reviewers look out for?

Is the current workflow missing any vital steps?
Do the steps under the Explore mapped data frame section seem helpful? Are the added steps clear? Is there anything that should be added/removed from this section?

Analysis Pull Request Check List (roughly in order):

Content checks

All {{BLANKS}} have been replaced with the correct content.
Sources are cited
Seed is set (if applicable)

Formatting Checks

Removed any manual numbering of sections.
Removed any instances of chunk naming.
Spell checked any Rmd file or md file.
Comments and documentation are up to date.

Add datasets to S3

Added data and metadata files to S3.

Docker/Snakemake rendering components

Added the .html link to the navigation bar.
Any not yet added packages needed for this analysis have been added to the Dockerfile and it successfully builds.
In the Docker container, snakemake was run for rendering.

- make appropriate updates to `references.bib`, `Snakefile`, and `_navbar.html` file

02-microarray/gene-id-convert_microarray_01_ensembl.Rmd

- add references to `references.bib` - update comments and documentation - rerun Snakefile

- rerun Snakefile

- fix formatting

cansavvy

This is a good start! This doesn't yet discuss the multivals argument that I mentioned here: #212 (comment) so the last check in you did isn't exactly reflective of what's going on with the mappings.

I think the meaning of the multivals argument and how it influences the output is a worthwhile thing for us to dive into. Because if you are confused by this argument, our users will definitely be confused by this too (I don't like that they have a default for this, I think it makes it easy for people to miss it). We should aim for making the multivals argument and what it means, as clear as possible for our users.

So I think next you may want to take another closer look at the mapIds docs and we can always schedule a zoom/Google hangout/Slack/Teams chat to discuss if that would be helpful.

02-microarray/gene-id-convert_microarray_01_ensembl.Rmd

cbethell · 2020-09-14T19:03:25Z

This is a good start! This doesn't yet discuss the multivals argument that I mentioned here: #212 (comment) so the last check in you did isn't exactly reflective of what's going on with the mappings.

I think the meaning of the multivals argument and how it influences the output is a worthwhile thing for us to dive into. Because if you are confused by this argument, our users will definitely be confused by this too (I don't like that they have a default for this, I think it makes it easy for people to miss it). We should aim for making the multivals argument and what it means, as clear as possible for our users.

So I think next you may want to take another closer look at the mapIds docs and we can always schedule a zoom/Google hangout/Slack/Teams chat to discuss if that would be helpful.

Per a discussion with @cansavvy re the above comment, the plan moving forward is as follows:

Run mapIds() and supply the multiVals argument with the "list" option
Use reshape2::melt() to get that large list into a more manageable data frame
Explore the melted data frame using some of the steps already used in this PR (to look at multi mappings, etc.)
Then rerun the mapIds() function, this time supplying the multiVals argument with the"filter" option which will remove all elements with multiple matches

- rename the file and propagate change in `Snakefile` and `navbar.html` - update references - rearrange notebook and rerun Snakefile

cbethell · 2020-09-15T15:55:59Z

@cansavvy, I believe that I implemented the plan that we discussed and outlined in the last comment of this PR.

I also implemented the suggestions you provided through your most recent review of this PR (the main ones being changing the file name, keeping the use of IDs consistent throughout the notebook, and some sentence restructuring).

Please let me know if I misinterpreted any of your suggestions or missed any!

cansavvy

This looks great! Just have a few comments about a couple things to rearrange, I think this is close!

02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd

…xamples into cbethell/add-id-conversion-microarray

cansavvy

Two tiny-ish comments! LGTM after that!

02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd

cansavvy

Just a couple little comments. Looks like there's some template changes that might need to be done too; some that got lost: #216 (comment)
But I will probably file a separate PR to change these everywhere since this will affect every module and I haven't yet figured out the full extent of what needs to be changed.

02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd

Co-authored-by: Candace Savonen <cansav09@gmail.com>

jaclyn-taroni

LGTM

jaclyn-taroni · 2020-09-18T00:41:55Z

I'm playing with the settings that require branches to be up to date before merging. I am going to merge this despite the lack of status check (that I do not currently understand).

add microarray gene id conversion notebook

7bdd012

- make appropriate updates to `references.bib`, `Snakefile`, and `_navbar.html` file

cbethell mentioned this pull request Sep 10, 2020

Create microarray gene ID conversion example #113

Closed

cansavvy reviewed Sep 10, 2020

View reviewed changes

02-microarray/gene-id-convert_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved

02-microarray/gene-id-convert_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved

cbethell added 4 commits September 10, 2020 15:42

add section to explore the mapped df and add some context

5a1f27b

add context around multiVals

d39c193

- add references to `references.bib` - update comments and documentation - rerun Snakefile

add annotation package to dockerfile

2b3e324

add new reference

58bd38c

- rerun Snakefile

cbethell changed the title ~~WIP: Add microarray gene ID conversion example~~ Add microarray gene ID conversion example Sep 11, 2020

cbethell added 2 commits September 11, 2020 11:40

fix a sentence

62095e0

add more context around multiVals

a4a8b38

cbethell marked this pull request as ready for review September 11, 2020 15:45

cbethell requested a review from cansavvy September 11, 2020 15:51

add a bit more context around the gene ids used

1c66b52

- fix formatting

cansavvy reviewed Sep 14, 2020

View reviewed changes

cbethell added 3 commits September 15, 2020 11:40

use list and filter in mapIds() function execution

a0c076b

- rename the file and propagate change in `Snakefile` and `navbar.html` - update references - rearrange notebook and rerun Snakefile

fix some formatting

15dbff1

fix spacing

a40cbae

cbethell requested a review from cansavvy September 15, 2020 15:56

add a sanity check after obtaining new filtered df

f4b3d68

cbethell mentioned this pull request Sep 15, 2020

Update RNA-seq gene ID conversion example #218

Merged

11 tasks

cansavvy reviewed Sep 16, 2020

View reviewed changes

cbethell added 2 commits September 16, 2020 14:24

Merge branch 'master' of https://github.com/AlexsLemonade/refinebio-e…

251cf96

…xamples into cbethell/add-id-conversion-microarray

incorporate @cansavvy's review suggestions

b739911

cbethell requested a review from cansavvy September 16, 2020 19:35

cansavvy approved these changes Sep 16, 2020

View reviewed changes

02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved

02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Show resolved Hide resolved

02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Show resolved Hide resolved

fix typo and add step to join rest of data to final df

5e52faf

cbethell requested a review from cansavvy September 17, 2020 00:20

fix link in reference file

c843d29

cansavvy approved these changes Sep 17, 2020

View reviewed changes

02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved

02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved

cansavvy reviewed Sep 17, 2020

View reviewed changes

02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved

cbethell and others added 3 commits September 17, 2020 08:59

Apply @cansavvy's suggestions from code review

0fadca7

Co-authored-by: Candace Savonen <cansav09@gmail.com>

add reference for AnnotationDbi and rerun

62d29cb

fix order in references.bib

1613e8f

jaclyn-taroni approved these changes Sep 17, 2020

View reviewed changes

jaclyn-taroni merged commit d5f6c7a into master Sep 18, 2020

jaclyn-taroni mentioned this pull request Sep 18, 2020

RNA-Seq Header Section #216

Merged

11 tasks

cbethell deleted the cbethell/add-id-conversion-microarray branch September 18, 2020 01:11

cansavvy mentioned this pull request Sep 22, 2020

"Modernize" the reshape2::melt step in the microarray gene id conversion notebook #224

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add microarray gene ID conversion example #212

Add microarray gene ID conversion example #212

cbethell commented Sep 10, 2020 •

edited

Loading

cansavvy left a comment

cbethell commented Sep 14, 2020 •

edited

Loading

cbethell commented Sep 15, 2020

cansavvy left a comment

cansavvy left a comment

cansavvy left a comment

jaclyn-taroni left a comment

jaclyn-taroni commented Sep 18, 2020

Add microarray gene ID conversion example #212

Add microarray gene ID conversion example #212

Conversation

cbethell commented Sep 10, 2020 • edited Loading

Purpose:

Strategy

Concerns/Questions for reviewers:

Analysis Pull Request Check List (roughly in order):

Content checks

Formatting Checks

Add datasets to S3

Docker/Snakemake rendering components

cansavvy left a comment

Choose a reason for hiding this comment

cbethell commented Sep 14, 2020 • edited Loading

cbethell commented Sep 15, 2020

cansavvy left a comment

Choose a reason for hiding this comment

cansavvy left a comment

Choose a reason for hiding this comment

cansavvy left a comment

Choose a reason for hiding this comment

jaclyn-taroni left a comment

Choose a reason for hiding this comment

jaclyn-taroni commented Sep 18, 2020

cbethell commented Sep 10, 2020 •

edited

Loading

cbethell commented Sep 14, 2020 •

edited

Loading