Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add microarray gene ID conversion example #212

Merged
merged 19 commits into from
Sep 18, 2020

Conversation

cbethell
Copy link
Contributor

@cbethell cbethell commented Sep 10, 2020

Purpose:

What issue(s) does your PR address?

This PR closes #113 (the issue on creating a microarray gene ID conversion example notebook).

Strategy

What was your strategy for this new or edited analysis?

I followed the comments on issue #113 and found a new (larger than n = 4 samples) mouse microarray dataset.

The dataset used in this paper is a [mouse glioma cancer stem cell dataset](Cancer Stem Cells Are Enriched In The Side-Population Cells In A Mouse Model Of Glioma) with n = 15 samples (which has been uploaded to the S3 bucket for testing/review).

I then copied the analysis steps from the RNA-seq gene ID conversion notebook, but changed the annotation package being loaded in to that relevant to the mouse genome.

Concerns/Questions for reviewers:

What things should reviewers look out for?

  • Is the current workflow missing any vital steps?
  • Do the steps under the Explore mapped data frame section seem helpful? Are the added steps clear? Is there anything that should be added/removed from this section?

Analysis Pull Request Check List (roughly in order):

Content checks

  • All {{BLANKS}} have been replaced with the correct content.
  • Sources are cited
  • Seed is set (if applicable)

Formatting Checks

  • Removed any manual numbering of sections.
  • Removed any instances of chunk naming.
  • Spell checked any Rmd file or md file.
  • Comments and documentation are up to date.

Add datasets to S3

Docker/Snakemake rendering components

- make appropriate updates to `references.bib`, `Snakefile`, and `_navbar.html` file
- add references to `references.bib`
- update comments and documentation
- rerun Snakefile
- rerun Snakefile
@cbethell cbethell changed the title WIP: Add microarray gene ID conversion example Add microarray gene ID conversion example Sep 11, 2020
@cbethell cbethell marked this pull request as ready for review September 11, 2020 15:45
@cbethell cbethell requested a review from cansavvy September 11, 2020 15:51
Copy link
Contributor

@cansavvy cansavvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start! This doesn't yet discuss the multivals argument that I mentioned here: #212 (comment) so the last check in you did isn't exactly reflective of what's going on with the mappings.

I think the meaning of the multivals argument and how it influences the output is a worthwhile thing for us to dive into. Because if you are confused by this argument, our users will definitely be confused by this too (I don't like that they have a default for this, I think it makes it easy for people to miss it). We should aim for making the multivals argument and what it means, as clear as possible for our users.

So I think next you may want to take another closer look at the mapIds docs and we can always schedule a zoom/Google hangout/Slack/Teams chat to discuss if that would be helpful.

02-microarray/gene-id-convert_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-convert_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-convert_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-convert_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-convert_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-convert_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-convert_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-convert_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-convert_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-convert_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
@cbethell
Copy link
Contributor Author

cbethell commented Sep 14, 2020

This is a good start! This doesn't yet discuss the multivals argument that I mentioned here: #212 (comment) so the last check in you did isn't exactly reflective of what's going on with the mappings.

I think the meaning of the multivals argument and how it influences the output is a worthwhile thing for us to dive into. Because if you are confused by this argument, our users will definitely be confused by this too (I don't like that they have a default for this, I think it makes it easy for people to miss it). We should aim for making the multivals argument and what it means, as clear as possible for our users.

So I think next you may want to take another closer look at the mapIds docs and we can always schedule a zoom/Google hangout/Slack/Teams chat to discuss if that would be helpful.

Per a discussion with @cansavvy re the above comment, the plan moving forward is as follows:

  • Run mapIds() and supply the multiVals argument with the "list" option
  • Use reshape2::melt() to get that large list into a more manageable data frame
  • Explore the melted data frame using some of the steps already used in this PR (to look at multi mappings, etc.)
  • Then rerun the mapIds() function, this time supplying the multiVals argument with the"filter" option which will remove all elements with multiple matches

- rename the file and propagate change in `Snakefile` and `navbar.html`
- update references
- rearrange notebook and rerun Snakefile
@cbethell
Copy link
Contributor Author

@cansavvy, I believe that I implemented the plan that we discussed and outlined in the last comment of this PR.

I also implemented the suggestions you provided through your most recent review of this PR (the main ones being changing the file name, keeping the use of IDs consistent throughout the notebook, and some sentence restructuring).

Please let me know if I misinterpreted any of your suggestions or missed any!

@cbethell cbethell requested a review from cansavvy September 15, 2020 15:56
Copy link
Contributor

@cansavvy cansavvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Just have a few comments about a couple things to rearrange, I think this is close!

02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
02-microarray/gene-id-annotation_microarray_01_ensembl.Rmd Outdated Show resolved Hide resolved
@cbethell cbethell requested a review from cansavvy September 16, 2020 19:35
Copy link
Contributor

@cansavvy cansavvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two tiny-ish comments! LGTM after that!

@cbethell cbethell requested a review from cansavvy September 17, 2020 00:20
Copy link
Contributor

@cansavvy cansavvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple little comments. Looks like there's some template changes that might need to be done too; some that got lost: #216 (comment)
But I will probably file a separate PR to change these everywhere since this will affect every module and I haven't yet figured out the full extent of what needs to be changed.

Copy link
Member

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jaclyn-taroni
Copy link
Member

I'm playing with the settings that require branches to be up to date before merging. I am going to merge this despite the lack of status check (that I do not currently understand).

@jaclyn-taroni jaclyn-taroni merged commit d5f6c7a into master Sep 18, 2020
@jaclyn-taroni jaclyn-taroni mentioned this pull request Sep 18, 2020
11 tasks
@cbethell cbethell deleted the cbethell/add-id-conversion-microarray branch September 18, 2020 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create microarray gene ID conversion example
3 participants