Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt CollectRrbsMetrics to SinglePassSamProgram #940

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

tcibinan
Copy link
Contributor

@tcibinan tcibinan commented Sep 28, 2017

Description

CollectRrbsMetrics extends from CommandLineProgram. In order to optimize code and make it more readable SinglePassSamProgram was added.

Implementation

  • CRM extends from SPSP and can be used by CollectMultipleMetrics;
  • Add tests for running CRM by CollectMultipleMetrics;
  • Add more tests for CRM;
  • Argument OUTPUT in SPSP was made optional (it could be a point for a discussion) in case where CRM runs with argument METRICS_FILE_PREFIX instead of argument OUTPUT and 2 other related arguments CHART_OUTPUT and SUMMARY_OUTPUT:
java -jar picard.jar CollectRrbsMetrics
R=reference_sequence.fasta
I=input.bam
M=basename_for_metrics_output_files

Checklist (never delete this)

Never delete this, it is our record that procedure was followed. If you find that for whatever reason one of the checklist points doesn't apply to your PR, you can leave it unchecked but please add an explanation below.

Content

  • Added or modified tests to cover changes and any new functionality
  • Edited the README / documentation (if applicable)
  • All tests passing on Travis

Review

  • Final thumbs-up from reviewer
  • Rebase, squash and reword as applicable

@coveralls
Copy link

coveralls commented Sep 28, 2017

Coverage Status

Coverage increased (+0.01%) to 76.783% when pulling 4e4c0c5 on EpamLifeSciencesTeam:epam-ls_CollectRrbsMetricsToSinglePassSamProgram into d4b04a9 on broadinstitute:master.

Copy link
Contributor

@yfarjoun yfarjoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @tcibinan. a few small comments and I think this is good!

@Override
public SinglePassSamProgram makeInstance(final String outbase, final String outext, final File input, final File reference, final Set<MetricAccumulationLevel> metricAccumulationLevel, final File dbSnp, final File intervals) {
final CollectRrbsMetrics program = new CollectRrbsMetrics();
program.OUTPUT = new File(outbase + ".rrbc.detail_metrics" + outext);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rrbc->rrbs (x3)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"<pre>" +
"java -jar picard.jar CollectRrbsMetrics \\<br />" +
" I=input.bam \\<br />" +
" O=detail_output.bam \\<br />" +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O=detail_output.txt, not .bam

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

" I=input.bam \\<br />" +
" O=detail_output.bam \\<br />" +
" CHART=chart_output.pdf \\<br />" +
" S=summary_output.bam \\<br />" +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.txt not .bam

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -56,7 +56,7 @@
@Argument(shortName = StandardOptionDefinitions.INPUT_SHORT_NAME, doc = "Input SAM or BAM file.")
public File INPUT;

@Argument(shortName = "O", doc = "File to write the output to.")
@Argument(shortName = "O", doc = "File to write the output to.", optional = true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this optional?

Copy link
Contributor Author

@tcibinan tcibinan Sep 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @yfarjoun.

CollectRrbsMetrics didn't have OUTPUT argument (only METRICS_FILE_PREFIX). But now it extends from SinglePassSamProgram and you can't run the metrics without mandatory argument OUTPUT.

So OUTPUT was made optional in order to be able to run CollectRrbsMetrics without it.

Also we can use that excessive OUTPUT argument with two more to specify output files explicitly. It can be useful and it is a way to use this unnecessary argument.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about this more, this isn't "OK". this changes the behavior of all the other programs that extend from SinglePassSamProgram.

can you find a solution that doens't change the behavior/API for the other programs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I owuld almost prefer to simply rename METRICS_FILE_PREFIX -> "OUTPUT" and to do away with the specific metric/chart/summary output variables....the only problem is that it would create an API change...

@cnorman, perhaps you can suggest solutions for this...given your knowledge of the parsing system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that you implement it like CollectGcBiasMetrics is implemented, i.e. OUTPUT is for the detailed_metrics, SUMMARY_OUTPUT is for summary_metrics and CHART_OUTPUT for the pdf. You can use METRICS_FILE_PREFIX as Mutex, or remove it and change the API for this program (but not for all of the Single-Pass-Sam programs)

@yfarjoun
Copy link
Contributor

why the change to specifying the outputs explicitly? what's the benefit there?

@tcibinan
Copy link
Contributor Author

Please check the comment above.

@@ -1,102 +0,0 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this test removed? we should still be testing CollectRrbsMetrics directly, not only via collect multiple metrics.

@yfarjoun yfarjoun self-assigned this Jun 6, 2018
@yfarjoun
Copy link
Contributor

yfarjoun commented Sep 5, 2018

please respond to the review here if this is still interesting.

@kbergin
Copy link
Contributor

kbergin commented Apr 16, 2019

Hi @tcibinan - Would you be available to complete this PR?

@tcibinan
Copy link
Contributor Author

Hi! It's been a long time since I created the pull request. I'll try to find time to update the code to the latest changes and add requested changes but I can't be certain when. So if there is somebody who have enough courage to finish the request I'll be very pleased. Otherwise, I'll complete it one day.

@yfarjoun
Copy link
Contributor

@tcibinan I'm wondering if this PR is needed now with the changes I made to CMM...perhaps you could re-tool CollectRrbsMetrics to only use the regular paramters from CMM and take additional parameter via EXTRA_ARGUMENTS?

@kbergin
Copy link
Contributor

kbergin commented Nov 3, 2020

Hey @tcibinan ! Going through and looking at open PRs and issues in Picard and wanted to check in if you had seen Yossi's comment, or had any thoughts about the future of this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants