-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support reference bundles. #1713
base: master
Are you sure you want to change the base?
Conversation
11db598
to
94875b9
Compare
94875b9
to
b0a5f1c
Compare
b0a5f1c
to
ae17645
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cmnbroad This looks good to me. I have a few style questions and typos, feel free to do what you want with those though.
@@ -110,4 +117,47 @@ public HaploidReferenceDecoder getHaploidReferenceDecoder( | |||
return (HaploidReferenceDecoder) resolveForDecoding(inputBundle).getDecoder(inputBundle, HaploidReferenceDecoderOptions); | |||
} | |||
|
|||
/** | |||
* Create q reference bundle given only a fasta path, including an index and a dictionary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: q
@@ -110,4 +117,47 @@ public HaploidReferenceDecoder getHaploidReferenceDecoder( | |||
return (HaploidReferenceDecoder) resolveForDecoding(inputBundle).getDecoder(inputBundle, HaploidReferenceDecoderOptions); | |||
} | |||
|
|||
/** | |||
* Create q reference bundle given only a fasta path, including an index and a dictionary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Create q reference bundle given only a fasta path, including an index and a dictionary | |
* Create a reference bundle given only a fasta path, including an index and a dictionary |
@@ -64,6 +65,21 @@ public FastaSequenceFile(final Path path, final boolean truncateNamesAtWhitespac | |||
this.in = new FastLineReader(IOUtil.openFileForReading(path)); | |||
} | |||
|
|||
/** | |||
* Constructs a FastaSequenceFile that reads from the specified fasta and dictionary file. Makes no | |||
* assumptions that the fata and dict file are in the same directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* assumptions that the fata and dict file are in the same directory. | |
* assumptions that the fasta and dict file are in the same directory. |
* Open the given indexed fasta sequence file. Throw an exception if the file cannot be opened. | ||
* | ||
* @param path The file to open. | ||
* @param dictPath the dictionar path (may be null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @param dictPath the dictionar path (may be null) | |
* @param dictPath the dictionary path (may be null) |
if (IOUtil.isBlockCompressed(path.toPath(), true)) { | ||
throw new SAMException("Indexed block-compressed FASTA file cannot be handled: " + path); | ||
} | ||
this.channel = Files.newByteChannel(path.toPath()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should add getChannel methods to IOPath. We have getStream ones.
if (!truncateNamesAtWhitespace) { | ||
throw new RuntimeException("preferIndexed option requires truncateNamesAtWhitespace"); | ||
} | ||
final Optional<BundleResource> indexPathResource = referenceBundle.get(BundleResourceType.CT_REFERENCE_INDEX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These chains of optional if-present-get are kind of gross. What do you think of flatmap chains instead? I think they're a little more readable, at least once you know what you're looking at.
IOPath indexPath = referenceBundle.get(BundleResourceType.CT_REFERENCE_INDEX)
.flatMap(BundleResource::getIOPath)
.orElse(null);
if (preferIndexed && indexPath != null !Files.exists(indexPath.toPath())) {
throw new RuntimeException(String.format("FASTA index file %s does not exist", indexPath));
}
final IOPath remoteIndex = new HtsPath( | ||
Files.copy( | ||
indexFile.toPath(), | ||
nioFastaDir.resolve(indexFile.getBaseName().get() + indexFile.getExtension().get())).toUri().toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you meant indexDir here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These copy blocks seem like they could be extracted into a function, it might help avoid this copy paste sorts of errors.
The main feature implemented here is that a bundle can be used in place of a reference, without executing any code paths that assume that the individual resources are located in the same parent directory (there are a LOT of such code paths, since most of the existing code base makes these assumptions). There are a number of tests added that exist purely to catch any regressions where such assumptions are reintroduced, at lease in the code paths where bundles are enabled.