Releases: dpryan79/bison
Releases · dpryan79/bison
0.4.0
- Allow lower case reads in fastq files (previously, this would result in corrupt BAM files.
- HTSlib is now a submodule in the Github repository. This simplifies compilation. Further, that is the only supported compilation method now (samtools-0.1.19 is no longer supported).
- Somehow, the methylation extractor was still defaulting to a minimum phred score of 10, when the documentation said it was defaulting to 5.
- CRAM files can now be produced and processed. Both bison and bison_herd will output in CRAM format if the -C option is given.
- The header @pg line is now rewritten to contain the actual command executed and the bison/herd version.
- Excess space allocated to hold the genome is now returned.
- Output BAM/CRAM files can now be sorted on the fly. The method for this is similar to that used by samtools, where temporary files are written and then merged. This merge step is performed in parallel if multiple output files are being written by bison_herd.
- Fixed a bug in bison_CpG_coverage, where previously only the first chromosome was used.
0.3.3
v0.3.2b
v0.3.2
- Added bedGraph2MOABS to convert bedGraph files for use by MOABS. See usage above.
- Added support for HTSlib.
- Fixed a small bug wherein --reorder wasn't being invoked when multiple output BAM files were to be used.
- Fixed a small bug that only manifested in DEBUG mode.
- There is now a tutorial.
- The default minimum MAPQ and Phred scores used by bison_mbias have been updated to match bison_methylation_extractor.
Version 0.3.1
0.3.1
- The various bedGraph files didn't previously have a "track" line. The UCSC
Genome Browser requires this, so bedGraph files produced will now contain
it. It should be noted that this is the very minimal line required. Bison
does not provide facilities for making these changes, users need to edit
things manually or use external programs for this. It should also be noted
that any changes to the "track" or other header lines should be made after
all processing with Bison is complete. - Add conversion scripts for import into MethylSeekR, BiSeq, and BEAT.
- Revamped how
bison_markduplicates
works. The 3' coordinates are now
ignored, soft-clipped bases on the 5' end are now incorporated in
determining the 5' coordinate and methylation calls are also used in
determining if reads/pairs are duplicates. This should be a much more
robust (though more resource intensive) method than that previously used.
Whereas the previous version kept unmarked the read/pair with the highest
MAPQ, this one will do that for the read/pair with the highest summed phred
score (a la picard).
Version 0.3.0
0.3.0
- Note: The indices produced by previous versions are not guaranteed to be
compatible unless you used a multi-fasta file. There was a serious
implementation problem with how bison_index worked when given multiple
files as input and how multiple files were read into memory in previous
versions. If you used a multi-fasta file, then everything will continue
to work correctly. However, if you used multiple fasta files in a list
then I strongly encourage you to delete the previous indices (just remove
the bisulfite_genome directory) and reindex. The technical reasons for this
issue are that when the bison tools previously read multiple fasta files
into memory, they would do so in whatever order they appeared in the
directory structure, which can change over time and isn't guaranteed to
match the order of files someone specified during indexing. While the
alignments wouldn't be affected by this, the methylation calls could have
been seriously compromised. In this version, bison_index will only accept a
directory, not a list of files, and it will always alphasort() the list of
files in that directory prior to processing. This should eliminate this
problem. My apologies to anyone affected by this. - Added --genome-size option to a number of the tools. Many of the bison
programs need to read the genome into memory. By default, 3 gigabases worth
of memory are allocated for that and the size increased as needed. For
smaller genomes, this wasted space. For larger genomes, the constant
reallocation of space could seriously slow things down. Consequently, this
option was added to any tool that reads the genome into memory. It's
convenient to overestimate this slightly, so if your genome is 3.8
gigabases, then just use 4000000000 as the genome size. - bison_merge_CpGs can now take multiple input files at once.
- A number of small bug fixes, such as when "genome_dir" doesn't end in a /.