forked from deweylab/RSEM
-
Notifications
You must be signed in to change notification settings - Fork 0
/
WHAT_IS_NEW
327 lines (205 loc) · 18 KB
/
WHAT_IS_NEW
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
RSEM v1.3.0
- Added Prior-Enhanced RSEM (pRSEM) as a submodule
- Introduced `--strandedness <none|forward|reverse>` option, `--strand-specific` and `--forward-prob` are deprecated (but still supported)
- Revised documentation for `rsem-plot-model`, maked it clear that in alignment statistics, isoform-level (instead of genome-level) multi-mapping reads are shown
- Significantly improved the output information of `rsem-sam-validator`: if indels/clippings/skips are detected in alignments or alignments exceed transcript boundaries, `rsem-sam-validator` will report them instead of telling you the input is valid
- Updated the warning message to ask users to make sure that they align their reads agains a set of transcripts instead of genome when RSEM finds less sequences in the BAM file than RSEM's indices
--------------------------------------------------------------------------------------------
RSEM v1.2.31
- Rewrote `rsem-gff3-to-gtf` to handle a more general set of GFF3 files
- Added safety checks to make sure poly(A) tails are not added to the reference when `--star` is set
--------------------------------------------------------------------------------------------
RSEM v1.2.30
- Fixed a bug that can cause SAMtools sort to fail
- Improved the appearance of warning messages: for the same type of warning messages, only show the first 50 messages and then provide the total number of such warnings
--------------------------------------------------------------------------------------------
RSEM v1.2.29
- Reformatted Makefile to be more professional, and `make install` is ready to use
- Enabled `./configure --without-curses` for configuring SAMtools to avoid potential compilation problems due to curses library
- Fixed bugs for installing EBSeq
- Improved the readability of RSEM documentation
--------------------------------------------------------------------------------------------
RSEM v1.2.28
- Fixed a bug in RSEM v1.2.27 that can lead to assertion errors for parsing GTF files
- Fixed a bug in Makefile
--------------------------------------------------------------------------------------------
RSEM v1.2.27
- Upgraded SAMtools to v1.3; RSEM now supports input alignments in SAM/BAM/CRAM format
- '--sam/--bam' options of 'rsem-calculate-expression' are obsoleted; use '--alignments' instead; '--sam/--bam' can still be used for compatibility with previous versions
- Some 'rsem-calculate-expression' options are renamed for better interpretability
- Documents are updated to reflect the SAMtools upgrade
- Fixed a bug for parsing GTF files
- Fixed a bug for generating transcript wiggle plots
--------------------------------------------------------------------------------------------
RSEM v1.2.26
- RSEM supports GFF3 annotation format now
- Added instructions to build RSEM references using RefSeq, Ensembl, or GENCODE annotations
- Added an option to extract only transcripts from certain resources given a GTF/GFF3 file
- Fixed a bug and improved the clarity of error messages for extracting transcripts from genome
--------------------------------------------------------------------------------------------
RSEM v1.2.25
- RSEM will extract gene_name/transcript_name from GTF file when possible; however, it only appends them to the 'sample_name.*.results' files if '--append-names' option is specified; unlike v1.2.24, this version is compatible with STAR aligner even when '--append-names' is set
--------------------------------------------------------------------------------------------
RSEM v1.2.24
- RSEM will extract gene_name/transcript_name from GTF file when possible; if extracted, gene_name/transcript_name will append at the end of gene_id/transcript_id with an underscore in between
- Modified 'rsem-plot-model' to indicate the modes of fragment length and read length distributions
- Modified 'rsem-plot-model' to present alignment statistics better using both barplot and pie chart
- Updated 'EBSeq' to version 1.2.0
- Added coefficient of quartile variation in addition to credibility intervals when '--calc-ci' is turned on
- Added '--single-cell-prior' option to notify RSEM to use a sparse prior (Dir(0.1)) for single cell data; this option only makes sense if '--calc-pme' or '--calc-ci' is set
--------------------------------------------------------------------------------------------
RSEM v1.2.23
- Moved version information from WHAT_IS_NEW to rsem_perl_utils.pm in order to make sure the '--version' option always output the version information
- Fixed a typo in 'rsem-calculate-expression' that can lead an error when '--star' is set and '--star-path' is not set
- Fixed a bug that can occasionally crash the RSEM simulator
- Added user-friendly error messages that are triggered when RSEM detects invalid bases in the input FASTA file during reference building
--------------------------------------------------------------------------------------------
RSEM v1.2.22
- Added options to run the STAR aligner
--------------------------------------------------------------------------------------------
RSEM v1.2.21
- Strip read names of extra words to avoid mismatches of paired-end read names
--------------------------------------------------------------------------------------------
RSEM v1.2.20
- Fixed a problem that can lead to assertion error if any paired-end read's insert size > 32767 (by changing the type of insertL in PairedEndHit.h from short to int)
--------------------------------------------------------------------------------------------
RSEM v1.2.19
- Modified 'rsem-prepare-reference' such that by default it does not add any poly(A) tails. To add poly(A) tails, use '--polyA' option
- Added an annotation of the 'sample_name.stat/sample_name.cnt' file, see 'cnt_file_description.txt'
--------------------------------------------------------------------------------------------
RSEM v1.2.18
- Only generate warning message if two mates of a read pair have different names
- Only parse attributes of a GTF record if its feature is "exon" to avoid unnecessary warning messages
--------------------------------------------------------------------------------------------
RSEM v1.2.17
- Added error detection for cases such as a read's two mates having different names or a read is both alignable and unalignable
--------------------------------------------------------------------------------------------
RSEM v1.2.16
- Corrected a typo in 'rsem-generate-data-matrix', this script extracts 'expected_count' column instead of 'TPM' column
--------------------------------------------------------------------------------------------
RSEM v1.2.15
- Allowed for a subset of reference sequences to be declared in an input SAM/BAM file
- For any transcript not declared in the SAM/BAM file, its PME estimates and credibility intervals are set to zero
- Added advanced options for customizing Gibbs sampler and credibility interval calculation behaviors
- Splitted options in 'rsem-calculate-expression' into basic and advanced options
--------------------------------------------------------------------------------------------
RSEM v1.2.14
- Changed RSEM's behaviors for building Bowtie/Bowtie 2 indices. In 'rsem-prepare-reference', '--no-bowtie' and '--no-ntog' options are removed. By default, RSEM does not build either Bowtie or Bowtie 2 indices. Instead, it generates two index Multi-FASTA files, 'reference_name.idx.fa' and 'reference_name.n2g.idx.fa'. Compared to the former file, the latter one in addition converts all 'N's into 'G's. These two files can be used to build aligner indices for customized aligners. In addition, 'reference_name.transcripts.fa' does not have poly(A) tails added. To enable RSEM build Bowtie/Bowtie 2 indices, '--bowtie' or '--bowtie2' must be set explicitly. The most significant benefit of this change is that now we can build Bowtie and Bowtie 2 indices simultaneously by turning both '--bowtie' and '--bowtie2' on. Type 'rsem-prepare-reference --help' for more information
- If transcript coordinate files are visualized using IGV, 'reference_name.idx.fa' should be imported as a genome (instead of 'reference_name.transcripts.fa'). For more information, see the third subsection of Visualization in 'README.md'
- Modified RSEM perl scripts so that RSEM directory will be added in the beginning of the PATH variable. This also means RSEM will try to use its own samtools first
- Added --seed option to set random number generator seeds in 'rsem-calculate-expression'
- Added posterior standard deviation of counts as output if either '--calc-pme' or '--calc-ci' is set
- Updated boost to v1.55.0
- Renamed makefile as Makefile
- If '--output-genome-bam' is set, in the genome BAM file, each alignment's 'MD' field will be adjusted to match the CIGAR string
- 'XS:A:value' field is required by Cufflinks for spliced alignments. If '--output-genome-bam' is set, in the genome BAM file, first each alignment's 'XS' filed will be deleted. Then if the alignment is an spliced alignment, a 'XS:A:value' field will be added accordingly
- Added instructions for users who want to put all RSEM executables into a bin directory (see Compilation & Installation section of 'README.md')
--------------------------------------------------------------------------------------------
RSEM v1.2.13
- Allowed users to use the SAMtools in the PATH first and enabled RSEM to find its executables via a symbolic link
- Changed the behavior of parsing GTF file. Now if a GTF line's feature is not "exon" and it does not contain a "gene_id" or "transcript_id" attribute, only a warning message will be produced (instead of failing the RSEM)
--------------------------------------------------------------------------------------------
RSEM v1.2.12
- Enabled allele-specific expression estimation
- Added '--calc-pme' option for 'rsem-calculate-expression' to calculate posterior mean estimates only (no credibility intervals)
- Modified the shebang line of RSEM perl scripts to make them more portable
- Added '--seed' option for 'rsem-simulate-reads' to enable users set the seed of random number generator used by the simulation
- Modified the transcript extraction behavior of 'rsem-prepare-reference'. For transcripts that cannot be extracted, instead of failing the whole script, warning information is produced. Those transcripts are ignored
--------------------------------------------------------------------------------------------
RSEM v1.2.11
- Enabled RSEM to use Bowtie 2 aligner (indel, local and discordant alignments are not supported yet)
- Changed option names '--bowtie-phred33-quals', '--bowtie-phred64-quals' and '--bowtie-solexa-quals' back to '--phred33-quals', '--phred64-quals' and '--solexa-quals'
--------------------------------------------------------------------------------------------
RSEM v1.2.10
- Fixed a bug which will lead to out-of-memory error when RSEM computes ngvector for EBSeq
--------------------------------------------------------------------------------------------
RSEM v1.2.9
- Fixed a compilation error problem in Mac OS
- Fixed a problem in makefile that affects 'make ebseq'
- Added 'model_file_description.txt', which describes the format and meanings of file 'sample_name.stat/sample_name.model'
- Updated samtools to version 0.1.19
--------------------------------------------------------------------------------------------
RSEM v1.2.8
- Provided a more detailed description for how to simulate RNA-Seq data using 'rsem-simulate-reads'
- Provided more user-friendly error message if RSEM fails to extract transcript sequences due to the failure of reading certain chromosome sequences
--------------------------------------------------------------------------------------------
RSEM v1.2.7
- 'rsem-find-DE' is replaced by 'rsem-run-ebseq' and 'rsem-control-fdr' for a more friendly user experience
- Added support for differential expression testing on more than 2 conditions in RSEM's EBSeq wrappers 'rsem-run-ebseq' and 'rsem-control-fdr'
- Renamed '--phred33-quals', '--phred64-quals', and '--solexa-quals' in 'rsem-calculate-expression' to '--bowtie-phred33-quals', '--bowtie-phred64-quals', and '--bowtie-solex-quals' to avoid confusion
--------------------------------------------------------------------------------------------
RSEM v1.2.6
- Install the latest version of EBSeq from Bioconductor and if fails, try to install EBSeq v1.1.5 locally
- Fixed a bug in 'rsem-gen-transcript-plots', which makes 'rsem-plot-transcript-wiggles' fail
--------------------------------------------------------------------------------------------
RSEM v1.2.5
- Updated EBSeq from v1.1.5 to v1.1.6
- Fixed a bug in 'rsem-generate-data-matrix', which can cause 'rsem-find-DE' to crash
--------------------------------------------------------------------------------------------
RSEM v1.2.4
- Fixed a bug that leads to poor parallelization performance in Mac OS systems
- Fixed a problem that may halt the 'rsem-gen-transcript-plots", thanks Han Lin for pointing out the problem and suggesting possible fixes
- Added some user-friendly error messages for converting transcript BAM files into genomic BAM files
- Modified rsem-tbam2gbam so that the original alignment quality MAPQ will be preserved if the input bam is not from RSEM
- Added user-friendly error messages if users forget to compile the source codes
--------------------------------------------------------------------------------------------
RSEM v1.2.3
- Fixed a bug in 'EBSeq/rsem-for-ebseq-generate-ngvector-from-clustering-info' which may crash the script
--------------------------------------------------------------------------------------------
RSEM v1.2.2
- Updated EBSeq to v1.1.5
- Modified 'rsem-find-DE' to generate extra output files (type 'rsem-find-DE' to see more information)
--------------------------------------------------------------------------------------------
RSEM v1.2.1
- Added poly(A) tails to 'reference_name.transcripts.fa' so that the RSEM generated transcript unsorted BAM file can be fed into RSEM as an input file. However, users need to rebuild their references if they want to visualize the transcript level wiggle files and BAM files using IGV
- Modified 'rsem-tbam2gbam' to convert users' alignments from transcript BAM files into genome BAM files, provided users use 'reference_name.idx.fa' to build indices for their aligners
- Updated EBSeq from v1.1.3 to v1.1.4
- Corrected several typos in warning messages
--------------------------------------------------------------------------------------------
RSEM v1.2.0
- Changed output formats, added FPKM field etc.
- Fixed a bug related to paired-end reads data
- Added a script to run EBSeq automatically and updated EBSeq to v1.1.3
--------------------------------------------------------------------------------------------
RSEM v1.1.21
- Removed optional field "Z0:A:!" in the BAM outputs
- Added --no-fractional-weight option to rsem-bam2wig, if the BAM file is not generated by RSEM, this option is recommended to be set
- Fixed a bug for generating transcript level wiggle files using 'rsem-plot-transcript-wiggles'
--------------------------------------------------------------------------------------------
RSEM v1.1.20
- Added an option to set the temporary folder name
- Removed sample_name.sam.gz. Instead, RSEM uses samtools to convert bowtie outputted SAM file into a BAM file under the temporary folder
- RSEM generated BAM files now contains all alignment lines produced by bowtie or user-specified aligners, including unalignable reads. Please note that for paired-end reads, if one mate has alignments but the other does not, RSEM will mark the alignable mate as "unmappable" (flag bit 0x4) and append an optional field "Z0:A:!"
--------------------------------------------------------------------------------------------
RSEM v1.1.19
- Allowed > 2^31 hits
- Added some instructions on how to visualize transcript coordinate BAM/WIG files using IGV
- Included EBSeq for downstream differential expression analysis
--------------------------------------------------------------------------------------------
RSEM v1.1.18
- Added some user-friendly error messages
- Added program 'rsem-sam-validator', users can use this program to check if RSEM can process their SAM/BAM files
- Modified 'convert-sam-for-rsem' so that this program will convert users' SAM/BAM files into acceptable BAM files for RSEM
--------------------------------------------------------------------------------------------
RSEM v1.1.17
- Fixed a bug related to parallezation of credibility intervals calculation
- Added --no-bam-output option to rsem-calculate-expression
- The order of @SQ tags in SAM/BAM files can be arbitrary now
--------------------------------------------------------------------------------------------
RSEM v1.1.16
- Added --time option to show time consumed by each phase
- Moved the alignment file out of the temporary folder
- Enabled pthreads for calculating credibility intervals
--------------------------------------------------------------------------------------------
RSEM v1.1.15
- Fixed several bugs causing compilation error
- Modified samtools' Makefile for cygwin. For cygwin users, please uncomment the 4th and 8th lines in sam/Makefile before compiling RSEM
--------------------------------------------------------------------------------------------
RSEM v1.1.14
- Added --chunkmbs option to rsem-calculate-expression (patch contributed by earonesty)
- Added --sampling-for-bam option to rsem-calculate-expression, in the bam file, instead of providing expected weights, for each read RSEM samples one alignment based on the expected weights
- RSEM can generate BAM and Wiggle files in both genomic-coordinate and transcript-coordinate
- Added rsem-plot-transcript-wiggles. This script can generate transcript-coordinate wiggle plots in pdf format. One unique feature is, a stacked plot can be generated, with unique read contribution shown as black and multi-read contribution shown as red
- Added convert_sam_for_rsem script for users do not use bowtie aligner
- Modified RSEM's GTF file parser. Now RSEM does not require "transcript_id" and "gene_id" be the first two attributes shown
- Improved descriptions for thread related errors