forked from cmusphinx/sphinx4
-
Notifications
You must be signed in to change notification settings - Fork 0
/
RELEASE_NOTES
193 lines (135 loc) · 5.58 KB
/
RELEASE_NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
Sphinx-4 Speech Recognition System
-------------------------------------------------------------------
Version: 1.0Beta6
Release Date: March 2011
-------------------------------------------------------------------
New Features and Improvements:
* SRGS/GrXML support, more to come soon with support for JSAPI2
* Model layout is unified with Pocketsphinx/Sphinxtrain
* Netbeans project files are included
* Language models can be loaded from URI
* Batch testing application allows testing inside Sphinxtrain
Bug Fixes:
* Flat linguist accuracy issue fixed
* Intelligent sorting in paritioner fixes stack overflow when tokens
have identical scores
* Various bug fixes
Thanks:
Timo Bauman, Nasir Hussain, Michele Alessandrini, Evandro Goueva,
Stephen Marquard, Larry A. Taylor, Yuri Orlov, Dirk Schnelle-Walka,
James Chivers, Firas Al Khalil
-------------------------------------------------------------------
Version: 1.0Beta5
Release Date: August 2010
-------------------------------------------------------------------
New Features and Improvements:
* Alignment demo and grammar to align long speech recordings to
transcription and get word times
* Lattice grammar for multipass decoding
* Explicit-backoff in LexTree linguist
* Significant LVCSR speedup with proper LexTree compression
* Simple filter to drop zero energy frames
* Graphviz for grammar dump vizualization instead of AISee
* Voxforge decoding accuracy test
* Lattice scoring speedup
* JSAPI-free JSGF parser
Bug Fixes:
* Insertion probabilities are counted in lattice scores
* Don't waste resources and memory on dummy acoustic model
transformations
* Small DMP files are loaded properly
* JSGF parser fixes
* Documentation improvements
* Debian package stuff
Thanks:
Antoine Raux, Marek Lesiak, Yaniv Kunda, Brian Romanowski, Tony
Robinson, Bhiksha Raj, Timo Baumann, Michele Alessandrini, Francisco
Aguilera, Peter Wolf, David Huggins-Daines, Dirk Schnelle-Walka.
-------------------------------------------------------------------
Version: 1.0Beta4
Release Date: February 2010
-------------------------------------------------------------------
New Features and Improvements:
* Large arbitrary-order language models
* Simplified and reworked model loading code
* Raw configuration and and demos
* HTK model loader
* A lot of code optimizations
* JSAPI-independent JSGF parser
* Noise filtering components
* Lattice rescoring
* Server-based language model
Bug fixes:
* Lots of bug fixes: PLP extraction, race-conditions
in scoring, etc.
Thanks:
Peter Wolf, Yaniv Kunda, Antoine Raux, Dirk Schnelle-Walka,
Yannick Estève, Anthony Rousseau and LIUM team, Christophe Cerisara.
-------------------------------------------------------------------
Version: 1.0Beta3
Release Date: August 2009
-------------------------------------------------------------------
New Features and Improvements:
* BatchAGC frontend component
* Completed transition to defaults in annotations
* ConcatFeatureExtrator to cooperate with cepwin models
* End of stream signals are passed to the decoder to fix cancellation
* Timer API improvement
* Threading policy is changed to TAS
Bug fixes:
* Fixes reading UTF-8 from language model dump.
* Huge memory optimization of the lattice compression
* More stable fronend work with DataStart and DataEnd and optional
SpeechStart/SpeechEnd
Thanks:
Yaniv Kunda, Michele Alessandrini, Holger Brandl, Timo Baumann,
Evandro Gouvea
-------------------------------------------------------------------
Version: 1.0Beta2
Release Date: February 2009
-------------------------------------------------------------------
New Features and Improvments:
* new much cleaner and more robust configuration system
* migrated to java5
* xml-free instantiation of new systems
* improved feature extraction (better voice activity detection, many bugfixes)
* Cleaned up some of the core APIs
* include-tag for configuration files
* better JavaSound support
* fully qualified grammar names in JSGF (Roger Toenz)
* support for dictionary addenda in the FastDictionary (Gregg Liming)
* added batch tools for measuring performance on NIST corpus with CTL files
* many perforamnce and stability improvments
-------------------------------------------------------------------
Version: 1.0Beta
Release Date: September 2004
-------------------------------------------------------------------
New Features:
* Confidence scoring
* Posterior probability computation
* Sausage creation from a lattice
* Dynamic grammars
* Narrow bandwidth acoustic model
* Out-of-grammar utterance rejection
* More demonstration programs
* WSJ5K Language model
Improvements:
* Better control over microphone selection
* JSGF limitations removed
* Improved performance for large, perplex JSGF grammars
* Added Filler support for JSGF Grammars
* Ability to configure microphone input
* Added ECMAScript Action Tags support and demos.
Bug fixes:
* Lots of bug fixes
Documentation:
* Added the Sphinx-4 FAQ
* Added scripts and instructions for building a WSJ5k language model
from LDC data.
Thanks:
* Peter Gorniak, Willie Walker, Philip Kwok, Paul Lamere
-------------------------------------------------------------------
Version: 0.1alpha
Release Date: June 2004
-------------------------------------------------------------------
Initial release