v2.3.0
Features:
- (#203) Added
-tok flores101
and-tok flores200
, a.k.a.spbleu
.
These are multilingual tokenizations that make use of the
multilingual SPM models released by Facebook and described in the
following papers:- Flores-101: https://arxiv.org/abs/2106.03193
- Flores-200: https://arxiv.org/abs/2207.04672
- (#213) Added JSON formatting for multi-system output (thanks to Manikanta Inugurthi @me-manikanta)
- (#211) You can now list all test sets for a language pair with
--list SRC-TRG
.
Thanks to Jaume Zaragoza (@ZJaume) for adding this feature. - Added WMT22 test sets (test set
wmt22
) - System outputs: include with wmt22. Also added wmt21/systems which will produce WMT21 submitted systems.
To see available systems, give a dummy system to--echo
, e.g.,sacrebleu -t wmt22 -l en-de --echo ?