Python library for calculating the mean opinion score (MOS) and 95% confidence interval (CI) of the standard deviation (SD) of text-to-speech (TTS) ratings according to "Ribeiro, F., Florêncio, D., Zhang, C., & Seltzer, M. (2011). CrowdMOS: An approach for crowdsourcing mean opinion score studies". To determine CIs, the authors used a two-way random effects model with the variables: diversity of intrinsic sentence quality, diversity of rater preference, and subjective uncertainty.
pip install mean-opinion-score --user
import numpy as np
from mean_opinion_score import get_ci95, get_ci95_default, get_mos
_ = np.nan
ratings = np.array([
# columns represent sentences
[4, 5, _, 4, _, 3], # rater 1
[4, 4, 4, 5, _, 4], # rater 2
[_, 3, 5, 4, _, 1], # rater 3
[_, _, _, _, _, _], # rater 4
])
mos = get_mos(ratings)
ci = get_ci95(ratings)
ci_default = get_ci95_default(ratings)
print(f"MOS: {mos:.2f} ± {ci:.4f}")
print(f"MOS: {mos:.2f} ± {ci_default:.4f}")
# MOS: 3.85 ± 1.3316
# MOS: 3.85 ± 0.5579
numpy
scipy
If you notice an error, please don't hesitate to open an issue.
# update
sudo apt update
# install Python 3.6, 3.7, 3.8, 3.9, 3.10 & 3.11 for ensuring that tests can be run
sudo apt install python3-pip \
python3.6 python3.6-dev python3.6-distutils python3.6-venv \
python3.7 python3.7-dev python3.7-distutils python3.7-venv \
python3.8 python3.8-dev python3.8-distutils python3.8-venv \
python3.9 python3.9-dev python3.9-distutils python3.9-venv \
python3.10 python3.10-dev python3.10-distutils python3.10-venv \
python3.11 python3.11-dev python3.11-distutils python3.11-venv
# install pipenv for creation of virtual environments
python3.11 -m pip install pipenv --user
# check out repo
git clone https://github.com/stefantaubert/mean-opinion-score.git
cd mean-opinion-score
# create virtual environment
python3.11 -m pipenv install --dev
# first install the tool like in "Development setup"
# then, navigate into the directory of the repo (if not already done)
cd mean-opinion-score
# activate environment
python3.11 -m pipenv shell
# run tests
tox
Final lines of test result output:
py36: OK
py37: OK
py38: OK
py39: OK
py310: OK
py311: OK
congratulations :)
MIT License
MOS and CI calculation is taken from:
- Ribeiro, F., Florêncio, D., Zhang, C., & Seltzer, M. (2011). CrowdMOS: An approach for crowdsourcing mean opinion score studies. 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2416–2419. https://doi.org/10.1109/ICASSP.2011.5946971
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410.
If you want to cite this repo, you can use this BibTeX-entry generated by GitHub (see About => Cite this repository).
Taubert, S. (2023). mean-opinion-score (Version 0.0.2) [Computer software]. https://doi.org/10.5281/zenodo.8238259
- v0.0.2 (2023-08-11)
- Added:
- commonly used 95% confidence interval calculation
- Added:
- v0.0.1 (2023-02-23)
- Initial release