Mosaic ML is a Python library for machine learning pipeline configuration using Monte Carlo Tree Search.
The original paper can be found here: https://www.ijcai.org/Proceedings/2019/457
Authors: Herilalaina Rakotoarison, Marc Schoenauer and Michèle Sebag
Requirements:
- Python (3.5 or higher)
- Numy
- Cython
- scipy
- Mosaic (https://github.com/herilalaina/mosaic)
Installation:
pip install cython numpy scipy pytest
sudo apt-get install build-essential swig
pip install git+https://github.com/herilalaina/mosaic@0.1
pip install git+https://github.com/herilalaina/mosaic_ml
The entry script is python examples/run_mosaic_ml.py -h
.
--openml-task-id OPENML_TASK_ID
OpenML Task ID (default 252)
--overall-time-budget OVERALL_TIME_BUDGET
Overall time budget in seconds (default 360)
--eval-time-budget EVAL_TIME_BUDGET
Time budget for each machine learning evaluation
(default 100)
--memory-limit MEMORY_LIMIT
RAM Memory limit (default 3034)
--seed SEED Seed for reproducibility (default 42)
--nb-init-metalearning NB_INIT_METALEARNING
Number of initial configurations from Auto-Sklearn
(default 25)
--ensemble-size ENSEMBLE_SIZE
Size of ensemble set (default 50)
Mosaic ML has three different components:
- vanilla: MCTS for algorithm selection and Bayesian Optimization for hyperparameter tuning
python examples/run_mosaic_ml.py --nb-init-metalearning 0 --ensemble-size 1
- metalearning: initialize with a set of configurations fetched from Auto-Sklearn then apply vanilla setting
python examples/run_mosaic_ml.py --nb-init-metalearning 25 --ensemble-size 1
- ensemble (with metalearning): add an ensemble selection method (Caruana et al, 04) in the top of the metalearning setting
python examples/run_mosaic_ml.py --nb-init-metalearning 25 --ensemble-size 50