Official implementation of the ICML'24 paper "Learning to Remove Cuts in Integer Linear Programming".
- Install the SCIP solver on your machine. The code was tested on version 8.0 for our OS.
- Get a working python installation on your machine. The code was tested on version 3.8.10.
- Install the dependencies for the project specified in the 'requirements.txt' file. For example, using virtualenv and pip you could:
python -m virtualenv <your-venv-name> # create a virtualenv
source <your-venv-name>/bin/activate # activate a virtualenv
python -m pip install -r 'requirements.txt' # install the dependencies
In order to generate and save multiple instances in the data/instances folder inside the project run:
python src/script_generate_instances_and_trajectories.py --instances <instance-name-and dims: str> --n_samples <number-of-instances: int>
The formatting for 'instance name and dims' for the benchmarks is as follows: packing_<n><m>, binpacking<n><m>, maxcut<N><E>, production_planning<T>, set_cover_<n>_<m>
For example, the command to generate 3000 instances for Max Cut N=14, E=40 would be:
python src/script_generate_instances_and_trajectories.py --instances maxcut_14_40 --n_samples 3000
The instances will be saved inside a subfolder of data/instances named as '<instance name and dims>' and each generation will have a subfolder named 'sample_<i>' containing A, b, c numpy objects for the optimization problem an '.mps' representation and a '.txt' containing SCIP solution data for the IGC computation and environment's sanity checks.
In order to generate trajectories of the expert policy for some instances and save them in data/trajectories run:
python src/script_generate_instances_and_trajectories.py --t --instances <instance-name-and dims: str> --n_samples <number-of-instances: int>
The formatting is the same as specified in "Generating the Instances".
For example, the command to generate 3000 trajectories for Max Cut N=14, E=40 would be:
python src/script_generate_instances_and_trajectories.py --t --instances maxcut_14_40 --n_samples 10
Note that in order to properly generate the trajectories the instances must exist.
The trajectories will be saved inside a subfolder of data/trajectories named as '<instance name and dims>', each generation will have a subfolder named 'sample_<i>' containing one folder per iteration '<j>' with a 'trajectory_datapoint.pkl' with the stored data for that iteration.
Training a Neural Policy
- Generating the training dataset from the trajectories. We compute and preprocess the X,y datapoints from the trajectories offline by running a unshuffled one batch run. This is only required once per trajectory and will generate a 'preprocessed' folder inside data/trajectories/<instance name and dims>/sample_<i>/<j>/ that the Pytorch training dataset will access.
python src/script_train.py --instances <instance-name-and-dims: str> --epochs 1 --shuffle 0 --batch_size 1
- Training the model: After we can train our model as we wish, see the script args for details on the options.
python src/script_train.py --instances <instance-name-and-dims: str>
In order to benchmark muliple policies against each other and collect the results we run the following command.
python src/script_benchmark.py --benchmarking_dataset <instance-name-and-dims: str> --benchmarking_samples <n> --neural_checkpoints_path_list './data/experiment_results/checkpoints/<run_id>/model_<i>_weights.pth'
To collect data for cutpool quality analysis set the collect_and_save_cpl
parameter in the environment class to true.
The pyscipopt interface implementation used to benchmark the nn verification
dataset can be found in this script. The dataset is available at deepmind's github.
You can add your own policies and architectures to re-use the pipelines by extending the base classes in policies.py
, models.py
and datasets.py
files respectively and then adding them as supported in the scripts.
Please refer to first author's email for any inquiries or questions, happy to chat about it!