This repository is for the code used in the experimental analysis of A Theory of Mind Approach as Test-Time Mitigation Against Emergent Adversarial Communication (Nancirose Piazza & Vahid Behzadan, AAMAS 2023).
- Python 3.7.11
- Torch 1.7.1
- Raylib 1.3.0
- Adversarial Comm Repository
- Auto Encoding Variational Bayes Repository
Pull the original Adversarial Comms Github Pepository from Proroklab.
- From your terminal, run
git pull https://github.com/proroklab/adversarial_comms.git
- Either follow the repo's installation instructions or continue to (3)
- Run
pip install -r requirements.txt
- Run
python setup.py install
- We modified
CoverageEnv
's configuration file, so adjust accordingly to available resources.
- Follow the directions from the
adv_comm
repository or continue to (2). - Run
python train_policy.py coverage -t 6
to train the cooperative team in the coverage environment for 6 million timesteps with only cooperative agents. - Run
python continue_policy.py [cooperative checkpoint path] -t 12 -e coverage -o self_interested
to train the self-interested agent for 6 million timesteps given a fixed cooperative policy. - Run
python continue_policy.py [adversarial checkpoint path] -t 18 -e coverage -o re_adapt
to have the cooperative team retrain/perform readaption training in the presence of a fixed adversary policy. - Please note the location of where the model parameters are saved, they will be needed for the checkpoint variables.
- Open
Generate_Coop_Team_Dataset.py
and replace the directories of where the datasets will be saved and which cooperative model to load in. Then runpython Generate_Coop_Team_Dataset.py
. The cooperative team dataset ends with_dataset_with_label.pkl
.
Pull Auto-Encoding Variational Bayes Github Repository for some utilities.
- Run
git pull https://github.com/angzhifan/Auto-Encoding_Variational_Bayes
. We leverage some helper functions from the repository to instantiate our VAEB.
- Open
VAEB_Training.ipynb
and replace all directories with directories pointing to your cooperative team's dataset. Replace the VAEB output path-name e.g./vae/vaeb_from_coop_dataset.pth
.
- Run
python ParameterSearchingRho.py
will evaluate our theory of mind mitigation method with various intervals for the parameter rho. We selected rho based on an analysis plot of return over episodes per fixed interval.
For all evaluations, please set the action selection to be deterministic, this can be done by passing a parameter before model evaluation function. In our code, you'll find references to .compute_action2(*)
which is a replica of .compute_action(*)
from Raylib except with manual parameter passing for deterministic action selection.
- Open
Evaluate_VAEB.py
and replace the cooperative team model directory and VAEB model directory with yours. Runpython Evaluate_VAEB.py
which will generated the VAEB baseline for the cooperative team performance before readaption in CoverageEnv . - Open
Evaluate_VAEB.py
and replace the readapted cooperative team model directory and VAEB model directory with yours. Runpython Evaluate_VAEB.py
which will generated the VAEB performance for the readapted cooperative team in CoverageEnv . - Open
ParameterSearchingRho.py
and comment-ineval_nocomm_adv(mode=0)
and comment-out the line above iteval_nocomm_adv(mode=1)
. Replace the cooperative team before readaption directory with yours and the evaluation output directory where you would like to store the evaluation scores. This will generated the ToM defense performance for the cooperative team before readaption training. - Open
ParameterSearchingRho.py
and comment-ineval_nocomm_adv(mode=0)
and comment-out the line above iteval_nocomm_adv(mode=1)
. Replace the readapted cooperative team directory with yours and the readaption evaluation output directory where you would like to store the evaluation scores. This will generated the ToM defense performance for the readapted cooperative team. - To generate the performance baselines: no defense cooperative performance, no adversary communication cooperative performance, ideal cooperative performance, adversary performance with no communication, and adversary performance with no cooperative team defense from the adversarial comm repository, following their evaluation instructions.
- Open
PerformanceComparison.ipynb
and replace your evaluation directories for the cooperative team before readaption training for ToM, VAEB and other baselines mentioned in section prior. The first generated graph is the performance comparison prior to readaption training. - Open
PerformanceComparison.ipynb
and replace your evaluation directories for the readapted cooperative team before for ToM, VAEB and other baselines mentioned in section prior. The second generated graph is the performance comparison of the defenses given readapted cooperative team. - Open
PerformanceComparison.ipynb
and replace the evaluation directories to generate the F1-score, False Positive, False Negative, True Positive and True Negative plot analysis of the ToM defense in comparison to the VAEB.