Skip to content

A third year uni project aiming to implement and evaluate the EFR algorithm with different deviation types and explore a potential tradeoff between exploitability and expected value of a strategy in practice.

Notifications You must be signed in to change notification settings

Jamesflynn1/CS344-Opponent-Exploitation-Poker

Repository files navigation

CS344-Opponent-Modelling-Poker

Overleaf: https://www.overleaf.com/project/6342ef568275739c63600bb5

For refractored EFR algorithm code and deviation types please see: https://github.com/Jamesflynn1/open_spiel.

Repository Structure:

EFR.py

Main EFR implementation logic, built off of CFR.py from OpenSpiel as a baseline.

Deviation_Types/

Provides an implementation of a deviation, namely the deviate and player_deviation_reach_probability functions alongside the deviation matrix.

Additionally contains generation code of 8 deviation sets.

Notebooks/

Contains all data processing and visualisation steps beyond the EV and exploitabilty calculations

Policy/

Contains a stored .csv version of the TabularPolicy object for each algorithm.

Output/

Contains per iteration information for each run and a CFR benchmark for each run. Current iteration information consists of iteration time and cumulative policy exploitability.

How to run the EFR code:

Note: Requires a Linux environment for OpenSpiel to be installed.

1. Clone the CS344 repository locally or copy accross the following folders and files

RunEFRExperiment.py

EFR.py

StoreTabularPolicy.py

Deviation_Types/

Deviation_Types/init.py

Deviation_Types/Deviation_Sets.py

Deviation_Types/Deviation.py

Deviation_Types/Swap_Transformation.py

Policy/

Optional:

RunOpponentValue.py (calculates the expected value from Policy/ files)

RunCFR.py (a wrapper to obtain the policy file for OpenSpiel CFR)

RunMCCFR.py (a wrapper to obtain the policy file for OpenSpiel ExternalSamplingMCCFR)

Notebooks/Exploitability graphs.ipynb (data processing and visualisation for exploitability data)

Notebooks/EV Opponent.ipynb (data processing and visualisation for expected value data)

3. Install a Python3 version (tested on 3.9.13)

Download Python from https://www.python.org/downloads/ and follow installation instructions.

2. Install the Python module requirements requirements for project

python -m pip install -q -r requirements.txt

3. Run the EFR algorithm on 2 player Leduc hold'em

python RunEFRExperiment.py (Filename) (Iterations) (Deviation_Type)

Options:

Filename: the name that will be used to save the policy and iteration data files.

Iterations: the number of update iterations that EFR will perform.

Deviation_Type: The type of deviation set that EFR will use (as defined in Deviation_Types/Deviation_Sets.py) Deviation options: ("blind_action", "informed_action", "blind_cf", "informed_cf", "blind_ps", "cfps", "csps", "tips", "bhv")

Game and player number can be modified in the RunEFRExperiment.py file. See OpenSpiel/games for a list of games that can be chosen.

4. (Optional) Install Jupiter notebook to access the .ipynb notebooks

How to recreate the exploitability results

  1. Generate per iteration EFR data for EFR with all deviation options.

For all desired deviation types: "python RunEFRExperiment.py (Deviation_Type) 10000 (Deviation_Type)"

  1. Use the EV notebook to aggregate and visualise the results.

Rerun all cells in the Notebooks/EV Opponent.ipynb notebook to obtain graphs as found in the final report.

How to recreate the expected value results

  1. Generate strategy data for EFR with all deviation options.

For all desired deviation types: "python RunEFRExperiment.py (Deviation_Type) 10000 (Deviation_Type)"

All saved to Policy/, use Deviation_Type as filename.

  1. Run all opponent strategy generation code.

Current opponents (in additional to the different EFR strategies):

MCCFR (low)

MCCFR (med)

MCCFR (high)

MCCFR (higher)

All saved to Policy/

Note that strategy data from the EFR strategy types has already been generated in step 1.

  1. Generate expected value data for all (EFR type, opponent type) combinations from policy files.

"python RunOpponentValue.py" ensuring that all EFR strategies and opponent strategies are defined in this file.

Uses strategy data from /Policy.

  1. Use the EV notebook to aggregate and visualise the results

Rerun all cells in the Notebooks/EV Opponent.ipynb notebook to obtain graphs as found in the final report.

About

A third year uni project aiming to implement and evaluate the EFR algorithm with different deviation types and explore a potential tradeoff between exploitability and expected value of a strategy in practice.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published