Skip to content

ssm-lab/Opinion-Guided-Reinforcement-Learning

Repository files navigation

A framework for human-informed reinforcement learning by subjective logic

License

Repository structure

  • /01-experiment-setup - Input files for the experiment.

  • /02-maps - Map files: .xlsx

  • /03-input - Input files: maps and human advice

  • /04-src - Source code

    • Main
      • runner.py - Main module
      • model.py - Model classes
    • Advice/SL modules
      • advice_parser.py - Parses human input from /03-input. Input file naming convention: advice-[SIZE]x[SIZE]-seed[SEED].txt Format:
       grid size [1]  
       advice [*] 
      
  • sl.py - Subjective logic utilities

  • Map module

    • map_tools.py - Generator, renderer, and parser for maps. Saves maps under /02-maps as .xslx files.
  • /05-experiments-output - Experiment data as .csv files

  • /06-analysis-output - Analysis of experiment data from /05-experiments-output as .pdf files

  • /tests - Unit tests.

Setup guide

  • Clone this repository.
  • Install requirements via pip install -r requirements.txt.

How to use

⚠️ All scripts to be run from the root directory. ⚠️

Running experiment

✏️ To replicate the experiment results as seen in the paper, follow the below steps with [SIZE] = 12 and [SEED] = 63 ✏️

  1. Generate a map by running python .\04-src\map_tools.py (--generate --render --size [SIZE] --seed [SEED]) | -default -- Replace [SIZE] and [SEED] with the values (int) you need. The --render flag is optional. When run with the -default option, the default 4x4 map will be generated. The map files will be in the folder 02-maps after generation.
  2. Create all twelve advice files in the 03-input folder with the following name: advice-[SIZE]x[SIZE]-seed[SEED]-[QUOTA].txt (e.g., advice-6x6-seed10-all.txt). Quota = {'all', 'holes', 'human10', 'human5', 'coop5-A1-topleft', 'coop5-A1-topright', 'coop5-A2-bottomleft', 'coop5-A2-bottomright', 'coop10-A1-topleft', 'coop10-A1-topright', 'coop10-A2-bottomleft', 'coop10-A2-bottomright'}
    • Synthetic advice file can be generated by running python .\04-src\advice_tools.py --size [SIZE] --seed [SEED] -g [ALL|HOLES]. ALL will generate advice for all cells; HOLES will generate advice for the holes and the goal. Other files must be generated manually.
      • Advice values for frozen tiles in ALL: +1 if no neighboring holes; 0 if one neighboring hole; -1 otherwise.
      • ⚠️ Generated files will be in the folder 02-maps, and must be moved to the folder 03-input before the next step. ⚠️

✏️ The files generated by the steps above for the experiments as seen in the paper are located in the folder 01-experiment-setup. Copy the files from the 01-experiment-setup folder to the 03-input folder to skip the previous steps. ✏️

  1. Run the experiment using python .\04-src\runner.py.
  • Mandatory parameter:
    • --mode [MODE] -- The [MODE] value is one of the following: random, noadvice, synthetic, coop.
  • Optional parameters:
    • --log [LOG_LEVEL] -- The [LOG_LEVEL] value is one of the following: critical, error, warn, warning, info, debug.
    • --name [STRING] -- The name of the experiment based on which the top results folder will be named. If not provided, the folder is named as datetime.now() by formatted as "%Y%m%d-%H%M%S".
  • Settings (size, seed, numexperiments, maxepisodes) can be set in runner.__name__.
  • Results will be generated into /05-experiments-output, under a timestamped folder, with the following folder structure:
 - [maxepisodes]  
	 - policy_data 
		- advice-coop5-topleft-bottomright 
			- One .csv file named after the map size and seed. 
		- advice-coop5-topright-bottomleft 
			- ... 
		- advice-coop10-topleft6-bottomright 
			- ... 
		- advice-coop10-topright-bottomleft 
			- ... 
		- advice-synthetic-all 
			- Multiple .csv files named after the map size, seed, and the _u_ parameter used in the specific experiment. 
		- advice-synthetic-holes 
			- ... 
		- advice-synthetic-human5 
			- ... 
		- advice-synthetic-human10 
			- ... 
		- noadvice 
			- One .csv file named after the map size and seed. 
		- random 
			- ... 
	- reward_data 
			- ...  

Analysis and plotting

  • Run python .\04-src\analysis.py -a [METHOD_NAME] -s [True|False] -log [LOG_LEVEL].
  • Optional parameters:
    • -a [METHOD_NAME] -- The [METHOD_NAME] value is one of the following: cumulative_reward, heatmap.
    • -s [True|False] -- Stash folder results
    • --log [LOG_LEVEL] -- The [LOG_LEVEL] value is one of the following: critical, error, warn, warning, info, debug.
  • Results will be generated into /06-analysis-output

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages