v2.0.0 #10

ChrisMcCarthyDev · 2023-08-15T15:15:53Z

ChrisMcCarthyDev
Aug 15, 2023

PrimAITE v2.0.0

✨ What's New

Command Line Interface

PrimAITE now comes with a CLI (built with Typer) that serves as the main entry point for those using PrimAITE out of the box.

To run the default PrimAITE Session out of the box, run:

primaite session

Application Directories

To enable PrimAITE to be used as an installed Python package, and to be used as is out-of-the-box without reliance on the repository, a collection of application directories has been created. These back-end/hidden and user-facing directories are used to store things like application log files, users' config files, users' Jupyter Notebooks, PrimAITE session outputs etc. The user needs to call primaite setup after doing pip install to perform the directory setup. The directories are structured as follows:

Windows

~/
├─ AppData/
│  ├─ primaite/
│  │  ├─ 2.0.0/
│  │  │  ├─ config/
│  │  │  ├─ logs/
│  │  │  │  ├─ primaite.log
├─ primaite/
│  ├─ 2.0.0/
│  │  ├─ config/
│  │  │  ├─ example_config/
│  │  │  │  ├─ lay_down/
│  │  │  │  ├─ training/
│  │  ├─ notebooks/
│  │  │  │  ├─ primaite_demo_notebooks/
│  │  ├─ sessions/

Linux

~/
├─ .cache/
│  ├─ primaite/
│  │  ├─ 2.0.0/
│  │  │  ├─ logs/
│  │  │  │  ├─ primaite.log
├─ .config/
│  ├─ primaite/
│  │  ├─ 2.0.0/
├─ .local/
│  ├─ share/
│  │  ├─ primaite/
│  │  │  ├─ 2.0.0/
├─ primaite/
│  ├─ 2.0.0/
│  │  ├─ config/
│  │  │  ├─ example_config/
│  │  │  │  ├─ lay_down/
│  │  │  │  ├─ training/
│  │  ├─ notebooks/
│  │  │  │  ├─ primaite_demo_notebooks/
│  │  ├─ sessions/

MacOS

~/
├─ Library/
│  ├─ Application Support/
│  │  ├─ Logs/
│  │  │  ├─ primaite/
│  │  │  │  ├─ 2.0.0/
│  │  │  │  │  ├─ log/
│  │  │  │  │  │  ├─ primaite.log
│  │  ├─ Preferences/
│  │  │  ├─ primaite/
│  │  │  │  ├─ 2.0.0/
│  │  ├─ primaite/
│  │  │  ├─ 2.0.0/
├─ primaite/
│  ├─ 2.0.0/
│  │  ├─ config/
│  │  │  ├─ example_config/
│  │  │  │  ├─ lay_down/
│  │  │  │  ├─ training/
│  │  ├─ notebooks/
│  │  │  │  ├─ primaite_demo_notebooks/
│  │  ├─ sessions/

Support for Ray Rllib

PrimAITE now supports the training of PPO and A2C agents using both Stable Baselines3 and Ray RLlib. The RL framework and agent algorithm to be used for training is determined by the agent_framework and agent_identifier configurable items in the training config file. If agent_framework=RLLIB, the backend Ray RLlib RL framework can be configured to use either Tensorflow, Tensorflow 2.x, or PyTorch using the deep_learning_framework.

# Sets which agent algorithm framework will be used.
# Options are:
# "SB3" (Stable Baselines3)
# "RLLIB" (Ray RLlib)
# "CUSTOM" (Custom Agent)
agent_framework: RLLIB

# Sets which deep learning framework will be used (by RLlib ONLY).
# Default is TF (Tensorflow).
# Options are:
# "TF" (Tensorflow)
# TF2 (Tensorflow 2.X)
# TORCH (PyTorch)
deep_learning_framework: TF2

# Sets which Agent class will be used.
# Options are:
# "A2C" (Advantage Actor-Critic coupled with either SB3 or RLLIB agent_framework)
# "PPO" (Proximal Policy Optimization coupled with either SB3 or RLLIB agent_framework)
# "HARDCODED" (The HardCoded agents coupled with an ACL or NODE action_type)
# "DO_NOTHING" (The DoNothing agents coupled with an ACL or NODE action_type)
# "RANDOM" (primaite.agents.simple.RandomAgent)
# "DUMMY" (primaite.agents.simple.DummyAgent)
agent_identifier: PPO

Random red agent

A random red agent has been provided to train the blue agent against. The random red agent will choose a random number of nodes to attack, as well as randomly choosing the actions to perform on the environment.

# Sets whether Red Agent POL and IER are randomised.
# Default is False.
# Options are:
# True
# False
random_red_agent: False

Repeatability of sessions

A seed can now be provided in the training configuration file. The seed will be used across PrimAITE so that a repeatable run is achievable. The seed needs to be an integer value and by default is set to null.

The ability to set PrimAITE to use deterministic or stochastic evaluation is also added.

# The (integer) seed to be used in random number generation
# Default is None (null)
seed: null

# Set whether the agent evaluation will be deterministic instead of stochastic
# Default is False (stochastic).
# Options are:
# True
# False
deterministic: False

Session loading

PrimAITE can now load previously run sessions for SB3 Agents (SB3 agents only, see Known Issues. This can be done via:

CLI

primaite session --load "<PREVIOUS_SESSION_DIRECTORY>"

Python

from primaite.main import run

run(session_path=<PREVIOUS_SESSION_DIRECTORY>)

The output for the loaded session will be in the target directory i.e. the PREVIOUS_SESSION_DIRECTORY. While most of the outputs won't be overwritten, the agent zip file will be overwritten.

Agent Session Classes

An AgentSessionABC class with SB3Agent and RLlib subclasses and a HardCodedAgentSessionABC class with various hard-coded agent subclasses have been created. The Agent Session classes act as a wrapper around various RL and hard-coded agents. They help to standardise how agents are trained in PrimAITE using a common interface. They also provide a suite of standardised session outputs.

Standardised Session Output

When a session is run, a session output sub-directory is created in the user's app sessions directory
(~/primaite/sessions). The sub-directory is formatted as such: ~/primaite/sessions/<yyyy-mm-dd>/<yyyy-mm-dd>_<hh-mm-dd>/. This session directory is populated with four types of outputs:

Session Metadata
Results
Diagrams
Saved agents (training checkpoints and a final trained agent)

Example Session Directory Structure

~/
└── primaite/
    └── 2.0.0/
        └── sessions/
            └── 2023-07-18/
                └── 2023-07-18_11-06-04/
                    ├── evaluation/
                    │   ├── all_transactions_2023-07-18_11-06-04.csv
                    │   ├── average_reward_per_episode_2023-07-18_11-06-04.csv
                    │   └── average_reward_per_episode_2023-07-18_11-06-04.png
                    ├── learning/
                    │   ├── all_transactions_2023-07-18_11-06-04.csv
                    │   ├── average_reward_per_episode_2023-07-18_11-06-04.csv
                    │   ├── average_reward_per_episode_2023-07-18_11-06-04.png
                    │   ├── checkpoints/
                    │   │   └── sb3ppo_5.zip
                    │   ├── SB3_PPO.zip
                    │   └── tensorboard_logs/
                    │       ├── PPO_1/
                    │       │   └── events.out.tfevents.1689674765.METD-9PMRFB3.42960.0
                    │       ├── PPO_2/
                    │       │   └── events.out.tfevents.1689674766.METD-9PMRFB3.42960.1
                    │       ├── PPO_3/
                    │       │   └── events.out.tfevents.1689674766.METD-9PMRFB3.42960.2
                    │       ├── PPO_4/
                    │       │   └── events.out.tfevents.1689674767.METD-9PMRFB3.42960.3
                    │       └── PPO_5/
                    │           └── events.out.tfevents.1689674767.METD-9PMRFB3.42960.4
                    ├── network_2023-07-18_11-06-04.png
                    └── session_metadata.json

PrimaiteSession Class

The PrimaiteSession class acts as a single wrapper around the Agent Session classes. It is both an entry point and a broker for the individual Agent Session classes.

Action Space

Discrete Action Space

The NODE and ACL action spaces have been changed from multi-discrete to discrete action spaces.

NODE and ACL action spaces are both dictionaries where a single number reflects an entire action an agent can take.

The below code block is an example of a dictionary entry for the NODE action space:

{
    1: [1, 1, 1, 0], 
    
}

The below code block is an example of a dictionary entry for the ACL action space:

{
    1: [1, 0, 1, 2, 1, 0, 3],
}

Combined Action Spaces

A new ANY action space option has been introduced. This allows the agent to do both NODE actions and ACL actions in the same episode (e.g., scan a node in Step 1 and create an ACL rule in Step 2).

The below code block is an example of a dictionary entry for the ANY action space:

{
    0: [1, 0, 0, 0], 
    1: [1, 1, 1, 0], 
    2: [1, 0, 1, 2, 1, 0, 3]
}

is_valid_acl_action_extra, is_valid_node_action and is_valid_acl_action in the primaite.agents.utils module help to slim down the dictionary to contain the relevant actions only.

For example, a node action to do PATCHING on a node's hardware state CANNOT happen so it is not added to the dictionary.

transform_action_node_readable and transform_action_acl_readable in the primaite.agents.utils module converts the enumerated node action into a more readable form. The readable form is used by functions such as is_valid_node_action to determine if the action is valid or not.

An example using transform_action_node_readable:

{ 
    # Converts node action into readable form
    [1, 3, 1, 0] -> [1, 'SERVICE', 'PATCHING', 0]
}

ACL Action Space

Previously, the ACL action space was made up of 6 items: [Decision, Permission, Source, Dest, Protocol, Port].

Now the agent can specifically choose where to place the ACL in the Access Control List:

  `[Decision, Permission, Source, Dest, Protocol, Port, Postion]`

Position -> [0 to length of ACL List]

Observation Space

Configurable observation space

Management of the observation space in Primaite env has been outsourced to a dedicated class, ObservationHandler. This is part of a new submodule, primaite.environment.observation. The observation space is now built from smaller components. The components can take optional user parameters at initialisation. Each component is responsible for querying the Primaite env and formatting information into a numpy array. The handler orchestrates the updating of components' data and combines the data from multiple components to create a composite observation space and observations.
The observation space configuration is set in the training config file. The below code block is an example of an observation space configuration:

   - item_type: OBSERVATION_SPACE
     components:
     - name: LINK_TRAFFIC_LEVELS
     - name: NODE_STATUSES
     - name: LINK_TRAFFIC_LEVELS
     - name: ACCESS_CONTROL_LIST
       options:
         combine_service_traffic: false
         quantisation_levels: 8

New observation types

Observations about node hardware and software states can now be formatted as MultiDiscrete gym spaces.
The same thing is possible for link traffic amounts.

Changes to the Access Control List

The acl dictionary in AccessControlList is now a list. This accommodates changes made to ACL action space and the positioning of ACLRules inside the list to signal their level of priority.

Blue Agent actions on Nodes

Previously, when a node's hardware state == HARDWARE_STATE.OFF there was no validation in active_node.py and service_node.py. This meant the agent could change a Nodes' running service_state or software_state even though the Node was OFF.

Logic has now been implemented to check if a node is OFF before executing any actions on the Node by the blue agent such as a change of service state, file system state or software state.

Reduced default reward values

Reward values in the default config files have been decreased by a factor of 10000.

Fixing the Functionality of Resetting a Node

A "SHUTTING DOWN" was added to last for a (configurable) given step count and a "BOOTING" operating state was added as well to last for a (configurable) given step count which is applied when either of the blue agent "node off" or "node on" actions are issued respectively. Issuing a reset command is simply the sum of these two instructions (off and on) with the summation of the step counts for each. That way, the blue agent may learn to issue a single instruction (i.e. reset) rather than clunkily issuing a shutdown and then a startup at precisely the right time to make it optimal (which is probably harder than simply issuing a reset).

🐛 Bug Fixes

The active and service node objects check their hardware state is not OFF before changing any states including File System State, Software State and the states of any running states.
The reset function for the operating state of the node should set compromised or overwhelmed services or operating system back to a good state.

♻️ Refactoring

Package Structure

The overall PrimAITE repository and package structure have been refactored to enable the build, distribution, and installation of Python wheels without reliance on the repository itself. To make this happen, the following work has been done:

All source code now sits inside the src/ directory.
The PRIMAITE Python package was renamed to primaite to adhere to PEP-8 Package & Module Names.
A src/primaite/VERSION file exists to act as a single source of truth for the PrimAITE version.
Docs now sit outside of the src/ directory.
Tests now sit outside of the src/ directory.
Non-python files (example config files, Jupyter notebooks, etc.) now sit inside a */_package_data/ directory in their respective sub-package.
A MANIFEST.in file was added to define which non-python files are to be included in the build (e.g. package data and VERSION file).
All dependencies are now defined in the pyproject.toml file. See Engineering Notes/pyproject.toml below for more info.

Enum classes now in Pascal Case

All Enum classes in primaite.common.enums have been re-written in Pascal Case to comply with PEP-8 Class Names.

Config keys now in snake case

All config keys have been re-written in snake case. This enables faster instantiation as keys can automatically be passed to a constructor as the YAML keys match the config class keys. The config load functions are backwards compatible with legacy config files (pass legacy=True as an arg).

To run a PrimAITE session using legacy files, pass the --legacy-tc and/or the --legacy-ldc options to primaite session:

primaite session --tc legacy_training_config.yaml --legacy-tc --ldc legacy_lay_down_config.yaml --legacy-ldc

Config Files Decoupled

The lay down config file needed to be in the same directory as the training config, or the training config needed to hard-code the full path of the lay down config. This coupling was restrictive and cumbersome to manage. Also, there was no benefit in having 1:1 mapping between training and lay down config files. The two configs have been decoupled and can be used interchangeably as they do not depend on one another.

Training config items moved from lay down to training

The lay down config file had:

- item_type: ACTIONS
  type: NODE

and

- item_type: STEPS
  steps: 256

While the training had:

# Number of episodes to run per session
num_episodes: 10

This meant that to configure a training session you had to have values on both the training config and lay down config. These two values from the lay down config have now been moved over to the training config.

In addition, there is a new configuration where you can set different time steps and episode counts depending on whether the session running is a training or an evaluation session. There are two different config values for the number of episodes for training and for evaluation (the same applies to the number of time steps). These can be explicitly specified in the config or a default value will be assigned:

# Sets How the Action Space is defined:
# "NODE"
# "ACL"
# "ANY" node and acl actions
action_type: NODE

# Number of episodes for training to run per session
num_train_episodes: 10

# Number of time_steps for training per episode
num_train_steps: 256

# Number of episodes for evaluation to run per session
num_eval_episodes: 1

# Number of time_steps for evaluation per episode
num_eval_steps: 256

Update to Node attribute Naming

An explicit and consistent naming convention has been enforced on the Node class and its children, ActiveNode, PassiveNode, and ServiceNode. These naming changes have also been carried through to where the Nodes are used in primaite.environment.primaite_env.Primaite.

Constructor params

_id -> node_id
_name -> name
_type -> node_type
_priority -> priority
_state -> hardware_state
_ip_address -> ip_address
_os_state -> software_state
_file_system_state -> file_system_state
_config_values -> config_values

Instance variables

self.type -> self.node_type
self.operating_state -> self.hardware_state
self.os_state -> self.software_state

Lay Down config file

id -> node_id
portsList -> ports_list
serviceList -> service_list
baseType -> base_type
nodeType -> node_type
hardwareState -> hardware_state
ipAddress -> ip_address
softwareState -> software_state
fileSystemState -> file_system_state

Primaite instance variables

_id -> node_id
_name -> name
_type -> node_type
_priority -> priority
_state -> hardware_state
_ip_address -> ip_address
_os_state -> software_state
_file_system_state -> file_system_state

Transactions

Pre-action and post-action observation are no longer reported in the transactions file - Only the pre-action is saved. The CSV header has more human-readable descriptions for columns relating to the observations.

Misc

PrimAITE is now fully type hinted.

Docstring documentation coverage was greatly increased and docstring formatting has been standardised to follow ReST syntax in most submodules.

🚦 Tests

The engineering team have made a great strive towards increasing the test code coverage. This is an ongoing process.

📚 Docs

Currently, while the repository is PrimAITE, we can't host the docs in the repo pages. As a temporary measure, they've been converted to pdf and added here:
PrimAITE v2.0.0 Docs.pdf

Documentation has been overhauled with the following changes:

PrimAITE API is automatically documented using recursive Sphinx auto-summary.
PrimAITE tests are automatically documented using recursive Sphinx auto-summary.
Furo theme is used to enable a responsive light/dark theme.
sphinx-code-tabs and sphinx-copybutton now make it easier to navigate between code blocks and copy the code for use.

🛠 Engineering Notes

pyproject.toml

Migrated from legacy setup.py to pyproject.toml following pypa/pip #8368 Deprecate call to setup.py install when building a wheel failed for source distributions without pyproject.toml.
As PrimAITE still has a dependency on openai/gym, we've had to force specific versions of build, setuptools, and wheel when installing PrimAITE from source with the dev extra. A future release will see PrimAITE migrate to farma-foundation/gymnasium as soon as Stable Baselines3 and Ray RLlib are aligned with the version of farma-foundation/gymnasium they're dependant on . Once this has happened, the dependency on specific versions of build, setuptools, and pip will be dropped.

⚡️ Performance Notes

PrimAITE v2.0.0 was benchmarked automatically upon release. Learning rate metrics were captured to be referenced during system-level testing and user acceptance testing (UAT).

The benchmarking process consists of running 10 training sessions using the same training and lay down config files. Each session trains an agent for 500 episodes, with each episode consisting of 256 steps. The mean reward per episode from each session is captured. This is then used to calculate a combined average reward per episode from the 10 individual sessions for smoothing. Finally, a 25-widow rolling average of the combined average reward per session is calculated for further smoothing.

Metric	Result
Total Sessions	10
Total Episodes	5000
Total Steps	1280000
Av Session Duration (s)	1005.8384
Av Step Duration (s)	0.0079
Av Duration per 100 Steps per 10 Nodes (s)	0.8731

PrimAITE 2.0.0 Learning Benchmark Plot

The full benchmark report is available here:
PrimAITE v2.0.0 Learning Benchmark.pdf

⚠️ Known Issues

Attempting to run an untrained RLlib agent in EVAL only mode will result in a RLlibAgentError with the error message Cannot evaluate an RLlib agent that hasn't been through training yet.
Attempting to load a previous RLlib session will result in a NotImplementedError. This feature will be added in a future release.

💫 Install & Run

Currently, the PRimAITE wheel can only be installed from GitHub. This may change in the future with release to PyPi.

Windows (PowerShell)

Prerequisites:

Manual install of Python >= 3.8 < 3.11

Install:

mkdir ~\primaite\2.0.0
cd ~\primaite\2.0.0
python3 -m venv .venv
attrib +h .venv /s /d # Hides the .venv directory
.\.venv\Scripts\activate
python -m pip install pip==23.0.1 --upgrade 
pip install wheel==0.38.4 --upgrade
pip install setuptools==66 --upgrade
pip install https://github.com/Autonomous-Resilient-Cyber-Defence/PrimAITE/releases/download/v2.0.0/primaite-2.0.0-py3-none-any.whl
primaite setup

Unix

Prerequisites:

Manual install of Python >= 3.8 < 3.11

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.10
sudo apt-get install python3-pip
sudo apt-get install python3-venv

Install:

mkdir ~/primaite/2.0.0
cd ~/primaite/2.0.0
python3 -m venv .venv
source .venv/bin/activate
python -m pip install pip==23.0.1 --upgrade 
pip install wheel==0.38.4 --upgrade
pip install setuptools==66 --upgrade
pip install https://github.com/Autonomous-Resilient-Cyber-Defence/PrimAITE/releases/download/v2.0.0/primaite-2.0.0-py3-none-any.whl
primaite setup

Contributors

@jamesshort1 - PrimAITE product owner
@ChrisMcCarthyDev - PrimAITE engineering team lead
@czar-ec-envitia - PrimAITE engineering team
@marek-methods - PrimAITE engineering team
@sunilsamra - PrimAITE engineering team
@briankanyora - PrimAITE engineering team

Full Changelog: https://github.com/Autonomous-Resilient-Cyber-Defence/PrimAITE/commits/v2.0.0

This discussion was created from the release v2.0.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.0.0 #10

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

v2.0.0 #10

ChrisMcCarthyDev Aug 15, 2023

PrimAITE v2.0.0

✨ What's New

Command Line Interface

Application Directories

Support for Ray Rllib

Random red agent

Repeatability of sessions

Session loading

Agent Session Classes

Standardised Session Output

PrimaiteSession Class

Action Space

Discrete Action Space

Combined Action Spaces

ACL Action Space

Observation Space

Configurable observation space

New observation types

Changes to the Access Control List

Blue Agent actions on Nodes

Reduced default reward values

Fixing the Functionality of Resetting a Node

🐛 Bug Fixes

♻️ Refactoring

Package Structure

Enum classes now in Pascal Case

Config keys now in snake case

Config Files Decoupled

Training config items moved from lay down to training

Update to Node attribute Naming

Transactions

Misc

🚦 Tests

📚 Docs

🛠 Engineering Notes

pyproject.toml

⚡️ Performance Notes

⚠️ Known Issues

💫 Install & Run

Windows (PowerShell)

Unix

Contributors

Replies: 0 comments

ChrisMcCarthyDev
Aug 15, 2023