Skip to content

marekpetrik/craam2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CRAAM: Robust And Approximate Markov decision processes

Craam is a header-only C++ library for solving Markov decision processes with support for handling uncertainty in transition probabilities. The library can handle uncertainties using both robust, or optimistic objectives.

The library includes Python and R interfaces. See below for detailed installation instructions.

When using the robust objective, adversarial nature chooses the worst plausible realization of the uncertain values. When using the optimistic objective, collaborative nature chooses the best plausible realization of the uncertain values.

The library also provides tools for basic simulation, for constructing MDPs from samples, and value function approximation. Objective functions supported are infinite horizon discounted MDPs, finite horizon MDPs, and stochastic shortest path [Puterman2005]. Some basic stochastic shortest path methods are also supported. The library assumes maximization over actions. The number of states and actions must be finite.

The library is based on two main data structures: MDP and MDPO. MDP is the standard model that consists of states S and actions A. Note that robust solutions are constrained to be absolutely continuous with respect to P(s, a, ⋅). This is a hard requirement for all choices of ambiguity (or uncertainty).

The MPDO model adds a set of outcomes that model possible actions that can be taken by nature. Using outcomes makes it more convenient to capture correlations between the ambiguity in rewards and the uncertainty in transition probabilities. It also make it much easier to represent uncertainties that lie in small-dimensional vector spaces. Constraints for nature's distributions over outcomes are also supported.

The available algorithms are value iteration and modified policy iteration. The library support both the plain worst-case outcome method and a worst case with respect to a base distribution.

Installing R Package

The R exposes most of the functions of the package. Method signatures are expected to change. The package should work on Linux, Mac, and Windows (with RTools 4.0+). R version 4.0 is required and the C++ compiler must support C+20 standard.

Gurobi: To enable methods that use Gurobi, you must install Gurobi (with a license) and set GUROBI_PATH to the Gurobi directory that has the subdirectories include and lib. Also libgurobi90.so (on Linux) or equivalent (on Windows/Mac) must be in the library directory (or set LD_LIBRARY_PATH).

Linux and Mac

A stable (and possibly stale) version of the package can be installed directly from the github repository using remotes:

install.packages("remotes")
remotes::install_github("marekpetrik/craam2/rcraam")

A development version can be installed from gitlab as follows:

install.packages("remotes")
remotes::install_gitlab("RLsquared/craam2", "rcraam")

To download and install a local development version, run:

gitlab clone git@gitlab.com/RLSquared/craam2
cd craam2/rcraam
R CMD INSTALL . --preclean

Windows

You also need to install Rtools 4.0 or later. If you want to avoid having to configure the compilation paths too, install pkgbuild. The code that should be able to install all of this automatically is:

install.packages(c("remotes","pkgbuild"))
remotes::install_github("marekpetrik/craam2/rcraam")

R Development

The C++ sources in directories craam and includes are currently replicated in rcraam/inst/includes. We are not using symlinks because they are not supported on Windows which makes it impossible to use remotes::install_.... The file rcraam/copy_libs.sh copies (running bash or similar) the latest version of the appropriate C++ files to rcraam/inst/includes.

Installing C++ Library

It is sufficient to copy the entire root directory to a convenient location.

Numerous asserts are enabled in the code by default. To disable them, make sure to insert the following line before including any files:

#define NDEBUG

Or use the -DNDEBUG compiler switch.

To make sure that asserts are disabled, you may also want to double check the file /craam/config.hpp which is auto-generated by cmake.

The library has minimal dependencies and was tested on Linux. It also compiles on macOS using recent Xcode versions. It has not been tested on Windows.

Requirements

  • At least C++17 compatible compiler, tested with C++20 compatible compiler (GCC 8+):

Optional Dependencies

  • CMake: 3.17.3 to build tests, command line executable, and the documentation
  • Gurobi 9 for using robust objectives that require a linear program solver. Set GUROBI_PATH to the location of the gurobi files (with subdirectories include and lib).
  • OpenMP to enable parallel computation
  • Doxygen 1.8.0+ to generate documentation
  • Boost for compiling and running unit tests (boost-devel package, libboost-all-dev package on some distributions)

Documentation

The project uses Doxygen for the documentation. To generate the documentation after generating the files, run:

    $ cmake --build . --target docs

This automatically generates both HTML and PDF documentation in the folder out.

Run unit tests

Note that Boost must be present in order to build the tests in the first place.

    $ cmake .
    $ cmake --build . --target testit

C++ Development

The instructions above generate a release version of the project. The release version is optimized for speed, but lacks debugging symbols and many intermediate checks are eliminated. For development purposes, is better to use the Debug version of the code. This can be generated as follows:

    $ cmake -DCMAKE_BUILD_TYPE=Debug .
    $ cmake --build .

The release version that omits many of the time-consuming debugging checks can be compiled as:

    $ cmake -DCMAKE_BUILD_TYPE=Release .
    $ cmake --build .

Gurobi: To enable methods that use Gurobi, you must install Gurobi (with a license) and set GUROBI_PATH to the Gurobi directory that has the subdirectories include and lib. Also libgurobi90.so (on Linux) or equivalent (on Windows/Mac) must be in the system library path (or set LD_LIBRARY_PATH).

QT creator is a nice IDE that can automatically parse and run cmake projects directly. As an alternative, CMake can be used to generate a CodeBlocks project files too:

To help with development, CMake can be used to generate a CodeBlocks project files too:

  $ cmake . -G "CodeBlocks - Ninja"

To list other types of projects that CMake can generate, call:

  $ cmake . -G

Pull/Merge Request Guidelines

  1. Do not add files using git add when adding files to commits. Rather, call git commit -a in order to avoid adding spurious files

  2. All C++ code should be formatted using clang-format and the style file craam/.clang-format. For example:

clang-format -i -style=file Action.hpp
  1. Please do not add any files that are proprietary or not licensed under a permissive license (MIT/BSD)

  2. Do not remove any of the libraries that are already included in the repository (eigen3 and others). They are included in the repository for a purpose.

  3. Do not include any of the auto-generated configuration files in the repository

  4. Make sure that all unit tests pass and that rcraam installs and loads OK

  5. See the R-development section above to make sure that the changes to the C++ code are reflected in the R package (= run the script as described in the R-development section

Other Steps

Build and Run Command-line Executable

To run a benchmark problem, download and decompress one of the following test files:

These two benchmark problems were generated from a uniform random distribution.

Download the code.

    $ git clone --depth 1 https://gitlab.com/RLsquared/craam2

Optionally, you can (re)install Eigen in the includes directory (requires bash or Cygwin on Windows). This is not necessary since the correct Eigen distribution is already included in the project git repository.

    $ ./install_eigen.sh

To install it manually, download the latest version from http://eigen.tuxfamily.org/ and install it under include/eigen3. A file include/eigen3/Eigen/Core should exist.

We can now build the project as follows:

    $ cmake -DCMAKE_BUILD_TYPE=Release .
    $ cmake --build . --target craam-cli

Finally, download and solve a simple benchmark problem:

    $ mkdir data
    $ cd data
    $ wget https://www.dropbox.com/s/b9x8sz7q5ow1vm4/ss.zip
    $ unzip ss.zip
    $ cd ..
    $ bin/craam-cli -i data/smallsize_test.csv -o data/smallsize_policy.csv

To see the list of command-line options, run:

    $ bin/craam-cli -h

C++ Library

Unit tests provide some examples of how to use the library. For simple end-to-end examples, see tests/benchmark.cpp and test/dev.cpp. Targets BENCH and DEV build them respectively.

The main models supported are:

  • craam::MDP : plain MDP with no specific definition of ambiguity (can be used to compute robust solutions anyway)
  • craam::RMDP : an augmented model that adds nature's actions (so-called outcomes) to the model for convenience
  • craam::impl::MDPIR : an MDP with implementability constraints. See [Petrik2016].

The regular value-function based methods are in the header algorithms/values.hpp and the robust versions are in in the header algorithms/robust_values.hpp. There are 4 main value-function based methods:

  • solve_vi: Gauss-Seidel value iteration; runs in a single thread. -solve_mpi: Jacobi modified policy iteration; parallelized with OpenMP. Generally, modified policy iteration is vastly more efficient than value iteration.
  • rsolve_vi: Like the value iteration above, but also supports robust, risk-averse, or optimistic objectives.
  • rsolve_mpi: Like the modified policy iteration above, but it also supports robust, risk-averse, optimistic objective.

These methods can be applied to either an MDP or an RMDP.

The header algorithms/occupancies.hpp provides tools for converting the MDP to a transition matrix and computing the occupancy frequencies.

There are tools for building simulators and sampling from simulations in the header Simulation.hpp and methods for handling samples in Samples.hpp.

References

  • [Filar1997] Filar, J., & Vrieze, K. (1997). Competitive Markov decision processes. Springer.
  • [Puterman2005] Puterman, M. L. (2005). Markov decision processes: Discrete stochastic dynamic programming. Handbooks in operations research and management …. John Wiley & Sons, Inc.
  • [Iyengar2005] Iyengar, G. N. G. (2005). Robust dynamic programming. Mathematics of Operations Research, 30(2), 1–29.
  • [Petrik2014] Petrik, M., Subramanian S. (2014). RAAM : The benefits of robustness in approximating aggregated MDPs in reinforcement learning. In Neural Information Processing Systems (NIPS).
  • [Petrik2016] Petrik, M., & Luss, R. (2016). Interpretable Policies for Dynamic Product Recommendations. In Uncertainty in Artificial Intelligence (UAI).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published