rlenvscpp

rlenvscpp is an effort to provide implementations and wrappers of environments suitable for training reinforcement learning agents using C++. In addition, the library provides various utilities such as experiment tracking, representing trajectories via waypoints and simple implementation of popular dynamics such as quadrotor dynamics.

Environments

Currently, rlenvscpp provides the following environments:

Environment	Use REST	Example
FrozenLake 4x4 map	Yes	example_1
FrozenLake 8x8 map	Yes	TODO
Blackjack	Yes	example_1
CliffWalking	Yes	example_1
CartPole	Yes	TODO
MountainCar	Yes	TODO
Taxi	Yes	example_1
Pendulum	Yes	example_6
Acrobot	Yes	TODO
GymWalk	Yes	TODO
gym-pybullet-drones	TODO	TODO
GridWorld	No	example_5
Connect2	No	example_7

The Gymnasium (former OpenAI-Gym) environments utilise a REST API to communicate requests to/from the environment and rlenvscpp.

Some environments have a vector implementation meaning multiple instances of the same environment. Currently, rlenvscpp provides the following vector environments:

Environment	Use REST	Example
AcrobotV	Yes	example_8

Various RL algorithms using the environments can be found at cuberl.

How to use

The following is an example how to use the FrozenLake environment from Gymnasium.

#include "rlenvs/rlenvs_types_v2.h"
#include "rlenvs/envs/gymnasium/toy_text/frozen_lake_env.h"
#include "rlenvs/envs/api_server/apiserver.h"

#include <iostream>
#include <string>
#include <unordered_map>
#include <any>

namespace example_1{

const std::string SERVER_URL = "http://0.0.0.0:8001/api";

using rlenvscpp::envs::gymnasium::FrozenLake;
using rlenvscpp::envs::RESTApiServerWrapper;


void test_frozen_lake(const RESTApiServerWrapper& server){

    FrozenLake<4> env(server);

    std::cout<<"Environame URL: "<<env.get_url()<<std::endl;

    // make the environment
    std::unordered_map<std::string, std::any> options;
    options.insert({"is_slippery", false});
    env.make("v1", options);

    std::cout<<"Is environment created? "<<env.is_created()<<std::endl;
    std::cout<<"Is environment alive? "<<env.is_alive()<<std::endl;
    std::cout<<"Number of valid actions? "<<env.n_actions()<<std::endl;
    std::cout<<"Number of states? "<<env.n_states()<<std::endl;

    // reset the environment
    auto time_step = env.reset(42, std::unordered_map<std::string, std::any>());

    std::cout<<"Reward on reset: "<<time_step.reward()<<std::endl;
    std::cout<<"Observation on reset: "<<time_step.observation()<<std::endl;
    std::cout<<"Is terminal state: "<<time_step.done()<<std::endl;

    //...print the time_step
    std::cout<<time_step<<std::endl;

    // take an action in the environment
	// 2 = RIGHT
    auto new_time_step = env.step(2);

    std::cout<<new_time_step<<std::endl;

    // get the dynamics of the environment for the given state and action
    auto state = 0;
    auto action = 1;
    auto dynamics = env.p(state, action);

    std::cout<<"Dynamics for state="<<state<<" and action="<<action<<std::endl;

    for(auto item:dynamics){

        std::cout<<std::get<0>(item)<<std::endl;
        std::cout<<std::get<1>(item)<<std::endl;
        std::cout<<std::get<2>(item)<<std::endl;
        std::cout<<std::get<3>(item)<<std::endl;
    }
	
	action = env.sample_action();
	new_time_step = env.step(action);

    std::cout<<new_time_step<<std::endl;
	
    // synchronize the environment
    env.sync(std::unordered_map<std::string, std::any>());
	
	auto copy_env = env.make_copy(1);
	copy_env.reset();
	
	std::cout<<"Org env cidx: "<<env.cidx()<<std::endl;
	std::cout<<"Copy env cidx: "<<copy_env.cidx()<<std::endl;
	
	copy_env.close();

    // close the environment
    env.close();

}

}


int main(){

	using namespace example_1;
	
	RESTApiServerWrapper server(SERVER_URL, true);

    std::cout<<"Testing FrozenLake..."<<std::endl;
    example_1::test_frozen_lake(server);
    std::cout<<"===================="<<std::endl;
    return 0;
}

In general, the environments exposed by the library follow the semantics in Environment API and Semantics specification. For more details see the rlenvscpp environment specification document.

The general use case is to build the library and link it with your driver code to access its functionality. The environments specified as using REST in the tables above, that is all Gymnasium, gym_pybullet_drones and GymWalk environments are accessed via a client/server pattern. Namely, they are exposed via an API developed using FastAPI. You need to fire up the FastAPI server, see dependencies, before using the environments in your code. To do so

./start_uvicorn.sh

By default the uvicorn server listents on port 8001. Change this if needed. You can access the OpenAPI specification at

http://0.0.0.0:8001/docs

Note that currently the implementation is not thread/process safe i.e. if multiple threads/processes access the environment a global instance of the environment is manipulated. Thus no session based environment exists. However, you can create copies of the same environment and access this via its dedicate index. If just one thread/process touches this specific environment you should be ok. Notice that the FastAPI server only uses a single process to manage all the environments. In addition, if you need multiple instances of the same environment you can also use one of the exissting vectorised environments (see table above).

Finally, you can choose to launch several instances of uvirocrn (listening on different ports). However in this case you need to implement all the interactions logic yourself as currently no implementation exists to handle such a scenario.

Dynamics

Apart from the exposed environments, rlenvscpp exposes classes that describe the dynamics of some popular rigid bodies:

Dynamics	Example
Differential drive	example_9
Quadrotor	example_10
Bicycle vehicle	TODO

Miscellaneous

Item	Example
Environment trajectory	example_3
WaypointTrajectory	example_11
TensorboardServer	example_12

Dependencies

The library has the following general dependencies

A compiler that supports C++20 e.g. g++-11
Boost C++
CMake >= 3.10
Gtest (if configured with tests)
Eigen3

Using the Gymnasium environments requires Gymnasium installed on your machine. In addition, you need to install

By installing the requirement under requirements.txt should set your Python environment up correctly.

In addition, the library also incorporates, see (src/extern), the following libraries

There are extra dependencies if you want to generate the documentation. Namely,

Doxygen
Sphinx
sphinx_rtd_theme
breathe
m2r2

Installation

The usual CMake based installation process is used. Namely

mkdir build && cd build && cmake ..
make install

You can toggle the following variables

CMAKE_BUILD_TYPE (default is RELEASE)
ENABLE_TESTS_FLAG (default is OFF)
ENABLE_EXAMPLES_FLAG (default is OFF)
ENABLE_DOC_FLAG (default is OFF)

For example enbling the examples

cmake -DENABLE_EXAMPLES_FLAG=ON ..
make install

Run the tests

You can execute all the tests by running the helper script execute_tests.sh.

Issues

Could not find `boost_system`

It is likely that you are missing the boost_system library with your local Boost installation. This may be the case is you installed boost via a package manager. On a Ubuntu machine the following should resolve the issue

sudo apt-get update -y
sudo apt-get install -y libboost-system-dev

FastAPI throws 422 Unpocessable entity

Typically, this is a problem with how the client (400-range error) specified the data to be sent to the server.

Name		Name	Last commit message	Last commit date
Latest commit History 414 Commits
.github/workflows		.github/workflows
cmake		cmake
doc		doc
docs		docs
examples		examples
rest_api		rest_api
src/rlenvs		src/rlenvs
tests		tests
torchboard_server		torchboard_server
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
config.h.in		config.h.in
execute_tests.sh		execute_tests.sh
start_uvicorn.sh		start_uvicorn.sh
version.h.in		version.h.in

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rlenvscpp

Environments

How to use

Dynamics

Miscellaneous

Dependencies

Installation

Run the tests

Issues

Could not find `boost_system`

FastAPI throws 422 Unpocessable entity

About

Releases 45

Packages

Languages

pockerman/rlenvscpp

Folders and files

Latest commit

History

Repository files navigation

rlenvscpp

Environments

How to use

Dynamics

Miscellaneous

Dependencies

Installation

Run the tests

Issues

Could not find boost_system

FastAPI throws 422 Unpocessable entity

About

Topics

Resources

Stars

Watchers

Forks

Releases 45

Packages 0

Languages

Could not find `boost_system`

Packages