Implementation of an Advantage Actor-Critic using Artificial Neural Networks
Explore the docs »
This is a very simple implementation of a Deep Reinforcement Learning Advantage Actor-Critic. It uses 2 independent Artificial Neural Networks to approximate the Policy function (Actor) and the State-value function (Critic). To test the implementation, I use the Moon Lander environment provided by OpenAI-Gym.
If you want to have a deeper understanding of the Actor-Critic algorithm, I strongly recommend you to take a look into the document References/A2C_Summary/A2C_Summary.pdf
and References/A2C_Presentation.pdf
. In References/A2C_Summary/
you can also find the original
To get a local copy up and running follow these simple steps.
A running installation of Anaconda. If you haven't installed Anaconda yet, you can follow the next tutorial:
Anaconda Installation
- Clone the repo
git clone https://github.com/andresbecker/Deep_RL_Actor_Critic.git
- Install the environment
# For ubuntu 20.04 conda env create -f conda_environment.yml # For Ubuntu 22.04 conda deactivate # Only if base conda environment is loaded python3 -m venv ~/venv/dl_seminar . ~/venv/dl_seminar/bin/activate pip3 install tensorflow tensorflow-probability matplotlib numpy pandas jupyterlab pip3 install gym==0.17.3 pip3 install box2d-py==2.3.8
To train and test this implementation, simply activate the environment
# For ubuntu 20.04
conda activate A2C_env
# For Ubuntu 22.04
. ~/venv/dl_seminar/bin/activate
open jupyter-lab
jupyter-lab
and navigate to open the notebook A2C.ipynb
.
Then, just follow the steps inside the notebook.
Have fun!
Andres Becker - LinkedIn - andres.becker@tum.de
Project Link: https://github.com/andresbecker/Deep_RL_Actor_Critic