This repository contains the code for policy gradient algorithm incorporating with credit assignment mechanism.
- Install Pytorch
pip install torch torchvision
- install Tensorflow 2
pip install tensorflow=2.2
or
pip install tensorflow-gpu=2.2
- Install OpenAI baseline (Tensorflow 2 version)
git clone https://github.com/openai/baselines.git -b tf2 && \
cd baselines && \
pip install -e .
Note: I haven't tested the code on Tensorflow 1 yet but it should work as well.
- Install gym
pip install 'gym[atari]'
- Install Park Platform. I modified the platform slightly to make it compatible with OpenAI's baseline.
git clone https://github.com/lehduong/park -b openai_baseline &&\
cd park && \
pip install -e .
python main.py --algo a2c --env-name PongNoFrameskip-v4
The started code is based on ikostrikov's repository