Skip to content

Commit

Permalink
Parameters that actually lead to good results (200) in CartPole with …
Browse files Browse the repository at this point in the history
…C51.

PiperOrigin-RevId: 257675436
  • Loading branch information
Marlos C. Machado authored and psc-g committed Jul 22, 2019
1 parent b9e932a commit f5f971f
Showing 1 changed file with 9 additions and 7 deletions.
16 changes: 9 additions & 7 deletions dopamine/agents/rainbow/configs/c51_cartpole.gin
Original file line number Diff line number Diff line change
Expand Up @@ -11,26 +11,28 @@ RainbowAgent.observation_shape = %gym_lib.CARTPOLE_OBSERVATION_SHAPE
RainbowAgent.observation_dtype = %gym_lib.CARTPOLE_OBSERVATION_DTYPE
RainbowAgent.stack_size = %gym_lib.CARTPOLE_STACK_SIZE
RainbowAgent.network = @gym_lib.cartpole_rainbow_network
RainbowAgent.num_atoms = 51
RainbowAgent.vmax = 10.
RainbowAgent.num_atoms = 201
RainbowAgent.vmax = 100.
RainbowAgent.gamma = 0.99
RainbowAgent.epsilon_eval = 0.
RainbowAgent.epsilon_train = 0.01
RainbowAgent.update_horizon = 1
RainbowAgent.min_replay_history = 500
RainbowAgent.update_period = 4
RainbowAgent.target_update_period = 100
RainbowAgent.update_period = 1
RainbowAgent.target_update_period = 1
RainbowAgent.epsilon_fn = @dqn_agent.identity_epsilon
RainbowAgent.replay_scheme = 'uniform'
RainbowAgent.tf_device = '/gpu:0' # use '/cpu:*' for non-GPU version
RainbowAgent.optimizer = @tf.train.AdamOptimizer()

tf.train.AdamOptimizer.learning_rate = 0.001
tf.train.AdamOptimizer.epsilon = 0.0003125
tf.train.AdamOptimizer.learning_rate = 0.00001
tf.train.AdamOptimizer.epsilon = 0.00000390625

create_gym_environment.environment_name = 'CartPole'
create_gym_environment.version = 'v0'
create_agent.agent_name = 'rainbow'
Runner.create_environment_fn = @gym_lib.create_gym_environment
Runner.num_iterations = 500
Runner.num_iterations = 400
Runner.training_steps = 1000
Runner.evaluation_steps = 1000
Runner.max_steps_per_episode = 200 # Default max episode length.
Expand Down

0 comments on commit f5f971f

Please sign in to comment.