You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Each agent has built-in epsilon greedy mechanism. There's likely little need to have the same code everywhere so it should be moved to a single place.
Consideration
As it is right now, and as it is intended, each agent needs to touch (agent.act) state to provides an action. This touch can be related to increment some internal counter or producing additional values, e.g. entropy. Only touched state (all data tuple) is used in step to learn something.
The text was updated successfully, but these errors were encountered:
What
Each agent has built-in epsilon greedy mechanism. There's likely little need to have the same code everywhere so it should be moved to a single place.
Consideration
As it is right now, and as it is intended, each agent needs to touch (
agent.act
) state to provides an action. This touch can be related to increment some internal counter or producing additional values, e.g. entropy. Only touched state (all data tuple) is used instep
to learn something.The text was updated successfully, but these errors were encountered: