Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiries on Modifying Reward Settings in OpenSpiel #1290

Open
Legendarzy opened this issue Nov 22, 2024 · 5 comments
Open

Inquiries on Modifying Reward Settings in OpenSpiel #1290

Legendarzy opened this issue Nov 22, 2024 · 5 comments

Comments

@Legendarzy
Copy link

I have recently been using the OpenSpiel codebase for a research project and need to modify the reward settings in the games. However, I found that the rewards are encapsulated within pyspiel.so, making it inaccessible for me to directly alter.

To circumvent this, I attempted an indirect modification approach by saving key-value pairs of {state: rewards} in a dictionary format wherever the program (primarily in Leduc Poker within PSRO) utilizes state.rewards() or state.returns(), and using this to replace the original rewards settings.

So far, I have made modifications in a total of five files:

open_spiel/python/algorithms/exploitability.py
open_spiel/python/algorithms/best_response.py
open_spiel/python/algorithms/psro_v2/abstract_meta_trainer.py
open_spiel/python/rl_environment.py
open_spiel/python/algorithms/psro_v2/rl_policy.py
My concern is that such changes might not be comprehensive and could potentially lead to non-convergence in subsequent training. I would like to ask if there are any methods that can help me modify the game's reward settings more effectively. If not, could my aforementioned modifications lead to any severe errors?

Thank you.

@lanctot
Copy link
Collaborator

lanctot commented Nov 22, 2024

Can I ask: are you not building from source?

And can I clarify that you want to make arbitrary reward changes to a game? E.g. ideally such that state.rewards() and state.returns() return something custom?

@lanctot
Copy link
Collaborator

lanctot commented Nov 22, 2024

Also, are you working in Python-only?

The easiest recommendation would be to simply copy leduc.h and leduc.cc to custom_leduc.h and custom_leduc.cc, name them something different, modify LeducState::Returns and then just use that new game. I suppose that doesn't work for you if you're using Python and not building from source?

@lanctot
Copy link
Collaborator

lanctot commented Nov 22, 2024

We have game wrappers that could do what you want... but I don't think they can work from Python. Ideally we'd have Leduc implemented as a Python game and you could just modify it directly. It may not be too hard to implement from the example of Kuhn poker?

@Legendarzy
Copy link
Author

Thank you very much for your prompt feedback. After following your advice, I believe copying and modifying Leduc_Poker.cc and Leduc_Poker.h would better suit my current needs. May I ask if the steps I need to take are: after modifying custom_leduc.h and custom_leduc.cc, I should generate a new pyspiel.so file as described in the developer_guide, and then replace the original file with this new one?

However, in my research, I need to frequently modify and update the rewards setting. If I have to make these changes in the source code rather than in the Python code each time, it may prevent my program from running automatically and make each run quite time-consuming. In this scenario, would you recommend that I rewrite custom_leduc.py similar to how Kuhn_Poker.py is written?

My research primarily revolves around PSRO, which is mainly demonstrated in open_spiel\python\examples\psro_v2_example.py in your code. This algorithm currently only supports the games 'kuhn_poker' and 'leduc_poker'. Could you briefly explain how I can apply the PSRO algorithm to other games?

@lanctot
Copy link
Collaborator

lanctot commented Jan 8, 2025

Hi @Legendarzy.

Sorry for the very late reply.

May I ask if the steps I need to take are: after modifying custom_leduc.h and custom_leduc.cc, I should generate a new pyspiel.so file as described in the developer_guide, and then replace the original file with this new one?

Yes, in this case you would need to recompile (by running make again.. no need to rerun cMake etc.)

However, in my research, I need to frequently modify and update the rewards setting. If I have to make these changes in the source code rather than in the Python code each time, it may prevent my program from running automatically and make each run quite time-consuming. In this scenario, would you recommend that I rewrite custom_leduc.py similar to how Kuhn_Poker.py is written?

Yes, correct -- that would probably be easier in your case.

My research primarily revolves around PSRO, which is mainly demonstrated in open_spiel\python\examples\psro_v2_example.py in your code. This algorithm currently only supports the games 'kuhn_poker' and 'leduc_poker'. Could you briefly explain how I can apply the PSRO algorithm to other games?

It should be as simple as using a different game string here:

flags.DEFINE_string("game_name", "kuhn_poker", "Game name.")

But please follow up if you run into any difficulties.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants