Inquiries on Modifying Reward Settings in OpenSpiel #1290

Legendarzy · 2024-11-22T09:22:36Z

I have recently been using the OpenSpiel codebase for a research project and need to modify the reward settings in the games. However, I found that the rewards are encapsulated within pyspiel.so, making it inaccessible for me to directly alter.

To circumvent this, I attempted an indirect modification approach by saving key-value pairs of {state: rewards} in a dictionary format wherever the program (primarily in Leduc Poker within PSRO) utilizes state.rewards() or state.returns(), and using this to replace the original rewards settings.

So far, I have made modifications in a total of five files:

open_spiel/python/algorithms/exploitability.py
open_spiel/python/algorithms/best_response.py
open_spiel/python/algorithms/psro_v2/abstract_meta_trainer.py
open_spiel/python/rl_environment.py
open_spiel/python/algorithms/psro_v2/rl_policy.py
My concern is that such changes might not be comprehensive and could potentially lead to non-convergence in subsequent training. I would like to ask if there are any methods that can help me modify the game's reward settings more effectively. If not, could my aforementioned modifications lead to any severe errors?

Thank you.

lanctot · 2024-11-22T18:10:58Z

Can I ask: are you not building from source?

And can I clarify that you want to make arbitrary reward changes to a game? E.g. ideally such that state.rewards() and state.returns() return something custom?

lanctot · 2024-11-22T18:12:36Z

Also, are you working in Python-only?

The easiest recommendation would be to simply copy leduc.h and leduc.cc to custom_leduc.h and custom_leduc.cc, name them something different, modify LeducState::Returns and then just use that new game. I suppose that doesn't work for you if you're using Python and not building from source?

lanctot · 2024-11-22T18:15:56Z

We have game wrappers that could do what you want... but I don't think they can work from Python. Ideally we'd have Leduc implemented as a Python game and you could just modify it directly. It may not be too hard to implement from the example of Kuhn poker?

Legendarzy · 2024-11-23T13:32:18Z

Thank you very much for your prompt feedback. After following your advice, I believe copying and modifying Leduc_Poker.cc and Leduc_Poker.h would better suit my current needs. May I ask if the steps I need to take are: after modifying custom_leduc.h and custom_leduc.cc, I should generate a new pyspiel.so file as described in the developer_guide, and then replace the original file with this new one?

However, in my research, I need to frequently modify and update the rewards setting. If I have to make these changes in the source code rather than in the Python code each time, it may prevent my program from running automatically and make each run quite time-consuming. In this scenario, would you recommend that I rewrite custom_leduc.py similar to how Kuhn_Poker.py is written?

My research primarily revolves around PSRO, which is mainly demonstrated in open_spiel\python\examples\psro_v2_example.py in your code. This algorithm currently only supports the games 'kuhn_poker' and 'leduc_poker'. Could you briefly explain how I can apply the PSRO algorithm to other games?

lanctot · 2025-01-08T20:03:05Z

Hi @Legendarzy.

Sorry for the very late reply.

May I ask if the steps I need to take are: after modifying custom_leduc.h and custom_leduc.cc, I should generate a new pyspiel.so file as described in the developer_guide, and then replace the original file with this new one?

Yes, in this case you would need to recompile (by running make again.. no need to rerun cMake etc.)

However, in my research, I need to frequently modify and update the rewards setting. If I have to make these changes in the source code rather than in the Python code each time, it may prevent my program from running automatically and make each run quite time-consuming. In this scenario, would you recommend that I rewrite custom_leduc.py similar to how Kuhn_Poker.py is written?

Yes, correct -- that would probably be easier in your case.

My research primarily revolves around PSRO, which is mainly demonstrated in open_spiel\python\examples\psro_v2_example.py in your code. This algorithm currently only supports the games 'kuhn_poker' and 'leduc_poker'. Could you briefly explain how I can apply the PSRO algorithm to other games?

It should be as simple as using a different game string here:

open_spiel/open_spiel/python/examples/psro_v2_example.py

Line 54 in d99705d

flags.DEFINE_string("game_name", "kuhn_poker", "Game name.")

But please follow up if you run into any difficulties.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiries on Modifying Reward Settings in OpenSpiel #1290

Inquiries on Modifying Reward Settings in OpenSpiel #1290

Legendarzy commented Nov 22, 2024

lanctot commented Nov 22, 2024 •

edited

Loading

lanctot commented Nov 22, 2024 •

edited

Loading

lanctot commented Nov 22, 2024 •

edited

Loading

Legendarzy commented Nov 23, 2024

lanctot commented Jan 8, 2025

Inquiries on Modifying Reward Settings in OpenSpiel #1290

Inquiries on Modifying Reward Settings in OpenSpiel #1290

Comments

Legendarzy commented Nov 22, 2024

lanctot commented Nov 22, 2024 • edited Loading

lanctot commented Nov 22, 2024 • edited Loading

lanctot commented Nov 22, 2024 • edited Loading

Legendarzy commented Nov 23, 2024

lanctot commented Jan 8, 2025

lanctot commented Nov 22, 2024 •

edited

Loading

lanctot commented Nov 22, 2024 •

edited

Loading

lanctot commented Nov 22, 2024 •

edited

Loading