-
Notifications
You must be signed in to change notification settings - Fork 949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inquiries on Modifying Reward Settings in OpenSpiel #1290
Comments
Can I ask: are you not building from source? And can I clarify that you want to make arbitrary reward changes to a game? E.g. ideally such that state.rewards() and state.returns() return something custom? |
Also, are you working in Python-only? The easiest recommendation would be to simply copy leduc.h and leduc.cc to custom_leduc.h and custom_leduc.cc, name them something different, modify LeducState::Returns and then just use that new game. I suppose that doesn't work for you if you're using Python and not building from source? |
We have game wrappers that could do what you want... but I don't think they can work from Python. Ideally we'd have Leduc implemented as a Python game and you could just modify it directly. It may not be too hard to implement from the example of Kuhn poker? |
Thank you very much for your prompt feedback. After following your advice, I believe copying and modifying Leduc_Poker.cc and Leduc_Poker.h would better suit my current needs. May I ask if the steps I need to take are: after modifying custom_leduc.h and custom_leduc.cc, I should generate a new pyspiel.so file as described in the developer_guide, and then replace the original file with this new one? However, in my research, I need to frequently modify and update the rewards setting. If I have to make these changes in the source code rather than in the Python code each time, it may prevent my program from running automatically and make each run quite time-consuming. In this scenario, would you recommend that I rewrite custom_leduc.py similar to how Kuhn_Poker.py is written? My research primarily revolves around PSRO, which is mainly demonstrated in open_spiel\python\examples\psro_v2_example.py in your code. This algorithm currently only supports the games 'kuhn_poker' and 'leduc_poker'. Could you briefly explain how I can apply the PSRO algorithm to other games? |
Hi @Legendarzy. Sorry for the very late reply.
Yes, in this case you would need to recompile (by running make again.. no need to rerun cMake etc.)
Yes, correct -- that would probably be easier in your case.
It should be as simple as using a different game string here:
But please follow up if you run into any difficulties. |
I have recently been using the OpenSpiel codebase for a research project and need to modify the reward settings in the games. However, I found that the rewards are encapsulated within pyspiel.so, making it inaccessible for me to directly alter.
To circumvent this, I attempted an indirect modification approach by saving key-value pairs of {state: rewards} in a dictionary format wherever the program (primarily in Leduc Poker within PSRO) utilizes state.rewards() or state.returns(), and using this to replace the original rewards settings.
So far, I have made modifications in a total of five files:
open_spiel/python/algorithms/exploitability.py
open_spiel/python/algorithms/best_response.py
open_spiel/python/algorithms/psro_v2/abstract_meta_trainer.py
open_spiel/python/rl_environment.py
open_spiel/python/algorithms/psro_v2/rl_policy.py
My concern is that such changes might not be comprehensive and could potentially lead to non-convergence in subsequent training. I would like to ask if there are any methods that can help me modify the game's reward settings more effectively. If not, could my aforementioned modifications lead to any severe errors?
Thank you.
The text was updated successfully, but these errors were encountered: