Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug?] Issues and Suggestions for Using PGMORL in MORL-Baselines #131

Closed
hamod-kh opened this issue Jan 2, 2025 · 2 comments
Closed

[Bug?] Issues and Suggestions for Using PGMORL in MORL-Baselines #131

hamod-kh opened this issue Jan 2, 2025 · 2 comments
Assignees

Comments

@hamod-kh
Copy link

hamod-kh commented Jan 2, 2025

Dear MORL-Baselines Maintainers,

As part of my bachelor thesis, I am exploring the application of multi-objective reinforcement learning (MORL). To avoid the tedious work of implementing an algorithm from scratch, I searched for libraries similar to Stable-Baselines3 and came across MORL-Baselines. Without any particular reason, I decided to start with PGMORL to familiarize myself with MORL. However, I have encountered a few issues:

  1. Environment Resource Limitation:
    Due to restricted licensing, I can only use one instance of the environment. Unfortunately, in the constructor of the PGMORL agent, a tmp_env object is created to extract environment information. This causes an issue when an existing env object is passed to the constructor, as the tmp_env creation process raises a simulation-related error (I am using a simulation program for the environment). Nevertheless, when env=None is passed instead the issue is resolved.

  2. Incompatibility Between Training and Evaluation Environments:
    The issue mentioned above also leads to another problem: the environments used for training and evaluation cannot share the same wrapper. Specifically, the training environment is wrapped with MOSyncVectorEnv, which causes the step function to fail during policy evaluation in the eval_mo function. I worked around this by adding the line env = env.envs[0], which I assume extracts the environment with the MORecordEpisodeStatistics wrapper. It would be more practical if the same environment instance could be seamlessly used for both training and evaluation.

  3. Model Saving and Testing:
    I am unable to test the different models at the end of training because I do not know how to save them. My understanding is that the training process produces multiple agents, each optimized for different objective weightings. To test these agents, I need to be able to save them. However, I could not find a save_model method similar to that in Stable-Baselines3. This might be a misunderstanding on my part, but I would greatly appreciate clarification on this.

Thank you in advance for your response and support!

Best Regards,
hamod-kh

@ffelten
Copy link
Collaborator

ffelten commented Jan 14, 2025

Hello @hamod-kh,

Sorry, this got lost in the holiday emails :).

First of all, welcome to the community! Now to answer your points:

  1. Ah, I see. The env and env_id kinda duplicate the information indeed. This is to stay compatible with algorithms that do not use vectorized envs, i.e., all other algos. Anyways, if you found a way it's all good. :)
  2. For the first point, so there is a bug in the code at this moment? For the practical point, this would mess up the learning: PPO needs the training rollouts to be contiguous, and you would "break the chain" if you evaluate on the same environment you use for training, see for example https://ai.stackexchange.com/questions/38232/why-is-it-recommended-to-use-a-separate-test-environment-when-evaluating-a-mod.
  3. There is currently no save model method for PGMORL. Implementing this can be a bit tricky as we would effectively need to store the Pareto archive of models (not just one model).

That being said, if you can only instantiae one environment, I would use a more sample efficient algorithm that PGMORL. I assume it is a continuous problem so I'd tend to advise for GPI-LS which has the save model feature implemented :).

I hope this helps and sorry for the late answer.

Cheers,

@hamod-kh
Copy link
Author

Hello @ffelten,

Thank you so much for your reply.

As for this issue, it can be closed!

Regards,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants