-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions on the Transport task #29
Comments
Hi! Could you share the commit you are running with, and also the exact files/configs? I can try to help if you can send me those. For the second question, do you mean showing the GUI live instead of recording the video? I've done that before by not using vectorized environment but I don't have the script available right now. You can take a look at the robomimic/robosuite doc. |
The above are the modifications I made to the relevant configuration files during the training process. Do you need any other information? I sincerely appreciate your help and also wish you a Happy New Year! |
Hmm I see your logs but I don't have a clue. Maybe you can look at the weights of the pre-trained and saved fine-tuned policies, and make sure they are not the same? Since the reward numbers exactly match, I would suspect something off with saving/loading the policy. Maybe my code has a bug but I would need to run the training, if you still can't figure it out. |
Thank you very much for your answer, but I'm sorry that I couldn't find the weights of the pre-trained and saved fine-tuned policies in the YAML configuration file. |
Oh I meant loading the weights of the two policies in torch and see if they are the same. It is really suspicious since the pre-trained and fine-tuned policies show the exactly same values for the evaluation reward (80.595), so I wonder if the policies are actually exactly the same. |
Hi! I encountered the same problem when training the one_leg task. I directly used the pre-trained model you released for fine-tuning. When testing the pre-training results with the eval_diffusion_mlp code, the success_rate was 40%. After fine-tuning for 200 epochs, the test result in the fine-tuning code was over 90%. |
I see, I will look into this later today and run some training! Thanks for raising the issue
…On Thu, Jan 2, 2025 at 9:42 AM ltl520 ***@***.***> wrote:
Hi! I encountered the same problem when training the one_leg task. I
directly used the pre-trained model you released for fine-tuning. When
testing the pre-training results with the eval_diffusion_mlp code, the
success_rate was 40%. After fine-tuning for 200 epochs, the test result in
the fine-tuning code was over 90%.
image.png (view on web)
<https://github.com/user-attachments/assets/0acf9374-b507-4b87-9ea7-027e206a1277>
However, when testing the fine-tuned result with the eval_diffusion_mlp
code, the success_rate was 45%.
image.png (view on web)
<https://github.com/user-attachments/assets/3176ff22-e287-4bd6-a2b4-45a2a5cf7f7d>
Same as @Knight-xiao <https://github.com/Knight-xiao>, I only modified
the base_policy_path in the eval_diffusion_mlp configuration file too.
image.png (view on web)
<https://github.com/user-attachments/assets/16f9de2b-8d5f-416d-a5c5-094929498d51>
Is this situation normal? What could be the cause? Thank you very much.
—
Reply to this email directly, view it on GitHub
<#29 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADNL7MP3XOBE4TU6FKS42D32IVF57AVCNFSM6AAAAABUJVQWKSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRXHA3TSMBVGQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Thanks for your reply! |
@Knight-xiao @ltl520 Hey guys, really sorry about this, but I think I have a bug about saving and loading the fine-tuned checkpoints in eval. Basically here dppo/model/diffusion/diffusion.py Line 85 in e7f73df
I will make sure to fix this tomorrow (including which parameters to save in training), it is really late in my time now. Meanwhile you can try it yourself |
Thanks! I fixed it by adding
|
@ltl520 Cool! Yea this works if all denoising steps are fine-tuned, which is the case with furniture ddim setup. For @Knight-xiao, the pre-trained policy (frozen in fine-tuning) also needs to be set up for inference for early denoising steps @Knight-xiao would you like to try out this branch for me? #31 thanks very much! |
Thank you very much for your excellent work. I have a few questions that I would like to get your answers .
Firstly, I tried to reproduce the Transport task in robomimic. After fine-tuning, the displayed success rate on the terminal has reached over 90%. However, when I executed the
eva
file, I found that the success rate was not high, far from 90%. I also tested the policy obtained from pre-training, and surprisingly, the effects of the pre-trained and fine-tuned policies were exactly the same. Do you have any solutions for this? It should be noted that I usedpre_diffusion_mlp
for pre-training andft_ppo_diffusion_mlp
for fine-tuning, and inft_ppo_diffusion_mlp
, I changed thebase_policy_path
to the policy obtained from pre-training.Additionally, I successfully recorded the Transport task by following your tutorial, mainly by modifying
render_num
,env.n_envs
, andenv.save_video
ineval_diffusion_mlp
. However, what should I do to directly display the simulation process of the Transport task?The text was updated successfully, but these errors were encountered: