You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to reproduce the work on a scanned 3D environment (like Gibson). Unfortunately, I met some problems. I'd appreciate it if you could offer some help.
I built my project based on the released code of this repo
To test the environment module, I set the perceptual model as a pretrained VGG network and give the agent a random action each step. The success rate (the posterior map converges in 200 steps and the converged grid is correct) looks reasonable when given enough memory images. The visualization of random walk in the environment is attached.
To train the RL algorithm,
I modify the network architecture mentioned in the supp.
I give the agent a big reward (100) when the posterior finally converges (any element in the posterior matrix is larger than 0.95)
However, the actions learned by RL quickly collapse to one of the three actions, i.e., the agent will always apply one specific action. The chosen action may differ in different training procedures.
I try decreasing the learning rate to 5e-5 / 5e-6, but not much change
anl_random_walk.mp4
.
The text was updated successfully, but these errors were encountered:
Hi. Thanks for your great work of ANL.
I'm trying to reproduce the work on a scanned 3D environment (like Gibson). Unfortunately, I met some problems. I'd appreciate it if you could offer some help.
To train the RL algorithm,
However, the actions learned by RL quickly collapse to one of the three actions, i.e., the agent will always apply one specific action. The chosen action may differ in different training procedures.
anl_random_walk.mp4
.
The text was updated successfully, but these errors were encountered: