You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Try to increase the weight decay such as 5e-4 including the bn and bias. I have also tried to include the shufflingBN from MoCo which helps a lot. The paper you have mentioned adopted the weight decay of 1e-4 without lars.
I notice a paper "Momentum2 Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning". It is an interesting work.
The results using all BN will not collapse.
I doubt the results may come from all view L2Norm? Should we split two views for testing?
The text was updated successfully, but these errors were encountered: