-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training Reproducibility #8
Comments
Hi @reginehartwig! Which experiments are you having issue to reproduce? As stated in the readme, for complex real images like birds and horses, we observed that the model can still converge to a bad local minima where the prototypical shape is wrong, and you should try another random seed and check the results after the first stage. It is difficult to reproduce exactly this kind of experiments even by setting the random seed, it can depend on the issue you pointed out or it could also depend on the hardware and the versions of librairies you installed. |
Hi @monniert! Thanks for the fast reply! Later on the results can become very different. This means, I cannot run a code twice (with the same seed) and expect the same outcome. |
From what I remember in my case, I think beginning of trainings were mostly identical by fixing the seed, not really sure about performances in the long run though. Are you always running the experiments on the same machine? The source of randomness can come from different tiny things, you should investigate the common source of randomness listed at this link (https://pytorch.org/docs/stable/notes/randomness.html), in particular you should set It could also be related to the issue you mentioned. I do not plan to work on this, but would be interested to hear about the root cause if you manage to make it completely deterministic |
I am currently analyzing the training process of your model.
I recognized that results are only partially reproducible as there seems to be some randomness in the training.
Do you know which parts of the code are influencing the reproducibility? Could it be Pytorch3D, also related to this issue facebookresearch/pytorch3d#659?
It would be great if you could tell me more about it, the tests you might have run, and whether you plan to work on this.
The text was updated successfully, but these errors were encountered: