Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to make random sampling for unbinding reproducible #16

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

jchelly
Copy link
Collaborator

@jchelly jchelly commented Feb 15, 2024

This makes the seed for the random shuffle used in unbinding depend on the particle IDs so that the random number sequence no longer depends on the (unpredictable) order in which threads process halos. It also adds a parameter that can be used to change the seed so that we can run multiple realizations. Particle IDs are xor'd with the supplied parameter and then used to initialize the random number generator. I'm not sure if some other operation would make more sense.

I think this change also fixes two problems with the original code:

  • The random number generator was never seeded so we always got the same, default seed on all MPI ranks
  • Calling std::random_shuffle from multiple threads without specifying the random number generator may have used a non thread-safe default generator

There might still be some non-deterministic behaviour due to rounding error in reduction operations depending on the order of execution.

@jchelly
Copy link
Collaborator Author

jchelly commented Feb 15, 2024

This still doesn't seem to be enough to make runs deterministic with >1 threads (tested by running on L1000N0900/DMO_FIDUCIAL twice.)

@jchelly
Copy link
Collaborator Author

jchelly commented Apr 29, 2024

It looks like TrackIds somehow get assigned in a non-deterministic way when we use multiple threads. In the small Colibre test I get the same halo propertes in snapshot 000 but the TrackIds differ.

@jchelly
Copy link
Collaborator Author

jchelly commented Apr 29, 2024

FeedCentrals() assigns subhalo indexes in a non-deterministic way using a critical section. It could be fixed using an ordered section or just removing the openmp directives.

@jchelly
Copy link
Collaborator Author

jchelly commented Apr 29, 2024

With the ordered section fix two colibre test runs now generate identical output for snasphots 0-10 (of 15) but diverge after that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant