-
Notifications
You must be signed in to change notification settings - Fork 429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve reproducibility of artifacts #2140
Comments
Thanks, @msarahan! Also thanks for not just posting that notebook... it's of course broken 😈 . In summary: My early work has a lot of shortcomings:
Observations in no particular order:
In working on this, I have found |
Good to see someone paying attention to this stuff! |
An update: diffoscope PR on conda-forge:
|
@mandeep you are a reproducibility-enhancing MACHINE. 🎸 on. Been a bit heads-down, but just updated conda-forge/staged-recipes#3281 and conda-forge/staged-recipes#3282 with some naming issues, extra deps, and typos on my part. Also started sneaking in: export DETERMINISTIC_BUILD=1
export PYTHONHASHSEED=0 ...just to see what happens. Of course, I'd love to remove those things if Additionally, I've started up a reprotest recipe: conda-forge/staged-recipes#3358 |
Ha, again reading too much internet will make me dumb. export DETERMINISTIC_BUILD=1 Apparently, this a nix-specific thing, which they use (with a patch to deep bits of cpython) to clobber the timestamp |
@bollwyvl I've been reading a lot into this as well and it seems there's a bit more to do on our side regarding timestamps. I checked the pyc files to see if the magic number was being changed and everything looked okay there, which makes me believe that the difference in bytecode is due to timestamps. #2234 won't fix this unfortunately. I think we need to introduce the |
@mandeep the rabbit hole is deep indeed! Unfortunately I can't contribute a whole lot right now, but I definitely want to revisit my reproducibly-building-conda-build once some more of your stuff lands! |
Did some more work on the proposed Those seem reasonable; however, the recipe is a bit tougher. If you try to reuse the If you use the Starting from (and including) the "human-centric" (multi-)recipe (and
|
So some more thinking on this:
So in the near term, it might make sense to start an experimental wrapper, e.g. Continuing down the pipedream: The "pop quiz" would be that such a tool could, for at least Once that was demonstrated, and which I am not sure has even been explored yet, might be a variant of The "midterm exam" from this would be a I'm not sure what the "final exam" would be, presumably other architectures! |
Looks like Python 3.7 may provide a method to create deterministic .pyc files via the accepted PEP 552 -- Deterministic pycs |
Very exciting! I've been looking for a chance to pick this back up! Perhaps there's a confluence of py37, conda-build and conda-forge and something like miniforge that will be a better "midterm" than a 3.6-based solution could be, given the number of python recipes that would be needed to be updated. |
Hi there, thank you for your contribution! This issue has been automatically marked as stale because it has not had recent activity. It will be closed automatically if no further activity occurs. If you would like this issue to remain open please:
NOTE: If this issue was closed prematurely, please leave a comment. Thanks! |
In my opinion this issue should remain open, as conda-build doesn't provide reproducible builds as of today. |
xref: #4762 |
@bollwyvl raised this privately. He would like conda packages to be completely verifiable. There's some good guidelines on this at https://reproducible-builds.org/
@bollwyvl has some preliminary work implemented in a jupyter notebook that I'm happy to share if anyone would like to see.
The text was updated successfully, but these errors were encountered: