-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support PEP-658 #139
Support PEP-658 #139
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I don't like adding gz files to repositories as this is the way XZ backdoor worked.
micropip/wheelinfo.py
Outdated
@@ -165,7 +196,7 @@ async def _fetch_bytes(self, fetch_kwargs: dict[str, Any]): | |||
raise e | |||
else: | |||
raise ValueError( | |||
f"Can't fetch wheel from '{self.url}'. " | |||
f"Can't fetch wheel from '{url}'. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"Can't fetch wheel from '{url}'. " | |
f"Can't fetch wheel from {url!r}. " |
Yeah, we've removed gz files in this repository so I think I should remove it in this PR too. |
@hoodmane @agriyakhetarpal Can one of you review this when you have time? I would like to include this in the next release. |
micropip/transaction.py
Outdated
# we just log the error and continue, as the metadata can be fetched extracted from the wheel. | ||
logger.debug("Metadata not available for %s", wheel.name) | ||
|
||
await wheel_download_task | ||
await self.gather_requirements(wheel.requires(extras)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If download_pep658_metadata()
succeeds is it possible to exchange the order here and call gather_requirements()
before awaiting wheel_download_task
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good catch. I originally implemented in that way in #90... probably I did something wrong while cherry-picking changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me implementation wise. Though it's not clear whether by itself it has any benefits yet since we block for downloading the wheel before resolving out the dependencies. So before we wanted one request to succeed before continuing and now if metadata is present we want two. If we can resolve the dependencies before the wheel arrives hopefully that's a performance win.
@mscolnick was asking about behavior where we would resolve and make a lock file without actually installing any wheels which would be a use case for this.
Yes, that's what I hope to achieve. How much it actually helps performance will depend on the network condition though. |
Thanks for the ping! Yes, I can help review it too. I am not acquainted with this PEP, so I'll need to read a bit about it. |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've read the PEP now; these changes mostly look good to me. Thank you, @ryanking13! I hope this will give some beneficial resolution speedups before the full wheel downloads kick in:
However, I think this PEP should be accompanied by PEP 714 as well, which fixed a few bugs in PEP 658's implementation because PyPI and pip
both had bugs that caused the metadata not to be parsed correctly. While I am confident that we haven't introduced that bug here, we should remain closer to PyPI regarding the implementation, so we should account for that change here. The key changes are as follows (please let me know if I missed something):
- for the HTML API, rename to
data-core-metadata
, and for the JSON API, rename tocore-metadata
, and - fall back to the currently proposed
data-dist-info-metadata
if they don't exist
Details: what I understand from that PEP is that PyPI kept the current/broken keys but also added these new keys for backward compatibility, from this line in the PEP:
To support clients that used the previous key names, the HTML representation MAY also be emitted using the
data-dist-info-metadata
, and if it does so it MUST match the value ofdata-core-metadata
.
And for the changes to add to micropip
(client-side):
Clients consuming the JSON representation of the Simple API MUST read the PEP 658 metadata from the key
core-metadata
if it is present. They MAY optionally use the legacydist-info-metadata
key if it is present butcore-metadata
is not.
I don't intend to block the merge in case the release is urgently needed (apologies for the delay in the review); but I do feel that we should incorporate both PEPs in the same PR, since I don't think a lot of further changes would be required; just a few renames to the new APIs added, some fallbacks, and a few changes to the testing indices (which I am not familiar with the code for, though).
Here is pip
's fix: pypa/pip#12078, and similarly for Warehouse: pypi/warehouse#13706
Also, one slightly related point for self.parsed_url.hostname
: the Anaconda.org index does not seem to support this PEP on the server side, which is an aspect that we should put into some consideration when we switch to using https://anaconda.org/pyodide/simple
in pyodide/pyodide-build#43 (while that is still okay because it's just for the build-time dependencies) and if we start using it for https://github.com/pyodide/pyodide-recipes later on to host most (if not all) wheels post pyodide/pyodide#4918:
pip
from PyPI:
Tap to expand
$ pip install cython --dry-run --no-cache
Collecting cython
Downloading Cython-3.0.11-py2.py3-none-any.whl.metadata (3.2 kB)
Downloading Cython-3.0.11-py2.py3-none-any.whl (1.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 37.1 MB/s eta 0:00:00
Would install Cython-3.0.11
However, when downloading from Anaconda.org:
Tap to expand
$ pip install -i https://pypi.anaconda.org/pyodide/simple cython --dry-run --no-cache
Looking in indexes: https://pypi.anaconda.org/pyodide/simple
Collecting cython
Downloading https://pypi.anaconda.org/pyodide/simple/cython/3.0.11/Cython-3.0.11-py2.py3-none-any.whl (1.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 1.8 MB/s eta 0:00:00
Would install Cython-3.0.11
Unrelated point, but similar to @Carreau's suggestion above: we should remove the existing wheel files from tests/test_data/wheel/
, since they will only bloat the repository as we proceed to add more commits to the default branch. They can be stored elsewhere (maybe in a GitHub release in a new repository and downloaded into the directory using pooch
before running the test suite).
Also, the current |
Oh, that's a good point indeed. I think updating the mousebender would work, so let me update the PR. |
Yes, maybe let's discuss it in a separate issue. I don't like to put bunch of big files in the repository for testing, but I am also not sure whether putting it in a separate place is a good way to go. |
Co-authored-by: Agriya Khetarpal <74401230+agriyakhetarpal@users.noreply.github.com>
…into pep-658-take2
I just checked the test file and it seems like it contains both |
for more information, see https://pre-commit.ci
Oops, sorry! Yes, PyPI contains both the new key and the fallback. I was searching for
Yes, opened a separate issue: #165 Please feel free to re-request a review once you are done here, thanks! |
@agriyakhetarpal Thanks, I think it is ready now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks great, @ryanking13!
This PR adds PEP 658: Serve Distribution Metadata in the Simple Repository API support to micropip.
When downloading package using Simple API (which is not a default), micropip will now check for distribution metadata and download the metadata file simultaneously which (hopefully) speeds up the dependency resolution.