Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caveat emptor! #5

Open
woodruffw opened this issue Oct 17, 2022 · 5 comments
Open

Caveat emptor! #5

woodruffw opened this issue Oct 17, 2022 · 5 comments

Comments

@woodruffw
Copy link

woodruffw commented Oct 17, 2022

Edit: Whoops, hit "publish" prematurely.

First of all, hello! I'm filing this issue because I saw this project on HN, in this thread.

I understand the desire to reserve names for future use (and to avoid typosquatting), but I think it'd be worth adding a warning to your README, letting users know that using this tool may put them at risk of index removal, per PEP 541.

Specifically, PEP 541 forbids namesquatting specifically via empty or otherwise non-functional packages:

project is name squatting (package has no functionality or is empty);

@mattsb42
Copy link
Owner

it uploads projects with no functionality

I don't entirely agree with this statement, but yes, as with most tools there are valid and invalid uses for this tool. I think the readme presents the case for valid uses, but I'm happy to review any changes you think might make it more clear.

Since PEP 541 does explicitly mention name squatting, I agree that addressing that in the readme would add value. Specifically:

project is name squatting (package has no functionality or is empty);

This is precisely the problem that pypi-parker is meant to address. Since PyPI lacks namespaces (for now) publishers have no mechanism available to them to protect the people using their packages from malicious name-squatting. The purpose of this package is not primarily to reserve names for future use (though IMO there are valid uses there too), but instead to give publishers a mechanism to protect their consumers.

Obviously, someone could abuse this tool to maliciously park large numbers of packages, but I don't think it lowers the bar very much from just manually generating those packages, so I don't really consider that something worth worrying about in this context, though incidentally that is why I ended up deciding not to add combinatoric or reverse-regex name expansion.

@mattsb42
Copy link
Owner

/me checks submitter's membership

Reading a bit deeper: If there were an officially recommended path for publishers to proactively protect consumers against malicious parking that PyPA endorsed, that would be even better. ;)

@woodruffw
Copy link
Author

I don't entirely agree with this statement, but yes, as with most tools there are valid and invalid uses for this tool. I think the readme presents the case for valid uses, but I'm happy to review any changes you think might make it more clear.

Right -- I'm not claiming that all uses of this tool are invalid (in fact, AFAICT, all uses are legitimate!).

The claim is that the mechanism this tool uses (uploading an empty package explicitly for the purpose of reserving the name) is forbidden under PEP 541, meaning that using the tool potentially exposes its users to surprise name takeovers and other admin actions.

I agree that the status quo is extremely non-ideal, and I also think this is a legitimate use case that package indices should address. This was mostly just me putting on my "standards lawyer" hat to make sure that there's no expectation that it's a pattern that PyPI supports or encourages 🙂

In terms of improving the status quo, we're currently looking at allowing PyPI users to temporarily "pre-register" project names (pypi/warehouse#11296). That doesn't handle this use case exactly since it's only temporary (with the expectation that a package will be uploaded shortly after), but it's a step in the general right direction. I believe there's also some work proceeding on both organizations and namespace support on PyPI, although I'm not a part of that work.

In the mean time (with the same "standards lawyer" hat on), this tool would technically not violate PEP 541 if, instead of an empty package, it created and uploaded "pass-through" packages that install the intended package instead (similar to bundle => bundler on RubyGems, but by the same owner). I don't know if you'd consider that a desirable change, however; just something to consider.

@mattsb42
Copy link
Owner

Gotcha. :)

Great to hear on the plans for "pre-register"; that would cover one of the major (and more grey-area) use-cases.

Thinking through this, I think what I'll do is add a "I want to do X" section, with discussion about what are and are not reasonable things to do. I'll also add a section on "why does it not do this other thing", specifically to your point about pass-through packages. I did originally consider doing pass-throughs but decided against that because it would encourage persistence of incorrect consumption, make version handling more complicated, and in general increase the maintenance burden for package owners. In the spirit of Explicit is better than implicit, the packages that pypi-parker generates always fail at install time; the only thing that you can configure is what error message they provide. Parking the name protects against malicious actors exploiting consumer confusion; erroring on install attempts protects confused package consumers from accidentally using the wrong thing.

I would be open to adding guardrails that restrict allowed names to only those within a certain distance of the actual package name. I'd have to take another look at whether/how the original package name gets passed around to see how much we could do without requiring additional configuration.

@woodruffw
Copy link
Author

Thinking through this, I think what I'll do is add a "I want to do X" section, with discussion about what are and are not reasonable things to do. I'll also add a section on "why does it not do this other thing", specifically to your point about pass-through packages.

That sounds great to me!

Your reasoning about pass-throughs makes sense; explicit is definitely more Pythonic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants