Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollout plan for critical projects promo #11625

Closed
8 tasks done
di opened this issue Jun 21, 2022 · 34 comments · Fixed by #12307
Closed
8 tasks done

Rollout plan for critical projects promo #11625

di opened this issue Jun 21, 2022 · 34 comments · Fixed by #12307
Labels
meta Meta issues (rollouts, etc)

Comments

@di
Copy link
Member

di commented Jun 21, 2022

The following steps should be followed to roll out the critical projects promo:

Launch

  • Merge Models, views, tasks, etc for critical projects promo #10856
    • At this point, no projects will be considered critical, and the promo will be unavailable
    • https://pypi.org/security-key-giveaway/ will 404
    • Overall, no functional changes will be enabled at this point
  • Merge Add 2FA metrics task #11626 which emits relevant metrics
    • Number of projects marked critical
    • Number of projects manually requiring 2FA
    • Number of critical project maintainers
    • Number of critical project maintainers with 2FA enabled
    • Number of users with 2FA enabled
  • Load discount codes into code column of user_titan_codes table
  • Set TWOFACTORREQUIREMENT_ENABLED to True.
    • This makes the opt-in 2FA requirement feature enabled
  • Set TWOFACTORMANDATE_AVAILABLE to True.
    • This enables the job which flips the pypi_mandates_2fa bit to True and emails maintainers.
    • However, the cohort size will be zero, so this has the effect of 'soft-launching' to only our own dependencies
    • This also makes https://pypi.org/security-key-giveaway/ viewable
    • At this point, we should make sure everything is working OK and codes are redeemable before moving on
  • Set TWOFACTORMANDATE_COHORTSIZE to a non-zero value
    • We are targeting the top 1% of projects, so this should be around 3800.
  • Tweets/announcements about the promo can go out at this time
    • These should include details about the giveaway, which projects are considered critical and why, as well as new features which allow projects to opt-in to their own 2FA mandate

Post-launch (After Oct 1, 2022)

Next steps are in #12308.

@di
Copy link
Member Author

di commented Jun 23, 2022

There's now a public dashboard for the relevant metrics here: https://p.datadoghq.com/sb/7dc8b3250-389f47d638b967dbb8f7edfd4c46acb1 (h/t @ewdurbin for beautifying this).

@davidism
Copy link

davidism commented Jul 8, 2022

Heads up, you sent out an email with http://localhost in the URLs instead of https://pypi.org. This happens in Flask when you don't configure it to know where it is when a request isn't active, such as generating emails; probably similar in Pyramid.

@di
Copy link
Member Author

di commented Jul 8, 2022

Thanks for the report, we're working on it 🙂

@tomato42
Copy link

tomato42 commented Jul 8, 2022

Titan keys are only approved for sale in certain geographic regions, and thus can only be shipped to the following countries: Austria, Belgium, Canada, France, Germany, Italy, Japan, Spain, Switzerland, United Kingdom, and the United States.

Since when Germany, Italy, France, etc. are part of different regulatory regime than the rest of EU?

@di
Copy link
Member Author

di commented Jul 8, 2022

@tomato42 Unfortunately this is out of our control, these are the only countries in which Google is able to sell the product, and I don't have an explanation as to why.

@mdmintz
Copy link

mdmintz commented Jul 8, 2022

I received the [PyPI] A project you maintain has been designated as critical, but it would be helpful to know the criteria for that designation. Number of downloads? Number of GitHub Stars? Number of other projects that have my project as a dependency? A combination of the above?

@davidism
Copy link

davidism commented Jul 8, 2022

It's also a bit inconsistent. Jinja2 didn't get marked as critical, even though it's the most downloaded of my projects. Flask didn't get marked, but the less used Quart did.

Never mind, I think it's currently limited to some libraries that Warehouse uses, although not sure where Quart came from.

@underyx
Copy link

underyx commented Jul 8, 2022

Never mind, I think it's currently limited to some libraries that Warehouse uses, although not sure where Quart came from.

https://pypi.org/project/semgrep/ got marked as critical and it doesn't seem to be used by Warehouse (yet!)

@di
Copy link
Member Author

di commented Jul 8, 2022

I received the [PyPI] A project you maintain has been designated as critical, but it would be helpful to know the criteria for that designation. Number of downloads? Number of GitHub Stars? Number of other projects that have my project as a dependency? A combination of the above?

Answers to this and many more questions are included at https://pypi.org/security-key-giveaway/

@di
Copy link
Member Author

di commented Jul 8, 2022

It's also a bit inconsistent. Jinja2 didn't get marked as critical, even though it's the most downloaded of my projects. Flask didn't get marked, but the less used Quart did.

This does surprise me, I wonder if we have an issue with name normalization happening.

Never mind, I think it's currently limited to some libraries that Warehouse uses, although not sure where Quart came from.

We expanded it to the top 1% by downloads. The query is here:

""" SELECT
COUNT(*) AS num_downloads,
file.project as project_name
FROM
{table}
WHERE
DATE(timestamp) BETWEEN DATE_TRUNC(
DATE_SUB(CURRENT_DATE(), INTERVAL 6 MONTH), MONTH
)
AND CURRENT_DATE()
GROUP BY
file.project
ORDER BY
num_downloads DESC
LIMIT
{cohort_size}

@dstufft
Copy link
Member

dstufft commented Jul 8, 2022

Yes, BigQuery stores the names normalized IIRC, that query is using Project.name not Project.normalized_name.

@di
Copy link
Member Author

di commented Jul 8, 2022

Yeah, we've only flipped the bit for 3381 projects, this should be >3800. Will address this.

@di
Copy link
Member Author

di commented Jul 8, 2022

@davidism #11796 should fix this, and the bit should get flipped for these projects in ~8 hours.

@tedmiston
Copy link

tedmiston commented Jul 8, 2022

I am a bit confused by how / what projects are getting marked as critical as well.

One of my projects (https://pypi.org/project/boa-str/) got marked as "critical" is an old, very small and simple string manipulation library last released in 2017. It was basically a small internal dependency made external for convenience. Nowhere near the level of a project like Flask or Jinja... I would be surprised if it had any external users at all, let alone met this criteria from the page linked above:

What determines if project is a critical project?

PyPI determines project eligibility based on download counts derived from PyPI's public dataset of download statistics. Any project in the top 1% of downloads over the prior 6 months is designated as critical.

I tried to access the public BigQuery dataset to run a simple query (below) but got denied running the first query due to free tier quota error.

SELECT COUNT(1)
FROM `bigquery-public-data.pypi.file_downloads`
WHERE
  project = "boa-str"
  AND timestamp >= "2022-01-01"
GROUP BY `project`
LIMIT 10;

The error:

Quota exceeded: Your project exceeded quota for free query bytes scanned. For more information, see https://cloud.google.com/bigquery/docs/troubleshoot-quotas

There is a small possibility that it's still in use as a dependency and e.g., being pulled in some Docker containers running at scale given that it was written for a startup which has grown massively.

Are there other ways to access this data e.g., a JSON export of the 3800 projects to check whether this is a mistake?

@alex
Copy link
Member

alex commented Jul 8, 2022

@tedmiston https://pepy.tech/project/boa-str or https://pypistats.org/packages/boa-str are both good ways to view this data. Looks like it gets quite a bit of downloads.

@tedmiston
Copy link

tedmiston commented Jul 8, 2022

@alex Thank you! It turns out xkcd was right after all.

updates resume

@hugovk
Copy link
Contributor

hugovk commented Jul 8, 2022

Are there other ways to access this data e.g., a JSON export of the 3800 projects to check whether this is a mistake?

I expect the top ~3,800 projects (over 6 months) will be somewhat similar to those on monthly list at https://hugovk.github.io/top-pypi-packages/

@davidism Jinja2 is number 36 (75 million monthly downloads) so should be included, and likewise Pillow at 60 (43m). Pillow also isn't currently marked as critical, but both have a capital initial so I expect the normalisation fix will sort that in a few hours 👍

@tedmiston boa-str is at number 1,340 with 878k downloads!

@CaselIT
Copy link

CaselIT commented Jul 8, 2022

There still seem to be some problems with the query. for example sqlalchemy is not marked as critical even if it's both a top 1% project and it's used by warehouse

@dstufft
Copy link
Member

dstufft commented Jul 8, 2022

The query hasn't re-run yet, it runs once a day.

@hugovk
Copy link
Contributor

hugovk commented Jul 9, 2022

Pillow is now marked as critical, and there's the bump from 3.38k to 3.82k critical projects on the dashboard:

image

Thanks!

@tedmiston
Copy link

tedmiston commented Jul 9, 2022

One note from a UX perspective — I enabled 2FA via app preemptively ahead of getting the hardware key. But as soon as one does this, the page at https://pypi.org/security-key-giveaway/, decides you're not eligible for the hardware key. It was trivial to remove it, request the hardware key, and re-enable it, but it would be nice if 3800 of us didn't have to do that 🙃.

Edit: Never mind about getting the order through... it looks like Google is sold out of both keys in the U.S. now. [The USB-C key says in stock on the product page, but out of stock once added to cart. The USB-A key says out of stock on product page.]

@ssbarnea
Copy link

ssbarnea commented Jul 11, 2022

I am maintainer of 16 projects marked as critical but I am still not eligible to get a hardware key because I did the right thing and adopted (software based) 2FA previously. That is hilarious.

I am not sure if @tedmiston trick still works but I can see how this program could easily have opposite effect than the desired one.

@di
Copy link
Member Author

di commented Jul 11, 2022

Our goal is to get as many people as possible to use 2FA. Our constraint is that we have a limited number of hardware keys to give away.

While I agree that hardware keys should be preferred over TOTP, if you already have 2FA enabled via TOTP, but take a pair of free keys, that potentially means that one less person can enable 2FA.

That said, the discount codes expire Oct 1. If it looks like we'll have a surplus of discount codes by then, I'd support adjusting this policy to allow TOTP users to acquire hardware keys as well.

@ssbarnea
Copy link

If someone never used hardware keys, I would recommend them the software approach as its is very easy to stick the TOTP into your prefered password manager or just us one app like google auth. Using a HW token is considerably more inconvenient.

Forcing 2FA is no brainer and I would support even more aggressive rolling methods (1% is quite low). I think that those that fight-it are very few and are in the category that do not give a (dime) about security for users as in the end nobody is excluded from being hacked.

A big thank you to all those that made the 1% group!

@di
Copy link
Member Author

di commented Jul 11, 2022

@hugovk, when you got the email for Pillow, did it have HTTP or HTTPS links? I believe it should have been HTTPS and #11802 is just a side-effect from us running the task via CLI instead of via cron.

@hugovk
Copy link
Contributor

hugovk commented Jul 11, 2022

It had HTTP. The first email was for projects with lowercase names, the second was for Pillow:

image

@memsharded
Copy link

Hi, we have also been designated as critical project.
We have been automatically deploying/publishing releases to PyPI directly from our CI (running in the cloud), fully automated.
It is not clear, or I cannot find how is it possible to achieve this, both the physical key and the authenticator apps seems to work only for manual publishing. Am I missing something? Many thanks!

@alex
Copy link
Member

alex commented Jul 12, 2022

API keys can be used to accomplish this: https://pypi.org/help/#apitoken

@memsharded
Copy link

API keys can be used to accomplish this: https://pypi.org/help/#apitoken

This is what we were already using, and it starting failing today, we assumed it was the 2FA being enabled. We have also tried manually enabling 2FA, and it is still failing with "Backend is unhealthy". It might be some temporary issue, we will try again tomorrow and report. Thanks!

@alex
Copy link
Member

alex commented Jul 12, 2022

Backend is unhealthy means the CDN is having trouble talking to the application servers. https://status.python.org/ shows some spikes in error metrics, not sure if that's related. In any event, it's unrelated to 2FA requirements :-)

@FirefoxMetzger
Copy link

FirefoxMetzger commented Jul 18, 2022

Hm, perhaps not the right place for this, but would it be useful to display a "critical package" badge on the pypi page, or make a badge for it to add to the repo if desired?

At the moment it mostly feels like another hurdle to jump through when we perform the release dance that happens somewhere deep in dev/maintainer land. I see how it may benefit security in general, but as far as I understand the main reason for this promo is to show that pypi is taking security serious so that users (and downstream packages) can trust their dependencies a bit more. Would be nice to have something to show for that.

@ssbarnea
Copy link

@FirefoxMetzger I am in big favor of starting to add badges but it is not so simple. For example, I still find "critical" as misleading because in fact what was used to determine this was the download traffic in last 6 months. I would say that critical is likely to be more related to how many other projects are using, something that pypi cannot yet determine.

For example, I would support making public the "Sole Owner" badge as as far as I am concerned that is a security and maintenance risk too as it means "only one person can publish". That persom might go-rogue at some point, or just become permanently unavailable. For me that might be a very good reason for marking a package as risky/problematic in a public way. In fact lack of use of bot accounts with tokens for uploading packages is another red flag but that is currently close to impossible to determine by pypi. Still, let's open a discussion thread as this issue is not the right place to discuss these.

AFAIK, nobody should ever publish packages using personal credentials. The only exception is when you bootstrap a new project, so you reserve the namespace, but even this can be done with tokens.

@FirefoxMetzger
Copy link

FirefoxMetzger commented Jul 18, 2022

Still, let's open a discussion thread as this issue is not the right place to discuss these.

@ssbarnea Sure, feel free to ping me and I'm happy to chime in.

I would support making public the "Sole Owner" badge as as far as I am concerned that is a security and maintenance risk [...] marking a package as risky/problematic in a public way [...] lack of use of bot accounts with tokens for uploading packages is another red flag [...] nobody should ever publish packages using personal credentials

Those are all very valid points from a security perspective and I agree that those are concerns to keep in mind. At the same time, I doubt that many maintainers are "sole owners" because they want to be, but rather because they haven't yet found others to join them in maintaining the package. In my (perhaps limited) experience, this change usually happens through increased adoption of the package because you'll eventually run into motivated individuals that volunteer to help out. I'm not entirely convinced that more pressure on sole maintainers (in the form of a "this repo is risky to use because there is only one person maintaining it" badge) will help improve the situation.

Instead, I was thinking that a "your project is a critical piece of infra, keep up the good work" badge doesn't cost much, shows appreciation for people spending their free time on this, might encourage sole maintainers to adhere to best practice (you want to live up to the expectation others have of you), and will at worst do nothing. Its also complementary to any crack down actions on packages that could be maintained better (eg., enforced 2FA), so I figured I could at least suggest it :)

@di
Copy link
Member Author

di commented Oct 3, 2022

I separated the rollout plan for the 2FA requirement for critical projects into #12308.

@di di closed this as completed in #12307 Oct 3, 2022
@di di added the meta Meta issues (rollouts, etc) label Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta Meta issues (rollouts, etc)
Projects
None yet
Development

Successfully merging a pull request may close this issue.