Replies: 0 comments 1 reply
-
I think the discussion here supercedes this and points to a better way to implement this - closing! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Introduction 😄
Per storacha/project-tracking#140, https://github.com/storacha/RFC/blob/feat/egress-billing/rfc/egress-billing.md and storacha/freeway#109 we are adding the notion of "auth tokens" to freeway. Once the implementation is complete, users of the
w3s.link
gateway will need to include an "authorization token" with their request or be subject to fairly strict rate limits. These rate limits are necessary because Storacha currently pays all costs associated with the egress of data through our gateway, and this is not financially sustainable.Problem 😞
According to the Egress Billing RFC, auth tokens should be created by including them in the
pol
field of a delegation. This effectively creates a restriction on the UCAN invocations used to authorize data egress out of our backends, and is a very flexible and extensible mechanism for limiting egress now and in the future. Unfortunately,pol
is a feature of the UCAN 1.0 specification, and does not exist in the version that we use. As a result, we need to figure out where to record and track auth tokens in the short-to-medium term, before we upgrade to UCAN 1.0. We suggest two possible paths forward.Possible Solutions 💁
1. Include an auth token in the
nb
field of "content claim" delegationsThe proposed delegations to the gateway look like this (some fields elided for clarity) :
This says, effectively, that
did:web:w3s.link
is authorized to invoke/assert/location
ondid:web:asia.web3.storage
AS LONG AS it includes the "auth token"zrptvx
.Contrast this with the current implementation of content claims, described by the following schema:
One possible solution to the current problem is to add an optional
token
field to thenb
struct. The upload service will then need to generate location claims targeted at the gateway that include this new field and save them (using the currently in-progress decentralized indexing service), and the gateway will find these new claims when it queries for content claims relevant to a request for a particular CID.Benefits
Drawbacks
a. https://github.com/storacha/content-claims/blob/main/packages/core/src/client/api.ts#L5
b. https://github.com/storacha/content-claims/blob/main/packages/core/src/capability/assert.js#L13
c. https://github.com/storacha/blob-fetcher/blob/main/src/api.ts#L29
2. Implement a temporary "auth token" service
In this solution we'd add a table (probably in Dynamo?) to store "auth tokens" generated by our clients. The gateway would make a request to a lightweight (HTTP?) service to determine whether an auth token is valid - if it is, the request would be served with no rate limits, if not it would be subject to normal CID rate limits.
Benefits
Drawbacks
Recommendations 🧙
Having rolled this around in my head for a few days, I'm partial to option (2) - I think it's conceptually simpler, easier to implement, and gives us more flexibility to iterate on the design of the authorization token and rate limiting system. Both systems likely require the implementation of a new
auth-token/create
capability that will allow users to create new auth tokens, but the implementation (especially for "bulk" use-cases) of this capability feels much simpler with option (2). While it would be nice to move closer to the eventual steady-state "decentralized" implementation of this functionality, I'm not sure that comes with many benefits at the moment, and the migration from an auth token "service" feels like less work than migrating the "hacky" location claim UCAN v.current implementation to thepol
-based UCAN v1.0 service of the future.I can definitely be convinced that it's worth going with (1) - would love to hear arguments in that direction!
One final note - either of these should support "private" data equally well - I believe that is effectively orthogonal to this conversation - once we have the content claims indexing service up and running the existence of any content claim for a particular CID delegated to the gateway will determine whether a CID is considered "public" by the gateway. I could be wrong here, and if I am I do think that's potentially a strong argument to go with (1).
Beta Was this translation helpful? Give feedback.
All reactions