Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dag-protobuf): cache dag pb directory structure and block indexes #147

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

fforbeck
Copy link
Member

@fforbeck fforbeck commented Jan 23, 2025

Context

The requests to fetch a DAG Protobuf directory structure using a CID execute the following steps:

  1. Get all content claims lookups to identify where we are pulling data from
  2. Fetch cid bafy...cid - which represents the folder containing the target file, so that we can determine the verifiable cid for the file (let's call that bafy...file)
  3. Fetch cid bafy...file to get the root block of the file, which in UnixFS contains NO raw data, but rather is a list of sub-blocks that contain the file (let's call those bafy...bytes1 and bafy...bytes2)
  4. Fetch the first raw data blocks to send the first byte

This PR enables the caching strategy for steps 2 to 4 where instead of fetching the directory structure from the locator and navigating the DAG for every request, it caches the DAGs if they have a Protobuf structure and content size <= 2MB.

Changes

  • Updated withContentClaimsDagula middleware to cache DAG PB content requests
  • New KV Store
    • DAGPB_CONTENT_CACHE
  • Caching rules
    • FF_DAGPB_CONTENT_CACHE_TTL_SECONDS: The number that represents when to expire the key-value pair in seconds from now. The minimum value is 60 seconds. Any value less than 60MB will not be used. We will use 5 minutes TTL by default.
    • FF_DAGPB_CONTENT_CACHE_MAX_SIZE_MB: The maximum size of the key-value pair in MB. The minimum value is 1 MB. Any value less than 1MB will not be used. We will use 2MB max file size by default.
    • FF_DAGPB_CONTENT_CACHE_ENABLED: The flag that enables the DAGPB content cache. The cache is disabled in prod by default.

Samples

2MB file - no cache
2mb-file-no-cache

  • 5.2 seconds

2MB file - cached
2mb-file-cached

  • 1.4 seconds

KV Limits

  • Reads: unlimited
  • Writes (different keys): unlimited
  • Writes (same key): 1w / sec (rate limiting)
  • Storage/account & Storage/namespace: unlimited
  • key size: <= 512 bytes
  • value size: <= 25MiB
  • Minimum cache ttl: 60 seconds
  • Higher limit? -> https://forms.gle/ukpeZVLWLnKeixDu7

resolves storacha/project-tracking#301

@fforbeck fforbeck self-assigned this Jan 23, 2025
@fforbeck fforbeck marked this pull request as ready for review January 23, 2025 19:27
@fforbeck fforbeck changed the title feat(dag-protobuf): cache dag pb content feat(dag-protobuf): cache dag pb directory structure and block indexes Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cache small blocks on the gateway seperately from responses
1 participant