Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Gitea and other forges #47

Open
piegamesde opened this issue Jul 31, 2023 · 3 comments
Open

Support for Gitea and other forges #47

piegamesde opened this issue Jul 31, 2023 · 3 comments

Comments

@piegamesde
Copy link
Collaborator

piegamesde commented Jul 31, 2023

I somehow just remembered that there are other ways to host your software other than GitHub and GitLab. Of course we support all generic git repositories, nevertheless there are more forges which could benefit from the additional support.

Potential forges:

  • Gitea
  • ...?

Questions that need answering for each one:

  • Is it self-hostable or fixed domain?
  • Does it have user/repo structure for all repositories like GitHub or is it more free-form like GitLab?
  • Does it host tarball artifacts for all commits?
    • Given a ref, what URL can its tarball be downloaded from?
    • Given a tag, what URL can its tarball be downloaded from?
  • In anticipation of Support specifying release artifact to prehash #44, given a release with uploaded artifacts, how to download them?

To generally keep the scope down, the current restrictions will also be required for new candidates:

  • Finding appropriate versions is done through the generic Git API (git ls-remote etc.)
  • Downloading files must be as simple as filling in an URL template and calling curl. Things that require adding a library for the specific API implementation are out of scope.
@lf-
Copy link

lf- commented May 5, 2024

I think the cleanest way to implement this in a maximally generic way is to implement Nix's immutable tarball protocol as used by flakehub and as of yesterday, forgejo, which is a trivial amount of code to possibly just upstream into all the forges.

With this you can just give https://some-forgejo/user/repo/archive/main.tar.gz, which is a URL that does not need to be inspected at all.

Here is some hastily written Python I wrote to implement it today:

import subprocess
import tempfile
from pathlib import Path
import re
import dataclasses
from typing import Literal
import urllib.parse
import json


@dataclasses.dataclass
class PinSerialized:
    kind: str
    rev: str | None
    nar_hash: str


@dataclasses.dataclass
class TarballPinSerialized(PinSerialized):
    kind: Literal['tarball']
    locked_url: str
    url: str


class PinSpec:

    def do_pin(self) -> dict[str, str]:
        raise ValueError('unimplemented')


@dataclasses.dataclass
class TarballPinSpec(PinSpec):
    url: str

    def do_pin(self) -> TarballPinSerialized:
        return lock_tarball(self.url)


@dataclasses.dataclass
class LinkHeader:
    url: str
    rev: str | None


LINK_HEADER_RE = re.compile(r'<(?P<url>.*)>; rel="immutable"')


def parse_link_header(header) -> LinkHeader | None:
    matched = LINK_HEADER_RE.match(header)
    if not matched:
        return None

    url = matched.group('url')
    parsed_url = urllib.parse.urlparse(url)
    parsed_qs = urllib.parse.parse_qs(parsed_url.query)

    return LinkHeader(url=url, rev=next(iter(parsed_qs.get('rev', [])), None))


def lock_tarball(url) -> TarballPinSerialized:
    """
    Prefetches a tarball using the Nix immutable tarball protocol
    """
    import requests
    resp = requests.get(url)
    with tempfile.TemporaryDirectory() as td:
        td = Path(td)
        proc = subprocess.Popen(["tar", "-C", td, "-xvzf", "-"],
                                stdin=subprocess.PIPE)
        assert proc.stdin
        for chunk in resp.iter_content(64 * 1024):
            proc.stdin.write(chunk)
        proc.stdin.close()
        if proc.wait() != 0:
            raise RuntimeError("untarring failed")

        children = list(td.iterdir())
        # FIXME: allow different tarball structures
        assert len(children) == 1

        child = children[0].rename(children[0].parent.joinpath('source'))
        sri_hash = subprocess.check_output(
            ["nix-hash", "--type", "sha256", "--sri", child]).decode().strip()
        path = subprocess.check_output(
            ["nix-store", "--add-fixed", "--recursive", "sha256",
             child]).decode().strip()

    link_info = parse_link_header(resp.headers['Link'])

    print(sri_hash, path)
    return TarballPinSerialized(kind='tarball',
                                nar_hash=sri_hash,
                                locked_url=link_info.url if link_info else url,
                                rev=link_info.rev if link_info else None,
                                url=url)

@andir
Copy link
Owner

andir commented May 5, 2024

I'd not be opposed to supporting this. Do we know who else supports this in this way? Also it is worth keeping in mind that a Link-Header field can contain multiple URLs (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Link#specifying_multiple_links).

@lf-
Copy link

lf- commented May 5, 2024

I'm not sure if there's any client implementations besides Nix and the software I wrote. Server wise the ones I know of are forgejo and flakehub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants