-
Notifications
You must be signed in to change notification settings - Fork 0
Granular build time sources
This is an idea for how we could reuse the lib.sources
functions to also work with build-time sources (derivation files). This then allows creating filtered derivations containing only a subset of files, while ensuring that the store path only changes when the included files are updated.
The implementation effort would be low (maybe a week, or two with tests and docs), but the use cases aren't very clear. In addition it might not be fully applicable to nixpkgs, as full use introduces a lot of extra derivations which add to the evaluation overhead. RFC 92 might change the landscape a bit though, I am not sure if that would speed this use case up.
- Accessing individual files of derivations without requiring a full download (e.g. nerdfonts is a good example)
- GitHub fetcher for individual files
- File-incremental build tooling like snack
- With the above, being able to patch sources without requiring a full rebuild
- More generally it could allow something like lazy trees for build-time
The general idea is that we can implement a builtins.path
(which supports file filtering) that works on derivations instead of eval-time paths. The main problem is that if you just filter files from a derivation into another, the resulting derivation changes every time any file changes, not just the ones that were filtered for.
In order to get around this, we need to know the hashes of each individual file in advance, such that we can create fixed-output derivations for each individual file. While these individual fixed-output derivations then depend on the original derivation we're trying to filter, the hash and file path of those fixed-output derivations won't change (that's how fixed-output derivations are handled in Nix).
The three main primitives that are introduced for this are:
-
pkgs.granularSource.pin
, which hashes all the individual files of a derivation and writes those to a JSON file, which can then either be used directly with IFD, or committed locally to be used without IFD. -
pkgs.granularSource.create
, which takes a derivation and a JSON file created bypin
and annotates that derivation with the file hashes from the JSON file. -
pkgs.granularSource._path
, which takes a derivation with annotated file information created bycreate
and returns a store path that contains only the files that were selected with a filter.
Finally, pkgs.granularSource.lib
re-exports all the functions from pkgs.lib.sources
, but swapping the underlying builtins.path
to be pkgs.granularSource._path
instead. This essentially allows all of the pkgs.lib.sources
functions to be used for both eval-time and build-time sources.
Pins the files of a derivation by writing the hashes and types of all files to a JSON file in a derivation.
The resulting file is suitable to be passed to pkgs.granularSource.create
, either as a derivation (which then leads to IFD, disallowed in nixpkgs), or as a file path when copied locally.
This function isn't made for local eval-time sources because in that case the builtins.path
primitive can be used without requiring such pinned hashes in advance.
Arguments:
-
src
: The derivation whose files to pin. -
hashAlgo
: The hashing algorithm to use, eithersha256
orsha512
.
Returns a derivation for a JSON file with the following format:
{
"treeHashes": {
"<someFile>": {
"file": {
"hash": "sha256-1rCVS1wK2D9lL22rbmdE6Wg2PwXRZxgFw8CVqnw3txM="
}
},
"<someDir>": {
"directory": {
"entries": {
"<someNestedFile>": {
"file": {
"hash": "sha256-5rAeSHaY8qfcdxHvb9XWZCYX35hYBssalFbZiYjoJV0="
}
}
}
}
},
"<someSymlink>": {
"symlink": {
"target": "<somePath>"
}
}
}
}
The hashes use the SRI hash format.
This function turns a derivation and associated granular file information generated using pkgs.granularSource.pin
into a source value that can be used with the pkgs.granularSource.lib
functions.
Implementation note: Just use pkgs.granularSource.{_path,_pathSymlinks}
with a filter that always returns true, therefore returning the derivation files unchanged.
Arguments:
-
src
: Derivation whose files to use as the source -
pinFile
: Path to file generated usingpkgs.granularSource.pin
. This file is imported at evaluation time, meaning that if this file is a derivation path, import-from-derivation is necessary. To prevent that, copy the pregenerated file to a project-local path. -
symlink
: Whether files should be symlinked instead of copied, defaults tofalse
. Enabling this requires less store space, but increases access time and might mess up some tools.
In nixpkgs this can be used for builders that can benefit from file-level build granularity, such as c2nix.buildCPP
like this:
c2nix.buildCPP {
src = granularSource.create {
path = fetchFromGitHub { ... };
pinFile = ./pinFile.json;
};
}
Implementation note: buildCPP
needs to use the pkgs.granularSource.lib
functions with src
to make use of the additional granularity.
TODO: Add a way to validate the hashes.
Like builtins.path
, but for derivation paths.
Only files not removed by the filter
will have an influence on the output hash.
Falls back to builtins.path
if path
is not a derivation.
All arguments are optional except path
:
-
path
: The underlying derivation. Needs to be a value returned frompkgs.granularSource.create
. -
name
(optional): The name of the derivation. -
filter
(optional): A function of the type expected bybuiltins.filterSource
, with the same semantics.
The result is a derivation containing the files from path
but filtered according to filter
.
Note: The recursive
and sha256
argument of builtins.path
are not implemented because they aren't needed for the lib.sources
interface.
Implementation note: This function needs to implement validation of the hashes.
Like pkgs.granularSource._path
, but it creates symlinks to the original source instead of copying the files.
TODO: Somehow export the result as a bash script doing the symlinking
Same functions as lib.sources
but acting on granular build-time sources created using pkgs.granularSource.create
.
Implementation note: Allow lib.sources
to be generic over the builtins.path
used.
These functions are from my proposal in the source combinators PR, these would be useful to get individual files from the granular source. The resulting store path is only influenced by the files it actually contains.