Now that it isn't just used for `Value`, it doesn't really belong in
there.
Rename `static-string-data.hh` to share the same prefix to keep them
close together when sorting lines, also.
I found this because of a test failure on cygwin in
nix_api_expr_test.nix_eval_state_lookup_path:
'std::filesystem::__cxx11::filesystem_error'
what(): filesystem error: cannot remove all: Device or resource busy
[...]
[.../my_state/db/db.sqlite]
LocalState was never getting destroyed due to a reference leak. These
_free functions use an 'operator delete' which doesn't call the
destructor for the type.
Fixes: 309d55807c
Currently, --gtest_filter=nix_api_store_test.nix_eval_state_lookup_path
will result in:
terminating due to unexpected unrecoverable internal error: Assertion
'gcInitialised' failed in void nix::assertGCInitialized() at
../src/libexpr/eval-gc.cc:138
Changing the test fixture to _exr_test causes GC to be initialised.
Typically PosixSourceAccessor can be used from multiple threads,
but mtime is not updated atomically (i.e. with compare_exchange_weak),
so mtime gets raced. It's only needed in dumpPathAndGetMtime and mtime
tracking can be gated behind that.
Also start using getLastModified interface instead of dynamic casts.
This makes the output easier to compare with the new machine-generated
lists in #9732.
The hand-curated order did have the advantage of putting more important
attributes at the top, but I don't think it is worth preserving that
when `std::map` is so much easier to work with. The right solution to
leading the reader to the more important attributes is to call them out
in the intro texts.
This makes the proto serializer characterisation test data be
accompanied by JSON data.
This is arguably useful for a reasons:
- The JSON data is human-readable while the binary data is not, so it
provides some indication of what the test data means beyond the C++
literals.
- The JSON data is language-agnostic, and so can be used to quickly rig
up tests for implementation in other languages, without having source
code literals at all (just go back and forth between the JSON and the
binary).
- Even though we have no concrete plans to place the binary protocol 1-1
or with JSON, it is still nice to ensure that the JSON serializers and
binary protocols have (near) equal coverage over data types, to help
ensure we didn't forget a JSON (de)serializer.
Make instances for them that share code with `nix path-info`, but do a
slightly different format without store paths containing store dirs
(matching the other latest JSON formats).
Progress on #13570.
If we depend on the store dir, our JSON serializers/deserializers take
extra arguements, and that interfaces with the likes of various
frameworks for associating these with types (e.g. nlohmann in C++, Serde
in Rust, and Aeson in Haskell).
For now, `nix path-info` still uses the previous format, with store
dirs. We may yet decide to "rip of the band-aid", and just switch it
over, but that is left as a future PR.
The variant has on the left-hand side the topologically sorted vector
and the right-hand side is a pair showing the path and its parent that
represent a cycle in the graph making the sort impossible.
This change prepares for enhanced cycle error messages that can provide
more context about the cycle. The variant approach allows callers to
handle cycles more flexibly, enabling better error reporting that shows
the full cycle path and which files are involved.
Adapted from Lix commit f7871fcb5.
Change-Id: I70a987f470437df8beb3b1cc203ff88701d0aa1b
Co-Authored-By: Maximilian Bosch <maximilian@mbosch.me>
Add support for configuring S3 storage class via the storage-class
parameter for S3BinaryCacheStore. This allows users to optimize costs
by selecting appropriate storage tiers (STANDARD, GLACIER,
INTELLIGENT_TIERING, etc.) based on access patterns.
The storage class is applied via the x-amz-storage-class header for
both regular PUT uploads and multipart upload initiation.
Replace the null-terminated C-style strings in Value with hybrid C /
Pascal strings, where the length is stored in the allocation before the
data, and there is still a null byte at the end for the sake of C
interopt.
Co-Authored-By: Taeer Bar-Yam <taeer@bar-yam.me>
Co-Authored-By: Sergei Zimmerman <sergei@zimmerman.foo>
These steps are done (originally in order, but I squashed it as the end
result is still pretty small, and the churn in the code comments was a
bit annoying to keep straight).
1. Create proper struct type for string contexts on the heap
This will make it easier to change this type in the future.
2. Make `Value::StringWithContext` iterable
This make some for loops a lot more terse.
3. Encapsulate `Value::StringWithContext::Context::elems`
It turns out the iterators we just exposed are sufficient.
4. Make `StringWithContext::Context` length-prefixed instead
Rather than having a null pointer at the end, have a `size_t` at the
beginning. This is the exact same size (note that null pointer is
longer than null byte) and thus takes no more space!
Also, see the new TODO on naming. The thing we already so-named is a
builder type for string contexts, not the on-heap type. The
`fromBuilder` static method reflects what the names ought to be too.
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
Otherwise PosTable grows indefinitely for each reload. Since
the total input size is limited to 4GB (uint32_t for byte offset PosIdx)
it can get exhausted pretty. This ensures that we don't waste memory
on reloads as well.
Update all channel URLs from https://nixos.org/channels/ to
https://channels.nixos.org/ to use the more reliable subdomain.
The nixos.org domain apex lacks IPv6 support due to DNS hoster
limitations. Using the subdomain allows better CDN distribution
and improved reliability.
Updated files:
- Installation scripts (multi-user and tarball installers)
- Channel URL resolution in eval-settings.cc
- Documentation and examples
- Docker image default channel URL
- Release notes (added note about URL change)
Fixes#14517
I've run into this quite a few times when working with characterization test
infra. It would print an invalid command:
_NIX_TEST_ACCEPT=1 meson test main/lang
Which you'd then proceed to run and it would fail. This commit makes it
be honest about the command you need to run:
_NIX_TEST_ACCEPT=1 meson test --suite main lang
I would run `nix flake show` on a flake than hit:
===
├───ihaskell: package 'ihaskell-wrapper'
├───ihaskell-96: package 'ihaskell-wrapper'
├───ihaskell-96-dev: package 'ghc-shell-for-ihaskell-0.10.4.0'
error: expected a derivation
===
and it is not obvious what package is the culprit here since nix stops
rightaway.
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
boost::concurrent_flat_map (used in libutil and libstore) includes the
C++17 <execution> header. GCC's libstdc++ implements parallel algorithms
using Intel TBB as the backend, which creates a link-time dependency on
libtbb even though we don't actually use any parallel algorithms.
Disable the TBB backend for libstdc++ by setting
_GLIBCXX_USE_TBB_PAR_BACKEND=0. This makes parallel algorithms fall back
to serial execution, which is acceptable since we don't use them anyway.
This only affects libstdc++ (GCC's standard library); other standard
libraries like libc++ (LLVM) are unaffected.
`allowedReferences` and friends can, in addition to supporting store
paths (and placeholders, but because those will be rewritten to store
paths), they also support to refering to other outputs in the derivation
by name.
We update the tests in order to cover for that.
(While we are at it, also introduce some scratch variables for paths and
placeholders to make the C++ literalsf for this test more concise.)
Since we haven't released v2 yet (2.32 has v1) we can just update this
in-place and avoid version churn.
Note that as a nice side effect of using the standard `Hash` JSON impl,
we don't neeed this `hashFormat` parameter anymore.
It turns out this code path is only used for unit tests (to ensure our
JSON formats are possible to parse by other code, elsewhere). No
user-facing functionality consumes this format.
Therefore, let's drop the old version parsing support.
We now have functional tests for these. The unit tests added negligible
value while imposing a much higher maintenance cost.
The maintenance cost is high:
- No automatic accept option
- They broke 5+ times during this session due to implementation changes (trace count, ordering)
- They require understanding ANSI escape codes, Uncolored() wrappers, trace reversal
- They test empty traces HintFmt("") from withTrace(pos, "") - pure implementation detail
- They're fragile: adding any trace anywhere breaks the exact count assertions
The additional value over functional tests is minimal:
- Functional tests already verify the error message
- Functional tests already show trace order and content (as users see it, helps review)
- Unit tests verify "exactly 3 traces, not 2 or 4" - but users don't count traces
- Unit tests verify empty traces exist - but users never see them
The white-box testing catches the wrong things:
- It catches "you added helpful context" as a failure
- It doesn't catch "the context is confusing" (which functional tests would show)
- It enforces implementation details that should be allowed to evolve
Show which element(s) are involved at each error point:
- When an element is missing the "key" attribute, show the element
- When an element is not an attribute set, show the element
- When comparing keys fails, show both elements being compared
- When calling operator fails, show which element was being processed
This provides concrete context using ValuePrinter with errorPrintOptions.
Note: errorPrintOptions uses maxDepth=10 by default, which may print
quite deeply nested structures in error messages. This could potentially
be overwhelming, but follows the existing default for error contexts.
The old string format is a holdover from the pre JSON days. It is not
friendly to users who need to get the information out of it.
Also introduce the sort of versioning we have for derivation for this
format too.
- Use canonical content address JSON format for floating content
addressed derivation outputs
This keeps it more consistent.
- Reorganize inputs into nested structure (`inputs.srcs` and
`inputs.drvs`)
This will allow for an easier to use, but less compact, alternative
where `srcs` is just a list of derived paths.
It also allows for other experiments for derivations with a different
input structure, as I suspect will be needed for secure build traces.
This was already dropped in `inputFromURL()`, but not in
`inputFromAttrs()`. Now it's done in `fixGitURL()`, which is used by
both.
In principle, `git+` shouldn't be used in the `url` attribute, since
we already know that it's a Git URL. But since it currently works, we
don't want to break it.
Fixes#14429.
Have one to that instead of one to `Derivation`. `DerivationBuilder`
doesn't need `inputDrvs`, so `BasicDerivation` suffices.
(In fact, it doesn't need `inputSrcs` either, but we don't yet hve a
type to exclude that.)
We were calling git with `--quiet` in order not to mess up Nix's
progress bar. However, `runProgram()` already suspends the progress
bar (since git may be interactive) so that's no longer an issue. So we
can just run with `--progress` instead.
Fix#14480
This method is not well-defined for arbitrary stores, which do not have
a notion of a "real path" -- it is only well-defined for local file
systems stores, which do have exactly that notion, and so it is moved to
that sub-interface instead.
Some call-sites had to be fixed up for this, but in all cases the
changes are positive. Using `getFSSourceAccessor` allows for more other
stores to work properly. `nix-channel` was straight-up wrong in the case
of redirected local stores. And the building logic with remote building
and a non-local store is also fixed, properly gating some deletions on
store type.
Co-authored-by: Robert Hensing <robert@roberthensing.nl>
The assumption that no unknown paths can be returned is incorrect. It
can happen if a derivation has outputs that are substitutable, but
that have references that cannot be substituted (i.e. an incomplete
closure in the binary cache). This can easily happen with
magic-nix-cache.
Previously, only shared memory segments were cleaned up.
This could lead to leaked message queues and semaphore sets when builds use System V IPC, exhausting kernel IPC limits over time.
This commit extends the cleanup to all three System V IPC types:
1. Shared memory segments
2. Message queues
3. Semaphores
Additionally, we stop removing IPC objects during iteration, as it could corrupt the kernel's iterator state and cause some objects to be skipped. The new implementation uses a two-pass approach where we list first and then remove them in a separate pass.
The IPC IDs are now extracted during iteration using actual system calls (shmget, msgget, semget) rather than being looked up later, ensuring the objects exist when we capture their IDs.
In Linux, IPC objects are automatically cleaned up when the IPC namespace is destroyed.
On Darwin, since there are no IPC namespaces, the IPC objects may sometimes persist after the build user's processes are killed.
This patch modifies the cleanup logic to use sysctl calls to identify and remove left over shm segments associated with the build user.
Fixes: #12548
For repos with a lot of non-linearity in the commit graph (like
Nixpkgs), this speeds up getting the revcount a lot, e.g. `nix flake
metadata /path/to/nixpkgs?rev=9dc7035bbee85ffc740d893e02cb64460f11989f` went
from 9.1s to 3.7s.
Warning:
```
[39/483] Generating src/kaitai-struct-checks/kaitai-generated-sources with a custom command
../src/kaitai-struct-checks/nar.ksy: /types/padded_str/seq/1/encoding:
warning: use canonical encoding name `ASCII` instead of `ascii` (see https://doc.kaitai.io/ksy_style_guide.html#encoding-name)
```
This will allow us to more accurately test dropping support for
dependent realisations, by separating the tests that should not change
from the tests that should.
I do that change in PR #14247, but even if for some reasons we don't end
up doing this soon, I think it is still good to separate the test data
this way so we have the option of doing that at some point.
Progress on #13405, which asks for an explicit characterisation of the
equivalence relation like the one given here.
Also progress on #11895, because we're using the term "build trace
entry" instead of "realisation".
Mention #9259, a future work item.
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
It is better to avoid null termination for performance and memory
safety, wherever possible.
These are good cleanups extracted from the Pascal String work that we
can land by themselves first, shrinking the diff in that PR.
Co-Authored-By: Aspen Smith <root@gws.fyi>
Co-Authored-By: Sergei Zimmerman <sergei@zimmerman.foo>
Add three configuration settings to `S3BinaryCacheStoreConfig` to control
multipart upload behavior:
- `bool multipart-upload` (default `false`): Enable/disable multipart uploads
- `uint64_t multipart-chunk-size` (default 5 MiB): Size of each upload part
- `uint64_t multipart-threshold` (default 100 MiB): Minimum file size for multipart
The feature is disabled by default.
The rule of silence can be a little surprising. As a compromise to
changing the default behavior, this adds printing a success message in
verbose mode, where we don't really have a reason to be silent about
our success.
Refactor `ExprConcatStrings::eval` by inlining two only-called-once
closures into the call-site, so that the code is easier to reason about
locally (especially since the variables that were closed over were
mutated all over the place within this function).
Also use curly braces with each branch for consistency in the the
resulting code.
This is a pure refactor, but also arguably causes us to depend less on
the optimizer; now, we don't have to make sure that this closure is
inlined.
3a3c062982 introduced a buffer overflow for the
case when there are more than 65535 formal arguments. It is a perfectly reasonable
limitation, but we *must* not crash, corrupt memory or otherwise crash the process.
Add a test for the graceful behavior and switch to using an explicit uninitialized_copy_n
to further guard against buffer overflows.
Stop delegating to `HttpBinaryCacheStore::upsertFile` and instead
handle compression in the S3 store's `upsertFile` override, then call
our own `upload()` method. This separation is necessary for future
multipart upload support.
Introduce protected `upload` method overloads in `HttpBinaryCacheStore`
that handle the actual upload after compression has been applied. This
separates compression concerns (in `upsertFile`) from upload mechanics
(in `upload`).
Two overloads are provided:
1. `upload(path, RestartableSource &, sizeHint, mimeType, contentEncoding)`
2. `upload(path, CompressedSource &, mimeType)`
Introduce a `CompressedSource` class in libutil's `serialise.hh` that
compresses a `RestartableSource` and owns the compressed data. This is a
general-purpose utility that can be used anywhere compressed data needs
to be treated as a source.
We were getting this flex lexer warning during build:
```
../src/libexpr/lexer.l:333: warning, -s option given but default rule can be matched
```
The lexer uses `%option nodefault` but the `PATH_START` state only had
rules for specific patterns (`PATH_SEG` and `HPATH_START`) without a
catch-all rule to handle unexpected input.
Added a catch-all rule with `unreachable()`. This code path should never
be reached in normal operation since `PATH_START` is only entered after
matching `PATH_SEG` or `HPATH_START`, and we immediately rewind to
re-parse those same patterns. The catch-all exists solely to satisfy
flex's `%option nodefault` requirement.
Make uploads run in constant memory. Also change the callbacks to be
noexcept, since we really don't want to be unwinding the stack in the
curl thread. That will definitely corrupt that stack and make nix/curl
crash in very bad ways.
Fix a race condition where interrupting a download (via Ctrl-C) during a
retry attempt could cause a crash. When `enqueueItem()` throws because the
download thread is shutting down, the exception would propagate without
setting `done=true`, causing the `TransferItem` destructor to invoke the
callback a second time.
This triggered an assertion failure in `Callback::rethrow()` with:
`Assertion '!prev' failed` and the error message `cannot enqueue download
request because the download thread is shutting down`.
The fix catches the exception from `enqueueItem()` and calls `fail()` to
properly complete the transfer, ensuring the callback is invoked exactly
once.
Some zsh setups (including mine) do not load the
completion if `#compdef` is not on the first line.
So we move the `# shellcheck` comment to the
second line to avoid this issue.
This continues the work for formalizing our current JSON docs. Note that
in the process, a few bugs were caught:
- `closureSize` was repeated twice, forgot `closureDownloadSize`
- `file*` fields should be `download*`. They are in fact called that in
the line-oriented `.narinfo` file, but were renamed in the JSON
format.
We immediately use this in the JSON schemas for Derivation and Deriving
Path, but we cannot yet use it in Store Object Info because those paths
*do* include the store dir currently.
- Uses the more explicit `@ingroup` most of the time, to avoid problems
with nested groups, and to make group membership more explicit.
The division into headers is not great for documentation purposes,
so this helps.
- More attention for memory management details
- Various other improvements to doc comments
Per #7591, the `nix-store --gc --print-dead` command does not provide
any feedback about the amount of disk space that is used by dead store
paths. It looks like this has been the case since 7ab68961e (* Garbage
collector: added an option `--use-atime' to delete paths in...,
2008-09-17).
Update the nix-store documentation to remove the claim that this is
function that `nix-store --gc --print-dead` performs.
Implement `uploadPart()` for uploading individual parts in S3 multipart
uploads:
- Constructs URL with `?partNumber=N&uploadId=ID` query parameters
- Uploads chunk data with `application/octet-stream` mime type
- Extracts and returns `ETag` from response
Add concurrency group configuration to the CI workflow to automatically
cancel outdated runs when a PR receives new commits or is force-pushed.
This prevents wasting CI resources on superseded code.
This is a good default (the methods that allow for an arbitrary choice
of source accessor are generally preferable both to implement and to
use). And it also pays its way by allowing us to delete *both* the
`DummyStore` and `LocalStore` implementations.
Introduces `scanForReferencesDeep` to provide per-file granularity when
scanning for store path references, enabling better diagnostics for
cycle detection and `nix why-depends --precise`.
Implement `abortMultipartUpload()` for cleaning up incomplete multipart
uploads on error:
- Constructs URL with `?uploadId=ID` query parameter
- Issues `DELETE` request to abort the multipart upload
With #14314, in some places in the parser we started using C++ objects
directly rather than pointers. In those places lines like `$$ = $1` now
imply a copy when we don't need one. This commit changes those to `$$ =
std::move($1)` to avoid those copies.
Previously it used the `ThreadPool` default,
i.e. `std::thread::hardware_concurrency()`. But copying signatures is
not primarily CPU-bound so it makes more sense to use the
`http-connections` setting (since we're typically copying from/to a
binary cache).
The `showBytes()` function was redundant with `renderSize()` as the
latter automatically selects the appropriate unit (KiB, MiB, GiB, etc.)
based on the value, whereas `showBytes()` always formatted as MiB
regardless of size.
Co-authored-by: Bernardo Meurer Costa <beme@anthropic.com>
Instead of iterating over the newly built bindings we can
do a cheaper set_intersection to count duplicates or fall back
to a per-element binary search over the "base" bindings.
This speeds up `hello` evaluation by around 10ms (0.196s -> 0.187s) and
`nixos.closures.ec2.x86_64-linux` by 140ms (2.744s -> 2.609s).
This addresses a somewhat steep performance regression from 82315c3807
that reduced memory requirements of attribute set merges. With this patch
we get back around to 2.31 level of eval performance while keeping the memory
usage optimization.
Also document the optimization a bit more.
In particular
- Remove `get`, it is redundant with `valueAt` and the `get` in
`util.hh`.
- Remove `nullableValueAt`. It is morally just the function composition
`getNullable . valueAt`, not an orthogonal combinator like the others.
- `optionalValueAt` return a pointer, not `std::optional`. This also
expresses optionality, but without creating a needless copy. This
brings it in line with the other combinators which also return
references.
- Delete `valueAt` and `optionalValueAt` taking the map by value, as we
did for `get` in 408c09a120, which
prevents bugs / unnecessary copies.
`adl_serializer<DerivationOptions::OutputChecks>::from_json` was the one
use of `getNullable`. I give it a little static function for the
ultimate creation of a `std::optional` it does need to do (after
switching it to using `getNullable . valueAt`. That could go in
`json-utils.hh` eventually, but I didn't bother for now since only one
things needs it.
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
S3 buckets support object versioning to prevent unexpected changes,
but Nix previously lacked the ability to fetch specific versions of
S3 objects. This adds support for a `versionId` query parameter in S3
URLs, enabling users to pin to specific object versions:
```
s3://bucket/key?region=us-east-1&versionId=abc123
```
This has already been implemented in 1e709554d5
as a side-effect of mounting the accessors in storeFS. Let's test this so it
doesn't regress.
(cherry-picked from https://github.com/NixOS/nix/pull/12915)
Move HttpBinaryCacheStore class from .cc file to header to enable
inheritance by S3BinaryCacheStore. Create S3BinaryCacheStore class that
overrides upsertFile() to implement multipart upload logic.
Add a sizeHint parameter to BinaryCacheStore::upsertFile() to enable
size-based upload decisions in implementations. This lays the groundwork
for reintroducing S3 multipart upload support.
Add support for HTTP DELETE requests to FileTransfer infrastructure:
This enables S3 multipart upload abort functionality via DELETE requests
to S3 endpoints.
This reverts commit 90d1ff4805.
The initial issue with EPIPE was solved in 9f680874c5.
Now this patch does move bad than good by eating up boost::io::format_error that are
bugs.
addToStore(): Don't parse the NAR
* StringSource: Implement skip()
This is slightly faster than doing a read() into a buffer just to
discard the data.
* LocalStore::addToStore(): Skip unnecessary NARs rather than parsing them
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
A few changes had cropped up with `_NIX_TEST_ACCEPT=1`:
1. Blake hashing test JSON had a different indentation
2. Store URI had improper non-quoted spaces
(1) was is just fixed, as we trust nlohmann JSON to parse JSON
correctly, regardless of whitespace.
For (2), the existing URL was made a read-only test, since we very much
wish to continue parsing such invalid URLs directly. And then the
original read/write test was updated to properly percent-encode the
space, as the normal form should be.
Since 2.32, nix now needs boost 1.87 or later to build,
due to using unordered::concurrent_flat_map try_emplace_and_cvisit
../src/libexpr/eval.cc: In member function ‘void nix::EvalState::evalFile(const nix::SourcePath&, nix::Value&, bool)’:
../src/libexpr/eval.cc:1096:20: error: ‘class boost::unordered::concurrent_flat_map<nix::SourcePath, nix::Value*, std::hash<nix::SourcePath>, std::equal_to<nix::SourcePath>, traceable_allocator<std::pair<const nix::SourcePath, nix::Value*> > >’ has no member named ‘try_emplace_and_cvisit’; did you mean ‘try_emplace_or_cvisit’?
1096 | fileEvalCache->try_emplace_and_cvisit(
| ^~~~~~~~~~~~~~~~~~~~~~
| try_emplace_or_cvisit
See 834580b539
The s3:ListBucket permission is required for read operations on S3
binary caches, not just for writes. Without this permission, users get
"Access Denied" errors when running nix-build.
Extract the path-based compression method determination logic into a
protected method that returns std::optional<std::string>. This allows
subclasses to reuse the logic and makes the semantics clearer (nullopt
means no compression, not empty string).
This prepares for S3BinaryCacheStore to apply the same compression
rules when implementing multipart uploads.
Fix POST requests with data to use the correct curl option for specifying
body size. Previously used CURLOPT_INFILESIZE_LARGE for both POST and PUT,
but POST requires CURLOPT_POSTFIELDSIZE_LARGE.
This caused POST request bodies to not be sent correctly, manifesting as
S3 multipart CompleteMultipartUpload requests failing with "You must
specify at least one part" even though the XML body contained valid parts.
When Nix's SQLite narinfo cache indicates a NAR exists, but the NAR
has been garbage collected from the binary cache, Nix displays error
messages even though the operation succeeds via fallback. This is
misleading because the cached narinfo is simply outdated.
This changes SubstituteGone exceptions to produce warnings instead of
errors, accurately reflecting that this is an expected cache coherency
issue, not an actual failure.
Fixes#11411🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
At least one user has probably used `file+git://` when they mean `git+file://`, maybe thinking of it as "a file-based git repository". This adds a specific error message to hint at the correct URL scheme format and may save some users from resorting to `path:///` and copying an entire repo.
Adds a comprehensive test to verify that `nix-prefetch-url` correctly
handles S3 URLs with query parameters (e.g., custom endpoints and regions).
Previously, nix-prefetch-url would fail with "invalid store
path" errors when given S3 URLs with query parameters like
`?endpoint=http://server:9000®ion=eu-west-1`, because it incorrectly
extracted the filename from the query parameters instead of the path.
Previously, `prefetchFile()` used `baseNameOf()` directly on the URL string
to extract the filename. This caused issues with URLs containing query
parameters that include slashes, such as S3 URLs with custom endpoints:
```
s3://bucket/file.txt?endpoint=http://server:9000
```
The `baseNameOf()` function naively searches for the rightmost `/` in the
entire string, which would find the `/` in `http://server:9000` and extract
`server:9000®ion=...` as the filename. This resulted in invalid store
path names containing illegal characters like `:`.
This commit fixes the issue by:
1. Adding a `VerbatimURL::lastPathSegment()` method that extracts the last
non-empty path segment from a URL, using `pathSegments(true)` to filter
empty segments
2. Changing `prefetchFile()` to accept `const VerbatimURL &` and use the new
`lastPathSegment()` method instead of manual path parsing
3. Adding early validation with `checkName()` to fail quickly on invalid
filenames
4. Maintains backward compatibility by falling back to `baseNameOf()` for
unparsable `VerbatimURL`s
Old code would do very much incorrect reentrancy crimes (trying to do an
erase inside the emplace callback). This would fail miserably with an assertion
in Boost:
terminating due to unexpected unrecoverable internal error: Assertion '(!find(px))&&("reentrancy not allowed")' failed in boost::unordered::detail::foa::entry_trace::entry_trace(const void *) at include/boost/unordered/detail/foa/reentrancy_check.hpp:33
This is trivially reproduced by using any S3 URL with a non-empty profile:
nix-prefetch-url "s3://happy/crash?profile=default"
The previous message was vague about what "deprecated" meant and why
unlocked inputs with NAR hashes "may not be reproducible". It also
used "verifiable" which was confusing.
The new message makes it clear that the NAR hash provides verification
(is checked by NAR hash) and explicitly states the failure modes:
garbage collection and sharing.
Add `test_public_bucket_operations` to validate that store operations
work correctly on public S3 buckets without requiring credentials.
Tests nix store info and nix copy operations.
Add cleanup of client store in the finally block of setup_s3 decorator.
Uses `nix store delete --ignore-liveness` to properly handle GC roots
and only attempts deletion if the path exists.
This slightly improves the logs situation by including the region/profile/endpoint
in the logs when S3 store references get printed. Instead of:
copying path '/nix/store/lxnp9cs4cfh2g9r2bs4z7gwwz9kdj2r9-test-package-c' to 's3://bucketname'...
This now includes:
copying path '/nix/store/lxnp9cs4cfh2g9r2bs4z7gwwz9kdj2r9-test-package-c' to 's3://bucketname?endpoint=http://server:9000®ion=eu-west-1'...
Nix attempts to set the stack size to 64 MB during initialization, which is
required for the repl tests to run successfully. Skip the tests on systems
where the hard stack limit is less than this value rather than failing.
We now unconditionally compile support for s3:// URLs and stores
without authentication. The whole curl version check can be greatly
simplified by the previous commit, which bumps the minimum required curl
version.
This version has been released a long time ago in 2021 and it's doubtful
that anybody actually uses it still, since it's full of vulnerabilities [^]
[^]: https://curl.se/docs/vuln-7.75.0.html
I realized that we can actually do this thing, even though it is not
what nlohmann expects at all, because the extra parameter has a default
argument so nlohmann doesn't need to care. Sneaky!
Since 3c610df550 this resulted in `getting status of`
errors on paths inside the chroot if a path was already valid. Careful inspection
of the logic shows that if buildMode != bmCheck actualPath gets reassigned to
store.toRealPath(finalDestPath). The only branch that cares about actualPath is
the buildMode == bmCheck case, which doesn't lead to optimisePath anyway.
Instead of the cryptic:
> error: Failed to resolve AWS credentials: error code 6153`
We now get more legible:
> error: AWS authentication error: 'Valid credentials could not be sourced by the IMDS provider' (6153)
This makes it so we don't need to rely on global variables and hacky destructors to
clean up another global variable. Just putting it in the correct order in the class
is more than enough.
This partially reverts commit 5e46df973f,
partially reversing changes made to
8c789db05b.
We do this because Hydra, while using the newer version of the protocol,
still uses this command, even though Nix (as a client) doesn't use it.
On that basis, we don't want to remove it (or consider it only part of
the older versions of the protocol) until Hydra no longer uses the
Legacy SSH Protocol.
This is necessary to fix nix-everything-llvm.
The problem here is that nix-cli is taken from the previous
stage that is built with libstdc++, but this derivation builds
plugins with libc++ and the plugin load fails miserably.
Realisations are conceptually key-value pairs, mapping `DrvOutputs` (the
key) to information about that derivation output.
This separate the value type, which will be useful in maps, etc., where
we don't want to denormalize by including the key twice.
This matches similar changes for existing types:
| keyed | unkeyed |
|--------------------|------------------------|
| `ValidPathInfo` | `UnkeyedValidPathInfo` |
| `KeyedBuildResult` | `BuildResult` |
| `Realisation` | `UnkeyedRealisation` |
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
Turns out there's a much better API for this that doesn't have the
footguns of the previous method.
isLegalRefName is somewhat of a misnomer, since it's mainly used to
validate user inputs that can be either references, branch names,
psedorefs or tags.
The macro now accurately reflects its purpose: gating only AWS
authentication code, not all S3 functionality. S3 URL parsing, store
configuration, and public bucket access work regardless of this flag.
This rename clarifies that:
- S3 support is always available (URL parsing, store registration)
- Only AWS credential resolution requires the flag
- The flag controls AWS CRT SDK dependency, not S3 protocol support
Move S3 URL parsing, store configuration, and public bucket support
outside of NIX_WITH_S3_SUPPORT guards. Only AWS credential resolution
remains gated, allowing builds with withAWS = false to:
- Parse s3:// URLs
- Register S3 store types
- Access public S3 buckets (via HTTPS conversion)
- Use S3-compatible services without authentication
The setupForS3() function now always performs URL conversion, with
authentication code conditionally compiled based on NIX_WITH_S3_SUPPORT.
The aws-creds.cc file (only code using AWS CRT SDK) is now conditionally
compiled by meson.
This commit replaces the AWS C++ SDK with a lighter curl-based approach
for S3 binary cache operations.
- Removed dependency on the heavy aws-cpp-sdk-s3 and aws-cpp-sdk-transfer
- Added lightweight aws-crt-cpp for credential resolution only
- Leverages curl's native AWS SigV4 authentication (requires curl >= 7.75.0)
- S3BinaryCacheStore now delegates to HttpBinaryCacheStore
- Function s3ToHttpsUrl converts ParsedS3URL to ParsedURL
- Multipart uploads are no longer supported (may be reimplemented later)
- Build now requires curl >= 7.75.0 for AWS SigV4 support
Fixes: #13084, #12671, #11748, #12403, #5947
This forces the code to go through proper abstractions instead of the raw filesystem
API.
This issue is evident from this reproducer:
nix eval --expr 'builtins.fetchurl { url = "https://example.com"; sha256 = ""; }' --json --eval-store "dummy://?read-only=false"
error:
… while calling the 'fetchurl' builtin
at «string»:1:1:
1| builtins.fetchurl { url = "https://example.com"; sha256 = ""; }
| ^
error: opening file '/nix/store/r4f87yrl98f2m6v9z8ai2rbg4qwlcakq-example.com': No such file or directory
We only care about the accessor for a single store object anyway, but
the validity gets ignored. Also `pathExists(store.printStorePath(path))`
is definitely incorrect since it confuses the logical location vs physical
location in case of a chroot store.
This is a simple wrapper around getFSAccessor that throws an InvalidPath
error. This simplifies usage in callsites that only care about getting
a non-null accessor.
Wrap fmt() calls in lambdas to defer string formatting until the
feature check fails. This avoids unnecessary string formatting in
the common case where the feature is enabled.
Addresses performance concern raised by xokdvium in PR review.
This, alongside the other invariants of the CanonPath is important
to uphold. std::filesystem happily crashes on NUL bytes in the constructor,
as we've seen with `path:%00` prior to c436b7a32a.
Best to stay clear of NUL bytes when we're talking about syscalls, especially
on Unix where strings are null terminated.
Very nice to have if we decide to switch over to pascal-style strings.
The refactor in the last commit fixed the bug it was supposed to fix,
but introduced a new bug in that sometimes we tried to write a resolved
derivation to a store before all its `inputSrcs` were in that store.
The solution is to defer writing the derivation until inside
`DerivationBuildingGoal`, just before we do an actual build. At this
point, we are sure that all inputs in are the store.
This does have the side effect of meaning we don't write down the
resolved derivation in the substituting case, only the building case,
but I think that is actually fine. The store that actually does the
building should make a record of what it built by storing the resolved
derivation. Other stores that just substitute from that store don't
necessary want that derivation however. They can trust the substituter
to keep the record around, or baring that, they can attempt to re
resolve everything, if they need to be audited.
(cherry picked from commit c97b050a6c)
Resolve the derivation before creating a building goal, in a context
where we know what output(s) we want. That way we have a chance just to
download the outputs we want.
Fix#13247
(cherry picked from commit 39f6fd9b46)
Store the reason string as a field in the exception class rather than
only embedding it in the error message. This supports better structured
error handling and future JSON error reporting.
Suggested by Ericson2314 in PR review.
std::regex is a really bad tool for parsing things, since
it tends to overflow the stack pretty badly. See the build failure
under ASan in [^].
[^]: https://hydra.nixos.org/build/310077167/nixlog/5
CURL is not very strict about validation of URLs passed to it. We
should reflect this in our handling of URLs that we get from the user
in <nix/fetchurl.nix> or builtins.fetchurl. ValidURL was an attempt to
rectify this, but it turned out to be too strict. The only good way to
resolve this is to pass (in some cases) the user-provided string verbatim
to CURL. Other usages in libfetchers still benefit from using structured
ParsedURL and validation though.
nix store prefetch-file --name foo 'https://cdn.skypack.dev/big.js@^5.2.2'
error: 'https://cdn.skypack.dev/big.js@^5.2.2' is not a valid URL: leftover
Add support for pre-resolving AWS credentials in the parent process
before forking for builtin:fetchurl. This avoids recreating credential
providers in the forked child process.
The previous implementation had a check-then-create race condition where
multiple threads could simultaneously:
1. Check the cache and find no provider (line 122)
2. Create their own providers (lines 126-145)
3. Insert into cache (line 161)
This resulted in multiple credential providers being created when
downloading multiple packages in parallel, as each .narinfo download
would trigger provider creation on its own thread.
Fix by using boost::concurrent_flat_map's try_emplace_and_cvisit, which
provides atomic get-or-create semantics:
- f1 callback: Called atomically during insertion, creates the provider
- f2 callback: Called if key exists, returns cached provider
- Other threads are blocked during f1, so no nullptr is ever visible
This will reduce the load on hydra. It doesn't make sense to
build 2 slightly different variations where the difference
is only in the nix-perl-bindings and additional sanitizers.
There's some unfortunate ODR violations that get dianosed with GCC but not Clang
for static inline constexpr variables defined inside the class body:
template<typename T>
struct static_const
{
static JSON_INLINE_VARIABLE constexpr T value{};
};
This can be ignored pretty much. There is the same problem for std::piecewise_construct:
http://lists.boost.org/Archives/boost/2007/06/123353.php
==2455704==ERROR: AddressSanitizer: odr-violation (0x7efddc460e20):
[1] size=1 'value' /nix/store/235hvgzcbl06fxy53515q8sr6lljvf68-nlohmann_json-3.11.3/include/nlohmann/detail/meta/cpp_future.hpp:156:45 in /nix/store/pkmljfq97a83dbanr0n64zbm8cyhna33-nix-store-2.33.0pre/lib/libnixstore.so.2.33.0
[2] size=1 'value' /nix/store/235hvgzcbl06fxy53515q8sr6lljvf68-nlohmann_json-3.11.3/include/nlohmann/detail/meta/cpp_future.hpp:156:45 in /nix/store/gbjpkjj0g8vk20fzlyrwj491gwp6g1qw-nix-util-2.33.0pre/lib/libnixutil.so.2.33.0
Instead of specifying env variables all the time
we can instead embed the __asan_default_options symbol
in all executables / shared objects. This reduces code
duplication.
This change overrides __assert_fail on glibc/musl
to instead call std::terminate that we have a custom
handler for. This ensures that we have more context
to diagnose issues encountered by users in the wild.
This commit adds two key fixes to http-binary-cache-store.cc to
properly support the new curl-based S3 implementation:
1. **Consistent cache key handling**: Use `getReference().render(withParams=false)`
for disk cache keys instead of `cacheUri.to_string()`. This ensures cache
keys are consistent with the S3 implementation and don't include query
parameters, which matches the behavior expected by Store::queryPathInfo()
lookups.
2. **S3 query parameter preservation**: When generating file transfer requests
for S3 URLs, preserve query parameters from the base URL (region, endpoint,
etc.) when the relative path doesn't have its own query parameters. This
ensures S3-specific configuration is propagated to all requests.
I want to separate "policy" from "mechanism".
Now the logic to decide how to build (a policy choice, though with some
hard constraints) is all in derivation building goal, and all in the
same spot. build hook, external builder, or local builder --- the choice
between all three is made in the same spot --- pure policy.
Now, if you want to use the external deriation builder, you simply
provide the `ExternalBuilder` you wish to use, and there is no
additional checking --- pure mechanism. It is the responsibility of the
caller to choose an external builder that works for the derivation in
question.
Also, `checkSystem()` was the only thing throwing `BuildError` from
`startBuilder`. Now that that is gone, we can now remove the
`try...catch` around that.
Add a new S3BinaryCacheStore implementation that inherits from
HttpBinaryCacheStore.
The implementation is activated with NIX_WITH_CURL_S3, keeping the
existing NIX_WITH_S3_SUPPORT (AWS SDK) implementation unchanged.
This code had several issues:
1. Not going through the SourceAccessor means that we can only work
with physical paths.
2. It did not actually check that the file exists. (std::ifstream does not check
it by default).
Most of the eval cache logic is flake-independent and libexpr,
but the loading part is not.
`nix-flake` is the right component for this, as the eval cache
isn't exactly specific to the command line.
we have now merge queues for maintainance branches. We still build it
for master to have our installer beeing updated. In future this part
could go in new workflow instead.
This barfed with
error: [json.exception.type_error.302] type must be string, but is array
on `nix build github:malt3/bazel-env#bazel-env` because it has a `exportReferencesGraph` with a value like `["string",...["string"]]`.
Add a `UsernameAuth` struct and optional `usernameAuth` field to
`FileTransferRequest` to support programmatic username/password
authentication.
This uses curl's `CURLOPT_USERNAME`/`CURLOPT_PASSWORD` options, which
works with multiple protocols (HTTP, FTP, etc.) and is not specific to
any particular authentication scheme.
The primary motivation is to enable S3 authentication refactoring where
AWS credentials (access key ID and secret access key) can be passed
through this general-purpose mechanism, reducing the amount of
S3-specific code behind `#if NIX_WITH_CURL_S3` guards.
This breaks gdb pretty-printers inserted into .debug_gdb_scripts section,
because it implies --compress-debug-sections=zlib, -Wa,--compress-debug-sections.
This is very unfortunate, because then gdb can't use pretty printers for
Boost.Unordered (which are very useful, since boost::unoredred_flat_map is
impossible to debug). This seems perfectly fine to disable in the dev-shell for
the time being.
See [1-3] for further references.
With this change I'm able to use boost's pretty-printers out-of-the box:
```
p *importResolutionCache
$2 = boost::concurrent_flat_map with 1 elements = {[{accessor = {p = std::shared_ptr<nix::SourceAccessor> (use count 5, weak count 1) = {
get() = 0x555555d830a8}}, path = {static root = {static root = <same as static member of an already seen type>, path = "/"},
path = "/derivation-internal.nix"}}] = {accessor = {p = std::shared_ptr<nix::SourceAccessor> (use count 5, weak count 1) = {
get() = 0x555555d830a8}}, path = {static root = {static root = <same as static member of an already seen type>, path = "/"},
path = "/derivation-internal.nix"}}}
```
When combined with a simple `add-auto-load-safe-path ~/code` in .gdbinit
[1]: https://gerrit.lix.systems/c/lix/+/3880
[2]: https://git.lix.systems/lix-project/lix/issues/1003
[3]: https://sourceware.org/pipermail/gdb-patches/2025-October/221398.html
Firstly, this is now available on darwin where the default in llvm 19.
Secondly, this leads to very weird segfaults when building with newer nixpkgs for some reason.
(It's UB after all).
This appears when building with the following:
mesonComponentOverrides = finalAttrs: prevAttrs: {
mesonBuildType = "debugoptimized";
dontStrip = true;
doCheck = false;
separateDebugInfo = false;
preConfigure = (prevAttrs.preConfigure or "") + ''
case "$mesonBuildType" in
release|minsize|debugoptimized) appendToVar mesonFlags "-Db_lto=true" ;;
*) appendToVar mesonFlags "-Db_lto=false" ;;
esac
'';
};
And with the following nixpkgs input:
nix build ".#nix-cli" -L --override-input nixpkgs "https://releases.nixos.org/nixos/unstable/nixos-25.11pre870157.7df7ff7d8e00/nixexprs.tar.xz"
Stacktrace:
#0 0x00000000006afdc0 in ?? ()
#1 0x00007ffff71cebb6 in _Unwind_ForcedUnwind_Phase2 () from /nix/store/41ym1jm1b7j3rhglk82gwg9jml26z1km-gcc-14.3.0-lib/lib/libgcc_s.so.1
#2 0x00007ffff71cf5b5 in _Unwind_Resume () from /nix/store/41ym1jm1b7j3rhglk82gwg9jml26z1km-gcc-14.3.0-lib/lib/libgcc_s.so.1
#3 0x00007ffff7eac7d8 in std::basic_ios<char, std::char_traits<char> >::~basic_ios (this=<optimized out>, this=<optimized out>)
at /nix/store/82kmz7r96navanrc2fgckh2bamiqrgsw-gcc-14.3.0/include/c++/14.3.0/bits/basic_ios.h:286
#4 std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream (this=<optimized out>, this=<optimized out>)
at /nix/store/82kmz7r96navanrc2fgckh2bamiqrgsw-gcc-14.3.0/include/c++/14.3.0/sstream:806
#5 nix::SimpleLogger::logEI (this=<optimized out>, ei=...) at ../logging.cc:121
#6 0x00007ffff7515794 in nix::Logger::logEI (this=0x675450, lvl=nix::lvlError, ei=...) at /nix/store/bkshji3nnxmrmgwa4n2kaxadajkwvn65-nix-util-2.32.0pre-dev/include/nix/util/logging.hh:144
#7 nix::handleExceptions (programName=..., fun=...) at ../shared.cc:336
#8 0x000000000047b76b in main (argc=<optimized out>, argv=<optimized out>) at /nix/store/82kmz7r96navanrc2fgckh2bamiqrgsw-gcc-14.3.0/include/c++/14.3.0/bits/new_allocator.h:88
This broke invocations like:
NIX_SSHOPTS='-p2222 -oUserKnownHostsFile=/dev/null -oStrictHostKeyChecking=no' nix copy /nix/store/......-foo --to ssh-ng://root@localhost
In Nix 2.30.2, fakeSSH was enabled when the "thing I want to connect to"
was plain old "localhost". Previously, this check was written as:
, fakeSSH(host == "localhost")
Given the above invocation, `host` would have been `root@localhost`, and
thus `fakeSSH` would be `false` because `root@localhost` != `localhost`.
However, since 49ba06175e, `authority.host`
returned _just_ the host (`localhost`, no user) and erroneously enabled
`fakeSSH` in this case, causing `NIX_SSHOPTS` to be ignored (since,
when `fakeSSH` is `true`, `SSHMaster::startCommand` doesn't call
`addCommonSSHOpts`).
`authority.to_string()` accurately returns the expected `root@localhost`
format (given the above invocation), fixing this.
These are helper programs that execute derivations for specified
system types (e.g. using QEMU to emulate another system type).
To use, set `external-builders`:
external-builders = [{"systems": ["aarch64-linux"], "program": "/path/to/external-builder.py"}]
The external builder gets one command line argument, the path to a JSON file containing all necessary information about the derivation:
{
"args": [...],
"builder": "/nix/store/kwcyvgdg98n98hqapaz8sw92pc2s78x6-bash-5.2p37/bin/bash",
"env": {
"HOME": "/homeless-shelter",
...
},
"realStoreDir": "/tmp/nix/nix/store",
"storeDir": "/nix/store",
"tmpDir": "/tmp/nix-shell.dzQ2hE/nix-build-patchelf-0.14.3.drv-46/build",
"tmpDirInSandbox": "/build"
}
Co-authored-by: Cole Helbling <cole.helbling@determinate.systems>
Until these repos are potentially merged, this is good for dogfooding
alongside the experimental installer. It also uses the more official
`artifacts.nixos.org` endpoint to install stable releases now
More immediately though, we need a patch for the experimental installer
to really work in CI at all, and that hasn't landed in a tag yet. So,
this lets us use it right from `main`!
Introduce a new build option 'curl-s3-store' for the curl-based S3
implementation, separate from the existing AWS SDK-based 's3-store'.
The two options are mutually exclusive to avoid conflicts.
Users can enable the new implementation with:
-Dcurl-s3-store=enabled -Ds3-store=disabled
Add lightweight AWS credential resolution using AWS CRT (Common Runtime)
instead of the full AWS SDK. This provides credential management for the
upcoming curl-based S3 implementation.
Realisations are conceptually key-value pairs, mapping `DrvOutputs` (the
key) to information about that derivation output.
This separate the value type, which will be useful in maps, etc., where
we don't want to denormalize by including the key twice.
This matches similar changes for existing types:
| keyed | unkeyed |
|--------------------|------------------------|
| `ValidPathInfo` | `UnkeyedValidPathInfo` |
| `KeyedBuildResult` | `BuildResult` |
| `Realisation` | `UnkeyedRealisation` |
Best I can tell this was never supposed to be exposed to the user
and has been this way since 2.19.
2.18 did not expose this file to the user:
nix run nix/2.18-maintenance -- eval --expr "import <nix/derivation-internal.nix>"
error: getting status of '/__corepkgs__/derivation-internal.nix': No such file or directory
https://en.cppreference.com/w/cpp/thread.html
src/libstore/gc.cc:121:39: error: no member named 'sleep_for' in namespace 'std::this_thread'
121 | std::this_thread::sleep_for(std::chrono::milliseconds(100));
| ~~~~~~~~~~~~~~~~~~^
Move ParsedS3URL from s3.cc/.hh into dedicated s3-url.cc/.hh files.
This separates URL parsing utilities (which are protocol-agnostic) from
the AWS SDK-specific S3Helper implementation, making the code cleaner
and enabling reuse by future curl-based S3 implementation.
The refactor in the last commit fixed the bug it was supposed to fix,
but introduced a new bug in that sometimes we tried to write a resolved
derivation to a store before all its `inputSrcs` were in that store.
The solution is to defer writing the derivation until inside
`DerivationBuildingGoal`, just before we do an actual build. At this
point, we are sure that all inputs in are the store.
This does have the side effect of meaning we don't write down the
resolved derivation in the substituting case, only the building case,
but I think that is actually fine. The store that actually does the
building should make a record of what it built by storing the resolved
derivation. Other stores that just substitute from that store don't
necessary want that derivation however. They can trust the substituter
to keep the record around, or baring that, they can attempt to re
resolve everything, if they need to be audited.
Resolve the derivation before creating a building goal, in a context
where we know what output(s) we want. That way we have a chance just to
download the outputs we want.
Fix#13247
A very unfortunate interaction of current filtering with pure eval is
that the following actually leads to `lib.a = {}`. This just adds a unit
test for this broken behavior. This is really good to be done as a unit test
via the in-memory store.
{
outputs =
{ ... }:
{
lib.a = builtins.readDir /.;
};
}
Whoever first calls `quit` now empties the queue, instead of waiting for
the worker thread to do it.
(Note that in the unwinding case, the worker thread is still the first
to call `quit`, though.)
This is my SNAFU. Accidentally broken in 02c9ac445f.
There's very dubious behavior for 'builtins.readDir /.':
{
outputs =
{ ... }:
{
lib.a = builtins.readDir /.;
};
}
nix eval /tmp/test-flake#lib.a
Starting from 2.27 this now returns an empty set. This really isn't supposed
to happen, but this change in the semantics of makeEmptySourceAccessor accidentally
changed the behavior of this.
The followLinksToStore() function could hang indefinitely when encountering
symlink cycles outside the Nix store, causing 100% CPU usage and blocking
any operations that use this function.
This affects multiple commands including nix-store --query, --delete,
--verify, nix-env, and nix-copy-closure when given paths with symlink cycles.
The fix adds a maximum limit of 1024 symlink follows (matching the limit
used by canonPath) and throws an error when exceeded, preventing the
infinite loop while preserving the original semantics of stopping at
the first path inside the store.
Replace non-thread-safe ptsname() calls with a new getPtsName() helper
function that:
- Uses thread-safe ptsname_r() on Linux/BSD platforms
- Uses mutex-protected ptsname() on macOS (which lacks ptsname_r())
This turns out to be a big problem for performance of Bison
generated code, that for whatever reason cannot be made internal
to the shared library. This causes GCC to make a bunch of function
calls go through PLT. Ideally these hot functions (like move/copy ctor) could become
inline in upstream Bison. That will make sure that GCC can do interprocedular
optimizations without -fno-semantic-interposition [^]. Considering that
LLVM already does inlining and whatnot is a good motivation for this change.
I don't know of any case where Nix relies on LD_PRELOAD tricks for the shared
libraries in production use-cases.
[^]: https://maskray.me/blog/2021-05-09-fno-semantic-interposition
Since the parser is now LALR we can easily switch
over to the less ugly sketelon than the default C one.
This would allow us to switch from %union to %define api.value.type variant
in the future to avoid the need for triviall POD types.
1. Saves 24-32 bytes per string (size of std::string)
2. Saves additional bytes by not over-allocating strings (in total we
save ~1% memory)
3. Sets us up to perform a similar transformation on the other Expr
subclasses
4. Makes ExprString trivially moveable (before the string data might
move, causing the Value's pointer to become invalid). This is important
so we can put ExprStrings in an std::vector and refer to them by index
We have introduced a string copy in ParserState::stripIndentation().
This could be removed by pre-allocating the right sized string in the
arena, but this adds complexity and doesn't seem to improve performance,
so for now we've left the copy in.
This mirrors what OptionalPathSetting does. Otherwise we run into
an assertion failure for relative paths specified as the authority + path:
nix build nixpkgs#hello --store "local://a/b"
nix: ../posix-source-accessor.cc:13: nix::PosixSourceAccessor::PosixSourceAccessor(std::filesystem::__cxx11::path&&): Assertion `root.empty() || root.is_absolute()' failed.
This is now diagnosed properly:
error: not an absolute path: 'a/b'
Just as you'd specify the root via a query parameter:
nix build nixpkgs#hello --store "local?root=a/b"
Fewer macros is better!
Introduce a new `JsonChacterizationTest` mixin class to help with this.
Also, avoid some needless copies with `GetParam`.
Part of my effort shoring up the JSON formats with #13570.
These stragglers have been accidentally left out when implementing the StoreConfig::getReference.
Also HttpBinaryCacheStore::getReference now returns the actual store parameters, not the cacheUri
parameters.
In the case where the store object doesn't exist, we do correctly move
(rather than copy) the scratch data into place. In this case, the
destination store object already exists, but we still want to clean up
after ourselves.
This avoids any complications that can arise from the environment
affecting evaluation of the help pages (which don't need to be calling
out to anything external anyways)
A recent example of one of these problems is
https://github.com/NixOS/nix/issues/14085, which would break help pages
by causing them to make invalid calls to the dummy store they're
evaluated with
Fixes: https://github.com/NixOS/nix/issues/14062
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
fetchToStore() caching was broken because it uses the fingerprint of
the accessor, but now that the accessor (typically storeFS) is a
composite (like MountedSourceAccessor or AllowListSourceAccessor),
there was no fingerprint anymore. So fetchToStore now uses the new
getFingerprint() method to get the specific fingerprint for the
subpath.
This returns the fingerprint for a specific subpath. This is intended
for "composite" accessors like MountedSourceAccessor, where different
subdirectories can have different fingerprints.
Previously, Nix would not create a cache entry for substituted/cached
inputs
This led to severe slowdowns in some scenarios where a large input (like
Nixpkgs) had already been unpacked to the store but didn't exist in a
users cache, as described in https://github.com/NixOS/nix/issues/11228
Using the same method as https://github.com/NixOS/nix/pull/12911, we can
create a cache entry for the fingerprint of substituted/cached inputs
and avoid this problem entirely
These counters are extremely expensive in a multi-threaded
program. For instance, disabling them speeds up evaluation of the
NixOS/nix/2.21.2 from 32.6s to 17.8s.
With this change, the store-wide `getFSAccessor` has only one usage left
--- the evaluator. If we get rid of that (as is planned), we can then
remove that method altogether, simplifying `Store`. Hurray!
I removed the store dir by mistake from the pretty-printed (for humans)
output in eb643d034f. That change was not
supposed to change output.
This is sometimes easier / more performant to implement, and
independently it is also a more convenient interface for many callers.
The existing store-wide `getFSAccessor` is only used for
- `nix why-depends`
- the evaluator
I hope we can get rid of it for those, too, and then we have the option
of getting rid of the store-wide method.
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
This makes the CI fail fast and more explicitly in case the formatting
is incorrect and provides a better error messages. This also ensures
that we don't burn CI on useless checks for code that wouldn't pass lints
anyway.
Old code is now just used for `nix build` --- there is no CLI breaking
change.
Test the new format, too.
The new format is not currently used, but will be used going forward,
for example in the C API.
Progress on #13570
This brings them in line with the other tests, and furthers my goals of
separating unit test data from code.
Doing this cleanup as part of my #13570 effort, but strictly-speaking,
this is separate as these data types' JSON never contained and store
paths or store dirs, just simple output name strings.
Tested by building with b_sanitize=thread and running:
nix flake prefetch-inputs --store "dummy://?read-only=false"
It might make sense to move this utility class out of dummy-store.cc,
but it seems fine for now.
No behavior is changed, just:
- Declare a canonical `nlohmnan::json::adl_serializer`
- Use `json-utils.hh` to shorten code without getting worse error
messages.
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
We should use proper abstractions for reading files from the store.
E.g. this caused errors when trying to download github flakes into
an in-memory store in #14023.
The docs weren't 100% clear about bounds checking, but suggested that
errors would be caught.
The bounds checks are cheap compared to the function calls they're in,
so we have no reason to omit them.
Enables builds with ASAN to catch memory corruption
bugs faster and in CI. This is an incredibly valuable
instrument that must be used as much as possible.
Somewhat based on jade's work from Lix, though there's a lot that
we have to do differently:
19ae87e5ce
Co-authored-by: Jade Lovelace <lix@jade.fyi>
This leads to ASAN errors:
==1137785==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x523000001d00 in thread T0:
object passed to delete has wrong type:
size of the allocated type: 5968 bytes;
size of the deallocated type: 5968 bytes.
alignment of the allocated type: 8 bytes;
alignment of the deallocated type: default-aligned.
This has multiple dangling pointer issues that lead to segfaults in e.g.:
nix eval --expr '(builtins.getFlake "github:nixos/nixpkgs/25.05")' --impure
This reverts commit ad175727e4, reversing
changes made to d314750174.
See #13570 for details --- the idea is that included the store dir in
store paths makes systematic JSON parting with e.g. Serde, Aeson,
nlohmann, or similiar harder.
After talking to Eelco, we are changing the `Derivation` format right
away because not only is `nix derivation` technically experimental, we think it is
also less widely used in practice than, say, `nix path-info`.
Progress on #13570
Add `read-only` setting to `dummy://` store for back compat.
Test by changing an existing test to use this instead, fixing a TODO.
Co-Authored-By: HaeNoe <git@haenoe.party>
Co-authored-by: Eelco Dolstra <edolstra@gmail.com>
Since `nix flake check` doesn't produce a `result` symlink, it doesn't
actually need to build/substitute derivations that are already known
to have succeeded, i.e. that are substitutable.
This can speed up CI jobs in cases where the derivations have already
been built by other jobs. For instance, a command like
nix flake check github:NixOS/hydra/aa62c7f7db31753f0cde690f8654dd1907fc0ce2
should no longer build anything because the outputs are already in
cache.nixos.org.
Based-on: https://github.com/DeterminateSystems/nix-src/pull/134
Based-on: https://gerrit.lix.systems/c/lix/+/3841
Co-authored-by: Eelco Dolstra <edolstra@gmail.com>
- Use `const K`, not `K`, otherwise we don't get auto referencing of
rvalues.
- Generalized the deleted overloads, because we don't care what the key
type is --- we want to get rid of anything that has an rvalue map
type.
A follow-up optimization will make it impossible to make a find function
that returns an iterator in an efficient manner. All consumer code can
easily use the `get` variant.
As evident from the number of tests that were holding this API completely
wrong (the end() iterator returned from find() is NEVER nullptr) we should
not have this footgun. A proper strong type guarantees that this confusion
will not happen again.
Also this will be helpful down the road when Bindings becomes something
smarter than an array of Attr.
This allows the weird network or DNS server fallback mechanism inside
glibc to work, and prevents a "Resolving timed out after 5000
milliseconds" error. Read on for details.
The DNS request stuff (dns-hosts) in glibc uses this fallback procedure
to minimize network RTT in the ideal case while dealing with
ill-behaving networks and DNS servers gracefully (see resolv.conf(5)):
- Use sendmmsg() to send UDP DNS requests for IPv4 and IPv6 in parallel
- If that times out (meaning that none or only one of the responses have
been received), send the requests one by one, waiting for the response
before sending the next request ("single-request")
- If that still times out, try to use a different socket (hence
different address) for each request ("single-request-reopen")
The default timeout inside glibc is 5 seconds. Therefore, setting
connect-timeout, and therefore CURLOPT_CONNECTTIMEOUT to 5 seconds
prevents the single-request fallback, and setting it to even 10 seconds
prevents the single-request-reopen fallback as well.
The fallback decision is saved by glibc, but only thread-locally, and
libcurl starts a new thread for getaddrinfo() for each connection.
Therefore for every connection the fallback starts from sendmmsg() all
over again. And since these are considered to have timed out by libcurl,
even though getaddrinfo() might return a successful result, it is not
cached in libcurl.
While a user could tweak these with resolv.conf(5) options (e.g. using
networking.resolvconf.extraOptions in NixOS), and indeed that is
probably needed to avoid annoying delays, it still means that the
default connect-timeout of 5 is too low. Raise it to give fallback a
chance.
../hash.cc: In function 'nix::{anonymous}::DecodeNamePair nix::baseExplicit(HashFormat)':
../hash.cc:114:1: warning: control reaches end of non-void function [-Wreturn-type]
114 | }
| ^
This has been dropped on unstable an nix no longer
compiled with overridden nixpkgs input. On 25.05 these
overrides already do nothing.
Tested with:
nix build .#packages.x86_64-darwin.nix-cli -L --override-input nixpkgs https://releases.nixos.org/nixos/unstable/nixos-25.11pre859555.ab0f3607a6c7/nixexprs.tar.xz
Default deployment target on 25.05 is 11.3, so 10.13
sdk override doesn't have to be updated at all as evident
from the fact that we didn't observe any issues with it.
This is because we need it in declarations where we should not be
including the full `nlohmann/json.hpp`.
Already can clean up by moving the experimental feature "instance".
Also, make the `std::map` instance better by allowing for other
comparison functions.
This reverts commit bdbc739d6e.
Such a change needs more thought put into it. By versioning
shared libraries we'd make a false impression that libraries
themselves are actually versioned and have some sort of stable
ABI, which is not the case.
This will be useful when C bindings become stable, but as long
as they are experimental it does not make sense to set SONAME.
Also this change should not have been backported, since it's
severely breaking.
When doing multithreaded evaluation, we want to ensure that any Nix
file is parsed and evaluated only once. The easiest way to do this is
to rely on thunks, since those ensure locking in the multithreaded
evaluator. `fileEvalCache` is now a mapping from `SourcePath` to a
`Value *`. The value is initially a thunk (pointing to a
`ExprParseFile` helper object) that can be forced to parse and
evaluate the file. So a subsequent thread requesting the same file
will see a thunk that is possibly locked and wait for it.
The parser cache is gone since it's no longer needed. However, there
is a new `importResolutionCache` that maps `SourcePath`s to
`SourcePath`s (e.g. `/foo` to `/foo/default.nix`). Previously we put
multiple entries in `fileEvalCache`, which was ugly and could result
in work duplication.
These constant Values have no business being in the EvalState in the
first place. The ultimate goal is to get rid of the ugly `getBuiltins`
and its relience (in `createBaseEnv`) on these global constants is getting in the way.
Same idea as in f017f9ddd3.
Co-authored-by: eldritch horrors <pennae@lix.systems>
This object is always constant and will never get modified.
Having it as a global (constant) static is much easier and
unclutters the EvalState.
Same idea as in f017f9ddd3.
Co-authored-by: eldritch horrors <pennae@lix.systems>
This implements a special back-compat shim to specifically allow
unbracketed IPv6 addresses in store references. This is something
that is relied upon in the wild and the old parsing logic accepted
both ways (brackets were optional). This patch restores this behavior.
As always, we didn't have any tests for this.
Addresses #13937.
`perf c2c` shows a lot of cacheline conflicts between purely read-only
Store methods (like `parseStorePath()`) and the Sync classes. So
allocate pathInfoCache separately to avoid that.
Calling `drainFD()` will hang if another process has the write side
open, since then the child won't get an EOF. This can happen if we
have multiple threads doing a build, since in that case another thread
may fork a child process that inherits the write side of the first
thread.
We could set O_CLOEXEC on the write side (using pipe2()) but it won't
help here since we don't always do an exec() in the child, e.g. in the
case of builtin builders. (We need a "close-on-fork", not a
"close-on-exec".)
Since the only construction and push_back() calls
to Bindings happen through the `BindingsBuilder` [1] we don't
need to keep `capacity` around on the heap anymore. This saves 8 bytes
(because of the member alignment padding)
per one Bindings allocation. This isn't that much, but it does
save significant memory.
This also shows that the Bindings don't necessarily have to
be mutable, which opens up opportunities for doing small bindings
optimization and storing a 1-element Bindings directly in Value.
For the following scenario:
nix-env --query --available --out-path --file ../nixpkgs --eval-system x86_64-linux
(nixpkgs revision: ddcddd7b09a417ca9a88899f4bd43a8edb72308d)
This patch results in reduction of `sets.bytes` 13115104016 -> 12653087640,
which amounts to 462 MB less bytes allocated for Bindings.
[1]: Not actually, `getBuiltins` does mutate bindings, but this is pretty
inconsequential and doesn't lead to problems.
This is relied upon (specifically the `local` store) by existing
tooling [1] and we broke this in 3e7879e6df (which
was first released in 2.31).
To lessen the scope of the breakage we should not normalize "auto" references
and explicitly specified references like "local" or "daemon". It also makes
sense to canonicalize local://,daemon:// to be more compatible with prior
behavior.
[1]: 05e1b3cba2/lib/NOM/Builds.hs (L60-L64)
Exactly why is is correct is a little subtle, because sometimes the
worker is owned by the worker. But the commit message in
e437b08250 explained the situation well
enough: I made that commit message part of the ABI docs, and now it
should be understandable to the next person.
Do this with a new `useHook` boolean we carefully make sure is set in
all cases. This change isn't really worthwhile by itself, but it allows
us to make further refactors (see later commits) which are
well-motivated.
When useMaster is true, startMaster() acquires the state lock, then
calls isMasterRunning(), which calls addCommonSSHOpts(), which tries
to acquire the state lock again, causing a deadlock.
The solution is to move tmpDir out of the state. It doesn't need to be
there in the first place because it never changes.
On macOS, poll() is fundamentally broken for HUP detection. It loses event
subscriptions when EVFILT_READ fires without matching the requested events
in the pollfd. This causes daemon processes to linger after client disconnect.
This commit replaces poll() with kqueue on macOS, which is what poll()
uses internally but without the bugs. The kqueue implementation uses
EVFILT_READ which works for both sockets and pipes, avoiding EVFILT_SOCK
which only works for sockets.
On Linux and other platforms, we continue using poll() with the standard
POSIX behavior where POLLHUP is always reported regardless of requested events.
Based on work from the Lix project (https://git.lix.systems/lix-project/lix)
commit 69ba3c92db3ecca468bcd5ff7849fa8e8e0fc6c0
Fixes: https://github.com/NixOS/nix/issues/13847
Related: https://git.lix.systems/lix-project/lix/issues/729
Apple bugs: rdar://37537852 (poll), FB17447257 (poll)
Co-authored-by: Jade Lovelace <jadel@mercury.com>
Now that Symbols are statically allocated at compile time with known IDs,
we can use switch statements instead of if-else chains for Symbol comparisons.
This provides better performance through compiler optimizations like jump tables.
Changes:
- Add public getId() method to Symbol class to access the internal ID
- Convert if-else chains comparing Symbol values to switch statements
in primops.cc's derivationStrictInternal function
- Simplify control flow by removing the 'handled' flag and moving the
default attribute handling into the switch's default case
The static and runtime Symbol IDs are guaranteed to match by the
copyIntoSymbolTable implementation which asserts this invariant.
Co-authored-by: John Ericson <git@JohnEricson.me>
In b70d22b `mkStringNoCopy()` was renamed to
`mkString()`, but this is a bit risky since in code like
vStringRegular.mkString("regular");
we want to be sure that the right overload is picked. (This is
especially problematic since the overload that takes an
`std::string_view` *does* allocate.) So let's be explicit.
(Rebased from https://github.com/NixOS/nix/pull/11551)
I (@Ericson2314) messed up. We were supposed to test the status quo
before landing any new chnages, and also there is one change that is not
quite right (relative paths).
I am reverting for now, and then backporting the test suite to the old
situation.
This reverts commit 04ad66af5f.
Git URI can also support scp style links similar to git itself.
This change augments the function fixGitURL to better handle the scp
style urls through a minimal parser rather than regex which has been
found to be brittle.
* Support for IPV6 added
* New test cases added for fixGitURL
* Clearer documentation on purpose and goal of function
* More `std::string_view` for performance
* A few more URL tests
Fixes#5958
The URL should not be normalized before handing it off to cURL, because
builtin fetchers like fetchTarball/fetchurl are expected to work with
arbitrary URLs, that might not be RFC3986 compliant. For those cases
Nix should not normalize URLs, though validation is fine. ParseURL and
cURL are supposed to match the set of acceptable URLs, since they implement
the same RFC.
This adds regression tests for fromTOML overflow/underflow behavior.
Previous versions of toml11 used to saturate, but this was never an
intended behavior (and Snix/Nix 2.3/toml11 >= 4.0 validate this).
(cherry picked from Lix [1,2])
[1]: 7ee442079d
[2]: 4de09b6b54
The motivation for this change is two-fold:
1. Commonly used Symbol values can be referred to
quite often and they can be assigned at compile-time
rather than runtime.
2. This also unclutters EvalState constructor, which was
getting very long and unreadable.
Spiritually similar to https://gerrit.lix.systems/c/lix/+/2218,
though that patch doesn't allocate the Symbol at compile time.
Co-authored-by: eldritch horrors <pennae@lix.systems>
Looking at perf:
0.21 │ push %rbp
0.99 │ mov %rsp,%rbp
│ push %r15
0.25 │ push %r14
│ push %r13
0.49 │ push %r12
0.66 │ push %rbx
1.23 │ lea -0x10000(%rsp),%r11
0.23 │ 15: sub $0x1000,%rsp
1.01 │ orq $0x0,(%rsp)
59.12 │ cmp %r11,%rsp
0.27 │ ↑ jne 15
Seems like 64K is too much to have on the stack for each invocation, considering
that only a minuscule number of allocations are actually larger than 4K.
There's actually no good reason this function should use so much stack space. Or
use small_string at all. Everything can be done in small chunks that don't require
any memory allocations and use up 2K bytes on the stack.
This patch also adds a microbenchmark for tracking the unparsing performance. Here
are the results for this change:
(Before)
BM_UnparseRealDerivationFile/hello 7275 ns 7247 ns 96093 bytes_per_second=232.136Mi/s
BM_UnparseRealDerivationFile/firefox 40538 ns 40376 ns 17327 bytes_per_second=378.534Mi/s
(After)
BM_UnparseRealDerivationFile/hello 3228 ns 3218 ns 215671 bytes_per_second=522.775Mi/s
BM_UnparseRealDerivationFile/firefox 39724 ns 39584 ns 17617 bytes_per_second=386.101Mi/s
This translates into nice evaluation performance improvements (compared to 18c3d2348f):
Benchmark 1: GC_INITIAL_HEAP_SIZE=8G old-nix/bin/nix-instantiate ../nixpkgs -A nixosTests.gnome --readonly-mode
Time (mean ± σ): 3.111 s ± 0.021 s [User: 2.513 s, System: 0.580 s]
Range (min … max): 3.083 s … 3.143 s 10 runs
Benchmark 2: GC_INITIAL_HEAP_SIZE=8G result/bin/nix-instantiate ../nixpkgs -A nixosTests.gnome --readonly-mode
Time (mean ± σ): 3.037 s ± 0.038 s [User: 2.461 s, System: 0.558 s]
Range (min … max): 2.960 s … 3.086 s 10 runs
Old versions of nix happily accepted a lot of weird flake references,
which we didn't have tests for, so this was accidentally broken in
c436b7a32a.
This patch restores previous behavior and adds a plethora of tests
to ensure we don't break this in the future.
These test cases are aligned with how 2.18/2.28 parsed flake references.
Starting from c436b7a32a
this used to lead to assertion failures like:
> std::string nix::ParsedURL::renderAuthorityAndPath() const: Assertion `path.empty() || path.front().empty()' failed.
This has the bugfix for the issue and regressions tests
so that this gets properly tested in the future.
This would print erroneous and misleading diagnostics like:
> error (ignored): error: '--arg' and '--argstr' are incompatible with flakes
When run with --expr/--file. Since this installable is used to get the
bash package it doesn't make sense to check this.
It is only done in the `force = true` case, and the only
`cleanupBuild(true)` call is right after where it used to be, so this
has the exact same behavior as before.
Calling `reset` on this `std::optional` field of `DerivationBuilderImpl`
is also what the (automatically created) destructor of
`DerivationBuilderImpl` will do. We should be making sure that the
derivation builder is cleaned up by the goal anyways, and if we do that,
then this `Finally` is no longer needed.
Before, had a very ugly `appendLogTailErrorMsg` callback. Now, we
instead have a `fixupBuilderFailureErrorMessage` that is just used by
`DerivationBuildingGoal`, and `DerivationBuilder` just returns the raw
data needed by this.
Now we have better separation of the core logic --- an integral part of
the store layer spec even --- from the goal mechanism and other
minutiae.
Co-authored-by: Jeremy Kolb <kjeremy@gmail.com>
See the new extensive doxygen in `url.hh`.
This fixes fetching gitlab: flakes.
Paths are now stored as a std::vector of individual path
segments, which can themselves contain path separators '/' (%2F).
This is necessary to make the Gitlab's /projects/ API work.
Co-authored-by: John Ericson <John.Ericson@Obsidian.Systems>
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
I think this should be fine for repairing. If anything, it is better,
because it would be weird to "mark and output good" only for it to then
fail output checks.
Sadly we cannot unexpose `DerivationBuilder::killChild` yet, because
`DerivationBuildingGoal` calls it elsewhere, but we can at least haave a
better division of labor between the two destructors.
It's hard to tell if I changed any behavior, but if I did, I think I
made it better, because now we explicitly move stuff out of the chroot
(if we were going to) before trying to delete the chroot.
Do this to match `DerivationBuilder::deleteTmpDir`, which we'll want to
combine it with next.
Also chenge one caller from `deleteTmpDir(true)` to `cleanupBuild(true)`
now that this is done, because it will not make a difference.
This should be a pure refactor with no behavioral change.
Aftet the previous simplifications, there is no reason to catch the
error and immediately return it with a `std::variant` --- just let the
caller catch it instead.
Instead of that funny business, the fixed output checks are not put in
`checkOutputs`, with the other (newer) output checks, where they also
better belong. The control flow is reworked (with comments!) so that
`checkOutputs` also runs in the `bmCheck` case.
Not only does this preserve existing behavior of `bmCheck`
double-checking fixed output hashes with less tricky code, it also makes
`bmCheck` better by also double-checking the other output checks, rather
than just assuming they pass if the derivation is deterministic.
It's fine to set these worker flags a little later in the control flow,
since we'll be sure to reach those points in the error cases. And doing
that is much nicer than having these tangled callbacks.
I originally made the callbacks to meticulously recreate the exact
behavior which I didn't quite understand. Now, thanks to cleaning up the
error handling, I do understand what is going on, so I can be confident
that this change is safe to make.
Instead of passing them around separately, or doing finicky logic in a
try-catch block to recover them, just make `BuildError` always contain a
status, and make it the thrower's responsibility to set it. This is much
more simple and explicit.
Once that change is done, split the `done` functions of `DerivationGoal`
and `DerivationBuildingGoal` into separate success and failure
functions, which ends up being easier to understand and hardly any
duplication.
Also, change the handling of failures in resolved cases to use
`BuildResult::DependencyFailed` and a new message. This is because the
underlying derivation will also get its message printed --- which is
good, because in general the resolved derivation is not unique. One dyn
drv test had to be updated, but CA (and dyn drv) is experimental, so I
do not mind.
Finally, delete `SubstError` because it is unused.
The commit says it was added for CA testing --- manual I assume, since
there is no use of this in the test suite. I don't think we need it any
more, and I am not sure whether it was ever supposed to have made it to
`master` either.
This reverts commit 2eec2f765a.
We currently just use this during the build of a derivation, but there is no
reason we wouldn't want to use it elsewhere, e.g. to check the outputs
of someone else's build after the fact.
Moreover, I like pulling things out of `DerivationBuilder` that are
simple and don't need access to all that state. While
`DerivationBuilder` is unix-only, this refactor also make the code more
portable "for free".
The header is private, at Eelco's request.
With the migration to /nix/var/nix/builds we now have failing builds
when the derivation name is too long.
This change removes the derivation name from the temporary build to have
a predictable prefix length:
Also see: https://github.com/NixOS/infra/pull/764
for context.
This allows us to replace some very hacky and not correct string
concatentation in `HttpBinaryCacheStore`. It will especially be useful
with #13752, when today's hacks started to cause problems in practice,
not just theory.
Also make `fixGitURL` returned a `ParsedURL`.
• Updated input 'nixpkgs':
'github:NixOS/nixpkgs/cd32a774ac52caaa03bcfc9e7591ac8c18617ced?narHash=sha256-VtMQg02B3kt1oejwwrGn50U9Xbjgzfbb5TV5Wtx8dKI%3D' (2025-08-17)
→ 'github:NixOS/nixpkgs/d98ce345cdab58477ca61855540999c86577d19d?narHash=sha256-O2CIn7HjZwEGqBrwu9EU76zlmA5dbmna7jL1XUmAId8%3D' (2025-08-26)
This update contains d1266642a8722f2a05e311fa151c1413d2b9653c, which
is necessary for the TOML timestamps to get tested via nixpkgsLibTests job.
I need this for some `ParseURL` improvements, but I figure this is
better to send as its own PR.
I changed the tests willy-nilly to sometimes use
`std::list<std::string_view>` instead of `Strings` (which is
`std::list<std::string>`).
Co-Authored-By: Sergei Zimmerman <sergei@zimmerman.foo>
It is suppposed to be "post build" not "during the build" after all. Its
location now matches that for the hook case (see elsewhere in
`DerivationdBuildingGoal`).
It was in a try-catch before, and now it isn't, but I believe that it is
impossible for it to throw `BuildError`, which is sufficient for this
code motion to be correct.
Update src/libutil/windows/current-process.cc
Prefer `nullptr` over `NULL`
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
Update src/libutil/unix/current-process.cc
Prefer C++ type casts
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
Update src/libutil/windows/current-process.cc
Prefer C++ type casts
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
Update src/libutil/unix/current-process.cc
Don't allocate exception
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
synopsis: Channel URLs migrated to channels.nixos.org subdomain
prs: [14518]
issues: [14517]
---
Channel URLs have been updated from `https://nixos.org/channels/` to `https://channels.nixos.org/` throughout Nix.
The subdomain provides better reliability with IPv6 support and improved CDN distribution. The old domain apex (`nixos.org/channels/`) currently redirects to the new location but may be deprecated in the future.
synopsis: "S3 binary cache stores now support storage class configuration"
prs: [14464]
issues: [7015]
---
S3 binary cache stores now support configuring the storage class for uploaded objects via the `storage-class` parameter. This allows users to optimize costs by selecting appropriate storage tiers based on access patterns.
The storage class applies to both regular uploads and multipart uploads. When not specified, objects use the bucket's default storage class.
See the [S3 storage classes documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html) for available storage classes and their characteristics.
Channels are a mechanism for referencing remote Nix expressions and conveniently retrieving their latest version.
The moving parts of channels are:
- The official channels listed at <https://nixos.org/channels>
- The official channels listed at <https://channels.nixos.org>
- The user-specific list of [subscribed channels](#subscribed-channels)
- The [downloaded channel contents](#channels)
- The [Nix expression search path](@docroot@/command-ref/conf-file.md#conf-nix-path), set with the [`-I` option](#opt-i) or the [`NIX_PATH` environment variable](#env-NIX_PATH)
- The [Nix expression search path](@docroot@/command-ref/conf-file.md#conf-nix-path), set with the [`-I` option](#opt-I) or the [`NIX_PATH` environment variable](#env-NIX_PATH)
> **Note**
>
@@ -88,9 +88,9 @@ This command has the following operations:
Subscribe to the Nixpkgs channel and run `hello` from the GNU Hello package:
To build all dependencies and start a shell in which all environment variables are set up so that those dependencies can be found:
@@ -256,7 +256,7 @@ You can use any of the other supported environments in place of `nix-cli-ccacheS
## Editor integration
The `clangd` LSP server is installed by default on the `clang`-based `devShell`s.
See [supported compilation environments](#compilation-environments) and instructions how to set up a shell [with flakes](#nix-with-flakes) or in [classic Nix](#classic-nix).
See [supported compilation environments](#compilation-environments) and instructions how to set up a shell [with flakes](#building-nix-with-flakes) or in [classic Nix](#building-nix).
To use the LSP with your editor, you will want a `compile_commands.json` file telling `clangd` how we are compiling the code.
Meson's configure always produces this inside the build directory.
[An experimental feature](#@docroot@/development/experimental-features.md#xp-feature-impure-derivations) that allows derivations to be explicitly marked as impure,
[An experimental feature](@docroot@/development/experimental-features.md#xp-feature-impure-derivations) that allows derivations to be explicitly marked as impure,
so that they are always rebuilt, and their outputs not reused by subsequent calls to realise them.
- [Nix database]{#gloss-nix-database}
@@ -279,7 +279,7 @@
See [References](@docroot@/store/store-object.md#references) for details.
- [referrer]{#gloss-reference}
- [referrer]{#gloss-referrer}
A reversed edge from one [store object] to another.
@@ -367,8 +367,8 @@
Nix represents files as [file system objects][file system object], and how they belong together is encoded as [references][reference] between [store objects][store object] that contain these file system objects.
The [Nix language] allows denoting packages in terms of [attribute sets](@docroot@/language/types.md#attribute-set) containing:
- attributes that refer to the files of a package, typically in the form of [derivation outputs](#output),
The [Nix language] allows denoting packages in terms of [attribute sets](@docroot@/language/types.md#type-attrs) containing:
- attributes that refer to the files of a package, typically in the form of [derivation outputs](#gloss-output),
- attributes with metadata, such as information about how the package is supposed to be used.
The exact shape of these attribute sets is up to convention.
@@ -333,7 +333,7 @@ Here is more information on the `output*` attributes, and what values they may b
`outputHashAlgo` can only be `null` when `outputHash` follows the SRI format, because in that case the choice of hash algorithm is determined by `outputHash`.
When defining an [attribute set](./types.md#attribute-set) or in a [let-expression](#let-expressions) it is often convenient to copy variables from the surrounding lexical scope (e.g., when you want to propagate attributes).
When defining an [attribute set](./types.md#type-attrs) or in a [let-expression](#let-expressions) it is often convenient to copy variables from the surrounding lexical scope (e.g., when you want to propagate attributes).
This can be shortened using the `inherit` keyword.
For historical reasons, [store derivations][store derivation] are stored on-disk in [ATerm](https://homepages.cwi.nl/~daybuild/daily-books/technology/aterm-guide/aterm-guide.html) format.
For historical reasons, [store derivations][store derivation] are stored on-disk in "Annotated Term" (ATerm) format
This is used when calculating the store paths of the derivation's outputs.
*`outputs`:
Information about the output paths of the derivation.
This is a JSON object with one member per output, where the key is the output name and the value is a JSON object with these fields:
*`path`:
The output path, if it is known in advanced.
Otherwise, `null`.
*`method`:
For an output which will be [content addressed], a string representing the [method](@docroot@/store/store-object/content-address.md) of content addressing that is chosen.
This schema describes the JSON representation of Nix's `BuildResult` type, which represents the result of building a derivation or substituting store paths.
Build results can represent either successful builds (with built outputs) or various types of failures.
oneOf:
- "$ref": "#/$defs/success"
- "$ref": "#/$defs/failure"
type:object
required:
- success
- status
properties:
timesBuilt:
type:integer
minimum:0
title:Times built
description:|
How many times this build was performed.
startTime:
type:integer
minimum:0
title:Start time
description:|
The start time of the build (or one of the rounds, if it was repeated), as a Unix timestamp.
stopTime:
type:integer
minimum:0
title:Stop time
description:|
The stop time of the build (or one of the rounds, if it was repeated), as a Unix timestamp.
cpuUser:
type:integer
minimum:0
title:User CPU time
description:|
User CPU time the build took, in microseconds.
cpuSystem:
type:integer
minimum:0
title:System CPU time
description:|
System CPU time the build took, in microseconds.
"$defs":
success:
type:object
title:Successful Build Result
description:|
Represents a successful build with built outputs.
required:
- success
- status
- builtOutputs
properties:
success:
const:true
title:Success indicator
description:|
Always true for successful build results.
status:
type:string
title:Success status
description:|
Status string for successful builds.
enum:
- "Built"
- "Substituted"
- "AlreadyValid"
- "ResolvesToAlreadyValid"
builtOutputs:
type:object
title:Built outputs
description:|
A mapping from output names to their build trace entries.
additionalProperties:
"$ref": "build-trace-entry-v1.yaml"
failure:
type:object
title:Failed Build Result
description:|
Represents a failed build with error information.
required:
- success
- status
- errorMsg
properties:
success:
const:false
title:Success indicator
description:|
Always false for failed build results.
status:
type:string
title:Failure status
description:|
Status string for failed builds.
enum:
- "PermanentFailure"
- "InputRejected"
- "OutputRejected"
- "TransientFailure"
- "CachedFailure"
- "TimedOut"
- "MiscFailure"
- "DependencyFailed"
- "LogLimitExceeded"
- "NotDeterministic"
- "NoSubstituters"
- "HashMismatch"
errorMsg:
type:string
title:Error message
description:|
Information about the error if the build failed.
isNonDeterministic:
type:boolean
title:Non-deterministic flag
description:|
If timesBuilt > 1, whether some builds did not produce the same result.
Note that 'isNonDeterministic = false' does not mean the build is deterministic,
just that we don't have evidence of non-determinism.
The path to the store object that resulted from building this derivation for the given output name.
dependentRealisations:
type:object
title:Underlying Base Build Trace
description:|
This is for [*derived*](@docroot@/store/build-trace.md#derived) build trace entries to ensure coherence.
Keys are derivation output IDs (same format as the main `id` field).
Values are the store paths that those dependencies resolved to.
As described in the linked section on derived build trace traces, derived build trace entries must be kept in addition and not instead of the underlying base build entries.
This is the set of base build trace entries that this derived build trace is derived from.
(The set is also a map since this miniature base build trace must be coherent, mapping each key to a single value.)
patternProperties:
"^sha256:[0-9a-f]{64}![a-zA-Z_][a-zA-Z0-9_-]*$":
$ref:"store-path-v1.yaml"
title:Dependent Store Path
description:Store path that this dependency resolved to during the build
additionalProperties:false
signatures:
type:array
title:Build Signatures
description:|
A set of cryptographic signatures attesting to the authenticity of this build trace entry.
This schema describes the JSON representation of Nix's `ContentAddress` type, which conveys information about [content-addressing store objects](@docroot@/store/store-object/content-address.md).
> **Note**
>
> For current methods of content addressing, this data type is a bit suspicious, because it is neither simply a content address of a file system object (the `method` is richer), nor simply a content address of a store object (the `hash` doesn't account for the references).
> It should thus only be used in contexts where the references are also known / otherwise made tamper-resistant.
<!--
TODO currently `ContentAddress` is used in both of these, and so same rationale applies, but actually in both cases the JSON is currently ad-hoc.
That will be fixed, and as each is fixed, the example (along with a more precise link to the field in question) should be become part of the above note, so what is is saying is more clear.
> For example:
> - Fixed outputs of derivations are not allowed to have any references, so an empty reference set is statically known by assumption.
> - [Store object info](./store-object-info.md) includes the set of references along side the (optional) content address.
> This data type is thus safely used in both of these contexts.
-->
type:object
properties:
method:
"$ref": "#/$defs/method"
hash:
title:Content Address
description:|
This would be the content-address itself.
For all current methods, this is just a content address of the file system object of the store object, [as described in the store chapter](@docroot@/store/file-system-object/content-address.md), and not of the store object as a whole.
In particular, the references of the store object are *not* taken into account with this hash (and currently-supported methods).
"$ref": "./hash-v1.yaml"
required:
- method
- hash
additionalProperties:false
"$defs":
method:
type:string
enum:[flat, nar, text, git]
title:Content-Addressing Method
description:|
A string representing the [method](@docroot@/store/store-object/content-address.md) of content addressing that is chosen.
Valid method strings are:
- [`flat`](@docroot@/store/store-object/content-address.md#method-flat) (provided the contents are a single file)
[Structured Attributes](@docroot@/store/derivation/index.md#structured-attrs), only defined if the derivation contains them.
Structured attributes are JSON, and thus embedded as-is.
type:object
additionalProperties:true
"$defs":
output:
overall:
title:Derivation Output
description:|
A single output of a derivation, with different variants for different output types.
oneOf:
- "$ref": "#/$defs/output/inputAddressed"
- "$ref": "#/$defs/output/caFixed"
- "$ref": "#/$defs/output/caFloating"
- "$ref": "#/$defs/output/deferred"
- "$ref": "#/$defs/output/impure"
inputAddressed:
title:Input-Addressed Output
description:|
The traditional non-fixed-output derivation type.
The output path is determined from the derivation itself.
See [Input-addressing derivation outputs](@docroot@/store/derivation/outputs/input-address.md) for more details.
type:object
required:
- path
properties:
path:
$ref:"store-path-v1.yaml"
title:Output path
description:|
The output path determined from the derivation itself.
additionalProperties:false
caFixed:
title:Fixed Content-Addressed Output
description:|
The output is content-addressed, and the content-address is fixed in advance.
See [Fixed-output content-addressing](@docroot@/store/derivation/outputs/content-address.md#fixed) for more details.
"$ref": "./content-address-v1.yaml"
required:
- method
- hash
properties:
method:
description:|
Method of content addressing used for this output.
hash:
title:Expected hash value
description:|
The expected content hash.
additionalProperties:false
caFloating:
title:Floating Content-Addressed Output
description:|
Floating-output derivations, whose outputs are content
addressed, but not fixed, and so the output paths are dynamically calculated from
whatever the output ends up being.
See [Floating Content-Addressing](@docroot@/store/derivation/outputs/content-address.md#floating) for more details.
type:object
required:
- method
- hashAlgo
properties:
method:
"$ref": "./content-address-v1.yaml#/$defs/method"
description:|
Method of content addressing used for this output.
hashAlgo:
title:Hash algorithm
"$ref": "./hash-v1.yaml#/$defs/algorithm"
description:|
What hash algorithm to use for the given method of content-addressing.
additionalProperties:false
deferred:
title:Deferred Output
description:|
Input-addressed output which depends on a (CA) derivation whose outputs (and thus their content-address
are not yet known.
type:object
properties:{}
additionalProperties:false
impure:
title:Impure Output
description:|
Impure output which is just like a floating content-addressed output, but this derivation runs without sandboxing.
As such, we don't record it in the build trace, under the assumption that if we need it again, we should rebuild it, as it might produce something different.
required:
- impure
- method
- hashAlgo
properties:
impure:
const:true
method:
"$ref": "./content-address-v1.yaml#/$defs/method"
description:|
How the file system objects will be serialized for hashing.
hashAlgo:
title:Hash algorithm
"$ref": "./hash-v1.yaml#/$defs/algorithm"
description:|
How the serialization will be hashed.
additionalProperties:false
outputName:
type:string
title:Output name
description:Name of the derivation output to depend on
outputNames:
type:array
title:Output Names
description:Set of names of derivation outputs to depend on
A cryptographic hash value used throughout Nix for content addressing and integrity verification.
This schema describes the JSON representation of Nix's `Hash` type.
type:object
properties:
algorithm:
"$ref": "#/$defs/algorithm"
format:
type:string
enum:
- base64
- nix32
- base16
- sri
title:Hash format
description:|
The encoding format of the hash value.
- `base64` uses standard Base64 encoding [RFC 4648, section 4](https://datatracker.ietf.org/doc/html/rfc4648#section-4)
- `nix32` is Nix-specific base-32 encoding
- `base16` is lowercase hexadecimal
- `sri` is the [Subresource Integrity format](https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity).
hash:
type:string
title:Hash
description:|
The encoded hash value, itself.
It is specified in the format specified by the `format` field.
It must be the right length for the hash algorithm specified in the `algorithm` field, also.
The hash value does not include any algorithm prefix.
required:
- algorithm
- format
- hash
additionalProperties:false
"$defs":
algorithm:
type:string
enum:
- blake3
- md5
- sha1
- sha256
- sha512
title:Hash algorithm
description:|
The hash algorithm used to compute the hash value.
`blake3` is currently experimental and requires the [`blake-hashing`](@docroot@/development/experimental-features.md#xp-feature-blake3-hashes) experimental feature.
Information about a [store object](@docroot@/store/store-object.md).
This schema describes the JSON representation of store object metadata as returned by commands like [`nix path-info --json`](@docroot@/command-ref/new-cli/nix3-path-info.md).
Store object information can come in a few different variations.
Firstly, "impure" fields, which contain non-intrinsic information about the store object, may or may not be included.
Second, binary cache stores have extra non-intrinsic infomation about the store objects they contain.
Thirdly, [`nix path-info --json --closure-size`](@docroot@/command-ref/new-cli/nix3-path-info.html#opt-closure-size) can compute some extra information about not just the single store object in question, but the store object and its [closure](@docroot@/glossary.md#gloss-closure).
The impure and NAR fields are grouped into separate variants below.
See their descriptions for additional information.
The closure fields however as just included as optional fields, to avoid a combinatorial explosion of variants.
oneOf:
- $ref:"#/$defs/base"
- $ref:"#/$defs/impure"
- $ref:"#/$defs/narInfo"
$defs:
base:
title:Store Object Info
description:|
Basic store object metadata containing only intrinsic properties.
This is the minimal set of fields that describe what a store object contains.
type:object
required:
- version
- narHash
- narSize
- references
- ca
properties:
version:
type:integer
const:2
title:Format version (must be 2)
description:|
Must be `2`.
This is a guard that allows us to continue evolving this format.
Here is the rough version history:
- Version 0: `.narinfo` line-oriented format
- Version 1: Original JSON format, with ugly `"r:sha256"` inherited from `.narinfo` format.
- Version 2: Use structured JSON type for `ca`
path:
type:string
title:Store Path
description:|
[Store path](@docroot@/store/store-path.md) to the given store object.
Note: This field may not be present in all contexts, such as when the path is used as the key and the the store object info the value in map.
narHash:
"$ref": "./hash-v1.yaml"
title:NAR Hash
description:|
Hash of the [file system object](@docroot@/store/file-system-object.md) part of the store object when serialized as a [Nix Archive](@docroot@/store/file-system-object/content-address.md#serial-nix-archive).
narSize:
type:integer
minimum:0
title:NAR Size
description:|
Size of the [file system object](@docroot@/store/file-system-object.md) part of the store object when serialized as a [Nix Archive](@docroot@/store/file-system-object/content-address.md#serial-nix-archive).
references:
type:array
title:References
description:|
An array of [store paths](@docroot@/store/store-path.md), possibly including this one.
items:
type:string
ca:
oneOf:
- type:"null"
const:null
- "$ref": "./content-address-v1.yaml"
title:Content Address
description:|
If the store object is [content-addressed](@docroot@/store/store-object/content-address.md),
this is the content address of this store object's file system object, used to compute its store path.
Otherwise (i.e. if it is [input-addressed](@docroot@/glossary.md#gloss-input-addressed-store-object)), this is `null`.
additionalProperties:false
impure:
title:Store Object Info with Impure Fields
description:|
Store object metadata including impure fields that are not *intrinsic* properties.
In other words, the same store object in different stores could have different values for these impure fields.
If known, the path to the [store derivation](@docroot@/glossary.md#gloss-store-derivation) from which this store object was produced.
Otherwise `null`.
> This is an "impure" field that may not be included in certain contexts.
registrationTime:
type:["integer","null"]
title:Registration Time
description:|
If known, when this derivation was added to the store (Unix timestamp).
Otherwise `null`.
> This is an "impure" field that may not be included in certain contexts.
ultimate:
type:boolean
title:Ultimate
description:|
Whether this store object is trusted because we built it ourselves, rather than substituted a build product from elsewhere.
> This is an "impure" field that may not be included in certain contexts.
signatures:
type:array
title:Signatures
description:|
Signatures claiming that this store object is what it claims to be.
Not relevant for [content-addressed](@docroot@/store/store-object/content-address.md) store objects,
but useful for [input-addressed](@docroot@/glossary.md#gloss-input-addressed-store-object) store objects.
> This is an "impure" field that may not be included in certain contexts.
items:
type:string
# Computed closure fields
closureSize:
type:integer
minimum:0
title:Closure Size
description:|
The total size of this store object and every other object in its [closure](@docroot@/glossary.md#gloss-closure).
> This field is not stored at all, but computed by traversing the other fields across all the store objects in a closure.
additionalProperties:false
narInfo:
title:Store Object Info with Impure fields and NAR Info
description:|
The store object info in the "binary cache" family of Nix store type contain extra information pertaining to *downloads* of the store object in question.
(This store info is called "NAR info", since the downloads take the form of [Nix Archives](@docroot@/store/file-system-object/content-address.md#serial-nix-archive, and the metadata is served in a file with a `.narinfo` extension.)
This download information, being specific to how the store object happens to be stored and transferred, is also considered to be non-intrinsic / impure.
Where to download a compressed archive of the file system objects of this store object.
> This is an impure "`.narinfo`" field that may not be included in certain contexts.
compression:
type:string
title:Compression
description:|
The compression format that the archive is in.
> This is an impure "`.narinfo`" field that may not be included in certain contexts.
downloadHash:
"$ref": "./hash-v1.yaml"
title:Download Hash
description:|
A digest for the compressed archive itself, as opposed to the data contained within.
> This is an impure "`.narinfo`" field that may not be included in certain contexts.
downloadSize:
type:integer
minimum:0
title:Download Size
description:|
The size of the compressed archive itself.
> This is an impure "`.narinfo`" field that may not be included in certain contexts.
closureDownloadSize:
type:integer
minimum:0
title:Closure Download Size
description:|
The total size of the compressed archive itself for this object, and the compressed archive of every object in this object's [closure](@docroot@/glossary.md#gloss-closure).
> This is an impure "`.narinfo`" field that may not be included in certain contexts.
> This field is not stored at all, but computed by traversing the other fields across all the store objects in a closure.
[file system object]: @docroot@/store/file-system-object.md
The format of this specification is close to [Extended Backus–Naur form](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form), with the exception of the `str(..)` function / parameterized rule, which length-prefixes and pads strings.
@@ -41,3 +41,15 @@ The `str` function / parameterized rule is defined as follows:
- `int(n)` = the 64-bit little endian representation of the number `n`
- `pad(s)` = the byte sequence `s`, padded with 0s to a multiple of 8 byte
## Kaitai Struct Specification
The Nix Archive (NAR) format is also formally described using [Kaitai Struct](https://kaitai.io/), an Interface Description Language (IDL) for defining binary data structures.
> Kaitai Struct provides a language-agnostic, machine-readable specification that can be compiled into parsers for various programming languages (e.g., C++, Python, Java, Rust).
```yaml
{{#include nar.ksy}}
```
The source of the spec can be found [here](https://github.com/nixos/nix/blob/master/src/nix-manual/source/protocols/nix-archive/nar.ksy). Contributions and improvements to the spec are welcomed.
-`nix-shell` shebang lines now support single-quoted arguments.
-`builtins.fetchTree` is now its own experimental feature, [`fetch-tree`](@docroot@/development/experimental-features.md#xp-fetch-tree).
This allows stabilising it independently of the rest of what is encompassed by [`flakes`](@docroot@/development/experimental-features.md#xp-fetch-tree).
-`builtins.fetchTree` is now its own experimental feature, [`fetch-tree`](@docroot@/development/experimental-features.md#xp-feature-fetch-tree).
This allows stabilising it independently of the rest of what is encompassed by [`flakes`](@docroot@/development/experimental-features.md#xp-feature-flakes).
- The interface for creating and updating lock files has been overhauled:
- Modify `nix derivation {add,show}` JSON format [#9866](https://github.com/NixOS/nix/issues/9866) [#10722](https://github.com/NixOS/nix/pull/10722)
The JSON format for derivations has been slightly revised to better conform to our [JSON guidelines](@docroot@/development/cli-guideline.md#returning-future-proof-json).
The JSON format for derivations has been slightly revised to better conform to our [JSON guidelines](@docroot@/development/json-guideline.md).
In particular, the hash algorithm and content addressing method of content-addressed derivation outputs are now separated into two fields `hashAlgo` and `method`,
rather than one field with an arcane `:`-separated format.
- Support unit prefixes in configuration settings [#10668](https://github.com/NixOS/nix/pull/10668)
Configuration settings in Nix now support unit prefixes, allowing for more intuitive and readable configurations. For example, you can now specify [`--min-free 1G`](@docroot@/command-ref/opt-common.md#opt-min-free) to set the minimum free space to 1 gigabyte.
Configuration settings in Nix now support unit prefixes, allowing for more intuitive and readable configurations. For example, you can now specify [`--min-free 1G`](@docroot@/command-ref/conf-file.md#conf-min-free) to set the minimum free space to 1 gigabyte.
This enhancement was extracted from [#7851](https://github.com/NixOS/nix/pull/7851) and is also useful for PR [#10661](https://github.com/NixOS/nix/pull/10661).
- Removed support for daemons and clients older than Nix 2.0 [#13951](https://github.com/NixOS/nix/pull/13951)
We have dropped support in the daemon worker protocol for daemons and clients that don't speak at least version 18 of the protocol. This first Nix release that supports this version is Nix 2.0, released in February 2018.
- Derivation JSON format now uses store path basenames only [#13570](https://github.com/NixOS/nix/issues/13570) [#13980](https://github.com/NixOS/nix/pull/13980)
Experience with many JSON frameworks (e.g. nlohmann/json in C++, Serde in Rust, and Aeson in Haskell) has shown that the use of the store directory in JSON formats is an impediment to systematic JSON formats, because it requires the serializer/deserializer to take an extra paramater (the store directory).
We ultimately want to rectify this issue with all JSON formats to the extent allowed by our stability promises. To start with, we are changing the JSON format for derivations because the `nix derivation` commands are — in addition to being formally unstable — less widely used than other unstable commands.
See the documentation on the [JSON format for derivations](@docroot@/protocols/json/derivation.md) for further details.
- C API: `nix_get_attr_name_byidx`, `nix_get_attr_byidx` take a `nix_value *` instead of `const nix_value *` [#13987](https://github.com/NixOS/nix/pull/13987)
In order to accommodate a more optimized internal representation of attribute set merges these functions require
a mutable `nix_value *` that might be modified on access. This does *not* break the ABI of these functions.
## New features
- C API: Add lazy attribute and list item accessors [#14030](https://github.com/NixOS/nix/pull/14030)
The C API now includes lazy accessor functions for retrieving values from lists and attribute sets without forcing evaluation:
-`nix_get_list_byidx_lazy()` - Get a list element without forcing its evaluation
-`nix_get_attr_byname_lazy()` - Get an attribute value by name without forcing evaluation
-`nix_get_attr_byidx_lazy()` - Get an attribute by index without forcing evaluation
These functions are useful when forwarding unevaluated sub-values to other lists, attribute sets, or function calls. They allow more efficient handling of Nix values by deferring evaluation until actually needed.
Additionally, bounds checking has been improved for all `_byidx` functions to properly validate indices before access, preventing potential out-of-bounds errors.
The documentation for `NIX_ERR_KEY` error handling has also been clarified to specify when this error code is returned.
- HTTP binary caches now support transparent compression for metadata
HTTP binary cache stores can now compress `.narinfo`, `.ls`, and build log files before uploading them,
reducing bandwidth usage and storage requirements. The compression is applied transparently using the
`Content-Encoding` header, allowing compatible clients to automatically decompress the files.
Three new configuration options control this behavior:
-`narinfo-compression`: Compression method for `.narinfo` files
-`ls-compression`: Compression method for `.ls` files
-`log-compression`: Compression method for build logs in `log/` directory
nix store copy-log --to 'http://cache.example.com?log-compression=br' /nix/store/...
```
- Temporary build directories no longer include derivation names [#13839](https://github.com/NixOS/nix/pull/13839)
Temporary build directories created during derivation builds no longer include the derivation name in their path to avoid build failures when the derivation name is too long. This change ensures predictable prefix lengths for build directories under `/nix/var/nix/builds`.
These are helper programs that Nix calls to perform derivations for specified system types, e.g. by using QEMU to emulate a different type of platform. For more information, see the [`external-builders` setting](../command-ref/conf-file.md#conf-external-builders).
This is currently an experimental feature.
## Performance improvements
- Optimize memory usage of attribute set merges [#13987](https://github.com/NixOS/nix/pull/13987)
[Attribute set update operations](@docroot@/language/operators.md#update) have been optimized to
reduce reallocations in cases when the second operand is small.
For typical evaluations of nixpkgs this optimization leads to ~20% less memory allocated in total
without significantly affecting evaluation performance.
See [eval-attrset-update-layer-rhs-threshold](@docroot@/command-ref/conf-file.md#conf-eval-attrset-update-layer-rhs-threshold)
- Substituted flake inputs are no longer re-copied to the store [#14041](https://github.com/NixOS/nix/pull/14041)
Since 2.25, Nix would fail to store a cache entry for substituted flake inputs, which in turn would cause them to be re-copied to the store on initial evaluation. Caching these inputs results in a near doubling of performance in some cases — especially on I/O-bound machines and when using commands that fetch many inputs, like `nix flake [archive|prefetch-inputs]`.
- `nix flake check` now skips derivations that can be substituted [#13574](https://github.com/NixOS/nix/pull/13574)
Previously, `nix flake check` would evaluate and build/substitute all
derivations. Now, it will skip downloading derivations that can be substituted.
This can drastically decrease the time invocations take in environments where
checks may already be cached (like in CI).
- `fetchTarball` and `fetchurl` now correctly substitute (#14138)
At some point we stopped substituting calls to `fetchTarball` and `fetchurl` with a set `narHash` to avoid incorrectly substituting things in `fetchTree`, even though it would be safe to substitute when calling the legacy `fetch{Tarball,url}`. This fixes that regression where it is safe.
- Started moving AST allocations into a bump allocator [#14088](https://github.com/NixOS/nix/issues/14088)
This leaves smaller, immutable structures in the AST. So far this saves about 2% memory on a NixOS config evaluation.
## Contributors
This release was made possible by the following 32 contributors:
The *build trace* is a [memoization table](https://en.wikipedia.org/wiki/Memoization) for builds.
It maps the inputs of builds to the outputs of builds.
Concretely, that means it maps [derivations][derivation] to maps of [output] names to [store objects][store object].
In general the derivations used as a key should be [*resolved*](./resolution.md).
A build trace with all-resolved-derivation keys is also called a *base build trace* for extra clarity.
If all the resolved inputs of a derivation are content-addressed, that means the inputs will be fully determined, leaving no ambiguity for what build was performed.
(Input-addressed inputs however are still ambiguous. They too should be locked down, but this is left as future work.)
Accordingly, to look up an unresolved derivation, one must first resolve it to get a resolved derivation.
Resolving itself involves looking up entries in the build trace, so this is a mutually recursive process that will end up inspecting possibly many entries.
Except for the issue with input-addressed paths called out above, base build traces are trivially *coherent* -- incoherence is not possible.
That means that the claims that each key-value base build try entry makes are independent, and no mapping invalidates another mapping.
Whether the mappings are *true*, i.e. the faithful recording of actual builds performed, is another matter.
Coherence is about the multiple claims of the build trace being mutually consistent, not about whether the claims are individually true or false.
In general, there is no way to audit a build trace entry except for by performing the build again from scratch.
And even in that case, a different result doesn't mean the original entry was a "lie", because the derivation being built may be non-deterministic.
As such, the decision of whether to trust a counterparty's build trace is a fundamentally subject policy choice.
Build trace entries are typically *signed* in order to enable arbitrary public-key-based trust polices.
## Derived build traces {#derived}
Implementations that wish to memoize the above may also keep additional *derived* build trace entries that do map unresolved derivations.
But if they do so, they *must* also keep the underlying base entries with resolved derivation keys around.
Firstly, this ensures that the derived entries are merely cache, which could be recomputed from scratch.
Secondly, this ensures the coherence of the derived build trace.
Unlike with base build traces, incoherence with derived build traces is possible.
The key ingredient is that derivation resolution is only deterministic with respect to a fixed base build trace.
Without fixing the base build trace, it inherits the subjectivity of base build traces themselves.
Concretely, suppose there are three derivations \\(a\\), \\(b\\), and \\(c\\).
Let \\(a\\) be a resolved derivation, but let \\(b\\) and \\(c\\) be unresolved and both take as an input an output of \\(a\\).
Now suppose that derived entries are made for \\(b\\) and \\(c\\) based on two different entries of \\(a\\).
(This could happen if \\(a\\) is non-deterministic, \\(a\\) and \\(b\\) are built in one store, \\(a\\) and \\(c\\) are built in another store, and then a third store substitutes from both of the first two stores.)
If trusting the derived build trace entries for \\(b\\) and \\(c\\) requires that each's underlying entry for \\(a\\) be also trusted, the two different mappings for \\(a\\) will be caught.
However, if \\(b\\) and \\(c\\)'s entries can be combined in isolation, there will be nothing to catch the contradiction in their hidden assumptions about \\(a\\)'s output.
- Once this is done, the derivation is *normalized*, replacing each input deriving path with its store path, which we now know from realising the input.
## Builder Execution
## Builder Execution {#builder-execution}
The [`builder`](./derivation/index.md#builder) is executed as follows:
@@ -102,11 +102,11 @@ But rather than somehow scanning all the other fields for inputs, Nix requires t
### System {#system}
The system type on which the [`builder`](#attr-builder) executable is meant to be run.
The system type on which the [`builder`](#builder) executable is meant to be run.
A necessary condition for Nix to schedule a given derivation on some [Nix instance] is for the "system" of that derivation to match that instance's [`system` configuration option] or [`extra-platforms` configuration option].
By putting the `system` in each derivation, Nix allows *heterogenous* build plans, where not all steps can be run on the same machine or same sort of machine.
By putting the `system` in each derivation, Nix allows *heterogeneous* build plans, where not all steps can be run on the same machine or same sort of machine.
Nix can schedule builds such that it automatically builds on other platforms by [forwarding build requests](@docroot@/advanced-topics/distributed-builds.md) to other Nix instances.
@@ -167,10 +167,10 @@ It is only in the potential for that check to fail that they are different.
>
> In a future world where floating content-addressing is also stable, we in principle no longer need separate [fixed](#fixed) content-addressing.
> Instead, we could always use floating content-addressing, and separately assert the precise value content address of a given store object to be used as an input (of another derivation).
> A stand-alone assertion object of this sort is not yet implemented, but its possible creation is tracked in [Issue #11955](https://github.com/NixOS/nix/issues/11955).
> A stand-alone assertion object of this sort is not yet implemented, but its possible creation is tracked in [issue #11955](https://github.com/NixOS/nix/issues/11955).
>
> In the current version of Nix, fixed outputs which fail their hash check are still registered as valid store objects, just not registered as outputs of the derivation which produced them.
> This is an optimization that means if the wrong output hash is specified in a derivation, and then the derivation is recreated with the right output hash, derivation does not need to be rebuilt --- avoiding downloading potentially large amounts of data twice.
> This is an optimization that means if the wrong output hash is specified in a derivation, and then the derivation is recreated with the right output hash, derivation does not need to be rebuilt — avoiding downloading potentially large amounts of data twice.
> This optimisation prefigures the design above:
> If the output hash assertion was removed outside the derivation itself, Nix could additionally not only register that outputted store object like today, but could also make note that derivation did in fact successfully download some data.
For example, for the "fetch URL" example above, making such a note is tantamount to recording what data is available at the time of download at the given URL.
@@ -43,7 +43,7 @@ In particular, the specification decides:
- if the content is content-addressed, how is it content addressed
- if the content is content-addressed, [what is its content address](./content-address.md#fixed-content-addressing) (and thus what is its [store path])
- if the content is content-addressed, [what is its content address](./content-address.md#fixed) (and thus what is its [store path])
That is to say, an input-addressed output's store path is a function not of the output itself, but of the derivation that produced it.
Even if two store paths have the same contents, if they are produced in different ways, and one is input-addressed, then they will have different store paths, and thus guaranteed to not be the same store object.
A naive implementation of an output hash computation for input-addressedoutputs would be to hash the derivation hash and output together.
This clearly has the uniqueness properties we want for input-addressed outputs, but suffers from an inefficiency.
Specifically, new builds would be required whenever a change is made to a fixed-output derivation, despite having provably no differences in the inputs to the new derivation compared to what it used to be.
Concretely, this would cause a "mass rebuild" whenever any fetching detail changes, including mirror lists, certificate authority certificates, etc.
**TODO hash derivation modulo.**
To solve this problem, we compute output hashes differently, so that certain output hashes become identical.
We call this concept quotient hashing, in reference to quotient types or sets.
So how do we compute the hash part of the output path of a derivation?
This is done by the function `hashDrv`, shown in Figure 5.10.
It distinguishes between two cases.
If the derivation is a fixed-output derivation, then it computes a hash over just the `outputHash` attributes.
So how do we compute the hash part of the output paths of an input-addressed derivation?
This is done by the function `hashQuotientDerivation`, shown below.
If the derivation is not a fixed-output derivation, we replace each element in the derivation’s inputDrvs with the result of a call to `hashDrv` for that element.
(The derivation at each store path in `inputDrvs` is converted from its on-disk ATerm representation back to a `StoreDrv` by the function `parseDrv`.) In essence, `hashDrv` partitions storederivations into equivalence classes, and for hashing purpose it replaces each store path in a derivation graph with its equivalence class.
First, a word on inputs.
`hashQuotientDerivation` is only defined on derivations whose [inputs](@docroot@/store/derivation/index.md#inputs) take the first-order form:
```typescript
typeConstantPath={
path: StorePath;
};
The recursion in Figure 5.10 is inefficient:
it will call itself once for each path by which a subderivation can be reached, i.e., `O(V k)` times for a derivation graph with `V` derivations and with out-degree of at most `k`.
In the actual implementation, memoisation is used to reduce this to `O(V + E)` complexity for a graph with E edges.
inputDrvOutputs: Set<FirstOrderOutputPath>;// new instead
// ...other fields...
};
```
In the [currently-experimental][xp-feature-dynamic-derivations] higher-order case where outputs of outputs are allowed as [deriving paths][deriving-path] and thus derivation inputs, derivations using that generalization are not valid arguments to this function.
Those derivations must be (partially) [resolved](@docroot@/store/resolution.md) enough first, to the point where no such higher-order inputs remain.
Then, and only then, can input addresses be assigned.
```
function hashQuotientDerivation(drv) -> Hash:
assert(drv.outputs are input-addressed)
drv′ ← drv with {
inputDrvOutputs = ⋃(
assert(drvPath is store path)
case hashOutputsOrQuotientDerivation(readDrv(drvPath)) of
drvHash : Hash →
(drvHash.toBase16(), output)
outputHashes : Map[String, Hash] →
(outputHashes[output].toBase16(), "out")
| (drvPath, output) ∈ drv.inputDrvOutputs
)
}
return hashSHA256(printDrv(drv′))
function hashOutputsOrQuotientDerivation(drv) -> Map[String, Hash] | Hash:
, ca = output.contentAddress // or get from build trace if floating
}
else: // drv.outputs are input-addressed
return hashQuotientDerivation(drv)
```
### `hashQuotientDerivation`
We replace each element in the derivation's `inputDrvOutputs` using data from a call to `hashOutputsOrQuotientDerivation` on the `drvPath` of that element.
When `hashOutputsOrQuotientDerivation` returns a single drv hash (because the input derivation in question is input-addressing), we simply swap out the `drvPath` for that hash, and keep the same output name.
When `hashOutputsOrQuotientDerivation` returns a map of content addresses per-output, we look up the output in question, and pair it with the output name `out`.
The resulting pseudo-derivation (with hashes instead of store paths in `inputDrvs`) is then printed (in the ["ATerm" format](@docroot@/protocols/derivation-aterm.md)) and hashed, and this becomes the hash of the "quotient derivation".
When calculating output hashes, `hashQuotientDerivation` is called on an almost-complete input-addressing derivation, which is just missing its input-addressed outputs paths.
The derivation hash is then used to calculate output paths for each output.
<!-- TODO describe how this is done. -->
Those output paths can then be substituted into the almost-complete input-addressed derivation to complete it.
> **Note**
>
> There may be an unintentional deviation from specification currently implemented in the `(outputHashes[output].toBase16(), "out")` case.
> This is not fatal because the deviation would only apply for content-addressing derivations with more than one output, and that only occurs in the floating case, which is [experimental][xp-feature-ca-derivations].
> Once this bug is fixed, this note will be removed.
### `hashOutputsOrQuotientDerivation`
How does `hashOutputsOrQuotientDerivation` in turn work?
It consists of two main cases, based on whether the outputs of the derivation are to be input-addressed or content-addressed.
#### Input-addressed outputs case
In the input-addressed case, it just calls `hashQuotientDerivation`, and returns that derivation hash.
This makes `hashQuotientDerivation` and `hashOutputsOrQuotientDerivation` mutually-recursive.
> **Note**
>
> In this case, `hashQuotientDerivation` is being called on a *complete* input-addressing derivation that already has its output paths calculated.
> The `inputDrvs` substitution takes place anyways.
#### Content-addressed outputs case
If the outputs are [content-addressed](./content-address.md), then it computes a hash for each output derived from the content-address of that output.
> **Note**
>
> In the [fixed](./content-address.md#fixed) content-addressing case, the outputs' content addresses are statically specified in advance, so this always just works.
> (The fixed case is what the pseudo-code shows.)
>
> In the [floating](./content-address.md#floating) case, the content addresses are not specified in advance.
> This is what the "or get from [build trace](@docroot@/store/build-trace.md) if floating" comment refers to.
> In this case, the algorithm is *stuck* until the input in question is built, and we know what the actual contents of the output in question is.
>
> That is OK however, because there is no problem with delaying the assigning of input addresses (which, remember, is what `hashQuotientDerivation` is ultimately for) until all inputs are known.
### Performance
The recursion in the algorithm is potentially inefficient:
it could call itself once for each path by which a subderivation can be reached, i.e., `O(V^k)` times for a derivation graph with `V` derivations and with out-degree of at most `k`.
In the actual implementation, [memoisation](https://en.wikipedia.org/wiki/Memoization) is used to reduce this cost to be proportional to the total number of `inputDrvOutputs` encountered.
### Semantic properties
*See [this chapter's appendix](@docroot@/store/math-notation.md) on grammar and metavariable conventions.*
In essence, `hashQuotientDerivation` partitions input-addressing derivations into equivalence classes: every derivation in that equivalence class is mapped to the same derivation hash.
We can characterize this equivalence relation directly, by working bottom up.
We start by defining an equivalence relation on first-order output deriving paths that refer content-addressed derivation outputs. Two such paths are equivalent if they refer to the same store object:
where \\({}^*(s, o)\\) denotes the store object that the output deriving path refers to.
We will also need the following construction to lift any equivalence relation on \\(X\\) to an equivalence relation on (finite) sets of \\(X\\) (in short, \\(\\mathcal{P}(X)\\)):
\\[
\\begin{prooftree}
\\AxiomC{$\\forall a \\in A. \\exists b \\in B. a \\,\\sim\_X\\, b$}
\\AxiomC{$\\forall b \\in B. \\exists a \\in A. b \\,\\sim\_X\\, a$}
\\BinaryInfC{$A \\,\\sim_{\\mathcal{P}(X)}\\, B$}
\\end{prooftree}
\\]
Now we can define the equivalence relation \\(\\sim_\\mathrm{IA}\\) on input-addressed derivation outputs. Two input-addressed outputs are equivalent if their derivations are equivalent (via the yet-to-be-defined \\(\\sim_{\\mathrm{IADrv}}\\) relation) and their output names are the same:
And now we can define \\(\\sim_{\\mathrm{IADrv}}\\).
Two input-addressed derivations are equivalent if their content-addressed inputs are equivalent, their input-addressed inputs are also equivalent, and they are otherwise equal:
<!-- cheating a bit with the semantics to get a good layout that fits on the page -->
where \\(\\mathrm{caInputs}(d)\\) returns the content-addressed inputs of \\(d\\) and \\(\\mathrm{iaInputs}(d)\\) returns the input-addressed inputs.
> **Note**
>
> An astute reader might notice that that nowhere does `inputSrcs` enter into these definitions.
> That means that replacing an input derivation with its outputs directly added to `inputSrcs` always results in a derivation in a different equivalence class, despite the resulting input closure (as would be mounted in the store at build time) being the same.
> [Issue #9259](https://github.com/NixOS/nix/issues/9259) is about creating a coarser equivalence relation to address this.
>
> \\(\\sim_\mathrm{Drv}\\) from [derivation resolution](@docroot@/store/resolution.md) is such an equivalence relation.
> It is coarser than this one: any two derivations which are "'hash quotient derivation'-equivalent" (\\(\\sim_\mathrm{IADrv}\\)) are also "resolution-equivalent" (\\(\\sim_\mathrm{Drv}\\)).
> It also relates derivations whose `inputDrvOutputs` have been rewritten into `inputSrcs`.
A few times in this manual, formal "proof trees" are used for [natural deduction](https://en.wikipedia.org/wiki/Natural_deduction)-style definition of various [relations](https://en.wikipedia.org/wiki/Relation_(mathematics)).
The following grammar and assignment of metavariables to syntactic categories is used in these sections.
*See [this chapter's appendix](@docroot@/store/math-notation.md) on grammar and metavariable conventions.*
To *resolve* a derivation is to replace its [inputs] with the simplest inputs — plain store paths — that denote the same store objects.
Derivations that only have store paths as inputs are likewise called *resolved derivations*.
(They are called that whether they are in fact the output of derivation resolution, or just made that way without non-store-path inputs to begin with.)
## Input Content Equivalence of Derivations
[Deriving paths][deriving-path] intentionally make it possible to refer to the same [store object] in multiple ways.
This is a consequence of content-addressing, since different derivations can produce the same outputs, and the same data can also be manually added to the store.
This is also a consequence even of input-addressing, as an output can be referred to by derivation and output name, or directly by its [computed](./derivation/outputs/input-address.md) store path.
Since dereferencing deriving paths is thus not injective, it induces an equivalence relation on deriving paths.
Let's call this equivalence relation \\(\\sim\\), where \\(p_1 \\sim p_2\\) means that deriving paths \\(p_1\\) and \\(p_2\\) refer to the same store object.
**Content Equivalence**: Two deriving paths are equivalent if they refer to the same store object:
\\[
\\begin{prooftree}
\\AxiomC{${}^*p_1 = {}^*p_2$}
\\UnaryInfC{$p_1 \\,\\sim_\\mathrm{DP}\\, p_2$}
\\end{prooftree}
\\]
where \\({}^\*p\\) denotes the store object that deriving path \\(p\\) refers to.
This also induces an equivalence relation on sets of deriving paths:
\\[
\\begin{prooftree}
\\AxiomC{$\\{ {}^*p | p \\in P_1 \\} = \\{ {}^*p | p \\in P_2 \\}$}
**Input Content Equivalence**: This, in turn, induces an equivalence relation on derivations: two derivations are equivalent if their inputs are equivalent, and they are otherwise equal:
Derivation resolution always maps derivations to input-content-equivalent derivations.
## Resolution relation
Dereferencing a derived path — \\({}^\*p\\) above — was just introduced as a black box.
But actually it is a multi-step process of looking up build results in the [build trace] that itself depends on resolving the lookup keys.
Resolution is thus a recursive multi-step process that is worth diagramming formally.
We can do this with a small-step binary transition relation; let's call it \\(\rightsquigarrow\\).
We can then conclude dereferenced equality like this:
\\[
\\begin{prooftree}
\\AxiomC{$p\_1 \\rightsquigarrow^* p$}
\\AxiomC{$p\_2 \\rightsquigarrow^* p$}
\\BinaryInfC{${}^*p\_1 = {}^*p\_2$}
\\end{prooftree}
\\]
I.e. by showing that both original items resolve (over 0 or more small steps, hence the \\({}^*\\)) to the same exact item.
With this motivation, let's now formalize a [small-step](https://en.wikipedia.org/wiki/Operational_semantics#Small-step_semantics) system of reduction rules for resolution.
### Formal rules
### \\(\text{resolved}\\) unary relation
\\[
\\begin{prooftree}
\\AxiomC{$s \in \text{store-path}$}
\\UnaryInfC{$s$ resolved}
\\end{prooftree}
\\]
\\[
\\begin{prooftree}
\\AxiomC{$\forall i \in \mathrm{inputs}(d). i \text{ resolved}$}
\\UnaryInfC{$d$ resolved}
\\end{prooftree}
\\]
### \\(\rightsquigarrow\\) binary relation
> **Remark**
>
> Actually, to be completely formal we would need to keep track of the build trace we are choosing to resolve against.
>
> We could do that by making \\(\rightsquigarrow\\) a ternary relation, which would pass the build trace to itself until it finally uses it in that one rule.
> This would add clutter more than insight, so we didn't bother to write it.
>
> There are other options too, like saying the whole reduction rule system is parameterized on the build trace, essentially [currying](https://en.wikipedia.org/wiki/Currying) the ternary \\(\rightsquigarrow\\) into a function from build traces to the binary relation written above.
Like all well-behaved evaluation relations, partial resolution is [*confluent*](https://en.wikipedia.org/wiki/Confluence_(abstract_rewriting)).
Also, if we take the symmetric closure of \\(\\rightsquigarrow^\*\\), we end up with the equivalence relations of the previous section.
Resolution respects content equivalence for deriving paths, and input content equivalence for derivations.
> **Remark**
>
> We chose to define from scratch an "resolved" unary relation explicitly above.
> But it can also be defined as the normal forms of the \\(\\rightsquigarrow^\*\\) relation:
>
> \\[ a \text{ resolved} \Leftrightarrow \forall b. b \rightsquigarrow^* a \Rightarrow b = a\\]
>
> In prose, resolved terms are terms which \\(\\rightsquigarrow^\*\\) only relates on the left side to the same term on the right side; they are the terms which can be resolved no further.
## Partial versus Complete Resolution
Similar to evaluation, we can also speak of *partial* versus *complete* derivation resolution.
Partial derivation resolution is what we've actually formalized above with \\(\\rightsquigarrow^\*\\).
Complete resolution is resolution ending in a resolved term (deriving path or derivation).
(Which is a normal form of the relation, per the remark above.)
With partial resolution, a derivation is related to equivalent derivations with the same or simpler inputs, but not all those inputs will be plain store paths.
This is useful when the input refers to a floating content addressed output we have not yet built — we don't know what (content-address) store path will used for that derivation, so we are "stuck" trying to resolve the deriving path in question.
(In the above formalization, this happens when the build trace is missing the keys we wish to look up in it.)
Complete resolution is a *functional* relation, i.e. values on the left are uniquely related with values on the right.
It is not however, a *total* relation (in general, assuming arbitrary build traces).
This is discussed in the next section.
## Termination
For static derivations graphs, complete resolution is indeed total, because it always terminates for all inputs.
(A relation that is both total and functional is a function.)
For [dynamic][xp-feature-dynamic-derivations] derivation graphs, however, this is not the case — resolution is not guaranteed to terminate.
The issue isn't rewriting deriving paths themselves:
a single rewrite to normalize an output deriving path to a constant one always exists, and always proceeds in one step.
The issue is that dynamic derivations (i.e. those that are filled-in the graph by a previous resolution) may have more transitive dependencies than the original derivation.
> **Example**
>
> Suppose we have this deriving path
> ```json
> {
> "drvPath": {
> "drvPath": "...-foo.drv",
> "output": "bar.drv"
> },
> "output": "baz"
> }
> ```
> and derivation `foo` is already resolved.
> When we resolve deriving path we'll end up with something like.
> ```json
> {
> "drvPath": "...-foo-bar.drv",
> "output": "baz"
> }
> ```
> So far is just an atomic single rewrite, with no termination issues.
> But the derivation `foo-bar` may have its *own* dynamic derivation inputs.
> Resolution must resolve that derivation first before the above deriving path can finally be normalized to a plain `...-foo-bar-baz` store path.
The important thing to notice is that while "build trace" *keys* must be resolved.
The *value* those keys are mapped to have no such constraints.
An arbitrary store object has no notion of being resolved or not.
But, an arbitrary store object can be read back as a derivation (as will in fact be done in case for dynamic derivations / nested output deriving paths).
And those derivations need *not* be resolved.
It is those dynamic non-resolved derivations which are the source of non-termination.
By the same token, they are also the reason why dynamic derivations offer greater expressive power.
We can take the [transitive closure] of the references graph, which any pair of store objects have an edge not if there is a single reference from the first to the second, but a path of one or more references from the first to the second.
We can take the [transitive closure] of the references graph, in which any pair of store objects have an edge if a *path* of one or more references exists from the first to the second object.
(A single reference always forms a path which is one reference long, but longer paths may connect objects which have no direct reference between them.)
The *requisites* of a store object are all store objects reachable by paths of references which start with given store object's references.
// console.log(`TeX error in "${jax.latex}": ${error.message}`);
// return jax.formatError(error);
//}
}
};
</script>
<!-- Load a newer versino of MathJax than mdbook does by default, and which in particular has working relative paths for the "bussproofs" extension. -->
@@ -46,7 +46,7 @@ The team meets twice a week (times are denoted in the [Europe/Amsterdam](https:/
- mark it as draft if it is blocked on the contributor
- escalate it back to the team by moving it to To discuss, and leaving a comment as to why the issue needs to be discussed again.
- Work meeting: Mondays 14:00-16:00 Europe/Amsterdam see [calendar](https://calendar.google.com/calendar/u/0/embed?src=b9o52fobqjak8oq8lfkhg3t0qg@group.calendar.google.com).
- Work meeting: Mondays 18:00-20:00 Europe/Amsterdam; see [calendar](https://calendar.google.com/calendar/u/0/embed?src=b9o52fobqjak8oq8lfkhg3t0qg@group.calendar.google.com).
1. Code review on pull requests from [In review](#in-review).
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.