Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regex_matches baseline check fails due to rev-parse changes in git 2.47.0 #1622

Open
EliahKagan opened this issue Oct 13, 2024 · 13 comments
Open
Labels
acknowledged an issue is accepted as shortcoming to be fixed

Comments

@EliahKagan
Copy link
Member

EliahKagan commented Oct 13, 2024

Update: See #1622 (comment) on the cause of this bug.

Current behavior 😯

When git is git 2.47.0 and tests are run with GIX_TEST_IGNORE_ARCHIVES=1 to force fixture scripts--and in particular the make_rev_spec_parse_repos.sh script--to run, the find_youngest_matching_commit::regex_matches rev-spec test fails:

     Summary [  72.831s] 2560 tests run: 2559 passed (1 leaky), 1 failed, 3 skipped
        FAIL [   0.024s] gix::gix revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches
error: test run failed

More specifically:

        FAIL [   0.024s] gix::gix revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches

--- STDOUT:              gix::gix revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches ---

running 1 test
test revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches ... FAILED

failures:

failures:
    revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 297 filtered out; finished in 0.01s


--- STDERR:              gix::gix revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches ---
thread 'revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches' panicked at gix/tests/gix/revision/spec/from_bytes/util.rs:181:13:
assertion `left == right` failed: :/mes.age: left (ours) should match right (git): Ok(Spec { inner: Include(Sha1(ef80b4b77b167f326351c93284dc0eb00dd54ff4)), path: None, first_ref: None, second_ref: None, repo: Repository { kind: WorkTree { is_linked: false }, git_dir: "tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph/.git", work_dir: Some("tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph") } })
  left: Some(Include(Sha1(ef80b4b77b167f326351c93284dc0eb00dd54ff4)))
 right: Some(Include(Sha1(9f9eac6bd1cd4b4cc6a494f044b28c985a22972b)))
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

As detailed below, the assertion that fails is not actually the assertion present directly in the test case function, but is instead an assertion present in a helper that checks against a baseline generated by git when running a fixture script.

I have observed the failures in Ubuntu 24.04 LTS with git 2.47.0 installed from the git-core PPA, in Arch Linux with git 2.47.0-1 (which is a downstream build of git 2.47.0 that is essentially the same), and in Arch Linux with manual builds of the upstream git source code at the 2.47.0 tag. The latter is the most valuable, because I also built the 2.46.2 tag to verify that the failure does not occur. Full details of most runs are in this gist. As shown there, I verified that there are no other failures, that there are no failures when GIX_TEST_IGNORE_ARCHIVES=1 is not used, and that the failures are not triggered or otherwise affected by the changes in #1620.

The cause

The failure is triggered by a change in behavior from git 2.46.2 to git 2.47.0, where the computed baseline values placed in gix/tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph/baseline.git differ. The behavior of gitoxide agrees with the behavior of git 2.46.2 but disagrees with the behavior of git 2.47.0. Running the test suite in two worktrees, both at the tip of gitoxide's main branch, and diffing the generated baseline.git file in each of the generated fixture directories, reveals:

ek in 🌐 catenary in ~/repos
❯ git --no-pager diff --no-index gitoxide-with-git-{2.46.2,2.47.0}/gix/tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph/baseline.git
diff --git a/gitoxide-with-git-2.46.2/gix/tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph/baseline.git b/gitoxide-with-git-2.47.0/gix/tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph/baseline.git
index 50eaff2..cf023f9 100644
--- a/gitoxide-with-git-2.46.2/gix/tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph/baseline.git
+++ b/gitoxide-with-git-2.47.0/gix/tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph/baseline.git
@@ -1,11 +1,11 @@
 :/message
-ef80b4b77b167f326351c93284dc0eb00dd54ff4
+9f9eac6bd1cd4b4cc6a494f044b28c985a22972b
 :/!-message
-55e825ebe8fd2ff78cad3826afb696b96b576a7e
+0270e757e023fedde198947489215e34bdaf1502
 :/mes.age
-ef80b4b77b167f326351c93284dc0eb00dd54ff4
+9f9eac6bd1cd4b4cc6a494f044b28c985a22972b
 :/!-mes.age
-55e825ebe8fd2ff78cad3826afb696b96b576a7e
+0270e757e023fedde198947489215e34bdaf1502
 :/not there
 1
 @^{/!-B}

This is from:

baseline ":/message" # finds 'message recent' instead of 'initial message'
baseline ":/!-message" # above, negated
baseline ":/mes.age" # regexes work too
baseline ":/!-mes.age" # negated above

Where baseline is defined as:

function baseline() {
local spec=${1:?first argument is the spec to test}
{
echo "$spec"
git rev-parse -q --verify "$spec" 2>/dev/null || echo $?
}>> baseline.git
}

The change is due to different behavior of git rev-parse from git 2.46.2 to git 2.47.0, causing the expectations represented and commented there about which matching item it selected no longer to hold. The change does not seem to relate to any differences in how the different versions of git actually create the fixture repositories. This is shown by running it on both fixture repositories, after running tests in the gitoxide-with-git-2.46.2 and gitoxide-with-git-2.47.0 worktrees in the manner implied by those worktrees' names.

ek in 🌐 catenary in complex_graph on  main [?]
❯ pwd
/home/ek/repos/gitoxide-with-git-2.46.2/gix/tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph

ek in 🌐 catenary in complex_graph on  main [?]
❯ PATH="$HOME/git-2.46.2/bin:$PATH" git rev-parse ':/mes.age'
ef80b4b77b167f326351c93284dc0eb00dd54ff4

ek in 🌐 catenary in complex_graph on  main [?]
❯ PATH="$HOME/git-2.47.0/bin:$PATH" git rev-parse ':/mes.age'
9f9eac6bd1cd4b4cc6a494f044b28c985a22972b
ek in 🌐 catenary in complex_graph on  main [?]
❯ pwd
/home/ek/repos/gitoxide-with-git-2.47.0/gix/tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph

ek in 🌐 catenary in complex_graph on  main [?]
❯ PATH="$HOME/git-2.46.2/bin:$PATH" git rev-parse ':/mes.age'
ef80b4b77b167f326351c93284dc0eb00dd54ff4

ek in 🌐 catenary in complex_graph on  main [?]
❯ PATH="$HOME/git-2.47.0/bin:$PATH" git rev-parse ':/mes.age'
9f9eac6bd1cd4b4cc6a494f044b28c985a22972b

Thus, whichever version of git is used to produce the fixture repositories, the effect of git rev-parse varies solely by which version of git is used to run git rev-parse itself.

To show that the test failure is really happening in the baseline comparison and not elsewhere, consider the test failure output with a backtrace:

        FAIL [   0.587s] gix::gix revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches

--- STDOUT:              gix::gix revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches ---

running 1 test
test revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches ... FAILED

failures:

failures:
    revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 244 filtered out; finished in 0.58s


--- STDERR:              gix::gix revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches ---
failed to extract 'tests/fixtures/generated-archives/make_rev_spec_parse_repos.tar': Ignoring archive at 'tests/fixtures/generated-archives/make_rev_spec_parse_repos.tar' as GIX_TEST_IGNORE_ARCHIVES is set.
thread 'revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches' panicked at gix/tests/gix/revision/spec/from_bytes/util.rs:181:13:
assertion `left == right` failed: :/mes.age: left (ours) should match right (git): Ok(Spec { inner: Include(Sha1(ef80b4b77b167f326351c93284dc0eb00dd54ff4)), path: None, first_ref: None, second_ref: None, repo: Repository { kind: WorkTree { is_linked: false }, git_dir: "tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph/.git", work_dir: Some("tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph") } })
  left: Some(Include(Sha1(ef80b4b77b167f326351c93284dc0eb00dd54ff4)))
 right: Some(Include(Sha1(9f9eac6bd1cd4b4cc6a494f044b28c985a22972b)))
stack backtrace:
   0: rust_begin_unwind
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/panicking.rs:665:5
   1: core::panicking::panic_fmt
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/panicking.rs:74:14
   2: core::panicking::assert_failed_inner
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/panicking.rs:405:23
   3: core::panicking::assert_failed
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/panicking.rs:365:5
   4: gix::revision::spec::from_bytes::util::compare_with_baseline
             at ./tests/gix/revision/spec/from_bytes/util.rs:181:13
   5: gix::revision::spec::from_bytes::util::parse_spec_opts
             at ./tests/gix/revision/spec/from_bytes/util.rs:153:5
   6: gix::revision::spec::from_bytes::util::parse_spec
             at ./tests/gix/revision/spec/from_bytes/util.rs:196:5
   7: gix::revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches
             at ./tests/gix/revision/spec/from_bytes/regex.rs:99:13
   8: gix::revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches::{{closure}}
             at ./tests/gix/revision/spec/from_bytes/regex.rs:95:23
   9: core::ops::function::FnOnce::call_once
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/ops/function.rs:250:5
  10: core::ops::function::FnOnce::call_once
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

The regex_matches test has multiple assertions, but the failure happens when evaluating an expression to be used in the first assertion:

#[test]
#[cfg(feature = "revparse-regex")]
fn regex_matches() {
let repo = repo("complex_graph").unwrap();
assert_eq!(
parse_spec(":/mes.age", &repo).unwrap(),
Spec::from_id(hex_to_id("ef80b4b77b167f326351c93284dc0eb00dd54ff4").attach(&repo))
);

That assertion is not really what is failing. Rather, failure occurs in an assertion in the implementation of parse_spec in util.rs. parse_spec calls parse_spec_opts, which calls compare_with_baseline. Because parse_spec_opts passes BaselineExpectation::Same as the expectation parameter of compare_with_baseline, this assertion runs:

match expectation {
BaselineExpectation::Same => {
assert_eq!(
actual, expected,
"{spec}: left (ours) should match right (git): {res:?}"
);
}

That assertion is what is actually failing.

Expected behavior 🤔

All tests should pass even when run with GIX_TEST_IGNORE_ARCHIVES=1, but I am not sure whether the old or new rev-parse behavior should be preferred. If the new behavior is preferred then I think the behavior of gitoxide will also have to change, and in this case, something should probably be done so that GIX_TEST_IGNORE_ARCHIVES=1 can be used even with older versions of git.

I don't know if the new git behavior is intentional, and if not then this might be considered a bug in git rather than in gitoxide or in gitoxide's test suite. See below regarding the specific git behavior involved here.

To the best of my knowledge, this is the only case where baseline assertions checked in the gitoxide test suite fail outside of Windows. On Windows, we do have such failures, comprising at least some of the failures reported in #1358; see also the comment discussion in #1345, especially at and below #1345 (comment). But as far as I know there is no relationship, conceptual or otherwise, between the failure here and any of the failures in #1358, other than both being related to baseline assertions. Furthermore, the failures there are probably lower priority, because we do not currently ever commit archives generated by running the test suite on Windows with GIX_TEST_IGNORE_ARCHIVES=1, and because CI does not run tests that way.

In contrast, this issue will eventually cause the test job on CI to fail, once the GitHub-hosted ubuntu-latest runner has git 2.47.0. Going by the details there, some macOS images already have git 2.47.0, but no others, yet.

Please note that this is not related to #1623, which fixes an unrelated failure in the CI test job that is already occurring.

Git behavior

The underlying cause here is a change in git behavior. In the hope that it will help identify the specific change, this section examines the specific differences corresponding to the baseline.git change shown above:

 :/message
-ef80b4b77b167f326351c93284dc0eb00dd54ff4
+9f9eac6bd1cd4b4cc6a494f044b28c985a22972b
 :/!-message
-55e825ebe8fd2ff78cad3826afb696b96b576a7e
+0270e757e023fedde198947489215e34bdaf1502
 :/mes.age
-ef80b4b77b167f326351c93284dc0eb00dd54ff4
+9f9eac6bd1cd4b4cc6a494f044b28c985a22972b
 :/!-mes.age
-55e825ebe8fd2ff78cad3826afb696b96b576a7e
+0270e757e023fedde198947489215e34bdaf1502

To investigate, I ran:

ek in 🌐 catenary in complex_graph on  main [?]
❯ pwd
/home/ek/repos/gitoxide-with-git-2.46.2/gix/tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph

ek in 🌐 catenary in complex_graph on  main [?]
❯ (PATH="$HOME/git-2.46.2/bin:$PATH"; git version; for x in ':/message' ':/!-message' ':/mes.age' ':/!-mes.age'; do (echo; set -x; git show "$(git rev-parse "$x")"); done) &>~/2.46.2

ek in 🌐 catenary in complex_graph on  main [?]
❯ (PATH="$HOME/git-2.47.0/bin:$PATH"; git version; for x in ':/message' ':/!-message' ':/mes.age' ':/!-mes.age'; do (echo; set -x; git show "$(git rev-parse "$x")"); done) &>~/2.47.0-cross
ek in 🌐 catenary in complex_graph on  main [?]
❯ pwd
/home/ek/repos/gitoxide-with-git-2.47.0/gix/tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/3848495901-unix/complex_graph

ek in 🌐 catenary in complex_graph on  main [?]
❯ (PATH="$HOME/git-2.47.0/bin:$PATH"; git version; for x in ':/message' ':/!-message' ':/mes.age' ':/!-mes.age'; do (ech
o; set -x; git show "$(git rev-parse "$x")"); done) &>~/2.47.0

ek in 🌐 catenary in complex_graph on  main [?]
❯ (PATH="$HOME/git-2.46.2/bin:$PATH"; git version; for x in ':/message' ':/!-message' ':/mes.age' ':/!-mes.age'; do (echo; set -x; git show "$(git rev-parse "$x")"); done) &>~/2.46.2-cross

As expected, the logs 2.46.2 and 2.46.2-cross have the same contents, as do the logs 2.47.0 and 2.47.0-cross. The reason I expect this is that, as noted above, I think the only behavioral difference is in the git rev-parse commands and that the generated fixture repositories are (aside from their baseline.git files shown rev-parse results) equivalent in all relevant respects.

Here is the content of the 2.46.2 log:

git version 2.46.2

++ git rev-parse :/message
+ git show ef80b4b77b167f326351c93284dc0eb00dd54ff4
commit ef80b4b77b167f326351c93284dc0eb00dd54ff4
Author: A U Thor <author@example.com>
Date:   Thu Apr 7 15:21:13 2005 -0700

    C

     message recent

diff --git a/file b/file
index edc79ae..419b800 100644
--- a/file
+++ b/file
@@ -1,3 +1,4 @@
 i
 j
 f
+c

++ git rev-parse ':/!-message'
+ git show 55e825ebe8fd2ff78cad3826afb696b96b576a7e
commit 55e825ebe8fd2ff78cad3826afb696b96b576a7e
Merge: 5b3f9e2 ef80b4b
Author: A U Thor <author@example.com>
Date:   Thu Apr 7 15:22:13 2005 -0700

    A

diff --cc file
index a4e7183,419b800..fe27474
--- a/file
+++ b/file
@@@ -1,8 -1,4 +1,10 @@@
 +g
 +h
  i
  j
 +d
 +e
  f
 +b
+ c
++a

++ git rev-parse :/mes.age
+ git show ef80b4b77b167f326351c93284dc0eb00dd54ff4
commit ef80b4b77b167f326351c93284dc0eb00dd54ff4
Author: A U Thor <author@example.com>
Date:   Thu Apr 7 15:21:13 2005 -0700

    C

     message recent

diff --git a/file b/file
index edc79ae..419b800 100644
--- a/file
+++ b/file
@@ -1,3 +1,4 @@
 i
 j
 f
+c

++ git rev-parse ':/!-mes.age'
+ git show 55e825ebe8fd2ff78cad3826afb696b96b576a7e
commit 55e825ebe8fd2ff78cad3826afb696b96b576a7e
Merge: 5b3f9e2 ef80b4b
Author: A U Thor <author@example.com>
Date:   Thu Apr 7 15:22:13 2005 -0700

    A

diff --cc file
index a4e7183,419b800..fe27474
--- a/file
+++ b/file
@@@ -1,8 -1,4 +1,10 @@@
 +g
 +h
  i
  j
 +d
 +e
  f
 +b
+ c
++a

In contrast, here is the content of the 2.47.0 log:

git version 2.47.0

++ git rev-parse :/message
+ git show 9f9eac6bd1cd4b4cc6a494f044b28c985a22972b
commit 9f9eac6bd1cd4b4cc6a494f044b28c985a22972b
Author: A U Thor <author@example.com>
Date:   Thu Apr 7 15:13:13 2005 -0700

    G

     initial message

diff --git a/file b/file
new file mode 100644
index 0000000..01058d8
--- /dev/null
+++ b/file
@@ -0,0 +1 @@
+g

++ git rev-parse ':/!-message'
+ git show 0270e757e023fedde198947489215e34bdaf1502
commit 0270e757e023fedde198947489215e34bdaf1502
Author: A U Thor <author@example.com>
Date:   Thu Apr 7 15:14:13 2005 -0700

    H

diff --git a/file b/file
new file mode 100644
index 0000000..6e9f0da
--- /dev/null
+++ b/file
@@ -0,0 +1 @@
+h

++ git rev-parse :/mes.age
+ git show 9f9eac6bd1cd4b4cc6a494f044b28c985a22972b
commit 9f9eac6bd1cd4b4cc6a494f044b28c985a22972b
Author: A U Thor <author@example.com>
Date:   Thu Apr 7 15:13:13 2005 -0700

    G

     initial message

diff --git a/file b/file
new file mode 100644
index 0000000..01058d8
--- /dev/null
+++ b/file
@@ -0,0 +1 @@
+g

++ git rev-parse ':/!-mes.age'
+ git show 0270e757e023fedde198947489215e34bdaf1502
commit 0270e757e023fedde198947489215e34bdaf1502
Author: A U Thor <author@example.com>
Date:   Thu Apr 7 15:14:13 2005 -0700

    H

diff --git a/file b/file
new file mode 100644
index 0000000..6e9f0da
--- /dev/null
+++ b/file
@@ -0,0 +1 @@
+h

Steps to reproduce 🕹

Most of the details are covered above, and the exact commands run for all tests can be seen in the gist.

How I built Git

I built, installed, and tested git 2.46.2 and git 2.47.0 on Arch Linux by running:

gh repo clone git/git
cd git
git switch -d v2.46.2
make prefix=~/git-2.46.2 -j10
make prefix=~/git-2.46.2 install
make prefix=~/git-2.46.2 test
gix clean -xde
git switch -d v2.47.0
make prefix=~/git-2.47.0 -j10
make prefix=~/git-2.47.0 install
make prefix=~/git-2.47.0 test

I could have done test and install in the other order (and test presumably didn't use the prefix value in any way). No unexpected failures occurred (i.e., as expected, some tests were reported as broken, but no tests had a status reported as failing).

In hindsight, I should have done gh repo clone git/git -- --recurse-submodules to get the sha1collisiondetection submodule. But I believe that makes no difference here: the tests do not seem to relate to the distinction between SHA1 and Hardened SHA1. Furthermore the failures only occur with 2.47.0, not with 2.46.2. Finally, I observe the failure with 2.47.0 when installed from the Arch Linux repositories on this system, as well as when installed from the git-core PPA on a separate Ubuntu 24.04.1 LTS system; I believe both of those builds do have hardened SHA-1.

How I tested gitoxide - full runs

Aside from some ad-hoc experiments running the full test suite on other systems to ensure that what I was observing was not due to the system being Arch Linux or to details of how I things up there, I ran the full test suite and varied three things, testing twelve combinations:

  • current tip of main (6487269) vs. before #1620 was merged (37c1e4c, called old-main in the gist)
  • git 2.46.2 vs. git 2.47.0 vs. downstream Arch Linux git 2.47.0-1 (2.47.0 and 2.47.0-1 behaved the same)
  • undefined GIX_TEST_IGNORE_ARCHIVES vs. defining GIX_TEST_IGNORE_ARCHIVES=1

In the gist, logs described as individual runs are actually runs both without, and then with, GIX_TEST_IGNORE_ARCHIVES=1.

I always ran gix clean -xde between runs (including between not defining GIX_TEST_IGNORE_ARCHIVES and defining it as 1).

I ran the test suite with cargo nextest run --all --no-fail-fast, varying the git command used by prefixing PATH="$HOME/git-2.46.2/bin:$PATH", PATH="$HOME/git-2.47.0/bin:$PATH", or (for the downstream git 2.47.0-1) no such prefix. For example:

PATH="$HOME/git-2.47.0/bin:$PATH" GIX_TEST_IGNORE_ARCHIVES=1 cargo nextest run --all --no-fail-fast

How I tested gitoxide - the specific failing test

To verify that the effect was not due to the interaction of multiple tests, I ran the specific failing test as well, including after running gix clean -xde. This is further detailed in the gist and it is also how I produced the output showing the backtrace. Specifically, the backtrace shown above was produced with:

cd gix/tests/gix/revision/spec/from_bytes
GIX_TEST_IGNORE_ARCHIVES=1 RUST_BACKTRACE=1 cargo nextest run regex_matches

(That used the downstream 2.47.0-1--which behaves the same as 2.47.0, at least with respect to what this issue investigates--and thus omits a PATH=... prefix.)

EliahKagan added a commit to EliahKagan/gitoxide that referenced this issue Oct 13, 2024
Per actions/runner-images#10636, the
`ubuntu-latest` GitHub Actions hosted runner is being migrated from
`ubuntu-22.04` to `ubuntu-24.04`. The `ci.yml` workflow, including
the full `test` job, use `ubuntu-latest`, and recently has switched
over to using Ubuntu 24.04 runner.

It makes sense to use this newer version, but that runner
apparently does not preinstall the headers required to build with
`-llzma`. That causes `just ci-test` to fail at
`cargo nextest run -p gix-testtools --features xz` with
`/usr/bin/ld: cannot find -llzma: No such file or directory`.

This installs the `liblzma-dev` package that provides the required
headers, so we can use Ubuntu 24.04 LTS while continuing to test
the `xz` feature of `gix-testtools`.

The failure this addresses was first observed in GitoxideLabs#1618 at 64e0f78:
https://github.com/GitoxideLabs/gitoxide/actions/runs/11304216949/job/31442302018?pr=1618

But it is not caused by any of the changes there, and it now occurs
whenever the `test` job is run, including if it is re-run at the
tip of the main branch (or any other branch), such as in:
https://github.com/EliahKagan/gitoxide/actions/runs/11317510844/job/31471040361

In both cases, expanding Set up job > Operating system shows
Ubuntu 24.04.1 LTS.

(The failure this addresses is also not related to GitoxideLabs#1622, which
documents an unrelated failure that is possible to observe locally
and that, if no change is made related to it, can be expected to
occur on CI in the `test` starting sometime soon, but that is not
yet seen on CI.)
EliahKagan added a commit to EliahKagan/gitoxide that referenced this issue Oct 14, 2024
Per actions/runner-images#10636, the
`ubuntu-latest` GitHub Actions hosted runner label is being
migrated from aliasing `ubuntu-22.04` to aliasing `ubuntu-24.04`.
Our `ci.yml` workflow, which includes the full `test` job, uses
`ubuntu-latest`, and recently has switched automatically to using
an Ubuntu 24.04 runner.

It makes sense to use this newer version, but that runner
apparently does not preinstall the headers required to build with
`-llzma`. That causes `just ci-test` to fail at
`cargo nextest run -p gix-testtools --features xz` with
`/usr/bin/ld: cannot find -llzma: No such file or directory`.

This commit installs the `liblzma-dev` package that provides the
required headers, so we can use Ubuntu 24.04 LTS while continuing
to test the `xz` feature of `gix-testtools`.

The failure this addresses was first observed in GitoxideLabs#1618 at 64e0f78:
https://github.com/GitoxideLabs/gitoxide/actions/runs/11304216949/job/31442302018?pr=1618

But it is not caused by any of the changes there, and it now occurs
whenever the `test` job is run, including if it is re-run at the
tip of the main branch (or any other branch), such as in:
https://github.com/EliahKagan/gitoxide/actions/runs/11317510844/job/31471040361

In both cases, expanding *Set up job* > *Operating system* shows
Ubuntu 24.04.1 LTS.

(The failure this addresses is also not related to GitoxideLabs#1622, which
documents an unrelated failure that is possible to observe locally
and that, if no change is made related to it, can be expected to
occur on CI in the `test` starting sometime soon, but that is not
yet seen on CI.)
@Byron Byron added the acknowledged an issue is accepted as shortcoming to be fixed label Oct 14, 2024
@Byron
Copy link
Member

Byron commented Oct 14, 2024

Thanks a lot for the detailed report!

It is my hope that once all the Windows tests are working reliably, the final step to mostly deterministic CI runs is to proactively test for changes on CI with GIX_TEST_IGNORE_ARCHIVES=1 set.

Looking at the graph itself, it's notable that Git v2.46.2 find [c], indeed the first matching commit…

Screenshot 2024-10-14 at 07 09 17

…but v2.47.0 finds [g], the first matching commit if one would follow first first parents first.

This probably has to do with a change in the underlying traversal order, and I vaguely remember that traversing the first parents first can be done as performance optimisation. I think I have seen that in the commit-graph traversal employed by merge-base searches.
Merge-bases are probably unrelated here, but some performance optimization might not be.

On the other hand, I also wouldn't be surprised if this is an unintended side-effect of another change to the way Git performs traversals.

Regarding the failing tests, it's obviously very dependent on the traversal order of the commit-graph even though that isn't what's under test. So I'd think that could be fixed by adjusting the commit messages to be unique enough to only allow one possible match.

@EliahKagan
Copy link
Member Author

So I'd think that could be fixed by adjusting the commit messages to be unique enough to only allow one possible match.

Isn't part of the purpose of these particular tests to check these kind of situations where traversal order matters? That is my impression because it seems like the comment here is documenting intent:

baseline ":/message" # finds 'message recent' instead of 'initial message'

Or is the idea that we should no longer attempt to test that?

@Byron
Copy link
Member

Byron commented Oct 14, 2024

That's indeed great evidence that this was intended all along. Thinking about it more, it does make sense as rev-parse needs to do a traversal so gitoxide has to try and match it.

It's a bit strange to see this was changed so fundamentally in v2.47, especially since the release notes don't seem to mention any of it.

Then I don't know what to do, as I do find the previous behaviour more intuitive. Maybe we just keep this open and wait, maybe it disappears as sudden as it appeared.

@EliahKagan
Copy link
Member Author

EliahKagan commented Oct 14, 2024

It is my hope that once #1358 are working reliably, the final step to mostly deterministic CI runs is to proactively test for changes on CI with GIX_TEST_IGNORE_ARCHIVES=1 set.

Do you also want macOS runs on CI with GIX_TEST_IGNORE_ARCHIVES=1? If so, then that's an opportunity to prepare for when this issue breaks the test job once ubuntu-latest has git 2.47.0, since I believe macOS runners already (since recently) have that version of git. This relates to...

Then I don't know what to do, as I do find the previous behaviour more intuitive. Maybe we just keep this open and wait, maybe it disappears as sudden as it appeared.

I agree that we should not remove or weaken the tests and that waiting might lead to a further update to Git that reverses the behavior.

It may be useful to prepare for CI breakage by making it so that the test suite automatically skips the affected test case when running on CI with git >=2.47.0. How best to go about that--and also when--may depend on the above macOS question.

(I encountered this issue when working on something else in gitoxide that is conceptually unrelated, and I'm in no hurry to do anything potentially unnecessary on this issue, so if the best way forward right now is just to wait, that is fine by me.)

@Byron
Copy link
Member

Byron commented Oct 15, 2024

Do you also want macOS runs on CI with GIX_TEST_IGNORE_ARCHIVES=1? If so, then that's an opportunity to prepare for when this issue breaks the test job once ubuntu-latest has git 2.47.0, since I believe macOS runners already (since recently) have that version of git.

Doing something depending on the version of Git could be interesting to gitoxide users as well. Unless you are interested in adding a version-dependent skip of the test, I think it's OK to wait until it breaks. It's well-known to two people now so I think the issue can easily be 'fixed' by ignoring the test on Linux as quick-fix, and to selectively skip it based on the parsed version in a follow-up.

It's a bit sad that once it's skipped, there is no way to easily learn if it still works - so maybe the skip should be implemented so that it will also test for the opposite case - "it's skipped, but the test actually succeeds, fix it".

@EliahKagan
Copy link
Member Author

EliahKagan commented Oct 15, 2024

Doing something depending on the version of Git could be interesting to gitoxide users as well.

If you mean having gitoxide's own search behavior vary based on what version of Git is installed, I recommend against doing that by default, because:

  1. It is hard to avoid relying on behavior one observes consistently during development, especially if one observes it both in interactive local use and on CI, and especially when one observes related software--in this case, Git--also to have that behavior. I also think developers of some Git-related software may be more likely to use more recent versions of Git than their users use (even if their users are also software developers).

    So if gitoxide behaves like the git command it finds on the system where it runs, for functionality where gitoxide will behave differently elsewhere, this would make it very easy for developers of applications that use gitoxide to write bugs based on implicit inaccurate assumptions that deployment or end-user environments match development and CI environments.

  2. Except in the narrow range of situations where both gitoxide is in a direct or indirect subprocess of git and git has itself prepended its git-core directory to PATH, there need not be any single value that should be understood as "the version of Git." Git may be absent or installed in multiple places at different versions. The git that gitoxide finds may even be a build that is bundled with a (third) application or otherwise serves a specific purpose.

    I think this is less of a problem when taking configuration from git--because the configuration comes from a file, gix config reveals its origin, and it can be overridden in a lower scope--than in inferring behavior from the version.

  3. This does not seem to be an area where gitoxide needs to behave the same as Git for reasons of compatibility. This is in contrast to something like GIT_CONFIG_PARAMETERS, where supporting it would be almost entirely for the purpose of interoperability with a locally available git, and where it would be necessary either to support or otherwise account for how its values use two different syntaxes depending on the git version.

These points relate indirectly (and in the case of the GIT_CONFIG_PARAMETERS example, directly) to some of my thoughts discussed in #1585 (review). Accordingly, I acknowledge that when I open the issues mentioned in #1585 (comment) (which I have not yet done), putting the broader ideas in more coherent form may end up revealing that I am mistaken about some of these things.

For the question of whether the gitoxide behavior related to this issue should match that of git 2.46.2 or git 2.47.0, I think that, unless this change goes away in a git 2.47.* patch release, we should eventually figure out what has caused the change, and whether it is intentional:

  • If it is unintentional and a bug, then it should be changed back in Git. In that case, I can report it to the Git mailing list, and gitoxide should not change to match the new behavior.
  • If it is unintentional but not a bug, and should not be changed back in Git, then gitoxide should be changed to match the new behavior, unless there are disadvantages to gitoxide doing that.
  • If it is intentional, then gitoxide should be changed to match the new behavior, unless there are serious disadvantages to doing so or it is infeasible to do so.
  • If it turns out that gitoxide ought to behave according to the older rule some of the time and according to the newer rule other times, then a gitoxide.* configuration variable could be added for it. This variable could recognize a value that causes gitoxide's behavior to be based on what Git version, if any, is found. But for the reasons above, I think that would still not be the best default value of this configuration variable.

I can eventually look further into how this change has arisen in Git. If I still cannot figure it out, then I could inquire on the Git mailing list. This specific research is not my immediate main priority for gitoxide-related research, but it is something I am willing to do eventually, if the old behavior is not restored in the mean time by a patch release.

Unless you are interested in adding a version-dependent skip of the test, I think it's OK to wait until it breaks.

Yes, I was thinking of doing something like that soon. I think the subexpression that fails in this way--and the others that would fail if it didn't--can be extracted and modified so that they do the check without a panic. If not, then the baseline check implementation--which is also part of the test suite--could be modified to allow this.

Then if the baseline checks triggered by the regex_matches test fail, check if both we are on CI and git reports its version as 2.47.0 or higher, and avoid failing the test in this case, or at least skip the assertions that would fail. If necessary, checking for CI could be omitted and this skipping logic could also be run locally.

However:

  • I would probably wait to do this until it is failing on CI (as you have suggested).

    The exception is if we are going to want to start running tests with GIX_TEST_IGNORE_ARCHIVES=1 on macOS on CI. The reason that would motivate me to make the change sooner is that, unlike on Windows, on macOS we do not have a long-standing situation where tests fail on macOS when run that way, and GitHub-hosted macOS runners seem already to have git 2.47.0 installed. So this would easily let me start with a situation where all but the affected test pass on CI.

  • There may be a better way to do it, that better keeps track of which tests are being skipped under what conditions and that provides utility beyond this specific test.

    If GIX_TEST_IGNORE_ARCHIVES or one or more other environment variables were to accept a syntax allowing specific fixtures to be overridden as having their archives ignored or not ignored, then CI steps could specify GIX_TEST_IGNORE_ARCHIVES=1 or something like it while also explicitly preventing specific broken fixtures from running.

    I plan to consider this idea further and open a PR to implement or an issue to discuss it if it still seems useful. But maybe you have thoughts on it now.

@Byron
Copy link
Member

Byron commented Oct 16, 2024

If you mean having gitoxide's own search behavior vary based on what version of Git is installed, I recommend against doing that by default, because:

No, that's (fortunately) not what I meant. The idea was to use this to selectively alter the expected outcome of the failing test, while providing this feature to gitoxide users. Some may rely on calling out to git for pushing/fetching, and maybe there are differences depending on the git version as well.

I can eventually look further into how this change has arisen in Git. If I still cannot figure it out, then I could inquire on the Git mailing list. This specific research is not my immediate main priority for gitoxide-related research, but it is something I am willing to do eventually, if the old behavior is not restored in the mean time by a patch release.

Thank you! I agree that there are more important things to do and that this regression can be taken care of later if it doesn't resolve itself.

I would probably wait to do this until it is failing on CI (as you have suggested).

I think so, too.

I plan to consider this idea further and open a PR to implement or an issue to discuss it if it still seems useful. But maybe you have thoughts on it now.

Let's wait, I think when this is implemented, if at all, decent solutions will be straightforward enough, and they can take different shapes independently of the first ideas shared here even.

@EliahKagan
Copy link
Member Author

Sounds good. If I notice that CI is immediately about to fail for this reason, then I'll make note of that. Otherwise, I'll wait until I observe it does actually fail. (I suggest keeping this issue open until either there is a lasting change in gitoxide that fixes it or the changed Git behavior has gone away.)

EliahKagan added a commit to EliahKagan/gitoxide that referenced this issue Oct 19, 2024
This temporarily suppresses baseline comparisons for two patterns
where the baseline checks fail if the `complex_graph` fixture
script was run in Git 2.47.0.

For now, they are unconditionally suppressed, regardless of the Git
version or whether generated archives are used. This change is
specific to the `find_youngest_matching_commit::regex_matches`
test.

This works around, but does not fix, issue GitoxideLabs#1622. The affected test
is still run, but the two directly affected baseline checks are
skipped.
@EliahKagan EliahKagan mentioned this issue Oct 19, 2024
Byron added a commit that referenced this issue Oct 19, 2024
On the CI linux workflow, archived baseline aren't used - instead fixtures will be re-evaluated.
As of Git 2.47, its behaviour changed which makes the following assertion fail.
We decided to just ignore it until it's clear that this isn't a bug - obviously the traversal order changed.

----

This temporarily suppresses baseline comparisons for two patterns
where the baseline checks fail if the `complex_graph` fixture
script was run in Git 2.47.0.

For now, they are unconditionally suppressed, regardless of the Git
version or whether generated archives are used. This change is
specific to the `find_youngest_matching_commit::regex_matches`
test.

This works around, but does not fix, issue #1622. The affected test
is still run, but the two directly affected baseline checks are
skipped.

Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com>
Byron added a commit that referenced this issue Oct 19, 2024
On the CI linux workflow, archived baseline aren't used - instead fixtures will be re-evaluated.
As of Git 2.47, its behaviour changed which makes the following assertion fail.
We decided to just ignore it until it's clear that this isn't a bug - obviously the traversal order changed.

----

This temporarily suppresses baseline comparisons for two patterns
where the baseline checks fail if the `complex_graph` fixture
script was run in Git 2.47.0.

For now, they are unconditionally suppressed, regardless of the Git
version or whether generated archives are used. This change is
specific to the `find_youngest_matching_commit::regex_matches`
test.

This works around, but does not fix, issue #1622. The affected test
is still run, but the two directly affected baseline checks are
skipped.

Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com>
Byron added a commit that referenced this issue Oct 19, 2024
On the CI linux workflow, archived baseline aren't used - instead fixtures will be re-evaluated.
As of Git 2.47, its behaviour changed which makes the following assertion fail.
We decided to just ignore it until it's clear that this isn't a bug - obviously the traversal order changed.

----

This temporarily suppresses baseline comparisons for two patterns
where the baseline checks fail if the `complex_graph` fixture
script was run in Git 2.47.0.

For now, they are unconditionally suppressed, regardless of the Git
version or whether generated archives are used. This change is
specific to the `find_youngest_matching_commit::regex_matches`
test.

This works around, but does not fix, issue #1622. The affected test
is still run, but the two directly affected baseline checks are
skipped.

Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com>
EliahKagan added a commit to EliahKagan/gitoxide that referenced this issue Nov 4, 2024
In the preceding two commits, the make_rev_spec_parse_repos fixture
was modified to avoid giving extra executable permissions to a
loose object file where they are not needed, and the affected
fixture archive was regenerated. Though the permissions change is
itself good and causes no problems, the overall change caused two
problems, which are corrected here:

1. I had taken the opportunity to follow better practices when
   running commands in a shell script whose arguments are formed by
   parameter expansion: adding quoting where splitting and globbing
   is not intended but could in principle also be indicated; and
   preceding the argument formed this way with a `--` to designate
   it clearly as a non-option argument, since `chmod` follows the
   XBD Utility Syntax Guidelines, which include `--` recognition.

   While adding quoting was a good change (in this case, just for
   clarity that no expansions are intended), the way I added `--`
   created a new problem where none had existed. This is because I
   wrongly thought of it as separating non-filename arguments from
   filename arguments, which is incorrect: in `chmod`, a mode
   argument is neither an option or an operand to an option.
   Accordingly, only some implementations of `chmod` allow it to be
   placed after the mode.

   This commit corrects that by placing it before the mode argument
   instead, which is portable while still achieving the goal of
   establishing the argument after it as as never being meant to be
   interpreted as an option (regardless of whether the system's
   `chmod` recognizes options after non-option arguments).

2. Due to GitoxideLabs#1622, regenerating `make_rev_spec_parse_repos.tar` with
   Git 2.47.0 causes

       revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches

   to to fail on all systems with all versions of Git, whenever
   `GIX_TEST_IGNORE_ARCHIVES` is *not* set. This differs from the
   usual situation where it fails only when that *is* set and only
   when the available Git is >= 2.47.0. This causes the test to
   fail in the `test-fast` CI job, since the mitigation in GitoxideLabs#1635
   for when the tests are detected to be running on CI deliberately
   covers only the `GIX_TEST_IGNORE_ARCHIVES` case.

   In the previous commit, I had regenerated that archive on an
   Ubuntu 24.04 LTS system with Git 2.47.0 installed from the
   git-core PPA, causing this problem.

   This commit regenerates the archive again on a macOS 15.0.1
   system with Git 2.39.5 (Apple Git-154), using the command:

       TZ=UTC cargo nextest run --all --no-fail-fast

   All tests passed and the archive was successfully remade. I used
   `TZ=UTC` since I usually regenerate archives on a system whose
   time zone is configured to be UTC rather than local time, and
   more specifically because there is an unrelated bug (to be
   separately reported) causing an unrelated test to fail in some
   time zones in the two weeks that follow daylight saving time
   adjustments.
@EliahKagan
Copy link
Member Author

The cause

Aarni Koskela has found and reported a bug in Git that I am pretty sure is the cause of this issue:

The behavior of the :/ notation to select commits seems to have changed to no longer prioritize younger commits on the current HEAD. Instead, if an older commit is reachable from a named ref (e.g. a tag), it will be selected instead of the younger commit. [...]

Patrick Steinhardt has identified the cause of that bug and submitted a patch, noting:

It was reported that magic pathspecs started to return results in reverse recency order starting with Git v2.47.0. This behaviour bisects to 57fb139b5e (object-name: fix leaking commit list items, 2024-08-01), which has refactored get_oid_oneline() to plug a couple of memory leaks. [...]

The patch by Patrick Steinhardt is currently at v3.

Verification

Procedure

To verify that the baseline test failure described in this gitoxide issue is due to that Git bug, I:

  1. Checked that my clone of git is up to date with the master branch at git/git@e66fd72 (including the submodule).

  2. Applied the patch on a new branch by running:

    git switch -c object-name
    b4 am 20241206-pks-rev-parse-fix-reversed-list-v3-1-d934c17db168@pks.im
    git am v3_20241206_ps_object_name_fix_reversed_ordering_with_pattern_revisions.mbx
  3. Installed git both at master and the feature branch (running the test suite, so as to catch anything unexpected):

    make prefix=~/git-unpatched -j10
    make prefix=~/git-unpatched install
    make prefix=~/git-unpatched -j10 test
    git worktree add ../git-object-name object-name
    cd ../git-object-name
    make prefix=~/git-patched -j10
    make prefix=~/git-patched install
    make prefix=~/git-patched -j10 test
  4. Checked that my clone of gitoxide was, at that time, up to date. I did this with main at 67f20b1. Since then, #1717 and #1718 have been merged, but I don't expect them to affect this.

  5. Ran the gitoxide regex_matches test that has been failing with the system Git (this is an Arch Linux system with git 2.47.1-1), the unpatched git build from master, and the patched git build described above where v3 of the patch by Patrick Steinhardt is applied to master:

    GIX_TEST_IGNORE_ARCHIVES=1 cargo nextest run -p gix regex_matches
    git restore .
    gix clean -xd -m '*generated*' -e
    PATH="$HOME/git-unpatched/bin:$PATH" GIX_TEST_IGNORE_ARCHIVES=1 cargo nextest run -p gix regex_matches
    git restore .
    gix clean -xd -m '*generated*' -e
    PATH="$HOME/git-patched/bin:$PATH" GIX_TEST_IGNORE_ARCHIVES=1 cargo nextest run -p gix regex_matches

    As expected, this gitoixde test failed when PATH was set to find the system git 2.47.1-1, failed in the same way with an unpatched git build from master, and no longer failed with the patched git.

Detailed results

Full output can be seen in this gist. The interesting portion, where the only difference is whether git has the patch, is:

ek in 🌐 catenary in gitoxide on  main is 📦 v0.39.0 via 🦀 v1.83.0
❯ PATH="$HOME/git-unpatched/bin:$PATH" GIX_TEST_IGNORE_ARCHIVES=1 cargo nextest run -p gix regex_matches
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.34s
------------
 Nextest run ID 26331f82-4b82-4643-adb6-9a11c4769d7a with nextest profile: default
    Starting 1 test across 4 binaries (256 tests skipped)
        FAIL [   0.702s] gix::gix revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches

--- STDOUT:              gix::gix revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches ---

running 1 test
test revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches ... FAILED

failures:

failures:
    revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 252 filtered out; finished in 0.70s


--- STDERR:              gix::gix revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches ---
failed to extract 'tests/fixtures/generated-archives/make_rev_spec_parse_repos.tar': Ignoring archive at 'tests/fixtures/generated-archives/make_rev_spec_parse_repos.tar' as GIX_TEST_IGNORE_ARCHIVES is set.
thread 'revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches' panicked at gix/tests/gix/revision/spec/from_bytes/util.rs:181:13:
assertion `left == right` failed: :/mes.age: left (ours) should match right (git): Ok(Spec { inner: Include(Sha1(ef80b4b77b167f326351c93284dc0eb00dd54ff4)), path: None, first_ref: None, second_ref: None, repo: Repository { kind: WorkTree { is_linked: false }, git_dir: "tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/1921312734-unix/complex_graph/.git", work_dir: Some("tests/fixtures/generated-do-not-edit/make_rev_spec_parse_repos/1921312734-unix/complex_graph") } })
  left: Some(Include(Sha1(ef80b4b77b167f326351c93284dc0eb00dd54ff4)))
 right: Some(Include(Sha1(9f9eac6bd1cd4b4cc6a494f044b28c985a22972b)))
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

  Cancelling due to test failure
------------
     Summary [   0.703s] 1 test run: 0 passed, 1 failed, 256 skipped
        FAIL [   0.702s] gix::gix revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches
error: test run failed

ek in 🌐 catenary in gitoxide on  main [!] is 📦 v0.39.0 via 🦀 v1.83.0
❯ git restore .

ek in 🌐 catenary in gitoxide on  main is 📦 v0.39.0 via 🦀 v1.83.0
❯ gix clean -xd -m '*generated*' -e
removing gix/tests/fixtures/generated-do-not-edit/ (🗑️)

ek in 🌐 catenary in gitoxide on  main is 📦 v0.39.0 via 🦀 v1.83.0
❯ PATH="$HOME/git-patched/bin:$PATH" GIX_TEST_IGNORE_ARCHIVES=1 cargo nextest run -p gix regex_matches
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.19s
------------
 Nextest run ID 42694ea4-6b5a-4ea7-8217-2bd8d7f183d2 with nextest profile: default
    Starting 1 test across 4 binaries (256 tests skipped)
        PASS [   0.606s] gix::gix revision::spec::from_bytes::regex::find_youngest_matching_commit::regex_matches
------------
     Summary [   0.606s] 1 test run: 1 passed, 256 skipped

Proposed plan

My guess is that the Git bugfix may make it into 2.47.2. Once a fixed version of Git makes it into the GitHub Actions runners, the workaround in #1635 can be reverted. Most versions of Git will have been unaffected and this only affects the baseline tests (the behavior of gitoxide itself is not implicated), so nothing should need to remain in place to try to support affected Git versions.

When the time comes, I will do a cursory check to see if it looks like any major operating systems have downstream versions of Git 2.47.0 or 2.47.1 (or others, if the fix comes in later than I expect) that have the bug and that may not be patched to fix it (for non-rolling releases, such a fix would not necessarily be integrated, since the bug is not severe or a vulnerability). If so, then maybe having a comment in the test will be worthwhile. Otherwise, even that should be unnecessary.

@Byron
Copy link
Member

Byron commented Dec 10, 2024

Thanks again for staying on the pulse of the Git project to learn about the pending fix, and of course for validating ahead of time that it will indeed fix the issue.

When looking at #1635 I was also reminded that you also predicted that this issue would happen in the first place - truly proactive!

Now it would be good if one could monitor the 'bill of software' of all images to know when they update, but I also think it's enough to return to this on a timer.

If so, then maybe having a comment in the test will be worthwhile.

I think a note to your comment here or similar hints that a patch is inbound would be useful. That way, even if reverting the changes is forgotten, there is a chance one will find the comment upon review. Please do feel free to submit a PR in any way you see fit.

EliahKagan added a commit to EliahKagan/gitoxide that referenced this issue Dec 10, 2024
This updates the comment about GitoxideLabs#1622, including by adding links to
Git mailing list posts by Aarni Koskela, who discovered the bug
that turns out to be the cause of this, and Patrick Steinhardt, who
analyzed the bug and wrote a fix (currently in testing), as well as
GitoxideLabs#1622 (comment),
which summarizes that and reports on its connection to GitoxideLabs#1622.

This also narrows partial suppression of the failing test code
(which consists of conditionally using `parse_spec_no_baseline`
instead of `parse_spec` in some assertions) so that it is only done
if Git is at one of the versions that is known to be affected.

If any future Git versions are affected, such as by the currently
cooking patch not being merged as soon as I expect, then this could
potentially fail on CI again. But that is something we would
probably want to find out about.
EliahKagan added a commit to EliahKagan/gitoxide that referenced this issue Dec 10, 2024
This builds on GitoxideLabs#1635 by:

- Updating the comment about GitoxideLabs#1622, including by adding links to
  Git mailing list posts by Aarni Koskela, who discovered the bug
  that turns out to be the cause of this, and Patrick Steinhardt,
  who analyzed the bug and wrote a fix (currently in testing); and
  GitoxideLabs#1622 (comment),
  which summarizes that and reports on its connection to GitoxideLabs#1622.

- Narrowing the partial suppression of the failing test code (which
  consists of conditionally using `parse_spec_no_baseline` instead
  of `parse_spec` in some assertions) so that it is only done if
  Git is at one of the versions that is known to be affected.

  If any future Git versions are affected, such as by the currently
  cooking patch not being merged as soon as I expect, then this
  could potentially fail on CI again. But that is something we
  would probably want to find out about.
EliahKagan added a commit to EliahKagan/gitoxide that referenced this issue Dec 10, 2024
This builds on GitoxideLabs#1635 by:

- Updating the comment about GitoxideLabs#1622, including by adding links to
  Git mailing list posts by Aarni Koskela, who discovered the bug
  that turns out to be the cause of this, and Patrick Steinhardt,
  who analyzed the bug and wrote a fix (currently in testing); and
  GitoxideLabs#1622 (comment),
  which summarizes that and reports on its connection to GitoxideLabs#1622.

- Narrowing the partial suppression of the failing test code (which
  consists of conditionally using `parse_spec_no_baseline` instead
  of `parse_spec` in some assertions) so that it is only done if
  Git is at one of the versions that is known to be affected.

  If any future Git versions are affected, such as by the currently
  cooking patch not being merged as soon as I expect, then this
  could potentially fail on CI again. But that is something we
  would probably want to find out about.
@EliahKagan
Copy link
Member Author

EliahKagan commented Dec 10, 2024

Now it would be good if one could monitor the 'bill of software' of all images to know when they update, but I also think it's enough to return to this on a timer.

I'm not clear on why, but the failure doesn't seem to occur on Windows. Currently the only systems with Git versions in the affected range that we test on CI with GIX_TEST_IGNORE_ARCHIVES=1 are Ubuntu and Windows. So really it is only the Ubuntu runner images' versions that are relevant.

It shouldn't be too hard to set something up to notify when a stable ubuntu-24.04 image is released with Git at version 2.47.2 or higher. But since there's only that one image to check, I think most likely no automation is needed.

I think a note to your comment here or similar hints that a patch is inbound would be useful. That way, even if reverting the changes is forgotten, there is a chance one will find the comment upon review. Please do feel free to submit a PR in any way you see fit.

I've opened #1719 for this.

@Byron
Copy link
Member

Byron commented Dec 10, 2024

Thanks again - I was thinking in way too complicated ways there 😅!

@EliahKagan
Copy link
Member Author

EliahKagan commented Jan 17, 2025

The fix is in Git 2.48.0 and higher, but it is not backported to newer Git 2.47.* point releases, of which there is now one, Git 2.47.2 (containing backported fixes for CVE-2024-50349 and CVE-2024-52006 but not for the unrelated non-security bug that results in this issue).

As shown in this gist, the problem does occur with Git 2.47.2. Although #1719 refined #1635, it seems to me that more changes will be in order:

  1. At any time, update the range of versions to skip some baseline checks on under specific (further) conditions from (2, 47, 0)..(2, 47, 2) to (2, 47, 0)..(2, 48, 0).
  2. When version 20250113 of the ubuntu-24.04 GitHub Actions runner image, which is currently in pre-release, has a production release, don't skip the baseline checks on CI anymore. That pre-release has Git 2.48.0, and my guess is that the production release will have Git 2.48.1 or higher; these are free of the bug, which only affects 2.47.*. Once the intended CI environment's git has the fix, we would want the checks to occur, since using an affected version of git on CI would itself be an unexpected condition we would want to know about, and also for the more general reason that the full test job should run all the fixture scripts and fully check their effects.
  3. Decide whether to preserve the special-case logic and enable it when running locally with a 2.47.* version--instead of removing it due to the availability of stable versions that don't have the problem, as we (or at least I) had originally planned to do. The argument for this is that it would make developing gitoxide locally easier for users of 2.47.*, which presumably will continue to receive security updates for some time and therefore may be reasonable to use. I recommend that this be done only if at least one major distribution has at least one release that packages Git 2.47.* (and does so without backporting their own downstream fix for the issue)--after all, this only affects gitoxide's test suite itself. That is something I have not yet researched.

Depending on the choices made, the outcome of this might simply be to remove the special-case logic entirely, which has the benefit of simplicity. We shouldn't do that right away, though, because the CI images aren't ready for (2). I might open a PR just for (1), since even if the logic is removed fairly soon thereafter, it will still be correct in the history of someone needs to experiment with it or bring it back.

Edit: I've opened #1774 for (1).

EliahKagan added a commit to EliahKagan/gitoxide that referenced this issue Jan 17, 2025
The bug in Git that causes GitoxideLabs#1622 has been fixed since 2.48.0 and
does not affect any versions prior to 2.47, but the fix is not
backported to subsequent 2.47.* point releases. In particular,
2.47.2 has been released, with backported security fixes, but it
does not have backported fixes for this non-security bug.

This builds on GitoxideLabs#1635 and GitoxideLabs#1719 by updating the range of Git
versions where we skip the affected baseline checks on CI (for
platforms this affects on our CI) from `(2, 47, 0)..(2, 47, 2)` to
`(2, 47, 0)..(2, 48, 0)`, i.e., with the exclusive upper bound
changed from 2.47.2 to the correct value of 2.48.0. This also
revises the comments accordingly.

For further details, see:
GitoxideLabs#1622 (comment)
(This change is item (1) there.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledged an issue is accepted as shortcoming to be fixed
Projects
None yet
Development

No branches or pull requests

2 participants