Message ID | xmqqk0sevqlh.fsf@gitster.c.googlers.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 5a11de010b85261407d306768c119269d9297481 |
Headers | show |
Series | ci/install-depends: attempt to fix "brew cask" stuff | expand |
Junio C Hamano <gitster@pobox.com> writes: > It seems that homebrew suddenly started giving us trouble, like this: > > https://github.com/git/git/runs/1705953982?check_suite_focus=true#step:3:70 > > Here is my attempt to work it around by blindly following the > suggested course of action in the error message, without knowing > what I am doing X-<. I am not a Mac person. > > What is frustrating is that every time we hit a minor snag like this > to break one of the jobs, all other unrelated jobs are also taken > down. > > Help by those who know what they are doing on macOS would greatly be > appreciated. Thanks. After seeing 'seen' with this patch at its tip pass the tests [*1*], I prepared a merge of this change into the tip of 'next' and tentatively updated 'seen' with it. The test is still running [*2*], but the problematic part in the macOS build has already passed, so I am planning to fast-track this change down to 'next', 'master' and eventually down to 'maint' to keep the CI going to help other platforms catch more interesting problems. Help from those who are more familiar with macOS and homebrew is still appreciated, though. [References] *1* https://github.com/git/git/actions/runs/486978562 *2* https://github.com/git/git/runs/1706704233?check_suite_focus=true#step:3:81 > ----- >8 ----- >8 ----- >8 ----- >8 ----- >8 ----- >8 ----- > We run "git pull" against "$cask_repo"; clarify that we are > expecting not to have any of our own modifications and running "git > pull" to merely update, by passing "--ff-only" on the command line. > > Also, the "brew cask install" command line triggers an error message > that says: > > Error: Calling brew cask install is disabled! Use brew install > [--cask] instead. > > In addition, "brew install caskroom/cask/perforce" step triggers an > error that says: > > Error: caskroom/cask was moved. Tap homebrew/cask instead. > > Attempt to see if blindly following the suggestion in these error > messages gets us into a better shape. > > Signed-off-by: Junio C Hamano <gitster@pobox.com> > --- > ci/install-dependencies.sh | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh > index 0229a77f7d..0b1184e04a 100755 > --- a/ci/install-dependencies.sh > +++ b/ci/install-dependencies.sh > @@ -44,13 +44,13 @@ osx-clang|osx-gcc) > test -z "$BREW_INSTALL_PACKAGES" || > brew install $BREW_INSTALL_PACKAGES > brew link --force gettext > - brew cask install --no-quarantine perforce || { > + brew install --cask --no-quarantine perforce || { > # Update the definitions and try again > cask_repo="$(brew --repository)"/Library/Taps/homebrew/homebrew-cask && > - git -C "$cask_repo" pull --no-stat && > - brew cask install --no-quarantine perforce > + git -C "$cask_repo" pull --no-stat --ff-only && > + brew install --cask --no-quarantine perforce > } || > - brew install caskroom/cask/perforce > + brew install homebrew/cask/perforce > case "$jobname" in > osx-gcc) > brew install gcc@9
On Fri, Jan 15, 2021 at 12:50 AM Junio C Hamano <gitster@pobox.com> wrote: > Junio C Hamano <gitster@pobox.com> writes: > > It seems that homebrew suddenly started giving us trouble, like this: > > https://github.com/git/git/runs/1705953982?check_suite_focus=true#step:3:70 > > > After seeing 'seen' with this patch at its tip pass the tests [*1*], > I prepared a merge of this change into the tip of 'next' and > tentatively updated 'seen' with it. > > The test is still running [*2*], but the problematic part in the > macOS build has already passed, so I am planning to fast-track this > change down to 'next', 'master' and eventually down to 'maint' to > keep the CI going to help other platforms catch more interesting > problems. > > Help from those who are more familiar with macOS and homebrew is > still appreciated, though. I looked over your Homebrew-related changes, and they seem fine. The `brew install homebrew/cask/perforce` command could be shortened to `brew install perforce`, at least at the present, but being explicit with the long identifier is probably better anyhow.
On 1/15/2021 12:48 AM, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > >> It seems that homebrew suddenly started giving us trouble, like this: >> >> https://github.com/git/git/runs/1705953982?check_suite_focus=true#step:3:70 >> >> Here is my attempt to work it around by blindly following the >> suggested course of action in the error message, without knowing >> what I am doing X-<. I am not a Mac person. >> >> What is frustrating is that every time we hit a minor snag like this >> to break one of the jobs, all other unrelated jobs are also taken >> down. >> >> Help by those who know what they are doing on macOS would greatly be >> appreciated. Thanks. > > After seeing 'seen' with this patch at its tip pass the tests [*1*], > I prepared a merge of this change into the tip of 'next' and > tentatively updated 'seen' with it. > > The test is still running [*2*], but the problematic part in the > macOS build has already passed, so I am planning to fast-track this > change down to 'next', 'master' and eventually down to 'maint' to > keep the CI going to help other platforms catch more interesting > problems. > > Help from those who are more familiar with macOS and homebrew is > still appreciated, though. > > > [References] > > *1* https://github.com/git/git/actions/runs/486978562 > *2* https://github.com/git/git/runs/1706704233?check_suite_focus=true#step:3:81 We recently hit this same issue with our macOS builds for Scalar and GCM core. Your solution looks very similar to how we fixed the problem. Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Derrick Stolee <stolee@gmail.com> writes: > On 1/15/2021 12:48 AM, Junio C Hamano wrote: >> Junio C Hamano <gitster@pobox.com> writes: >> >>> It seems that homebrew suddenly started giving us trouble, like this: >>> >>> https://github.com/git/git/runs/1705953982?check_suite_focus=true#step:3:70 >>> >>> Here is my attempt to work it around by blindly following the >>> suggested course of action in the error message, without knowing >>> what I am doing X-<. I am not a Mac person. >>> >>> What is frustrating is that every time we hit a minor snag like this >>> to break one of the jobs, all other unrelated jobs are also taken >>> down. >>> >>> Help by those who know what they are doing on macOS would greatly be >>> appreciated. Thanks. >> >> After seeing 'seen' with this patch at its tip pass the tests [*1*], >> I prepared a merge of this change into the tip of 'next' and >> tentatively updated 'seen' with it. >> >> The test is still running [*2*], but the problematic part in the >> macOS build has already passed, so I am planning to fast-track this >> change down to 'next', 'master' and eventually down to 'maint' to >> keep the CI going to help other platforms catch more interesting >> problems. >> >> Help from those who are more familiar with macOS and homebrew is >> still appreciated, though. >> >> >> [References] >> >> *1* https://github.com/git/git/actions/runs/486978562 >> *2* https://github.com/git/git/runs/1706704233?check_suite_focus=true#step:3:81 > > We recently hit this same issue with our macOS builds for Scalar > and GCM core. Your solution looks very similar to how we fixed > the problem. > > Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Thanks.
On Thu, Jan 14, 2021 at 07:14:34PM -0800, Junio C Hamano wrote: > It seems that homebrew suddenly started giving us trouble, like this: > > https://github.com/git/git/runs/1705953982?check_suite_focus=true#step:3:70 > > Here is my attempt to work it around by blindly following the > suggested course of action in the error message, without knowing > what I am doing X-<. I am not a Mac person. > > What is frustrating is that every time we hit a minor snag like this > to break one of the jobs, all other unrelated jobs are also taken > down. I think that has to do with the grouping in the workflow file (the Windows builds, for example, will cancel the _other_ Windows tests if they fail, but not the Linux ones). So we could split the macos test out. It would probably involve duplicating a little bit of the content, but we do something similar for the dockerized builds. It might be that there is an option we can set to say "keep building the others even if one of these fails", which would give us the best of both. > brew install $BREW_INSTALL_PACKAGES > brew link --force gettext > - brew cask install --no-quarantine perforce || { > + brew install --cask --no-quarantine perforce || { On a related note, it feels like perforce is a frequent offender for triggering spurious failures (both for homebrew setup, but I have definitely seen racy/flaky failures from it as well). I am tempted to say it is not worth the trouble, but then I do not care at all about git-p4 myself in the first place, so I may be biased. -Peff
On Fri, Jan 15, 2021 at 3:00 PM Jeff King <peff@peff.net> wrote: > > - brew cask install --no-quarantine perforce || { > > + brew install --cask --no-quarantine perforce || { > > On a related note, it feels like perforce is a frequent offender for > triggering spurious failures (both for homebrew setup, but I have > definitely seen racy/flaky failures from it as well). I am tempted to > say it is not worth the trouble, but then I do not care at all about > git-p4 myself in the first place, so I may be biased. To be fair to 'perforce', though, the fault of this particular problem is Homebrew, which doesn't seem to be all that concerned about backward compatibility, at least in my experience. The single Homebrew-related automation script I wrote for personal use has been broken by arbitrary Homebrew changes frustratingly often over the last three years.
On Fri, Jan 15, 2021 at 03:07:38PM -0500, Eric Sunshine wrote: > On Fri, Jan 15, 2021 at 3:00 PM Jeff King <peff@peff.net> wrote: > > > - brew cask install --no-quarantine perforce || { > > > + brew install --cask --no-quarantine perforce || { > > > > On a related note, it feels like perforce is a frequent offender for > > triggering spurious failures (both for homebrew setup, but I have > > definitely seen racy/flaky failures from it as well). I am tempted to > > say it is not worth the trouble, but then I do not care at all about > > git-p4 myself in the first place, so I may be biased. > > To be fair to 'perforce', though, the fault of this particular problem > is Homebrew, which doesn't seem to be all that concerned about > backward compatibility, at least in my experience. The single > Homebrew-related automation script I wrote for personal use has been > broken by arbitrary Homebrew changes frustratingly often over the last > three years. Yeah, sorry, I should have been more precise in my language. None of this is perforce's fault at all. It is homebrew in this case, and in the racy cases it is probably our tests. But I do not feel like trying to debug those races for a tool I don't care much about. I tried to dig up some failing logs as an example, but it's actually a bit hard to search for, and it looks like logs get expired after a few months anyway. -Peff
Hi everyone, Le 2021-01-15 à 14:59, Jeff King a écrit : > On Thu, Jan 14, 2021 at 07:14:34PM -0800, Junio C Hamano wrote: > >> It seems that homebrew suddenly started giving us trouble, like this: >> >> https://github.com/git/git/runs/1705953982?check_suite_focus=true#step:3:70 >> >> Here is my attempt to work it around by blindly following the >> suggested course of action in the error message, without knowing >> what I am doing X-<. I am not a Mac person. >> >> What is frustrating is that every time we hit a minor snag like this >> to break one of the jobs, all other unrelated jobs are also taken >> down. > > I think that has to do with the grouping in the workflow file (the > Windows builds, for example, will cancel the _other_ Windows tests if > they fail, but not the Linux ones). > > So we could split the macos test out. It would probably involve > duplicating a little bit of the content, but we do something similar for > the dockerized builds. It might be that there is an option we can set to > say "keep building the others even if one of these fails", which would > give us the best of both. Yes, a quick Google search pointed me to a blog post [1] that mentions using 'fail-fast: false' in the test matrix so that one failing job does not automatically cancel the rest of the jobs in the matrix (the default is 'true') [2]. If we apply that to all four matrices in the workflow file, (windows-test, vs-test, regular and dockerized), it would be something like this: ~~~ diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index aef6643648..6c31f978bc 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -125,6 +125,7 @@ jobs: strategy: matrix: nr: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] + fail-fast: false steps: - uses: actions/checkout@v1 - name: download build artifacts @@ -229,6 +230,7 @@ jobs: strategy: matrix: nr: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] + fail-fast: false steps: - uses: actions/checkout@v1 - name: download git-sdk-64-minimal @@ -289,6 +291,7 @@ jobs: - jobname: GETTEXT_POISON cc: gcc pool: ubuntu-latest + fail-fast: false env: CC: ${{matrix.vector.cc}} jobname: ${{matrix.vector.jobname}} @@ -315,6 +318,7 @@ jobs: image: alpine - jobname: Linux32 image: daald/ubuntu32:xenial + fail-fast: false env: jobname: ${{matrix.vector.jobname}} runs-on: ubuntu-latest ~~~ I've CC-ed Dscho regarding if we also want this for the Windows tests, (I don't see why not) and if we feel it's a good idea I can send a proper patch. Cheers, Philippe. [1] https://www.edwardthomson.com/blog/github_actions_6_fail_fast_matrix_workflows.html [2] https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idstrategyfail-fast
Philippe Blain <levraiphilippeblain@gmail.com> writes: > Yes, a quick Google search pointed me to a blog post [1] > that mentions using 'fail-fast: false' in the test matrix so that > one failing job does not automatically cancel the rest of the jobs > in the matrix (the default is 'true') [2]. > > If we apply that to all four matrices in the workflow file, > (windows-test, vs-test, regular and dockerized), it would be > something like this: I am assuming that "failure of build stops tests from running", which is done on the windows/vs-build side, is still kept with this change. If that is the case, it sounds quite sensible. > I've CC-ed Dscho regarding if we also want this for the Windows tests, > (I don't see why not) and if we feel it's a good idea I can > send a proper patch. I often see windows/vs-test #4 fail first and take other 9 with it. We may see some interesting patterns to emerge, when we run all of them (and let all of them fail X-<). Thanks for working on this.
On Sat, Jan 23, 2021 at 09:41:29PM -0500, Philippe Blain wrote: > I've CC-ed Dscho regarding if we also want this for the Windows tests, > (I don't see why not) and if we feel it's a good idea I can > send a proper patch. :-). The first thought I had when I read your email was: "shouldn't we turn this on in those sharded Windows tests, too?" So, yeah, I think that this is a good idea for the windows- and vs-build tests, too. Thanks, Taylor
On Sat, Jan 23, 2021 at 09:41:29PM -0500, Philippe Blain wrote: > > So we could split the macos test out. It would probably involve > > duplicating a little bit of the content, but we do something similar for > > the dockerized builds. It might be that there is an option we can set to > > say "keep building the others even if one of these fails", which would > > give us the best of both. > > Yes, a quick Google search pointed me to a blog post [1] > that mentions using 'fail-fast: false' in the test matrix so that > one failing job does not automatically cancel the rest of the jobs > in the matrix (the default is 'true') [2]. > > If we apply that to all four matrices in the workflow file, > (windows-test, vs-test, regular and dockerized), it would be > something like this: Thanks, that's exactly what I think we'd want. The downside, of course, is that a failure that will happen on every platform will mean wasted CPU to trigger the same failure over and over. But: - I rarely see that myself, because I wouldn't bother pushing up to CI until "make test" passed locally. So usually I'm finding portability issues via CI. Other people might be different, though. - we already have the Windows tests in a separate matrix anyway, so a failure on Linux would run the whole Windows suite (which is an order of magnitude more expensive) - even within the Windows matrix, I think running the rest of the tests after a failure is still valuable. If there's a second failure, you save a round-trip to CI (so it doesn't reduce CPU, but it may help latency to reach a passing state). -Peff
Jeff King <peff@peff.net> writes: > The downside, of course, is that a failure that will happen on every > platform will mean wasted CPU to trigger the same failure over and over. > But: > > - I rarely see that myself, because I wouldn't bother pushing up to CI > until "make test" passed locally. So usually I'm finding portability > issues via CI. Other people might be different, though. This is the same for me. > - we already have the Windows tests in a separate matrix anyway, so a > failure on Linux would run the whole Windows suite (which is an > order of magnitude more expensive) But they tend to finish earlier than OSX and Ubuntu jobs; their sharding of the tests into 10 jobs may probably have something to do with this. > - even within the Windows matrix, I think running the rest of the > tests after a failure is still valuable. If there's a second > failure, you save a round-trip to CI (so it doesn't reduce CPU, but > it may help latency to reach a passing state).
On Fri, Jan 15, 2021 at 03:46:40PM -0500, Jeff King wrote: > On Fri, Jan 15, 2021 at 03:07:38PM -0500, Eric Sunshine wrote: > > > On Fri, Jan 15, 2021 at 3:00 PM Jeff King <peff@peff.net> wrote: > > > > - brew cask install --no-quarantine perforce || { > > > > + brew install --cask --no-quarantine perforce || { > > > > > > On a related note, it feels like perforce is a frequent offender for > > > triggering spurious failures (both for homebrew setup, but I have > > > definitely seen racy/flaky failures from it as well). I am tempted to > > > say it is not worth the trouble, but then I do not care at all about > > > git-p4 myself in the first place, so I may be biased. > > > > To be fair to 'perforce', though, the fault of this particular problem > > is Homebrew, which doesn't seem to be all that concerned about > > backward compatibility, at least in my experience. The single > > Homebrew-related automation script I wrote for personal use has been > > broken by arbitrary Homebrew changes frustratingly often over the last > > three years. > > Yeah, sorry, I should have been more precise in my language. None of > this is perforce's fault at all. It is homebrew in this case, and in the > racy cases it is probably our tests. But I do not feel like trying to > debug those races for a tool I don't care much about. > > I tried to dig up some failing logs as an example, but it's actually a > bit hard to search for, I keep an eye out for flaky test failures, because they are much more entertaining to hunt down than the "usual" bugs, but can't recall any involving Perforce in the last year, since 6026aff5bb (git-p4: cleanup better on error exit, 2020-01-29) fixed the last one I knew about. > and it looks like logs get expired after a few > months anyway. Travis CI can still show our builds from mid-2016: https://travis-ci.org/github/git/git/builds/146233889 though I couldn't dig out our very first build.
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh index 0229a77f7d..0b1184e04a 100755 --- a/ci/install-dependencies.sh +++ b/ci/install-dependencies.sh @@ -44,13 +44,13 @@ osx-clang|osx-gcc) test -z "$BREW_INSTALL_PACKAGES" || brew install $BREW_INSTALL_PACKAGES brew link --force gettext - brew cask install --no-quarantine perforce || { + brew install --cask --no-quarantine perforce || { # Update the definitions and try again cask_repo="$(brew --repository)"/Library/Taps/homebrew/homebrew-cask && - git -C "$cask_repo" pull --no-stat && - brew cask install --no-quarantine perforce + git -C "$cask_repo" pull --no-stat --ff-only && + brew install --cask --no-quarantine perforce } || - brew install caskroom/cask/perforce + brew install homebrew/cask/perforce case "$jobname" in osx-gcc) brew install gcc@9
It seems that homebrew suddenly started giving us trouble, like this: https://github.com/git/git/runs/1705953982?check_suite_focus=true#step:3:70 Here is my attempt to work it around by blindly following the suggested course of action in the error message, without knowing what I am doing X-<. I am not a Mac person. What is frustrating is that every time we hit a minor snag like this to break one of the jobs, all other unrelated jobs are also taken down. Help by those who know what they are doing on macOS would greatly be appreciated. Thanks. ----- >8 ----- >8 ----- >8 ----- >8 ----- >8 ----- >8 ----- We run "git pull" against "$cask_repo"; clarify that we are expecting not to have any of our own modifications and running "git pull" to merely update, by passing "--ff-only" on the command line. Also, the "brew cask install" command line triggers an error message that says: Error: Calling brew cask install is disabled! Use brew install [--cask] instead. In addition, "brew install caskroom/cask/perforce" step triggers an error that says: Error: caskroom/cask was moved. Tap homebrew/cask instead. Attempt to see if blindly following the suggestion in these error messages gets us into a better shape. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- ci/install-dependencies.sh | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)