Message ID | pull.1404.git.1667482458622.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ci: avoid unnecessary builds | expand |
On Thu, Nov 03 2022, Johannes Schindelin via GitGitGadget wrote: > From: Johannes Schindelin <johannes.schindelin@gmx.de> > > Whenever a branch is pushed to a repository which has GitHub Actions > enabled, a bunch of new workflow runs are started. > > We sometimes see contributors push multiple branch updates in rapid > succession, which in conjunction with the impressive time swallowed by > even just a single CI build frequently leads to many queued-up runs. > > This is particularly problematic in the case of Pull Requests where a > single contributor can easily (inadvertently) prevent timely builds for > other contributors. The "timely" being an issue in git/git and/or gitgitgadget where CI time is a shared resource, but not in a <someuser>/git running CI just for <someuser>? > To help with this situation, let's use the `concurrency` feature of > GitHub workflows, essentially canceling GitHub workflow runs that are > obsoleted by more recent runs: > https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#concurrency In my own fork I very much use this concurrency not-cancel-in-progress intentionally. I.e. I'll run local tests, but also occasionally push to CI (particularly when I know I have bad local coverage). But that's very much an async process, sometimes I'll look at the (hopefully finished by then) CI in the morning. E.g. now I have a re-pushed branch where the last tip it was at is still running CI, but it's waiting just the ASAN job. Having that breadcrumb trail of "green" CI is very helpful, i.e. push every hour or so with something WIP, and be able to eventually see when things went wrong, if they went wrong. We have CI config for other such stuff, but I think it's probably hard to use in this case, as this involves cancelling the *other* job, not cancelling ourselves. So by the time we're past the config stage is when we might want to cancel. But doesn't this support the "if" syntax to narrow this down to the relevant repos where this would help, while leaving those forks where it isn't an issue to enact their own policy?
On Thu, Nov 03, 2022 at 01:34:18PM +0000, Johannes Schindelin via GitGitGadget wrote: > .github/workflows/check-whitespace.yml | 5 +++++ > .github/workflows/l10n.yml | 5 +++++ > .github/workflows/main.yml | 5 +++++ > 3 files changed, 15 insertions(+) Very nice. For my own workflow, I'll often do something like: - work on a topic - decide that the topic is ready for submission, so push it to GitHub - while running `git rebase master -x 'make -j40 DEVELOPER=1 test'`, I notice that an earlier patch is broken, so fix it - push the (updated) topic out ...and before preparing the rest of the series for submission, I'll cancel any in-progress runs, the results of which I no longer care about. It sounds like this does exactly that automatically, making older runs obsolete by newer ones. Thanks, Taylor
On Fri, Nov 04, 2022 at 02:46:23AM +0100, Ævar Arnfjörð Bjarmason wrote: > > This is particularly problematic in the case of Pull Requests where a > > single contributor can easily (inadvertently) prevent timely builds for > > other contributors. > > The "timely" being an issue in git/git and/or gitgitgadget where CI time > is a shared resource, but not in a <someuser>/git running CI just for > <someuser>? Yup, agreed. > > To help with this situation, let's use the `concurrency` feature of > > GitHub workflows, essentially canceling GitHub workflow runs that are > > obsoleted by more recent runs: > > https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#concurrency > > In my own fork I very much use this concurrency not-cancel-in-progress > intentionally. Interesting. I noted basically the opposite in my earlier reply[1] to Johannes, where the behavior I want is that newer pushes of the same topic supersede older ones that are currently in progress. But I think you make a compelling point (which doesn't match my own workflow, but I can see the utility of nonetheless). I was thinking that we could rely on something similar to the ci-config ref stuff from back in e76eec35540 (ci: allow per-branch config for GitHub Actions, 2020-05-07), but it looks like it'll be a little trickier than that, maybe impossible. We need to know about the state of the ci-config branch before we set the concurrency bits. So I think you *could* do something like: --- >8 --- diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 4fdf4d3552..f1ca364f96 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -2,11 +2,6 @@ name: CI on: [push, pull_request] -# Avoid unnecessary builds -concurrency: - group: ${{ github.workflow }}-${{ github.ref }} - cancel-in-progress: true - env: DEVELOPER: 1 @@ -39,7 +34,14 @@ jobs: then enabled=no fi + skip_concurrent=yes + if test -x config-repo/ci/config/skip-concurrent && + ! config-repo/ci/config/skip-concurrent '${{ github.ref }}' + then + skip_concurrent=no + fi echo "::set-output name=enabled::$enabled" + echo "::set-output name=skip_concurrent::$skip_concurrent" - name: skip if the commit or tree was already tested id: skip-if-redundant uses: actions/github-script@v3 @@ -86,6 +88,9 @@ jobs: name: win build needs: ci-config if: needs.ci-config.outputs.enabled == 'yes' + concurrency: + group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: needs.ci-config.outputs.skip_concurrent = 'yes' runs-on: windows-latest steps: - uses: actions/checkout@v2 --- 8< --- ...and similar "concurrency" blocks in each of the other jobs to define the settings at the job level instead of at the workflow level. So, it's doable, but a little gross. At the very least, it would satisfy Ævar's workflow requirements, too, since he could write a script that exits with non-zero status to avoid the new behavior. Thanks, Taylor [1]: https://lore.kernel.org/git/Y2R0YrQzKaUZzaPB@nand.local/
On Thu, Nov 03, 2022 at 10:23:56PM -0400, Taylor Blau wrote: > But I think you make a compelling point (which doesn't match my own > workflow, but I can see the utility of nonetheless). > > I was thinking that we could rely on something similar to the ci-config > ref stuff from back in e76eec35540 (ci: allow per-branch config for > GitHub Actions, 2020-05-07), but it looks like it'll be a little > trickier than that, maybe impossible. > > We need to know about the state of the ci-config branch before we set > the concurrency bits. So I think you *could* do something like: As an aside, I wish there was a way to interpret per-repo environment variables in the actual action config. The current ci-config stuff works, but it's pretty horrible because (if I understand correctly) it spins up a VM just to evaluate a glob and say "nope, no CI needed on this branch". So: 1. It's wasteful of resources, compared to a system where the Actions parser can evaluate a variable. 2. It makes the Actions results page for a repo ugly, because skipped branches clutter the output with "yes, I passed CI" even though all they passed was a trivial job to say "don't bother running more CI". 3. The problem you mention: it happens too late to affect the overall Actions flow, and instead individual jobs have to take it into account. When I wrote the original ci-config stuff I looked for an alternative. You _can_ set per-repo variables in the form of secrets, but I couldn't find a way to evaluate them at the top-level of the yaml file. But maybe that has improved in the meantime. I had looked against as recently as a month or two ago and didn't find anything, but I'm far from an expert on Actions. -Peff
On 11/3/22 9:34 AM, Johannes Schindelin via GitGitGadget wrote: > From: Johannes Schindelin <johannes.schindelin@gmx.de> > > Whenever a branch is pushed to a repository which has GitHub Actions > enabled, a bunch of new workflow runs are started. > > We sometimes see contributors push multiple branch updates in rapid > succession, which in conjunction with the impressive time swallowed by > even just a single CI build frequently leads to many queued-up runs. > > This is particularly problematic in the case of Pull Requests where a > single contributor can easily (inadvertently) prevent timely builds for > other contributors. As someone who is both the cause and the victim of this, I thank you for finding a way to reduce wasted CPU time. This patch looks good to me, though I'll need to trust the docs and your testing to be sure it will work. We will definitely see it in place as it merges into 'next' and 'main'. Thanks, -Stolee
On Mon, Nov 07, 2022 at 02:45:24PM -0500, Derrick Stolee wrote: > On 11/3/22 9:34 AM, Johannes Schindelin via GitGitGadget wrote: > > From: Johannes Schindelin <johannes.schindelin@gmx.de> > > > > Whenever a branch is pushed to a repository which has GitHub Actions > > enabled, a bunch of new workflow runs are started. > > > > We sometimes see contributors push multiple branch updates in rapid > > succession, which in conjunction with the impressive time swallowed by > > even just a single CI build frequently leads to many queued-up runs. > > > > This is particularly problematic in the case of Pull Requests where a > > single contributor can easily (inadvertently) prevent timely builds for > > other contributors. > > As someone who is both the cause and the victim of this, I > thank you for finding a way to reduce wasted CPU time. This > patch looks good to me, though I'll need to trust the docs > and your testing to be sure it will work. We will definitely > see it in place as it merges into 'next' and 'main'. I wonder how we should treat Ævar's concerns in this thread. I suspect that the vast majority of workflows wouldn't be affected, but I don't want to completely break Ævar's workflow, either ;-). Some kind of configuration mechanism like I proposed might be nice. Thoughts? Thanks, Taylor
On 11/7/22 2:53 PM, Taylor Blau wrote: > On Mon, Nov 07, 2022 at 02:45:24PM -0500, Derrick Stolee wrote: >> On 11/3/22 9:34 AM, Johannes Schindelin via GitGitGadget wrote: >>> From: Johannes Schindelin <johannes.schindelin@gmx.de> >>> >>> Whenever a branch is pushed to a repository which has GitHub Actions >>> enabled, a bunch of new workflow runs are started. >>> >>> We sometimes see contributors push multiple branch updates in rapid >>> succession, which in conjunction with the impressive time swallowed by >>> even just a single CI build frequently leads to many queued-up runs. >>> >>> This is particularly problematic in the case of Pull Requests where a >>> single contributor can easily (inadvertently) prevent timely builds for >>> other contributors. >> >> As someone who is both the cause and the victim of this, I >> thank you for finding a way to reduce wasted CPU time. This >> patch looks good to me, though I'll need to trust the docs >> and your testing to be sure it will work. We will definitely >> see it in place as it merges into 'next' and 'main'. > > I wonder how we should treat Ævar's concerns in this thread. I suspect > that the vast majority of workflows wouldn't be affected, but I don't > want to completely break Ævar's workflow, either ;-). > > Some kind of configuration mechanism like I proposed might be nice. > Thoughts? Taking a look at that sub-thread, I have two thoughts: 1. I don't think supporting a "multiple pushes of WIP work" scenario is a good use of "free" resources. If you want to test multiple versions of something, then use multiple branches (and I think Johannes's patch allows concurrent builds for distinct branch names). 2. The change you recommend requires running the job and deciding at runtime whether to do the actual build (unless I'm mistaken). It is better to let the workflow coordinator decide on concurrency before the stage where worker VMs are engaged. Either of these points may have an incorrect assumption, so I'm prepared to be wrong. Thanks, -Stolee
On Mon, Nov 07 2022, Derrick Stolee wrote: > On 11/7/22 2:53 PM, Taylor Blau wrote: >> On Mon, Nov 07, 2022 at 02:45:24PM -0500, Derrick Stolee wrote: >>> On 11/3/22 9:34 AM, Johannes Schindelin via GitGitGadget wrote: >>>> From: Johannes Schindelin <johannes.schindelin@gmx.de> >>>> >>>> Whenever a branch is pushed to a repository which has GitHub Actions >>>> enabled, a bunch of new workflow runs are started. >>>> >>>> We sometimes see contributors push multiple branch updates in rapid >>>> succession, which in conjunction with the impressive time swallowed by >>>> even just a single CI build frequently leads to many queued-up runs. >>>> >>>> This is particularly problematic in the case of Pull Requests where a >>>> single contributor can easily (inadvertently) prevent timely builds for >>>> other contributors. >>> >>> As someone who is both the cause and the victim of this, I >>> thank you for finding a way to reduce wasted CPU time. This >>> patch looks good to me, though I'll need to trust the docs >>> and your testing to be sure it will work. We will definitely >>> see it in place as it merges into 'next' and 'main'. >> >> I wonder how we should treat Ævar's concerns in this thread. I suspect >> that the vast majority of workflows wouldn't be affected, but I don't >> want to completely break Ævar's workflow, either ;-). >> >> Some kind of configuration mechanism like I proposed might be nice. >> Thoughts? > > Taking a look at that sub-thread, I have two thoughts: > > 1. I don't think supporting a "multiple pushes of WIP work" > scenario is a good use of "free" resources. If you want to > test multiple versions of something, then use multiple > branches (and I think Johannes's patch allows concurrent > builds for distinct branch names). The setting Taylor proposed in https://lore.kernel.org/git/Y2R3vJf1A2KOZwA7@nand.local/ is off by default, i.e. it would behave the same way as what Johannes is proposing, just give you (well, me) an opt-out from the default, without patching main.yml on every branch. So it seems like a win-win, why force others to change their workflow? Sure, I could push multiple branches, but you could also manually cancel your outstanding jobs before re-pushing... I agree that cancelling the outstanding job is a better default, and if we had to pick one or the other I'd say "sure", but if we can have both... > 2. The change you recommend requires running the job and > deciding at runtime whether to do the actual build > (unless I'm mistaken). It is better to let the workflow > coordinator decide on concurrency before the stage where > worker VMs are engaged. Doesn't the "concurrency" setting say "I am a job thaht's prepared to have its run cancelled if another job shows up with this <key>". I.e. we'd already be running the VM by definition if we're going to benefit from it (very tight races aside), and in any case getting to the "configure" stage doesn't require much in the way of resources, a few seconds of VM time... Whereas the "this tree already ran" *would* benefit from not starting a VM, but that's "I'm a new push, and maybe I shouldn't run at all", whereas this feature is "I always want to run, but someone might cancel me later". > Either of these points may have an incorrect assumption, so > I'm prepared to be wrong. I *think* you're wrong about #2, but I'm not sure either. I don't think you can be wrong about #1, "others should change their workflow to fit a new worldview" is more of a value-judgment :)
On 11/7/22 4:03 PM, Ævar Arnfjörð Bjarmason wrote: > > On Mon, Nov 07 2022, Derrick Stolee wrote: > >> On 11/7/22 2:53 PM, Taylor Blau wrote: >>> I wonder how we should treat Ævar's concerns in this thread. I suspect >>> that the vast majority of workflows wouldn't be affected, but I don't >>> want to completely break Ævar's workflow, either ;-). >>> >>> Some kind of configuration mechanism like I proposed might be nice. >>> Thoughts? >> >> Taking a look at that sub-thread, I have two thoughts: >> >> 1. I don't think supporting a "multiple pushes of WIP work" >> scenario is a good use of "free" resources. If you want to >> test multiple versions of something, then use multiple >> branches (and I think Johannes's patch allows concurrent >> builds for distinct branch names). > > The setting Taylor proposed in > https://lore.kernel.org/git/Y2R3vJf1A2KOZwA7@nand.local/ is off by > default, i.e. it would behave the same way as what Johannes is > proposing, just give you (well, me) an opt-out from the default, without > patching main.yml on every branch. > > So it seems like a win-win, why force others to change their workflow? > Sure, I could push multiple branches, but you could also manually cancel > your outstanding jobs before re-pushing... > > I agree that cancelling the outstanding job is a better default, and if > we had to pick one or the other I'd say "sure", but if we can have > both... >> Either of these points may have an incorrect assumption, so >> I'm prepared to be wrong. > > I *think* you're wrong about #2, but I'm not sure either. At the very least, the configurable option requires fetching the repo and checking out at least one file. I don't know how much it actually saves one way or another. > I don't think you can be wrong about #1, "others should change their > workflow to fit a new worldview" is more of a value-judgment :) True, but I think that the workflow you are trying to keep valid is also indistinguishable to the typical flow of force-pushing during incremental rewrites, so preserving your workflow will continue to add costs to that behavior. My value judgement is that experts can adapt their workflows as situations change for the better of the group. If the cost of doing the config option version is minimal over the global concurrency issue, then I say we should go that route. I just expect it to take up more resources than the strategy proposed in the initial patch. I wonder how we could determine this. Should we run a few CI jobs with some force-pushes in either approach (config turned off) so we know that cost? Thanks, -Stolee
On Mon, Nov 07, 2022 at 04:59:12PM -0500, Derrick Stolee wrote: > On 11/7/22 4:03 PM, Ævar Arnfjörð Bjarmason wrote: > > > > On Mon, Nov 07 2022, Derrick Stolee wrote: > > > >> On 11/7/22 2:53 PM, Taylor Blau wrote: > > >>> I wonder how we should treat Ævar's concerns in this thread. I suspect > >>> that the vast majority of workflows wouldn't be affected, but I don't > >>> want to completely break Ævar's workflow, either ;-). > >>> > >>> Some kind of configuration mechanism like I proposed might be nice. > >>> Thoughts? > >> > >> Taking a look at that sub-thread, I have two thoughts: > >> > >> 1. I don't think supporting a "multiple pushes of WIP work" > >> scenario is a good use of "free" resources. If you want to > >> test multiple versions of something, then use multiple > >> branches (and I think Johannes's patch allows concurrent > >> builds for distinct branch names). > > > > The setting Taylor proposed in > > https://lore.kernel.org/git/Y2R3vJf1A2KOZwA7@nand.local/ is off by > > default, i.e. it would behave the same way as what Johannes is > > proposing, just give you (well, me) an opt-out from the default, without > > patching main.yml on every branch. > > > > So it seems like a win-win, why force others to change their workflow? > > Sure, I could push multiple branches, but you could also manually cancel > > your outstanding jobs before re-pushing... > > > > I agree that cancelling the outstanding job is a better default, and if > > we had to pick one or the other I'd say "sure", but if we can have > > both... > > >> Either of these points may have an incorrect assumption, so > >> I'm prepared to be wrong. > > > > I *think* you're wrong about #2, but I'm not sure either. > > At the very least, the configurable option requires fetching the > repo and checking out at least one file. I don't know how much it > actually saves one way or another. To be clear, I think the savings here are minimal from a pure CPU-usage perspective. I'm more concerned with the expense of running a job which counts double-digit minutes against your available Actions runtime. I played around with the following, but I can't quite get Actions to like it. The error message I get (ref[1]) is something like: The workflow is not valid. .github/workflows/main.yml (Line: 96, Col: 27): Unexpected value 'needs.ci-config.outputs.skip_concurrent == 'yes'' .github/workflows/main.yml (Line: 123, Col: 27): Unexpected value 'needs.ci-config.outputs.skip_concurrent == 'yes'' But I think the patch below should more-or-less be what we're looking for: --- >8 --- Subject: [PATCH] ci: avoid unnecessary builds Whenever a branch is pushed to a repository which has GitHub Actions enabled, a bunch of new workflow runs are started. We sometimes see contributors push multiple branch updates in rapid succession, which in conjunction with the impressive time swallowed by even just a single CI build frequently leads to many queued-up runs. This is particularly problematic in the case of Pull Requests where a single contributor can easily (inadvertently) prevent timely builds for other contributors when using a shared repository. To help with this situation, let's use the `concurrency` feature of GitHub workflows, essentially canceling GitHub workflow runs that are obsoleted by more recent runs: https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#concurrency For workflows that *do* want the behavior in the pre-image of this patch, they can use the ci-config feature to disable the new behavior by adding an executable script on the ci-config branch called 'skip-concurrent' which terminates with a non-zero exit code. Original-patch-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- .github/workflows/check-whitespace.yml | 6 ++++ .github/workflows/l10n.yml | 6 ++++ .github/workflows/main.yml | 40 ++++++++++++++++++++++++-- 3 files changed, 50 insertions(+), 2 deletions(-) diff --git a/.github/workflows/check-whitespace.yml b/.github/workflows/check-whitespace.yml index ad3466ad16..0bcc9cffbd 100644 --- a/.github/workflows/check-whitespace.yml +++ b/.github/workflows/check-whitespace.yml @@ -9,6 +9,12 @@ on: pull_request: types: [opened, synchronize] +# Avoid unnecessary builds. Unlike the main CI jobs, these are not +# ci-configurable (but could be). +concurrency: + group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: true + jobs: check-whitespace: runs-on: ubuntu-latest diff --git a/.github/workflows/l10n.yml b/.github/workflows/l10n.yml index 27f72f0ff3..51fd46e6af 100644 --- a/.github/workflows/l10n.yml +++ b/.github/workflows/l10n.yml @@ -2,6 +2,12 @@ name: git-l10n on: [push, pull_request_target] +# Avoid unnecessary builds. Unlike the main CI jobs, these are not +# ci-configurable (but could be). +concurrency: + group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: true + jobs: git-po-helper: if: >- diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index bd6f75b8e0..87b5b369e1 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -11,6 +11,7 @@ jobs: runs-on: ubuntu-latest outputs: enabled: ${{ steps.check-ref.outputs.enabled }}${{ steps.skip-if-redundant.outputs.enabled }} + skip_concurrent: ${{ steps.check-ref.outputs.skip_concurrent }} steps: - name: try to clone ci-config branch run: | @@ -34,7 +35,15 @@ jobs: then enabled=no fi + + skip_concurrent=yes + if test -x config-repo/ci/config/skip-concurrent && + ! config-repo/ci/config/skip-concurrent '${{ github.ref }}' + then + skip_concurrent=no + fi echo "::set-output name=enabled::$enabled" + echo "::set-output name=skip_concurrent::$skip_concurrent" - name: skip if the commit or tree was already tested id: skip-if-redundant uses: actions/github-script@v3 @@ -82,6 +91,9 @@ jobs: needs: ci-config if: needs.ci-config.outputs.enabled == 'yes' runs-on: windows-latest + concurrency: + group: ${{ github.job.name }}-${{ github.ref }} + cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes' steps: - uses: actions/checkout@v2 - uses: git-for-windows/setup-git-for-windows-sdk@v1 @@ -101,11 +113,14 @@ jobs: windows-test: name: win test runs-on: windows-latest - needs: [windows-build] + needs: [ci-config, windows-build] strategy: fail-fast: false matrix: nr: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] + concurrency: + group: ${{ github.job.name }}-${{ github.ref }} + cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes' steps: - name: download tracked files and build artifacts uses: actions/download-artifact@v2 @@ -137,6 +152,9 @@ jobs: NO_PERL: 1 GIT_CONFIG_PARAMETERS: "'user.name=CI' 'user.email=ci@git'" runs-on: windows-latest + concurrency: + group: ${{ github.job.name }}-${{ github.ref }} + cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes' steps: - uses: actions/checkout@v2 - uses: git-for-windows/setup-git-for-windows-sdk@v1 @@ -184,11 +202,14 @@ jobs: vs-test: name: win+VS test runs-on: windows-latest - needs: vs-build + needs: [ci-config, vs-build] strategy: fail-fast: false matrix: nr: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] + concurrency: + group: ${{ github.job.name }}-${{ github.ref }} + cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes' steps: - uses: git-for-windows/setup-git-for-windows-sdk@v1 - name: download tracked files and build artifacts @@ -218,6 +239,9 @@ jobs: name: ${{matrix.vector.jobname}} (${{matrix.vector.pool}}) needs: ci-config if: needs.ci-config.outputs.enabled == 'yes' + concurrency: + group: ${{ github.job.name }}-${{ github.ref }} + cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes' strategy: fail-fast: false matrix: @@ -281,6 +305,9 @@ jobs: name: ${{matrix.vector.jobname}} (${{matrix.vector.image}}) needs: ci-config if: needs.ci-config.outputs.enabled == 'yes' + concurrency: + group: ${{ github.job.name }}-${{ github.ref }} + cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes' strategy: fail-fast: false matrix: @@ -316,6 +343,9 @@ jobs: env: jobname: StaticAnalysis runs-on: ubuntu-22.04 + concurrency: + group: ${{ github.job.name }}-${{ github.ref }} + cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes' steps: - uses: actions/checkout@v2 - run: ci/install-dependencies.sh @@ -327,6 +357,9 @@ jobs: env: jobname: sparse runs-on: ubuntu-20.04 + concurrency: + group: ${{ github.job.name }}-${{ github.ref }} + cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes' steps: - name: Download a current `sparse` package # Ubuntu's `sparse` version is too old for us @@ -345,6 +378,9 @@ jobs: name: documentation needs: ci-config if: needs.ci-config.outputs.enabled == 'yes' + concurrency: + group: ${{ github.job.name }}-${{ github.ref }} + cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes' env: jobname: Documentation runs-on: ubuntu-latest -- 2.38.0.16.g393fd4c6db --- 8< --- Thanks, Taylor [1]: https://github.com/ttaylorr/git/actions/runs/3414594555
On Mon, Nov 07 2022, Derrick Stolee wrote: > On 11/7/22 4:03 PM, Ævar Arnfjörð Bjarmason wrote: >> >> On Mon, Nov 07 2022, Derrick Stolee wrote: >> >>> On 11/7/22 2:53 PM, Taylor Blau wrote: > >>>> I wonder how we should treat Ævar's concerns in this thread. I suspect >>>> that the vast majority of workflows wouldn't be affected, but I don't >>>> want to completely break Ævar's workflow, either ;-). >>>> >>>> Some kind of configuration mechanism like I proposed might be nice. >>>> Thoughts? >>> >>> Taking a look at that sub-thread, I have two thoughts: >>> >>> 1. I don't think supporting a "multiple pushes of WIP work" >>> scenario is a good use of "free" resources. If you want to >>> test multiple versions of something, then use multiple >>> branches (and I think Johannes's patch allows concurrent >>> builds for distinct branch names). >> >> The setting Taylor proposed in >> https://lore.kernel.org/git/Y2R3vJf1A2KOZwA7@nand.local/ is off by >> default, i.e. it would behave the same way as what Johannes is >> proposing, just give you (well, me) an opt-out from the default, without >> patching main.yml on every branch. >> >> So it seems like a win-win, why force others to change their workflow? >> Sure, I could push multiple branches, but you could also manually cancel >> your outstanding jobs before re-pushing... >> >> I agree that cancelling the outstanding job is a better default, and if >> we had to pick one or the other I'd say "sure", but if we can have >> both... > >>> Either of these points may have an incorrect assumption, so >>> I'm prepared to be wrong. >> >> I *think* you're wrong about #2, but I'm not sure either. > > At the very least, the configurable option requires fetching the > repo and checking out at least one file. I don't know how much it > actually saves one way or another. It's already fetching the ci-config repo, so we're talking about the marginal cost of running the bit of shellscript to check if config-repo/ci/config/skip-concurrent is executable, and if not keeping the default config. >> I don't think you can be wrong about #1, "others should change their >> workflow to fit a new worldview" is more of a value-judgment :) > > True, but I think that the workflow you are trying to keep valid > is also indistinguishable to the typical flow of force-pushing > during incremental rewrites, so preserving your workflow will > continue to add costs to that behavior. I don't think it will, per the above. I mean, pedantically yes: But the cost of that "test -x and variable setting" is so trivial that it's not worth worrying about. > My value judgement is that experts can adapt their workflows as > situations change for the better of the group. Sure, I agree with that in zero-sum scenarios, or where it's a hassle to provide two things, and we need to pick one etc. I just don't see that being the case here. > If the cost of doing the config option version is minimal over > the global concurrency issue, then I say we should go that route. > I just expect it to take up more resources than the strategy > proposed in the initial patch. Based on what? That you read it as us cloning the ci-config repo just for this new proposed config, and missed that we're doing it already, or ...? > I wonder how we could determine this. Should we run a few CI > jobs with some force-pushes in either approach (config turned > off) so we know that cost? The incremental cost of that "test -x", or...? I'm not sure what you mean here.
On 11/7/2022 5:56 PM, Ævar Arnfjörð Bjarmason wrote: > > On Mon, Nov 07 2022, Derrick Stolee wrote: > >> On 11/7/22 4:03 PM, Ævar Arnfjörð Bjarmason wrote: >>> >>> On Mon, Nov 07 2022, Derrick Stolee wrote: >>>> Either of these points may have an incorrect assumption, so >>>> I'm prepared to be wrong. >>> >>> I *think* you're wrong about #2, but I'm not sure either. >> >> At the very least, the configurable option requires fetching the >> repo and checking out at least one file. I don't know how much it >> actually saves one way or another. > > It's already fetching the ci-config repo, so we're talking about the > marginal cost of running the bit of shellscript to check if > config-repo/ci/config/skip-concurrent is executable, and if not keeping > the default config. >> I wonder how we could determine this. Should we run a few CI >> jobs with some force-pushes in either approach (config turned >> off) so we know that cost? > > The incremental cost of that "test -x", or...? I'm not sure what you > mean here. The difference is that setting the concurrency globally allows Actions to decide the concurrency from the workflow file alone, before any jobs are run. This is done without a clone. The method introduced in e76eec35540 (ci: allow per-branch config for GitHub Actions, 2020-05-07) requires running the ci-config job at minimum to set the concurrency value, which involves doing a (very small, shallow) clone of the ci-config branch to determine if that file exists. Since this is the first job to run in the workflow, the global concurrency will only reduces the amount of compute consumed when pushes happen close together, but that is included for "oops I forgot to --amend" and other common cases. At the very least, the difference between the two mechanisms is greater than a 'test -x'. Thanks, -Stolee
Derrick Stolee <derrickstolee@github.com> writes: > On 11/3/22 9:34 AM, Johannes Schindelin via GitGitGadget wrote: >> From: Johannes Schindelin <johannes.schindelin@gmx.de> >> >> Whenever a branch is pushed to a repository which has GitHub Actions >> enabled, a bunch of new workflow runs are started. >> >> We sometimes see contributors push multiple branch updates in rapid >> succession, which in conjunction with the impressive time swallowed by >> even just a single CI build frequently leads to many queued-up runs. >> >> This is particularly problematic in the case of Pull Requests where a >> single contributor can easily (inadvertently) prevent timely builds for >> other contributors. > > As someone who is both the cause and the victim of this, I > thank you for finding a way to reduce wasted CPU time. This > patch looks good to me, though I'll need to trust the docs > and your testing to be sure it will work. We will definitely > see it in place as it merges into 'next' and 'main'. When I see breakages of 'seen' only at the CI, perhaps because it manifests only on macOS, I manually "bisected" by pushing various combinations of topics merged on top of 'master' and pushing the result out as 'seen' only to the GitHub repository, and not having to wait one to finish before pushing out another was a really nice feature. Of course, I could wait before pushing another out, but after seeing the last one close to successful completion in a few minutes and being able to push out the next one was a great timesaver, not only for the "few minutes", but for the countless minutes because I will have to concentrate on more than just a "few minutes" on another task if I have to switch to another task in order to wait for just a "few minutes" before pushing the trial merge out. So, that is the only concern I have with this change, but in general, not running jobs whose results are clearly not needed is a good idea. It just is "clearly" is hard to determine automatically.
Hi Taylor, On Mon, 7 Nov 2022, Taylor Blau wrote: > I'm more concerned with the expense of running a job which counts > double-digit minutes against your available Actions runtime. Me, too. The fact that we used up 50.000 build minutes before we were able to finalize v2.38.1 is quite concerning. > I played around with the following, but I can't quite get Actions to > like it. The error message I get (ref[1]) is something like: > > The workflow is not valid. .github/workflows/main.yml (Line: 96, Col: > 27): Unexpected value 'needs.ci-config.outputs.skip_concurrent == 'yes'' > .github/workflows/main.yml (Line: 123, Col: 27): Unexpected value > 'needs.ci-config.outputs.skip_concurrent == 'yes'' The reason is that what you are trying to do simply cannot work. The `concurrency` setting is evaluated _before_ running the build. Yet you want to make it contingent on the output of a job that only runs _as part_ of the build. Catch-22. Reading through the thread, my assessment is that the original patch is still the best we can do. I also want to point out that the strategy described by Ævar to push every hour and see a "breadcrumb trail" of green builds is quite wasteful. Just because somebody else pays for it does not mean that it is free of cost. It very much looks like wasting resources unnecessarily to me, and I do not condone this, and I believe that the Git project should neither encourage nor support this. If you _must_ have concurrent builds of your branch (i.e. if you want to use that many resources despite the planet literally burning already from wasteful use of resources), you can always start your branch with a patch that comments out the `cancel-in-progress` line, without forcing the complexity of a more generic support for this behavior into Git's code base. Ciao, Dscho
Hi Peff, On Thu, 3 Nov 2022, Jeff King wrote: > On Thu, Nov 03, 2022 at 10:23:56PM -0400, Taylor Blau wrote: > > > But I think you make a compelling point (which doesn't match my own > > workflow, but I can see the utility of nonetheless). > > > > I was thinking that we could rely on something similar to the ci-config > > ref stuff from back in e76eec35540 (ci: allow per-branch config for > > GitHub Actions, 2020-05-07), but it looks like it'll be a little > > trickier than that, maybe impossible. > > > > We need to know about the state of the ci-config branch before we set > > the concurrency bits. So I think you *could* do something like: > > As an aside, I wish there was a way to interpret per-repo environment > variables in the actual action config. There kind of is. "Kind of" because it is not _really_ a per-repo variable (those do not exist on GitHub), but there are topics you can set. These are relatively free-form labels you can attach to _your_ fork, and these labels show up below the "About" section and the link to the home page (if any) on the right side of your respository. AFAICT these topics are not inherited automatically when forking a repository, which is precisely what we want. See https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/classifying-your-repository-with-topics for more details on that. A GitHub workflow can be made conditional on the presence of such a topic. I use this, for example, for an opt-in build of the InnoSetup installer: https://github.com/jrsoftware/issrc/blob/is-6_2_1/.github/workflows/build.yml#L11-L12 The build is opt-in because it requires a non-free build environment, configured via two repository secrets containing a URL pointing to a `.zip` file and a password to extract said `.zip`. As you say, I cannot make the build opt-in based on the presence of secrets, but I can use the presence a repository topic as the knob to enable the workflow. > The current ci-config stuff works, but it's pretty horrible because (if > I understand correctly) it spins up a VM just to evaluate a glob and say > "nope, no CI needed on this branch". So: > > 1. It's wasteful of resources, compared to a system where the Actions > parser can evaluate a variable. Indeed. It might look like the job only takes a few seconds (at least that was the argument that got the `ci-config` patch accepted), but that misses the queue time, which turns this more into several dozens of seconds, and the recorded total duration is much longer than that. In https://github.com/gitgitgadget/git/actions/runs/3412982102 for example, the `ci-config` job only took 6 seconds, according to the page, but the total duration of the build was 6 minutes and 56 seconds. > 2. It makes the Actions results page for a repo ugly, because skipped > branches clutter the output with "yes, I passed CI" even though all > they passed was a trivial job to say "don't bother running more > CI". > > 3. The problem you mention: it happens too late to affect the overall > Actions flow, and instead individual jobs have to take it into > account. And 4. The way `ci-config` is configured is sufficiently "magic" that it only benefits very, very few users, at the price of adding to everybody's build time. To see what I mean, look at https://github.com/gitgitgadget/git/actions/runs/3416084640/jobs/5685867765#step:1:1 and at https://github.com/gitgitgadget/git/actions/runs/3416084640/jobs/5685879914#step:1:1 turning on the timestamps (i.e. click on the sprocket wheel on the upper right side of the log and select "Show timestamps"). You will see that the `ci-config` job started at 3:22:05 UTC and the next job, `win-build`, started only at 4:16:03 UTC. 5. There is official support for the desired behavior that comes without any magic branch with special content that users somehow need to learn about: If you push a branch with commit messages that contain `[skip ci]`, the build will be skipped, and no time is wasted on running any job. For full details, see https://github.blog/changelog/2021-02-08-github-actions-skip-pull-request-and-push-workflows-with-skip-ci/ Having said that, the `ci-config` job is used for something I find much more useful than that magic `ci-config` branch handling: to skip builds when there are already successful builds of the same commit or at least tree. Sadly, that logic fails sometimes because of an unresilient REST API call: in case of network errors, we fall back to not skipping the build _even if_ there has been a previously-successful build of that tree. If it was up to me, I'd simply rip out the `ci-config` branch (`ci/config` file) handling and tell users to mark up branches that need to be skipped with `[skip ci]`. That has a further advantage of being actually portable across a wider range of CI systems (for example, CircleCI supports it, too: https://circleci.com/docs/skip-build/). But then, it would not save us a whole lot because the `ci-config` job _still_ does something useful (i.e. checking for previously successful builds of the same tree), so that time is still spent. But how knows, maybe there will be out-of-the-box support for that at some stage.
Hi Junio, On Mon, 7 Nov 2022, Junio C Hamano wrote: > Derrick Stolee <derrickstolee@github.com> writes: > > > On 11/3/22 9:34 AM, Johannes Schindelin via GitGitGadget wrote: > >> From: Johannes Schindelin <johannes.schindelin@gmx.de> > >> > >> Whenever a branch is pushed to a repository which has GitHub Actions > >> enabled, a bunch of new workflow runs are started. > >> > >> We sometimes see contributors push multiple branch updates in rapid > >> succession, which in conjunction with the impressive time swallowed by > >> even just a single CI build frequently leads to many queued-up runs. > >> > >> This is particularly problematic in the case of Pull Requests where a > >> single contributor can easily (inadvertently) prevent timely builds for > >> other contributors. > > > > As someone who is both the cause and the victim of this, I > > thank you for finding a way to reduce wasted CPU time. This > > patch looks good to me, though I'll need to trust the docs > > and your testing to be sure it will work. We will definitely > > see it in place as it merges into 'next' and 'main'. > > When I see breakages of 'seen' only at the CI, perhaps because it > manifests only on macOS, I manually "bisected" by pushing various > combinations of topics merged on top of 'master' and pushing the > result out as 'seen' only to the GitHub repository, and not having > to wait one to finish before pushing out another was a really nice > feature. Of course, I could wait before pushing another out, but > after seeing the last one close to successful completion in a few > minutes and being able to push out the next one was a great > timesaver, not only for the "few minutes", but for the countless > minutes because I will have to concentrate on more than just a "few > minutes" on another task if I have to switch to another task in > order to wait for just a "few minutes" before pushing the trial > merge out. I often find myself with similar problems where I have to test a couple of revisions in order to pinpoint regressions that do not reproduce locally. One of my power tools to figure these things out is https://github.com/mxschmitt/action-tmate, allowing me to SSH into the build agent where the failure happens (note: do use with prudence lest your account gets flagged for potential mining). I find that much more surgical a tool than having multiple builds run concurrently, not the least reason being that keeping track of which builds correspond to which step of the bisection can be very, very confusing. Also, one of my core values is to use up only the resources that I need to. And using up a lot of build minutes is contrary to that value. So what I end up doing in similar situations is: - first of all, switch to a temporary branch - then, add a TO-DROP commit that rips out _all_ of the unnecessary builds. Typically the first to go are check-whitespace and git-l10n, and then I liberally delete jobs (including the `ci-config` job that's totally useless in this use case) and restrict the test suite to running just the failing script by editing the `T = [...]` line in `t/Makefile`, often even adding a `--run=[... only the minimal amount of test cases...]` (after figuring out which ones are needed which is unfortunately quite hard due to the abundance of side effects our test suite relies on) to the `GIT_TEST_OPTS` in `ci/`. - then, if I need to test multiple revisions, I _create_ new branches for each bisection point (typically encoding information in the branch name that will help me with book-keeping), cherry-picking the just-created TO-DROP commit before pushing. Since I want to minimize my footprint when it comes to using resources, I typically am very judicious about what revisions I test, and how many I run concurrently. It is a bit sad that doing this is currently very much involved and that it is so much easier to just go ahead and run the entire test suite on every available platform even if the test failures one wishes to diagnose happen on but one platform in but one test script. I.e. in the current shape, our code base encourages wasting of resources. That is a situation I would like to see improved. If I was not committed to other work streams, I would work on it because I find it that important. > So, that is the only concern I have with this change, but in > general, not running jobs whose results are clearly not needed is a > good idea. It just is "clearly" is hard to determine automatically. Right. It is also very subjective what "clearly" should mean in this context. For example, I find it clearly undesirable just how long our test suite takes (and not for any good reason, really) and how inflexible it is when it comes to things like Test Impact Analysis (i.e. running only tests covering the code modified in a given PR, which would speed up everything _tremendously_). At the same time I am fully aware that I find myself pretty alone in this assessment, otherwise other contributors would clearly be more interested in fixing these issues than they are. Ciao, Dscho
On Tue, Nov 08, 2022 at 09:18:21AM +0100, Johannes Schindelin wrote: > > I played around with the following, but I can't quite get Actions to > > like it. The error message I get (ref[1]) is something like: > > > > The workflow is not valid. .github/workflows/main.yml (Line: 96, Col: > > 27): Unexpected value 'needs.ci-config.outputs.skip_concurrent == 'yes'' > > .github/workflows/main.yml (Line: 123, Col: 27): Unexpected value > > 'needs.ci-config.outputs.skip_concurrent == 'yes'' > > The reason is that what you are trying to do simply cannot work. I was surprised that I couldn't get this to work, because to me it seemed like the sort of thing that *should* be possible to do. Indeed, it is, and I made a couple of mistakes in writing the workflow file: - The expression for 'skip-in-progress' needed to be enclosed in '${{}}' markers. - It also needed to take into account the job name (and matrix information!) where relevant. And here we can't just use ${{github.job}}, since that is only available inside of the job steps. To the last bullet point there, we unfortunately have to copy and paste the job name, which seems like a limitation of the Actions workflow parser to me. I posted an alternative approach to this patch in [1], and I would be very curious to hear your thoughts, if you have time! Thanks, Taylor [1]: https://lore.kernel.org/git/cover.1667931937.git.me@ttaylorr.com/T/#t
On Tue, Nov 08, 2022 at 10:16:15AM +0100, Johannes Schindelin wrote: > > As an aside, I wish there was a way to interpret per-repo environment > > variables in the actual action config. > > There kind of is. "Kind of" because it is not _really_ a per-repo variable > (those do not exist on GitHub), but there are topics you can set. These > are relatively free-form labels you can attach to _your_ fork, and these > labels show up below the "About" section and the link to the home page (if > any) on the right side of your respository. AFAICT these topics are not > inherited automatically when forking a repository, which is precisely what > we want. See > https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/classifying-your-repository-with-topics > for more details on that. Ah, that's very clever, thank you! For the original problem that motivated me adding ci-config in the first place, branch selection, I think we could do this: if: | !contains(join(github.event.repository.topics), 'ci-only-') || contains(github.event.repository.topics, format('ci-only-{0}', github.ref_name)) and then by default we'd continue to build for all pushes, but if you add ci-only-foo as a repo topic, then we'd build only refs/heads/foo. I may see if I can work this into our workflow file. Even if we can't get rid of ci-config for the skip-successful-build feature, this would still save CPU by dropping even ci-config when the branch should be skipped entirely. > > The current ci-config stuff works, but it's pretty horrible because (if > > I understand correctly) it spins up a VM just to evaluate a glob and say > > "nope, no CI needed on this branch". So: > > > > 1. It's wasteful of resources, compared to a system where the Actions > > parser can evaluate a variable. > > Indeed. It might look like the job only takes a few seconds (at least that > was the argument that got the `ci-config` patch accepted), but that misses > the queue time, which turns this more into several dozens of seconds, and > the recorded total duration is much longer than that. In > https://github.com/gitgitgadget/git/actions/runs/3412982102 for example, > the `ci-config` job only took 6 seconds, according to the page, but the > total duration of the build was 6 minutes and 56 seconds. Yes, but I don't think that's wasting 6 minutes of resources if we're just sitting in a queue. It may increase the end-to-end latency of getting the CI result, of course. > 4. The way `ci-config` is configured is sufficiently "magic" that it > only benefits very, very few users, at the price of adding to > everybody's build time. To see what I mean, look at > https://github.com/gitgitgadget/git/actions/runs/3416084640/jobs/5685867765#step:1:1 > and at > https://github.com/gitgitgadget/git/actions/runs/3416084640/jobs/5685879914#step:1:1 > turning on the timestamps (i.e. click on the sprocket wheel on the > upper right side of the log and select "Show timestamps"). You will > see that the `ci-config` job started at 3:22:05 UTC and the next > job, `win-build`, started only at 4:16:03 UTC. Ouch. Though I wonder in practice how fast that would have gone without ci-config. It is just asking for a generic ubuntu vm, which many of other jobs would. Was there a shortage of those vms at 3:22, and by the time it finally ran that shortage was over? Or is there constant contention, and it is increasing the end-to-end latency by asking for a queue slot, running, and then asking for more queue slots? > 5. There is official support for the desired behavior that comes > without any magic branch with special content that users somehow > need to learn about: If you push a branch with commit messages that > contain `[skip ci]`, the build will be skipped, and no time is > wasted on running any job. For full details, see > https://github.blog/changelog/2021-02-08-github-actions-skip-pull-request-and-push-workflows-with-skip-ci/ This existed back when I added ci-config originally, and I tried it, but the results are quite painful to use. Because "skip ci" is really a property of a branch that a commit is pushed to, and not a branch. So you have to sprinkle these "skip ci" markers all over the branch tips, but then remember to remove them when you actually want to do useful things with the branches, like merge them or send them out. -Peff
On Wed, Nov 09, 2022 at 09:00:33AM -0500, Jeff King wrote: > On Tue, Nov 08, 2022 at 10:16:15AM +0100, Johannes Schindelin wrote: > > > > As an aside, I wish there was a way to interpret per-repo environment > > > variables in the actual action config. > > > > There kind of is. "Kind of" because it is not _really_ a per-repo variable > > (those do not exist on GitHub), but there are topics you can set. These > > are relatively free-form labels you can attach to _your_ fork, and these > > labels show up below the "About" section and the link to the home page (if > > any) on the right side of your respository. AFAICT these topics are not > > inherited automatically when forking a repository, which is precisely what > > we want. See > > https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/classifying-your-repository-with-topics > > for more details on that. > > Ah, that's very clever, thank you! Very cute. > For the original problem that motivated me adding ci-config in the first > place, branch selection, I think we could do this: > > if: | > !contains(join(github.event.repository.topics), 'ci-only-') || > contains(github.event.repository.topics, format('ci-only-{0}', github.ref_name)) > > and then by default we'd continue to build for all pushes, but if you > add ci-only-foo as a repo topic, then we'd build only refs/heads/foo. > > I may see if I can work this into our workflow file. Even if we can't > get rid of ci-config for the skip-successful-build feature, this would > still save CPU by dropping even ci-config when the branch should be > skipped entirely. Yeah. I'd be hesitant to see a bunch of CI configuration knobs get folded into the repository topics feature, but a clear "save CI minutes if we don't care about building these branch(es)" is useful. Thanks, Taylor
diff --git a/.github/workflows/check-whitespace.yml b/.github/workflows/check-whitespace.yml index ad3466ad16e..8a4c4bfbb93 100644 --- a/.github/workflows/check-whitespace.yml +++ b/.github/workflows/check-whitespace.yml @@ -9,6 +9,11 @@ on: pull_request: types: [opened, synchronize] +# Avoid unnecessary builds +concurrency: + group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: true + jobs: check-whitespace: runs-on: ubuntu-latest diff --git a/.github/workflows/l10n.yml b/.github/workflows/l10n.yml index 27f72f0ff34..77d9a416289 100644 --- a/.github/workflows/l10n.yml +++ b/.github/workflows/l10n.yml @@ -2,6 +2,11 @@ name: git-l10n on: [push, pull_request_target] +# Avoid unnecessary builds +concurrency: + group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: true + jobs: git-po-helper: if: >- diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 831f4df56c5..cf47f0ccfed 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -2,6 +2,11 @@ name: CI on: [push, pull_request] +# Avoid unnecessary builds +concurrency: + group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: true + env: DEVELOPER: 1