ci: avoid unnecessary builds

Message ID	pull.1404.git.1667482458622.gitgitgadget@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@kernel.org> Message-Id: <pull.1404.git.1667482458622.gitgitgadget@gmail.com> Date: Thu, 03 Nov 2022 13:34:18 +0000 Subject: [PATCH] ci: avoid unnecessary builds Fcc: Sent Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes Schindelin <johannes.schindelin@gmx.de>, Johannes Schindelin <johannes.schindelin@gmx.de> Precedence: bulk From: Johannes Schindelin <johannes.schindelin@gmx.de>
Series	ci: avoid unnecessary builds \| expand ci: avoid unnecessary builds

Johannes Schindelin Nov. 3, 2022, 1:34 p.m. UTC

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Whenever a branch is pushed to a repository which has GitHub Actions
enabled, a bunch of new workflow runs are started.

We sometimes see contributors push multiple branch updates in rapid
succession, which in conjunction with the impressive time swallowed by
even just a single CI build frequently leads to many queued-up runs.

This is particularly problematic in the case of Pull Requests where a
single contributor can easily (inadvertently) prevent timely builds for
other contributors.

To help with this situation, let's use the `concurrency` feature of
GitHub workflows, essentially canceling GitHub workflow runs that are
obsoleted by more recent runs:
https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#concurrency

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    ci: avoid unnecessary builds
    
    Just something I noticed recently and only got around to implement
    today.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1404%2Fdscho%2Favoid-unnecessary-ci-builds-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1404/dscho/avoid-unnecessary-ci-builds-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1404

 .github/workflows/check-whitespace.yml | 5 +++++
 .github/workflows/l10n.yml             | 5 +++++
 .github/workflows/main.yml             | 5 +++++
 3 files changed, 15 insertions(+)


base-commit: e7e5c6f715b2de7bea0d39c7d2ba887335b40aa0

Ævar Arnfjörð Bjarmason Nov. 4, 2022, 1:46 a.m. UTC | #1

On Thu, Nov 03 2022, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> Whenever a branch is pushed to a repository which has GitHub Actions
> enabled, a bunch of new workflow runs are started.
>
> We sometimes see contributors push multiple branch updates in rapid
> succession, which in conjunction with the impressive time swallowed by
> even just a single CI build frequently leads to many queued-up runs.
>
> This is particularly problematic in the case of Pull Requests where a
> single contributor can easily (inadvertently) prevent timely builds for
> other contributors.

The "timely" being an issue in git/git and/or gitgitgadget where CI time
is a shared resource, but not in a <someuser>/git running CI just for
<someuser>?

> To help with this situation, let's use the `concurrency` feature of
> GitHub workflows, essentially canceling GitHub workflow runs that are
> obsoleted by more recent runs:
> https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#concurrency

In my own fork I very much use this concurrency not-cancel-in-progress
intentionally.

I.e. I'll run local tests, but also occasionally push to CI
(particularly when I know I have bad local coverage).

But that's very much an async process, sometimes I'll look at the
(hopefully finished by then) CI in the morning.

E.g. now I have a re-pushed branch where the last tip it was at is still
running CI, but it's waiting just the ASAN job.

Having that breadcrumb trail of "green" CI is very helpful, i.e. push
every hour or so with something WIP, and be able to eventually see when
things went wrong, if they went wrong.

We have CI config for other such stuff, but I think it's probably hard
to use in this case, as this involves cancelling the *other* job, not
cancelling ourselves. So by the time we're past the config stage is when
we might want to cancel.

But doesn't this support the "if" syntax to narrow this down to the
relevant repos where this would help, while leaving those forks where it
isn't an issue to enact their own policy?

Taylor Blau Nov. 4, 2022, 2:09 a.m. UTC | #2

On Thu, Nov 03, 2022 at 01:34:18PM +0000, Johannes Schindelin via GitGitGadget wrote:
>  .github/workflows/check-whitespace.yml | 5 +++++
>  .github/workflows/l10n.yml             | 5 +++++
>  .github/workflows/main.yml             | 5 +++++
>  3 files changed, 15 insertions(+)

Very nice. For my own workflow, I'll often do something like:

  - work on a topic
  - decide that the topic is ready for submission, so push it to GitHub
  - while running `git rebase master -x 'make -j40 DEVELOPER=1 test'`, I
    notice that an earlier patch is broken, so fix it
  - push the (updated) topic out

...and before preparing the rest of the series for submission, I'll
cancel any in-progress runs, the results of which I no longer care
about.

It sounds like this does exactly that automatically, making older runs
obsolete by newer ones.

Thanks,
Taylor

Taylor Blau Nov. 4, 2022, 2:23 a.m. UTC | #3

On Fri, Nov 04, 2022 at 02:46:23AM +0100, Ævar Arnfjörð Bjarmason wrote:
> > This is particularly problematic in the case of Pull Requests where a
> > single contributor can easily (inadvertently) prevent timely builds for
> > other contributors.
>
> The "timely" being an issue in git/git and/or gitgitgadget where CI time
> is a shared resource, but not in a <someuser>/git running CI just for
> <someuser>?

Yup, agreed.

> > To help with this situation, let's use the `concurrency` feature of
> > GitHub workflows, essentially canceling GitHub workflow runs that are
> > obsoleted by more recent runs:
> > https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#concurrency
>
> In my own fork I very much use this concurrency not-cancel-in-progress
> intentionally.

Interesting. I noted basically the opposite in my earlier reply[1] to
Johannes, where the behavior I want is that newer pushes of the same
topic supersede older ones that are currently in progress.

But I think you make a compelling point (which doesn't match my own
workflow, but I can see the utility of nonetheless).

I was thinking that we could rely on something similar to the ci-config
ref stuff from back in e76eec35540 (ci: allow per-branch config for
GitHub Actions, 2020-05-07), but it looks like it'll be a little
trickier than that, maybe impossible.

We need to know about the state of the ci-config branch before we set
the concurrency bits. So I think you *could* do something like:

--- >8 ---
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 4fdf4d3552..f1ca364f96 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -2,11 +2,6 @@ name: CI

 on: [push, pull_request]

-# Avoid unnecessary builds
-concurrency:
-  group: ${{ github.workflow }}-${{ github.ref }}
-  cancel-in-progress: true
-
 env:
   DEVELOPER: 1

@@ -39,7 +34,14 @@ jobs:
           then
             enabled=no
           fi
+          skip_concurrent=yes
+          if test -x config-repo/ci/config/skip-concurrent &&
+             ! config-repo/ci/config/skip-concurrent '${{ github.ref }}'
+          then
+            skip_concurrent=no
+          fi
           echo "::set-output name=enabled::$enabled"
+          echo "::set-output name=skip_concurrent::$skip_concurrent"
       - name: skip if the commit or tree was already tested
         id: skip-if-redundant
         uses: actions/github-script@v3
@@ -86,6 +88,9 @@ jobs:
     name: win build
     needs: ci-config
     if: needs.ci-config.outputs.enabled == 'yes'
+    concurrency:
+      group: ${{ github.workflow }}-${{ github.ref }}
+      cancel-in-progress: needs.ci-config.outputs.skip_concurrent = 'yes'
     runs-on: windows-latest
     steps:
     - uses: actions/checkout@v2
--- 8< ---

...and similar "concurrency" blocks in each of the other jobs to define
the settings at the job level instead of at the workflow level.

So, it's doable, but a little gross. At the very least, it would satisfy
Ævar's workflow requirements, too, since he could write a script that
exits with non-zero status to avoid the new behavior.

Thanks,
Taylor

[1]: https://lore.kernel.org/git/Y2R0YrQzKaUZzaPB@nand.local/

Jeff King Nov. 4, 2022, 3:20 a.m. UTC | #4

On Thu, Nov 03, 2022 at 10:23:56PM -0400, Taylor Blau wrote:

> But I think you make a compelling point (which doesn't match my own
> workflow, but I can see the utility of nonetheless).
> 
> I was thinking that we could rely on something similar to the ci-config
> ref stuff from back in e76eec35540 (ci: allow per-branch config for
> GitHub Actions, 2020-05-07), but it looks like it'll be a little
> trickier than that, maybe impossible.
> 
> We need to know about the state of the ci-config branch before we set
> the concurrency bits. So I think you *could* do something like:

As an aside, I wish there was a way to interpret per-repo environment
variables in the actual action config. The current ci-config stuff
works, but it's pretty horrible because (if I understand correctly) it
spins up a VM just to evaluate a glob and say "nope, no CI needed on
this branch". So:

  1. It's wasteful of resources, compared to a system where the Actions
     parser can evaluate a variable.

  2. It makes the Actions results page for a repo ugly, because skipped
     branches clutter the output with "yes, I passed CI" even though all
     they passed was a trivial job to say "don't bother running more
     CI".

  3. The problem you mention: it happens too late to affect the overall
     Actions flow, and instead individual jobs have to take it into
     account.

When I wrote the original ci-config stuff I looked for an alternative.
You _can_ set per-repo variables in the form of secrets, but I couldn't
find a way to evaluate them at the top-level of the yaml file. But maybe
that has improved in the meantime. I had looked against as recently as a
month or two ago and didn't find anything, but I'm far from an expert on
Actions.

-Peff

Derrick Stolee Nov. 7, 2022, 7:45 p.m. UTC | #5

On 11/3/22 9:34 AM, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> Whenever a branch is pushed to a repository which has GitHub Actions
> enabled, a bunch of new workflow runs are started.
> 
> We sometimes see contributors push multiple branch updates in rapid
> succession, which in conjunction with the impressive time swallowed by
> even just a single CI build frequently leads to many queued-up runs.
> 
> This is particularly problematic in the case of Pull Requests where a
> single contributor can easily (inadvertently) prevent timely builds for
> other contributors.

As someone who is both the cause and the victim of this, I
thank you for finding a way to reduce wasted CPU time. This
patch looks good to me, though I'll need to trust the docs
and your testing to be sure it will work. We will definitely
see it in place as it merges into 'next' and 'main'.

Thanks,
-Stolee

Taylor Blau Nov. 7, 2022, 7:53 p.m. UTC | #6

On Mon, Nov 07, 2022 at 02:45:24PM -0500, Derrick Stolee wrote:
> On 11/3/22 9:34 AM, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > Whenever a branch is pushed to a repository which has GitHub Actions
> > enabled, a bunch of new workflow runs are started.
> >
> > We sometimes see contributors push multiple branch updates in rapid
> > succession, which in conjunction with the impressive time swallowed by
> > even just a single CI build frequently leads to many queued-up runs.
> >
> > This is particularly problematic in the case of Pull Requests where a
> > single contributor can easily (inadvertently) prevent timely builds for
> > other contributors.
>
> As someone who is both the cause and the victim of this, I
> thank you for finding a way to reduce wasted CPU time. This
> patch looks good to me, though I'll need to trust the docs
> and your testing to be sure it will work. We will definitely
> see it in place as it merges into 'next' and 'main'.

I wonder how we should treat Ævar's concerns in this thread. I suspect
that the vast majority of workflows wouldn't be affected, but I don't
want to completely break Ævar's workflow, either ;-).

Some kind of configuration mechanism like I proposed might be nice.
Thoughts?

Thanks,
Taylor

Derrick Stolee Nov. 7, 2022, 8:08 p.m. UTC | #7

On 11/7/22 2:53 PM, Taylor Blau wrote:
> On Mon, Nov 07, 2022 at 02:45:24PM -0500, Derrick Stolee wrote:
>> On 11/3/22 9:34 AM, Johannes Schindelin via GitGitGadget wrote:
>>> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>>>
>>> Whenever a branch is pushed to a repository which has GitHub Actions
>>> enabled, a bunch of new workflow runs are started.
>>>
>>> We sometimes see contributors push multiple branch updates in rapid
>>> succession, which in conjunction with the impressive time swallowed by
>>> even just a single CI build frequently leads to many queued-up runs.
>>>
>>> This is particularly problematic in the case of Pull Requests where a
>>> single contributor can easily (inadvertently) prevent timely builds for
>>> other contributors.
>>
>> As someone who is both the cause and the victim of this, I
>> thank you for finding a way to reduce wasted CPU time. This
>> patch looks good to me, though I'll need to trust the docs
>> and your testing to be sure it will work. We will definitely
>> see it in place as it merges into 'next' and 'main'.
> 
> I wonder how we should treat Ævar's concerns in this thread. I suspect
> that the vast majority of workflows wouldn't be affected, but I don't
> want to completely break Ævar's workflow, either ;-).
> 
> Some kind of configuration mechanism like I proposed might be nice.
> Thoughts?

Taking a look at that sub-thread, I have two thoughts:

1. I don't think supporting a "multiple pushes of WIP work"
   scenario is a good use of "free" resources. If you want to
   test multiple versions of something, then use multiple
   branches (and I think Johannes's patch allows concurrent
   builds for distinct branch names).

2. The change you recommend requires running the job and
   deciding at runtime whether to do the actual build
   (unless I'm mistaken). It is better to let the workflow
   coordinator decide on concurrency before the stage where
   worker VMs are engaged.

Either of these points may have an incorrect assumption, so
I'm prepared to be wrong.

Thanks,
-Stolee

Ævar Arnfjörð Bjarmason Nov. 7, 2022, 9:03 p.m. UTC | #8

On Mon, Nov 07 2022, Derrick Stolee wrote:

> On 11/7/22 2:53 PM, Taylor Blau wrote:
>> On Mon, Nov 07, 2022 at 02:45:24PM -0500, Derrick Stolee wrote:
>>> On 11/3/22 9:34 AM, Johannes Schindelin via GitGitGadget wrote:
>>>> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>>>>
>>>> Whenever a branch is pushed to a repository which has GitHub Actions
>>>> enabled, a bunch of new workflow runs are started.
>>>>
>>>> We sometimes see contributors push multiple branch updates in rapid
>>>> succession, which in conjunction with the impressive time swallowed by
>>>> even just a single CI build frequently leads to many queued-up runs.
>>>>
>>>> This is particularly problematic in the case of Pull Requests where a
>>>> single contributor can easily (inadvertently) prevent timely builds for
>>>> other contributors.
>>>
>>> As someone who is both the cause and the victim of this, I
>>> thank you for finding a way to reduce wasted CPU time. This
>>> patch looks good to me, though I'll need to trust the docs
>>> and your testing to be sure it will work. We will definitely
>>> see it in place as it merges into 'next' and 'main'.
>> 
>> I wonder how we should treat Ævar's concerns in this thread. I suspect
>> that the vast majority of workflows wouldn't be affected, but I don't
>> want to completely break Ævar's workflow, either ;-).
>> 
>> Some kind of configuration mechanism like I proposed might be nice.
>> Thoughts?
>
> Taking a look at that sub-thread, I have two thoughts:
>
> 1. I don't think supporting a "multiple pushes of WIP work"
>    scenario is a good use of "free" resources. If you want to
>    test multiple versions of something, then use multiple
>    branches (and I think Johannes's patch allows concurrent
>    builds for distinct branch names).

The setting Taylor proposed in
https://lore.kernel.org/git/Y2R3vJf1A2KOZwA7@nand.local/ is off by
default, i.e. it would behave the same way as what Johannes is
proposing, just give you (well, me) an opt-out from the default, without
patching main.yml on every branch.

So it seems like a win-win, why force others to change their workflow?
Sure, I could push multiple branches, but you could also manually cancel
your outstanding jobs before re-pushing...

I agree that cancelling the outstanding job is a better default, and if
we had to pick one or the other I'd say "sure", but if we can have
both...

> 2. The change you recommend requires running the job and
>    deciding at runtime whether to do the actual build
>    (unless I'm mistaken). It is better to let the workflow
>    coordinator decide on concurrency before the stage where
>    worker VMs are engaged.

Doesn't the "concurrency" setting say "I am a job thaht's prepared to
have its run cancelled if another job shows up with this <key>".

I.e. we'd already be running the VM by definition if we're going to
benefit from it (very tight races aside), and in any case getting to the
"configure" stage doesn't require much in the way of resources, a few
seconds of VM time...

Whereas the "this tree already ran" *would* benefit from not starting a
VM, but that's "I'm a new push, and maybe I shouldn't run at all",
whereas this feature is "I always want to run, but someone might cancel
me later".

> Either of these points may have an incorrect assumption, so
> I'm prepared to be wrong.

I *think* you're wrong about #2, but I'm not sure either.

I don't think you can be wrong about #1, "others should change their
workflow to fit a new worldview" is more of a value-judgment :)

Derrick Stolee Nov. 7, 2022, 9:59 p.m. UTC | #9

On 11/7/22 4:03 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Mon, Nov 07 2022, Derrick Stolee wrote:
> 
>> On 11/7/22 2:53 PM, Taylor Blau wrote:

>>> I wonder how we should treat Ævar's concerns in this thread. I suspect
>>> that the vast majority of workflows wouldn't be affected, but I don't
>>> want to completely break Ævar's workflow, either ;-).
>>>
>>> Some kind of configuration mechanism like I proposed might be nice.
>>> Thoughts?
>>
>> Taking a look at that sub-thread, I have two thoughts:
>>
>> 1. I don't think supporting a "multiple pushes of WIP work"
>>    scenario is a good use of "free" resources. If you want to
>>    test multiple versions of something, then use multiple
>>    branches (and I think Johannes's patch allows concurrent
>>    builds for distinct branch names).
> 
> The setting Taylor proposed in
> https://lore.kernel.org/git/Y2R3vJf1A2KOZwA7@nand.local/ is off by
> default, i.e. it would behave the same way as what Johannes is
> proposing, just give you (well, me) an opt-out from the default, without
> patching main.yml on every branch.
> 
> So it seems like a win-win, why force others to change their workflow?
> Sure, I could push multiple branches, but you could also manually cancel
> your outstanding jobs before re-pushing...
> 
> I agree that cancelling the outstanding job is a better default, and if
> we had to pick one or the other I'd say "sure", but if we can have
> both...

>> Either of these points may have an incorrect assumption, so
>> I'm prepared to be wrong.
> 
> I *think* you're wrong about #2, but I'm not sure either.

At the very least, the configurable option requires fetching the
repo and checking out at least one file. I don't know how much it
actually saves one way or another.

> I don't think you can be wrong about #1, "others should change their
> workflow to fit a new worldview" is more of a value-judgment :)

True, but I think that the workflow you are trying to keep valid
is also indistinguishable to the typical flow of force-pushing
during incremental rewrites, so preserving your workflow will
continue to add costs to that behavior.

My value judgement is that experts can adapt their workflows as
situations change for the better of the group.

If the cost of doing the config option version is minimal over
the global concurrency issue, then I say we should go that route.
I just expect it to take up more resources than the strategy
proposed in the initial patch.

I wonder how we could determine this. Should we run a few CI
jobs with some force-pushes in either approach (config turned
off) so we know that cost?

Thanks,
-Stolee

Taylor Blau Nov. 7, 2022, 10:44 p.m. UTC | #10

On Mon, Nov 07, 2022 at 04:59:12PM -0500, Derrick Stolee wrote:
> On 11/7/22 4:03 PM, Ævar Arnfjörð Bjarmason wrote:
> >
> > On Mon, Nov 07 2022, Derrick Stolee wrote:
> >
> >> On 11/7/22 2:53 PM, Taylor Blau wrote:
>
> >>> I wonder how we should treat Ævar's concerns in this thread. I suspect
> >>> that the vast majority of workflows wouldn't be affected, but I don't
> >>> want to completely break Ævar's workflow, either ;-).
> >>>
> >>> Some kind of configuration mechanism like I proposed might be nice.
> >>> Thoughts?
> >>
> >> Taking a look at that sub-thread, I have two thoughts:
> >>
> >> 1. I don't think supporting a "multiple pushes of WIP work"
> >>    scenario is a good use of "free" resources. If you want to
> >>    test multiple versions of something, then use multiple
> >>    branches (and I think Johannes's patch allows concurrent
> >>    builds for distinct branch names).
> >
> > The setting Taylor proposed in
> > https://lore.kernel.org/git/Y2R3vJf1A2KOZwA7@nand.local/ is off by
> > default, i.e. it would behave the same way as what Johannes is
> > proposing, just give you (well, me) an opt-out from the default, without
> > patching main.yml on every branch.
> >
> > So it seems like a win-win, why force others to change their workflow?
> > Sure, I could push multiple branches, but you could also manually cancel
> > your outstanding jobs before re-pushing...
> >
> > I agree that cancelling the outstanding job is a better default, and if
> > we had to pick one or the other I'd say "sure", but if we can have
> > both...
>
> >> Either of these points may have an incorrect assumption, so
> >> I'm prepared to be wrong.
> >
> > I *think* you're wrong about #2, but I'm not sure either.
>
> At the very least, the configurable option requires fetching the
> repo and checking out at least one file. I don't know how much it
> actually saves one way or another.

To be clear, I think the savings here are minimal from a pure CPU-usage
perspective. I'm more concerned with the expense of running a job which
counts double-digit minutes against your available Actions runtime.

I played around with the following, but I can't quite get Actions to
like it. The error message I get (ref[1]) is something like:

    The workflow is not valid. .github/workflows/main.yml (Line: 96, Col:
    27): Unexpected value 'needs.ci-config.outputs.skip_concurrent == 'yes''
    .github/workflows/main.yml (Line: 123, Col: 27): Unexpected value
    'needs.ci-config.outputs.skip_concurrent == 'yes''

But I think the patch below should more-or-less be what we're looking
for:

--- >8 ---
Subject: [PATCH] ci: avoid unnecessary builds

Whenever a branch is pushed to a repository which has GitHub Actions
enabled, a bunch of new workflow runs are started.

We sometimes see contributors push multiple branch updates in rapid
succession, which in conjunction with the impressive time swallowed by
even just a single CI build frequently leads to many queued-up runs.

This is particularly problematic in the case of Pull Requests where a
single contributor can easily (inadvertently) prevent timely builds for
other contributors when using a shared repository.

To help with this situation, let's use the `concurrency` feature of
GitHub workflows, essentially canceling GitHub workflow runs that are
obsoleted by more recent runs:

  https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#concurrency

For workflows that *do* want the behavior in the pre-image of this
patch, they can use the ci-config feature to disable the new behavior by
adding an executable script on the ci-config branch called
'skip-concurrent' which terminates with a non-zero exit code.

Original-patch-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 .github/workflows/check-whitespace.yml |  6 ++++
 .github/workflows/l10n.yml             |  6 ++++
 .github/workflows/main.yml             | 40 ++++++++++++++++++++++++--
 3 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/check-whitespace.yml b/.github/workflows/check-whitespace.yml
index ad3466ad16..0bcc9cffbd 100644
--- a/.github/workflows/check-whitespace.yml
+++ b/.github/workflows/check-whitespace.yml
@@ -9,6 +9,12 @@ on:
   pull_request:
     types: [opened, synchronize]

+# Avoid unnecessary builds. Unlike the main CI jobs, these are not
+# ci-configurable (but could be).
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+
 jobs:
   check-whitespace:
     runs-on: ubuntu-latest
diff --git a/.github/workflows/l10n.yml b/.github/workflows/l10n.yml
index 27f72f0ff3..51fd46e6af 100644
--- a/.github/workflows/l10n.yml
+++ b/.github/workflows/l10n.yml
@@ -2,6 +2,12 @@ name: git-l10n

 on: [push, pull_request_target]

+# Avoid unnecessary builds. Unlike the main CI jobs, these are not
+# ci-configurable (but could be).
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+
 jobs:
   git-po-helper:
     if: >-
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index bd6f75b8e0..87b5b369e1 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -11,6 +11,7 @@ jobs:
     runs-on: ubuntu-latest
     outputs:
       enabled: ${{ steps.check-ref.outputs.enabled }}${{ steps.skip-if-redundant.outputs.enabled }}
+      skip_concurrent: ${{ steps.check-ref.outputs.skip_concurrent }}
     steps:
       - name: try to clone ci-config branch
         run: |
@@ -34,7 +35,15 @@ jobs:
           then
             enabled=no
           fi
+
+          skip_concurrent=yes
+          if test -x config-repo/ci/config/skip-concurrent &&
+             ! config-repo/ci/config/skip-concurrent '${{ github.ref }}'
+          then
+            skip_concurrent=no
+          fi
           echo "::set-output name=enabled::$enabled"
+          echo "::set-output name=skip_concurrent::$skip_concurrent"
       - name: skip if the commit or tree was already tested
         id: skip-if-redundant
         uses: actions/github-script@v3
@@ -82,6 +91,9 @@ jobs:
     needs: ci-config
     if: needs.ci-config.outputs.enabled == 'yes'
     runs-on: windows-latest
+    concurrency:
+      group: ${{ github.job.name }}-${{ github.ref }}
+      cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes'
     steps:
     - uses: actions/checkout@v2
     - uses: git-for-windows/setup-git-for-windows-sdk@v1
@@ -101,11 +113,14 @@ jobs:
   windows-test:
     name: win test
     runs-on: windows-latest
-    needs: [windows-build]
+    needs: [ci-config, windows-build]
     strategy:
       fail-fast: false
       matrix:
         nr: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
+    concurrency:
+      group: ${{ github.job.name }}-${{ github.ref }}
+      cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes'
     steps:
     - name: download tracked files and build artifacts
       uses: actions/download-artifact@v2
@@ -137,6 +152,9 @@ jobs:
       NO_PERL: 1
       GIT_CONFIG_PARAMETERS: "'user.name=CI' 'user.email=ci@git'"
     runs-on: windows-latest
+    concurrency:
+      group: ${{ github.job.name }}-${{ github.ref }}
+      cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes'
     steps:
     - uses: actions/checkout@v2
     - uses: git-for-windows/setup-git-for-windows-sdk@v1
@@ -184,11 +202,14 @@ jobs:
   vs-test:
     name: win+VS test
     runs-on: windows-latest
-    needs: vs-build
+    needs: [ci-config, vs-build]
     strategy:
       fail-fast: false
       matrix:
         nr: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
+    concurrency:
+      group: ${{ github.job.name }}-${{ github.ref }}
+      cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes'
     steps:
     - uses: git-for-windows/setup-git-for-windows-sdk@v1
     - name: download tracked files and build artifacts
@@ -218,6 +239,9 @@ jobs:
     name: ${{matrix.vector.jobname}} (${{matrix.vector.pool}})
     needs: ci-config
     if: needs.ci-config.outputs.enabled == 'yes'
+    concurrency:
+      group: ${{ github.job.name }}-${{ github.ref }}
+      cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes'
     strategy:
       fail-fast: false
       matrix:
@@ -281,6 +305,9 @@ jobs:
     name: ${{matrix.vector.jobname}} (${{matrix.vector.image}})
     needs: ci-config
     if: needs.ci-config.outputs.enabled == 'yes'
+    concurrency:
+      group: ${{ github.job.name }}-${{ github.ref }}
+      cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes'
     strategy:
       fail-fast: false
       matrix:
@@ -316,6 +343,9 @@ jobs:
     env:
       jobname: StaticAnalysis
     runs-on: ubuntu-22.04
+    concurrency:
+      group: ${{ github.job.name }}-${{ github.ref }}
+      cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes'
     steps:
     - uses: actions/checkout@v2
     - run: ci/install-dependencies.sh
@@ -327,6 +357,9 @@ jobs:
     env:
       jobname: sparse
     runs-on: ubuntu-20.04
+    concurrency:
+      group: ${{ github.job.name }}-${{ github.ref }}
+      cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes'
     steps:
     - name: Download a current `sparse` package
       # Ubuntu's `sparse` version is too old for us
@@ -345,6 +378,9 @@ jobs:
     name: documentation
     needs: ci-config
     if: needs.ci-config.outputs.enabled == 'yes'
+    concurrency:
+      group: ${{ github.job.name }}-${{ github.ref }}
+      cancel-in-progress: needs.ci-config.outputs.skip_concurrent == 'yes'
     env:
       jobname: Documentation
     runs-on: ubuntu-latest
--
2.38.0.16.g393fd4c6db

--- 8< ---

Thanks,
Taylor

[1]: https://github.com/ttaylorr/git/actions/runs/3414594555

Ævar Arnfjörð Bjarmason Nov. 7, 2022, 10:56 p.m. UTC | #11

On Mon, Nov 07 2022, Derrick Stolee wrote:

> On 11/7/22 4:03 PM, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Mon, Nov 07 2022, Derrick Stolee wrote:
>> 
>>> On 11/7/22 2:53 PM, Taylor Blau wrote:
>
>>>> I wonder how we should treat Ævar's concerns in this thread. I suspect
>>>> that the vast majority of workflows wouldn't be affected, but I don't
>>>> want to completely break Ævar's workflow, either ;-).
>>>>
>>>> Some kind of configuration mechanism like I proposed might be nice.
>>>> Thoughts?
>>>
>>> Taking a look at that sub-thread, I have two thoughts:
>>>
>>> 1. I don't think supporting a "multiple pushes of WIP work"
>>>    scenario is a good use of "free" resources. If you want to
>>>    test multiple versions of something, then use multiple
>>>    branches (and I think Johannes's patch allows concurrent
>>>    builds for distinct branch names).
>> 
>> The setting Taylor proposed in
>> https://lore.kernel.org/git/Y2R3vJf1A2KOZwA7@nand.local/ is off by
>> default, i.e. it would behave the same way as what Johannes is
>> proposing, just give you (well, me) an opt-out from the default, without
>> patching main.yml on every branch.
>> 
>> So it seems like a win-win, why force others to change their workflow?
>> Sure, I could push multiple branches, but you could also manually cancel
>> your outstanding jobs before re-pushing...
>> 
>> I agree that cancelling the outstanding job is a better default, and if
>> we had to pick one or the other I'd say "sure", but if we can have
>> both...
>
>>> Either of these points may have an incorrect assumption, so
>>> I'm prepared to be wrong.
>> 
>> I *think* you're wrong about #2, but I'm not sure either.
>
> At the very least, the configurable option requires fetching the
> repo and checking out at least one file. I don't know how much it
> actually saves one way or another.

It's already fetching the ci-config repo, so we're talking about the
marginal cost of running the bit of shellscript to check if
config-repo/ci/config/skip-concurrent is executable, and if not keeping
the default config.

>> I don't think you can be wrong about #1, "others should change their
>> workflow to fit a new worldview" is more of a value-judgment :)
>
> True, but I think that the workflow you are trying to keep valid
> is also indistinguishable to the typical flow of force-pushing
> during incremental rewrites, so preserving your workflow will
> continue to add costs to that behavior.

I don't think it will, per the above. I mean, pedantically yes: But the
cost of that "test -x and variable setting" is so trivial that it's not
worth worrying about.

> My value judgement is that experts can adapt their workflows as
> situations change for the better of the group.

Sure, I agree with that in zero-sum scenarios, or where it's a hassle to
provide two things, and we need to pick one etc. I just don't see that
being the case here.

> If the cost of doing the config option version is minimal over
> the global concurrency issue, then I say we should go that route.
> I just expect it to take up more resources than the strategy
> proposed in the initial patch.

Based on what? That you read it as us cloning the ci-config repo just
for this new proposed config, and missed that we're doing it already, or
...?

> I wonder how we could determine this. Should we run a few CI
> jobs with some force-pushes in either approach (config turned
> off) so we know that cost?

The incremental cost of that "test -x", or...? I'm not sure what you
mean here.

Derrick Stolee Nov. 8, 2022, 12:02 a.m. UTC | #12

On 11/7/2022 5:56 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Mon, Nov 07 2022, Derrick Stolee wrote:
> 
>> On 11/7/22 4:03 PM, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> On Mon, Nov 07 2022, Derrick Stolee wrote:

>>>> Either of these points may have an incorrect assumption, so
>>>> I'm prepared to be wrong.
>>>
>>> I *think* you're wrong about #2, but I'm not sure either.
>>
>> At the very least, the configurable option requires fetching the
>> repo and checking out at least one file. I don't know how much it
>> actually saves one way or another.
> 
> It's already fetching the ci-config repo, so we're talking about the
> marginal cost of running the bit of shellscript to check if
> config-repo/ci/config/skip-concurrent is executable, and if not keeping
> the default config.

>> I wonder how we could determine this. Should we run a few CI
>> jobs with some force-pushes in either approach (config turned
>> off) so we know that cost?
> 
> The incremental cost of that "test -x", or...? I'm not sure what you
> mean here.

The difference is that setting the concurrency globally allows
Actions to decide the concurrency from the workflow file alone,
before any jobs are run. This is done without a clone.

The method introduced in e76eec35540 (ci: allow per-branch config
for GitHub Actions, 2020-05-07) requires running the ci-config
job at minimum to set the concurrency value, which involves doing
a (very small, shallow) clone of the ci-config branch to determine
if that file exists.

Since this is the first job to run in the workflow, the global
concurrency will only reduces the amount of compute consumed
when pushes happen close together, but that is included for "oops
I forgot to --amend" and other common cases.

At the very least, the difference between the two mechanisms is
greater than a 'test -x'.

Thanks,
-Stolee

Junio C Hamano Nov. 8, 2022, 12:31 a.m. UTC | #13

Derrick Stolee <derrickstolee@github.com> writes:

> On 11/3/22 9:34 AM, Johannes Schindelin via GitGitGadget wrote:
>> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>> 
>> Whenever a branch is pushed to a repository which has GitHub Actions
>> enabled, a bunch of new workflow runs are started.
>> 
>> We sometimes see contributors push multiple branch updates in rapid
>> succession, which in conjunction with the impressive time swallowed by
>> even just a single CI build frequently leads to many queued-up runs.
>> 
>> This is particularly problematic in the case of Pull Requests where a
>> single contributor can easily (inadvertently) prevent timely builds for
>> other contributors.
>
> As someone who is both the cause and the victim of this, I
> thank you for finding a way to reduce wasted CPU time. This
> patch looks good to me, though I'll need to trust the docs
> and your testing to be sure it will work. We will definitely
> see it in place as it merges into 'next' and 'main'.

When I see breakages of 'seen' only at the CI, perhaps because it
manifests only on macOS, I manually "bisected" by pushing various
combinations of topics merged on top of 'master' and pushing the
result out as 'seen' only to the GitHub repository, and not having
to wait one to finish before pushing out another was a really nice
feature.  Of course, I could wait before pushing another out, but
after seeing the last one close to successful completion in a few
minutes and being able to push out the next one was a great
timesaver, not only for the "few minutes", but for the countless
minutes because I will have to concentrate on more than just a "few
minutes" on another task if I have to switch to another task in
order to wait for just a "few minutes" before pushing the trial
merge out.

So, that is the only concern I have with this change, but in
general, not running jobs whose results are clearly not needed is a
good idea.  It just is "clearly" is hard to determine automatically.

Johannes Schindelin Nov. 8, 2022, 8:18 a.m. UTC | #14

Hi Taylor,

On Mon, 7 Nov 2022, Taylor Blau wrote:

> I'm more concerned with the expense of running a job which counts
> double-digit minutes against your available Actions runtime.

Me, too. The fact that we used up 50.000 build minutes before we were able
to finalize v2.38.1 is quite concerning.

> I played around with the following, but I can't quite get Actions to
> like it. The error message I get (ref[1]) is something like:
>
>     The workflow is not valid. .github/workflows/main.yml (Line: 96, Col:
>     27): Unexpected value 'needs.ci-config.outputs.skip_concurrent == 'yes''
>     .github/workflows/main.yml (Line: 123, Col: 27): Unexpected value
>     'needs.ci-config.outputs.skip_concurrent == 'yes''

The reason is that what you are trying to do simply cannot work.

The `concurrency` setting is evaluated _before_ running the build. Yet you
want to make it contingent on the output of a job that only runs _as part_
of the build. Catch-22.

Reading through the thread, my assessment is that the original patch is
still the best we can do.

I also want to point out that the strategy described by Ævar to push every
hour and see a "breadcrumb trail" of green builds is quite wasteful. Just
because somebody else pays for it does not mean that it is free of cost.
It very much looks like wasting resources unnecessarily to me, and I do
not condone this, and I believe that the Git project should neither
encourage nor support this.

If you _must_ have concurrent builds of your branch (i.e. if you want to
use that many resources despite the planet literally burning already from
wasteful use of resources), you can always start your branch with a patch
that comments out the `cancel-in-progress` line, without forcing the
complexity of a more generic support for this behavior into Git's code
base.

Ciao,
Dscho

Johannes Schindelin Nov. 8, 2022, 9:16 a.m. UTC | #15

Hi Peff,

On Thu, 3 Nov 2022, Jeff King wrote:

> On Thu, Nov 03, 2022 at 10:23:56PM -0400, Taylor Blau wrote:
>
> > But I think you make a compelling point (which doesn't match my own
> > workflow, but I can see the utility of nonetheless).
> >
> > I was thinking that we could rely on something similar to the ci-config
> > ref stuff from back in e76eec35540 (ci: allow per-branch config for
> > GitHub Actions, 2020-05-07), but it looks like it'll be a little
> > trickier than that, maybe impossible.
> >
> > We need to know about the state of the ci-config branch before we set
> > the concurrency bits. So I think you *could* do something like:
>
> As an aside, I wish there was a way to interpret per-repo environment
> variables in the actual action config.

There kind of is. "Kind of" because it is not _really_ a per-repo variable
(those do not exist on GitHub), but there are topics you can set. These
are relatively free-form labels you can attach to _your_ fork, and these
labels show up below the "About" section and the link to the home page (if
any) on the right side of your respository. AFAICT these topics are not
inherited automatically when forking a repository, which is precisely what
we want. See
https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/classifying-your-repository-with-topics
for more details on that.

A GitHub workflow can be made conditional on the presence of such a topic.
I use this, for example, for an opt-in build of the InnoSetup installer:
https://github.com/jrsoftware/issrc/blob/is-6_2_1/.github/workflows/build.yml#L11-L12
The build is opt-in because it requires a non-free build environment,
configured via two repository secrets containing a URL pointing to a
`.zip` file and a password to extract said `.zip`. As you say, I cannot
make the build opt-in based on the presence of secrets, but I can use the
presence a repository topic as the knob to enable the workflow.

> The current ci-config stuff works, but it's pretty horrible because (if
> I understand correctly) it spins up a VM just to evaluate a glob and say
> "nope, no CI needed on this branch". So:
>
>   1. It's wasteful of resources, compared to a system where the Actions
>      parser can evaluate a variable.

Indeed. It might look like the job only takes a few seconds (at least that
was the argument that got the `ci-config` patch accepted), but that misses
the queue time, which turns this more into several dozens of seconds, and
the recorded total duration is much longer than that. In
https://github.com/gitgitgadget/git/actions/runs/3412982102 for example,
the `ci-config` job only took 6 seconds, according to the page, but the
total duration of the build was 6 minutes and 56 seconds.

>   2. It makes the Actions results page for a repo ugly, because skipped
>      branches clutter the output with "yes, I passed CI" even though all
>      they passed was a trivial job to say "don't bother running more
>      CI".
>
>   3. The problem you mention: it happens too late to affect the overall
>      Actions flow, and instead individual jobs have to take it into
>      account.

And

    4. The way `ci-config` is configured is sufficiently "magic" that it
       only benefits very, very few users, at the price of adding to
       everybody's build time. To see what I mean, look at
       https://github.com/gitgitgadget/git/actions/runs/3416084640/jobs/5685867765#step:1:1
       and at
       https://github.com/gitgitgadget/git/actions/runs/3416084640/jobs/5685879914#step:1:1
       turning on the timestamps (i.e. click on the sprocket wheel on the
       upper right side of the log and select "Show timestamps"). You will
       see that the `ci-config` job started at 3:22:05 UTC and the next
       job, `win-build`, started only at 4:16:03 UTC.

    5. There is official support for the desired behavior that comes
       without any magic branch with special content that users somehow
       need to learn about: If you push a branch with commit messages that
       contain `[skip ci]`, the build will be skipped, and no time is
       wasted on running any job. For full details, see
       https://github.blog/changelog/2021-02-08-github-actions-skip-pull-request-and-push-workflows-with-skip-ci/

Having said that, the `ci-config` job is used for something I find much
more useful than that magic `ci-config` branch handling: to skip builds
when there are already successful builds of the same commit or at least
tree. Sadly, that logic fails sometimes because of an unresilient REST API
call: in case of network errors, we fall back to not skipping the build
_even if_ there has been a previously-successful build of that tree.

If it was up to me, I'd simply rip out the `ci-config` branch (`ci/config`
file) handling and tell users to mark up branches that need to be skipped
with `[skip ci]`. That has a further advantage of being actually portable
across a wider range of CI systems (for example, CircleCI supports it,
too: https://circleci.com/docs/skip-build/).

But then, it would not save us a whole lot because the `ci-config` job
_still_ does something useful (i.e. checking for previously successful
builds of the same tree), so that time is still spent. But how knows,
maybe there will be out-of-the-box support for that at some stage.

Johannes Schindelin Nov. 8, 2022, 9:51 a.m. UTC | #16

Hi Junio,

On Mon, 7 Nov 2022, Junio C Hamano wrote:

> Derrick Stolee <derrickstolee@github.com> writes:
>
> > On 11/3/22 9:34 AM, Johannes Schindelin via GitGitGadget wrote:
> >> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >>
> >> Whenever a branch is pushed to a repository which has GitHub Actions
> >> enabled, a bunch of new workflow runs are started.
> >>
> >> We sometimes see contributors push multiple branch updates in rapid
> >> succession, which in conjunction with the impressive time swallowed by
> >> even just a single CI build frequently leads to many queued-up runs.
> >>
> >> This is particularly problematic in the case of Pull Requests where a
> >> single contributor can easily (inadvertently) prevent timely builds for
> >> other contributors.
> >
> > As someone who is both the cause and the victim of this, I
> > thank you for finding a way to reduce wasted CPU time. This
> > patch looks good to me, though I'll need to trust the docs
> > and your testing to be sure it will work. We will definitely
> > see it in place as it merges into 'next' and 'main'.
>
> When I see breakages of 'seen' only at the CI, perhaps because it
> manifests only on macOS, I manually "bisected" by pushing various
> combinations of topics merged on top of 'master' and pushing the
> result out as 'seen' only to the GitHub repository, and not having
> to wait one to finish before pushing out another was a really nice
> feature.  Of course, I could wait before pushing another out, but
> after seeing the last one close to successful completion in a few
> minutes and being able to push out the next one was a great
> timesaver, not only for the "few minutes", but for the countless
> minutes because I will have to concentrate on more than just a "few
> minutes" on another task if I have to switch to another task in
> order to wait for just a "few minutes" before pushing the trial
> merge out.

I often find myself with similar problems where I have to test a couple of
revisions in order to pinpoint regressions that do not reproduce locally.

One of my power tools to figure these things out is
https://github.com/mxschmitt/action-tmate, allowing me to SSH into the
build agent where the failure happens (note: do use with prudence lest your
account gets flagged for potential mining).

I find that much more surgical a tool than having multiple builds run
concurrently, not the least reason being that keeping track of which
builds correspond to which step of the bisection can be very, very
confusing.

Also, one of my core values is to use up only the resources that I need
to. And using up a lot of build minutes is contrary to that value.

So what I end up doing in similar situations is:

- first of all, switch to a temporary branch

- then, add a TO-DROP commit that rips out _all_ of the unnecessary
  builds. Typically the first to go are check-whitespace and git-l10n, and
  then I liberally delete jobs (including the `ci-config` job that's
  totally useless in this use case) and restrict the test suite to running
  just the failing script by editing the `T = [...]` line in `t/Makefile`,
  often even adding a `--run=[... only the minimal amount of test
  cases...]` (after figuring out which ones are needed which is
  unfortunately quite hard due to the abundance of side effects our test
  suite relies on) to the `GIT_TEST_OPTS` in `ci/`.

- then, if I need to test multiple revisions, I _create_ new branches for
  each bisection point (typically encoding information in the branch name
  that will help me with book-keeping), cherry-picking the just-created
  TO-DROP commit before pushing.

Since I want to minimize my footprint when it comes to using resources, I
typically am very judicious about what revisions I test, and how many I
run concurrently.

It is a bit sad that doing this is currently very much involved and that
it is so much easier to just go ahead and run the entire test suite on
every available platform even if the test failures one wishes to diagnose
happen on but one platform in but one test script. I.e. in the current
shape, our code base encourages wasting of resources.

That is a situation I would like to see improved. If I was not committed
to other work streams, I would work on it because I find it that
important.

> So, that is the only concern I have with this change, but in
> general, not running jobs whose results are clearly not needed is a
> good idea.  It just is "clearly" is hard to determine automatically.

Right. It is also very subjective what "clearly" should mean in this
context. For example, I find it clearly undesirable just how long our test
suite takes (and not for any good reason, really) and how inflexible it is
when it comes to things like Test Impact Analysis (i.e. running only tests
covering the code modified in a given PR, which would speed up everything
_tremendously_). At the same time I am fully aware that I find myself
pretty alone in this assessment, otherwise other contributors would
clearly be more interested in fixing these issues than they are.

Ciao,
Dscho

Taylor Blau Nov. 8, 2022, 6:30 p.m. UTC | #17

On Tue, Nov 08, 2022 at 09:18:21AM +0100, Johannes Schindelin wrote:
> > I played around with the following, but I can't quite get Actions to
> > like it. The error message I get (ref[1]) is something like:
> >
> >     The workflow is not valid. .github/workflows/main.yml (Line: 96, Col:
> >     27): Unexpected value 'needs.ci-config.outputs.skip_concurrent == 'yes''
> >     .github/workflows/main.yml (Line: 123, Col: 27): Unexpected value
> >     'needs.ci-config.outputs.skip_concurrent == 'yes''
>
> The reason is that what you are trying to do simply cannot work.

I was surprised that I couldn't get this to work, because to me it
seemed like the sort of thing that *should* be possible to do.

Indeed, it is, and I made a couple of mistakes in writing the workflow
file:

  - The expression for 'skip-in-progress' needed to be enclosed in
    '${{}}' markers.

  - It also needed to take into account the job name (and matrix
    information!) where relevant. And here we can't just use
    ${{github.job}}, since that is only available inside of the job
    steps.

To the last bullet point there, we unfortunately have to copy and paste
the job name, which seems like a limitation of the Actions workflow
parser to me.

I posted an alternative approach to this patch in [1], and I would be
very curious to hear your thoughts, if you have time!

Thanks,
Taylor

[1]: https://lore.kernel.org/git/cover.1667931937.git.me@ttaylorr.com/T/#t

Jeff King Nov. 9, 2022, 2 p.m. UTC | #18

On Tue, Nov 08, 2022 at 10:16:15AM +0100, Johannes Schindelin wrote:

> > As an aside, I wish there was a way to interpret per-repo environment
> > variables in the actual action config.
> 
> There kind of is. "Kind of" because it is not _really_ a per-repo variable
> (those do not exist on GitHub), but there are topics you can set. These
> are relatively free-form labels you can attach to _your_ fork, and these
> labels show up below the "About" section and the link to the home page (if
> any) on the right side of your respository. AFAICT these topics are not
> inherited automatically when forking a repository, which is precisely what
> we want. See
> https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/classifying-your-repository-with-topics
> for more details on that.

Ah, that's very clever, thank you!

For the original problem that motivated me adding ci-config in the first
place, branch selection, I think we could do this:

  if: |
    !contains(join(github.event.repository.topics), 'ci-only-') ||
    contains(github.event.repository.topics, format('ci-only-{0}', github.ref_name))

and then by default we'd continue to build for all pushes, but if you
add ci-only-foo as a repo topic, then we'd build only refs/heads/foo.

I may see if I can work this into our workflow file. Even if we can't
get rid of ci-config for the skip-successful-build feature, this would
still save CPU by dropping even ci-config when the branch should be
skipped entirely.

> > The current ci-config stuff works, but it's pretty horrible because (if
> > I understand correctly) it spins up a VM just to evaluate a glob and say
> > "nope, no CI needed on this branch". So:
> >
> >   1. It's wasteful of resources, compared to a system where the Actions
> >      parser can evaluate a variable.
> 
> Indeed. It might look like the job only takes a few seconds (at least that
> was the argument that got the `ci-config` patch accepted), but that misses
> the queue time, which turns this more into several dozens of seconds, and
> the recorded total duration is much longer than that. In
> https://github.com/gitgitgadget/git/actions/runs/3412982102 for example,
> the `ci-config` job only took 6 seconds, according to the page, but the
> total duration of the build was 6 minutes and 56 seconds.

Yes, but I don't think that's wasting 6 minutes of resources if we're
just sitting in a queue. It may increase the end-to-end latency of
getting the CI result, of course.

>     4. The way `ci-config` is configured is sufficiently "magic" that it
>        only benefits very, very few users, at the price of adding to
>        everybody's build time. To see what I mean, look at
>        https://github.com/gitgitgadget/git/actions/runs/3416084640/jobs/5685867765#step:1:1
>        and at
>        https://github.com/gitgitgadget/git/actions/runs/3416084640/jobs/5685879914#step:1:1
>        turning on the timestamps (i.e. click on the sprocket wheel on the
>        upper right side of the log and select "Show timestamps"). You will
>        see that the `ci-config` job started at 3:22:05 UTC and the next
>        job, `win-build`, started only at 4:16:03 UTC.

Ouch. Though I wonder in practice how fast that would have gone without
ci-config. It is just asking for a generic ubuntu vm, which many of
other jobs would. Was there a shortage of those vms at 3:22, and by the
time it finally ran that shortage was over? Or is there constant
contention, and it is increasing the end-to-end latency by asking for a
queue slot, running, and then asking for more queue slots?

>     5. There is official support for the desired behavior that comes
>        without any magic branch with special content that users somehow
>        need to learn about: If you push a branch with commit messages that
>        contain `[skip ci]`, the build will be skipped, and no time is
>        wasted on running any job. For full details, see
>        https://github.blog/changelog/2021-02-08-github-actions-skip-pull-request-and-push-workflows-with-skip-ci/

This existed back when I added ci-config originally, and I tried it, but
the results are quite painful to use. Because "skip ci" is really a
property of a branch that a commit is pushed to, and not a branch. So
you have to sprinkle these "skip ci" markers all over the branch tips,
but then remember to remove them when you actually want to do useful
things with the branches, like merge them or send them out.

-Peff

Taylor Blau Nov. 10, 2022, 2:40 a.m. UTC | #19

On Wed, Nov 09, 2022 at 09:00:33AM -0500, Jeff King wrote:
> On Tue, Nov 08, 2022 at 10:16:15AM +0100, Johannes Schindelin wrote:
>
> > > As an aside, I wish there was a way to interpret per-repo environment
> > > variables in the actual action config.
> >
> > There kind of is. "Kind of" because it is not _really_ a per-repo variable
> > (those do not exist on GitHub), but there are topics you can set. These
> > are relatively free-form labels you can attach to _your_ fork, and these
> > labels show up below the "About" section and the link to the home page (if
> > any) on the right side of your respository. AFAICT these topics are not
> > inherited automatically when forking a repository, which is precisely what
> > we want. See
> > https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/classifying-your-repository-with-topics
> > for more details on that.
>
> Ah, that's very clever, thank you!

Very cute.

> For the original problem that motivated me adding ci-config in the first
> place, branch selection, I think we could do this:
>
>   if: |
>     !contains(join(github.event.repository.topics), 'ci-only-') ||
>     contains(github.event.repository.topics, format('ci-only-{0}', github.ref_name))
>
> and then by default we'd continue to build for all pushes, but if you
> add ci-only-foo as a repo topic, then we'd build only refs/heads/foo.
>
> I may see if I can work this into our workflow file. Even if we can't
> get rid of ci-config for the skip-successful-build feature, this would
> still save CPU by dropping even ci-config when the branch should be
> skipped entirely.

Yeah. I'd be hesitant to see a bunch of CI configuration knobs get
folded into the repository topics feature, but a clear "save CI minutes
if we don't care about building these branch(es)" is useful.

Thanks,
Taylor

ci: avoid unnecessary builds

Commit Message

Comments

Patch