diff mbox series

[v2] ci: allow per-branch config for GitHub Actions

Message ID 20200507162011.GA3638906@coredump.intra.peff.net (mailing list archive)
State New, archived
Headers show
Series [v2] ci: allow per-branch config for GitHub Actions | expand

Commit Message

Jeff King May 7, 2020, 4:20 p.m. UTC
On Tue, May 05, 2020 at 05:04:51PM -0400, Jeff King wrote:

> Subject: [PATCH] ci: allow per-branch config for GitHub Actions

Here's a "v2" of that patch based on the discussion.

I think it smooths some of the rough edges of the orphan-branch
approach, while still having a cost on par with other suggestions (or at
least ones that truly allow any config; we can check for "for-ci/**" or
something very cheaply, but that implies hard-coding it for everybody).
I think the cost here is acceptable, and it gives us room to add more
features in the future.

If Actions eventually adds per-repo variable storage that can be used in
"if:" conditionals, then we could eventually switch to that. :)

The documentation here should be enough to let people work with it. But
we'd probably want to take Danh's patch to mention Actions in
SubmittingPatches on top (and it possibly could be modified to point to
the ci/config directory).

-- >8 --
Subject: [PATCH] ci: allow per-branch config for GitHub Actions

Depending on the workflows of individual developers, it can either be
convenient or annoying that our GitHub Actions CI jobs are run on every
branch. As an example of annoying: if you carry many half-finished
work-in-progress branches and rebase them frequently against master,
you'd get tons of failure reports that aren't interesting (not to
mention the wasted CPU).

This commit adds a new job which checks a special branch within the
repository for CI config, and then runs a shell script it finds there to
decide whether to skip the rest of the tests. The default will continue
to run tests for all refs if that branch or script is missing.

There have been a few alternatives discussed:

One option is to carry information in the commit itself about whether it
should be tested, either in the tree itself (changing the workflow YAML
file) or in the commit message (a "[skip ci]" flag or similar). But
these are frustrating and error-prone to use:

  - you have to manually apply them to each branch that you want to mark

  - it's easy for them to leak into other workflows, like emailing patches

We could likewise try to get some information from the branch name. But
that leads to debates about whether the default should be "off" or "on",
and overriding still ends up somewhat awkward. If we default to "on",
you have to remember to name your branches appropriately to skip CI. And
if "off", you end up having to contort your branch names or duplicate
your pushes with an extra refspec.

By comparison, this commit's solution lets you specify your config once
and forget about it, and all of the data is off in its own ref, where it
can be changed by individual forks without touching the main tree.

There were a few design decisions that came out of on-list discussion.
I'll summarize here:

 - we could use GitHub's API to retrieve the config ref, rather than a
   real checkout (and then just operate on it via some javascript). We
   still have to spin up a VM and contact GitHub over the network from
   it either way, so it ends up not being much faster. I opted to go
   with shell to keep things similar to our other tools (and really
   could implement allow-refs in any language you want). This also makes
   it easy to test your script locally, and to modify it within the
   context of a normal git.git tree.

 - we could keep the well-known refname out of refs/heads/ to avoid
   cluttering the branch namespace. But that makes it awkward to
   manipulate. By contrast, you can just "git checkout ci-config" to
   make changes.

 - we could assume the ci-config ref has nothing in it except config
   (i.e., a branch unrelated to the rest of git.git). But dealing with
   orphan branches is awkward. Instead, we'll do our best to efficiently
   check out only the ci/config directory using a shallow partial clone,
   which allows your ci-config branch to be just a normal branch, with
   your config changes on top.

 - we could provide a simpler interface, like a static list of ref
   patterns. But we can't get out of spinning up a whole VM anyway, so
   we might as well use that feature to make the config as flexible as
   possible. If we add more config, we should be able to reuse our
   partial-clone to set more outputs.

Signed-off-by: Jeff King <peff@peff.net>
---
 .github/workflows/main.yml  | 42 +++++++++++++++++++++++++++++++++++++
 ci/config/allow-refs.sample | 26 +++++++++++++++++++++++
 2 files changed, 68 insertions(+)
 create mode 100755 ci/config/allow-refs.sample

Comments

Taylor Blau May 7, 2020, 5 p.m. UTC | #1
Hi Peff,

On Thu, May 07, 2020 at 12:20:11PM -0400, Jeff King wrote:
> On Tue, May 05, 2020 at 05:04:51PM -0400, Jeff King wrote:
>
> > Subject: [PATCH] ci: allow per-branch config for GitHub Actions
>
> Here's a "v2" of that patch based on the discussion.

I really like this direction. I think that it's a good mix of
flexibility and convenience. I'm happy to push a one-time 'ci-config'
branch to 'ttaylorr/git' and forget about it.

> I think it smooths some of the rough edges of the orphan-branch
> approach, while still having a cost on par with other suggestions (or at
> least ones that truly allow any config; we can check for "for-ci/**" or
> something very cheaply, but that implies hard-coding it for everybody).
> I think the cost here is acceptable, and it gives us room to add more
> features in the future.
>
> If Actions eventually adds per-repo variable storage that can be used in
> "if:" conditionals, then we could eventually switch to that. :)
>
> The documentation here should be enough to let people work with it. But
> we'd probably want to take Danh's patch to mention Actions in
> SubmittingPatches on top (and it possibly could be modified to point to
> the ci/config directory).
>
> -- >8 --
> Subject: [PATCH] ci: allow per-branch config for GitHub Actions
>
> Depending on the workflows of individual developers, it can either be
> convenient or annoying that our GitHub Actions CI jobs are run on every
> branch. As an example of annoying: if you carry many half-finished
> work-in-progress branches and rebase them frequently against master,
> you'd get tons of failure reports that aren't interesting (not to
> mention the wasted CPU).
>
> This commit adds a new job which checks a special branch within the
> repository for CI config, and then runs a shell script it finds there to
> decide whether to skip the rest of the tests. The default will continue
> to run tests for all refs if that branch or script is missing.
>
> There have been a few alternatives discussed:
>
> One option is to carry information in the commit itself about whether it
> should be tested, either in the tree itself (changing the workflow YAML
> file) or in the commit message (a "[skip ci]" flag or similar). But
> these are frustrating and error-prone to use:
>
>   - you have to manually apply them to each branch that you want to mark
>
>   - it's easy for them to leak into other workflows, like emailing patches
>
> We could likewise try to get some information from the branch name. But
> that leads to debates about whether the default should be "off" or "on",
> and overriding still ends up somewhat awkward. If we default to "on",
> you have to remember to name your branches appropriately to skip CI. And
> if "off", you end up having to contort your branch names or duplicate
> your pushes with an extra refspec.
>
> By comparison, this commit's solution lets you specify your config once
> and forget about it, and all of the data is off in its own ref, where it
> can be changed by individual forks without touching the main tree.
>
> There were a few design decisions that came out of on-list discussion.
> I'll summarize here:
>
>  - we could use GitHub's API to retrieve the config ref, rather than a
>    real checkout (and then just operate on it via some javascript). We
>    still have to spin up a VM and contact GitHub over the network from
>    it either way, so it ends up not being much faster. I opted to go
>    with shell to keep things similar to our other tools (and really
>    could implement allow-refs in any language you want). This also makes
>    it easy to test your script locally, and to modify it within the
>    context of a normal git.git tree.
>
>  - we could keep the well-known refname out of refs/heads/ to avoid
>    cluttering the branch namespace. But that makes it awkward to
>    manipulate. By contrast, you can just "git checkout ci-config" to
>    make changes.
>
>  - we could assume the ci-config ref has nothing in it except config
>    (i.e., a branch unrelated to the rest of git.git). But dealing with
>    orphan branches is awkward. Instead, we'll do our best to efficiently
>    check out only the ci/config directory using a shallow partial clone,
>    which allows your ci-config branch to be just a normal branch, with
>    your config changes on top.
>
>  - we could provide a simpler interface, like a static list of ref
>    patterns. But we can't get out of spinning up a whole VM anyway, so
>    we might as well use that feature to make the config as flexible as
>    possible. If we add more config, we should be able to reuse our
>    partial-clone to set more outputs.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  .github/workflows/main.yml  | 42 +++++++++++++++++++++++++++++++++++++
>  ci/config/allow-refs.sample | 26 +++++++++++++++++++++++
>  2 files changed, 68 insertions(+)
>  create mode 100755 ci/config/allow-refs.sample
>
> diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
> index fd4df939b5..802a4bf7cd 100644
> --- a/.github/workflows/main.yml
> +++ b/.github/workflows/main.yml
> @@ -6,7 +6,39 @@ env:
>    DEVELOPER: 1
>
>  jobs:
> +  ci-config:
> +      runs-on: ubuntu-latest
> +      outputs:
> +        enabled: ${{ steps.check-ref.outputs.enabled }}
> +      steps:
> +        - name: try to clone ci-config branch
> +          continue-on-error: true
> +          run: |
> +            git -c protocol.version=2 clone \
> +              --no-tags \
> +              --single-branch \
> +              -b ci-config \
> +              --depth 1 \
> +              --no-checkout \
> +              --filter=blob:none \
> +              https://github.com/${{ github.repository }} \
> +              config-repo &&
> +              cd config-repo &&
> +              git checkout HEAD -- ci/config
> +        - id: check-ref
> +          name: check whether CI is enabled for ref
> +          run: |
> +            enabled=yes
> +            if test -x config-repo/ci/config/allow-ref &&
> +               ! config-repo/ci/config/allow-ref '${{ github.ref }}'
> +            then
> +              enabled=no
> +            fi
> +            echo "::set-output name=enabled::$enabled"
> +
>    windows-build:
> +    needs: ci-config
> +    if: needs.ci-config.outputs.enabled == 'yes'

One thing I wonder is whether the downstream 'windows-test' partitions.
I think that it should be fine, since we won't run the dependent
'windows-build', and then 'windows-test' won't have all of its
prerequisites filled.

>      runs-on: windows-latest
>      steps:
>      - uses: actions/checkout@v1
> @@ -70,6 +102,8 @@ jobs:
>          name: failed-tests-windows
>          path: ${{env.FAILED_TEST_ARTIFACTS}}
>    vs-build:
> +    needs: ci-config
> +    if: needs.ci-config.outputs.enabled == 'yes'
>      env:
>        MSYSTEM: MINGW64
>        NO_PERL: 1
> @@ -154,6 +188,8 @@ jobs:
>                            ${{matrix.nr}} 10 t[0-9]*.sh)
>          "@
>    regular:
> +    needs: ci-config
> +    if: needs.ci-config.outputs.enabled == 'yes'
>      strategy:
>        matrix:
>          vector:
> @@ -189,6 +225,8 @@ jobs:
>          name: failed-tests-${{matrix.vector.jobname}}
>          path: ${{env.FAILED_TEST_ARTIFACTS}}
>    dockerized:
> +    needs: ci-config
> +    if: needs.ci-config.outputs.enabled == 'yes'
>      strategy:
>        matrix:
>          vector:
> @@ -213,6 +251,8 @@ jobs:
>          name: failed-tests-${{matrix.vector.jobname}}
>          path: ${{env.FAILED_TEST_ARTIFACTS}}
>    static-analysis:
> +    needs: ci-config
> +    if: needs.ci-config.outputs.enabled == 'yes'
>      env:
>        jobname: StaticAnalysis
>      runs-on: ubuntu-latest
> @@ -221,6 +261,8 @@ jobs:
>      - run: ci/install-dependencies.sh
>      - run: ci/run-static-analysis.sh
>    documentation:
> +    needs: ci-config
> +    if: needs.ci-config.outputs.enabled == 'yes'
>      env:
>        jobname: Documentation
>      runs-on: ubuntu-latest
> diff --git a/ci/config/allow-refs.sample b/ci/config/allow-refs.sample
> new file mode 100755
> index 0000000000..f157f1945a
> --- /dev/null
> +++ b/ci/config/allow-refs.sample
> @@ -0,0 +1,26 @@
> +#!/bin/sh
> +#
> +# Sample script for enabling/disabling GitHub Actions CI runs on
> +# particular refs. By default, CI is run for all branches pushed to
> +# GitHub. You can override this by dropping the ".sample" from the script,
> +# editing it, committing, and pushing the result to the "ci-config" branch of
> +# your repository:
> +#
> +#   git checkout -b ci-config

Should we be recommending '--orphan' instead of '-b' here? It looks
like when you clone this branch down that you try to get as few bytes as
possible, so I figure it may be easier to have this be a orphaned
branch.

> +#   cp allow-refs.sample allow-refs
> +#   $EDITOR allow-refs
> +#   git commit -am "implement my ci preferences"
> +#   git push
> +#
> +# This script will then be run when any refs are pushed to that repository. It
> +# gets the fully qualified refname as the first argument, and should exit with
> +# success only for refs for which you want to run CI.
> +
> +case "$1" in
> +# allow one-off tests by pushing to "for-ci" or "for-ci/mybranch"
> +refs/heads/for-ci*) true ;;
> +# always build your integration branch
> +refs/heads/my-integration-branch) true ;;
> +# don't build any other branches or tags
> +*) false ;;
> +esac
> --
> 2.26.2.1005.ge383644752
>

Thanks,
Taylor
Jeff King May 7, 2020, 5:18 p.m. UTC | #2
On Thu, May 07, 2020 at 11:00:42AM -0600, Taylor Blau wrote:

> >    windows-build:
> > +    needs: ci-config
> > +    if: needs.ci-config.outputs.enabled == 'yes'
> 
> One thing I wonder is whether the downstream 'windows-test' partitions.
> I think that it should be fine, since we won't run the dependent
> 'windows-build', and then 'windows-test' won't have all of its
> prerequisites filled.

Yes, I intentionally left them out for that reason. It seemed simpler to
just let the skip percolate down the dependency tree.

> > +# Sample script for enabling/disabling GitHub Actions CI runs on
> > +# particular refs. By default, CI is run for all branches pushed to
> > +# GitHub. You can override this by dropping the ".sample" from the script,
> > +# editing it, committing, and pushing the result to the "ci-config" branch of
> > +# your repository:
> > +#
> > +#   git checkout -b ci-config
> 
> Should we be recommending '--orphan' instead of '-b' here? It looks
> like when you clone this branch down that you try to get as few bytes as
> possible, so I figure it may be easier to have this be a orphaned
> branch.

No, the whole point of doing the partial clone (rather than using
actions/checkout, which doesn't support that) was so people didn't have
to deal with the orphan-branch thing.

-Peff
Junio C Hamano May 7, 2020, 7:53 p.m. UTC | #3
Jeff King <peff@peff.net> writes:

> +        - id: check-ref
> +          name: check whether CI is enabled for ref
> +          run: |
> +            enabled=yes
> +            if test -x config-repo/ci/config/allow-ref &&
> +               ! config-repo/ci/config/allow-ref '${{ github.ref }}'

Is it deliberate that the output from the script is not redirected
to >/dev/null, which would mean they are allowed to do something
that looks like:

		echo "::set-output name=enabled::frotz"

or emit other random ::string-that-affects-github-actions to its
standard output stream?

> +            then
> +              enabled=no
> +            fi
> +            echo "::set-output name=enabled::$enabled"
Jeff King May 7, 2020, 8:46 p.m. UTC | #4
On Thu, May 07, 2020 at 12:53:09PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > +        - id: check-ref
> > +          name: check whether CI is enabled for ref
> > +          run: |
> > +            enabled=yes
> > +            if test -x config-repo/ci/config/allow-ref &&
> > +               ! config-repo/ci/config/allow-ref '${{ github.ref }}'
> 
> Is it deliberate that the output from the script is not redirected
> to >/dev/null, which would mean they are allowed to do something
> that looks like:
> 
> 		echo "::set-output name=enabled::frotz"
> 
> or emit other random ::string-that-affects-github-actions to its
> standard output stream?

It was deliberate in the sense that I would allow them to write useful
messages to the Actions log. If they want to do nonsense like
"::set-output", then it's their foot and their gun.

I don't know if Actions distinguishes between stdout and stderr here
(i.e., if we redirected the script's stdout to stderr, would that
prevent this case or not?).

-Peff
Junio C Hamano May 7, 2020, 9:58 p.m. UTC | #5
Jeff King <peff@peff.net> writes:

> It was deliberate in the sense that I would allow them to write useful
> messages to the Actions log. If they want to do nonsense like
> "::set-output", then it's their foot and their gun.

It's not like fooling the framework you laid out here is a
potentially useful attack vector.  We can assume that it is unlikely
for the custom allow-ref to be writing a string that happens to
begin with double-colon by mistake and making it harder to debug.

> I don't know if Actions distinguishes between stdout and stderr here
> (i.e., if we redirected the script's stdout to stderr, would that
> prevent this case or not?).

Perhaps we can experiment with "echo >&2 we are getting called" in
the allow-ref script itself ;-).

In any case, I'll queue it on 'pu'.  Thanks.
Jeff King May 8, 2020, 6 p.m. UTC | #6
On Thu, May 07, 2020 at 02:58:16PM -0700, Junio C Hamano wrote:

> In any case, I'll queue it on 'pu'.  Thanks.

I just noticed this needs a small fix to the sample script, which I gave
the wrong name:

diff --git a/ci/config/allow-refs.sample b/ci/config/allow-ref.sample
similarity index 93%
rename from ci/config/allow-refs.sample
rename to ci/config/allow-ref.sample
index f157f1945a..c9c9aea9ff 100755
--- a/ci/config/allow-refs.sample
+++ b/ci/config/allow-ref.sample
@@ -7,8 +7,8 @@
 # your repository:
 #
 #   git checkout -b ci-config
-#   cp allow-refs.sample allow-refs
-#   $EDITOR allow-refs
+#   cp allow-refs.sample allow-ref
+#   $EDITOR allow-ref
 #   git commit -am "implement my ci preferences"
 #   git push
 #

> Perhaps we can experiment with "echo >&2 we are getting called" in
> the allow-ref script itself ;-).

That definitely ends up in the log. But doing:

  echo     "::error file=foo.sh,line=1,col=2::this is the stdout msg"
  echo >&2 "::error file=foo.sh,line=1,col=2::this is the stderr msg"

shows both versions formatted in red, which implies to me that stdout
and stderr are treated the same.

-Peff
Đoàn Trần Công Danh May 9, 2020, 1:23 a.m. UTC | #7
On 2020-05-08 14:00:47-0400, Jeff King <peff@peff.net> wrote:
> I just noticed this needs a small fix to the sample script, which I gave
> the wrong name:
> 
> diff --git a/ci/config/allow-refs.sample b/ci/config/allow-ref.sample
> similarity index 93%
> rename from ci/config/allow-refs.sample
> rename to ci/config/allow-ref.sample
> index f157f1945a..c9c9aea9ff 100755
> --- a/ci/config/allow-refs.sample
> +++ b/ci/config/allow-ref.sample
> @@ -7,8 +7,8 @@
>  # your repository:
>  #
>  #   git checkout -b ci-config
> -#   cp allow-refs.sample allow-refs
> -#   $EDITOR allow-refs
> +#   cp allow-refs.sample allow-ref

Hi Peff,

The source needs to be changed, too:
---------------8<-------------
diff --git a/ci/config/allow-ref.sample b/ci/config/allow-ref.sample
index c9c9aea9ff..249872425f 100755
--- a/ci/config/allow-ref.sample
+++ b/ci/config/allow-ref.sample
@@ -7,7 +7,7 @@
 # your repository:
 #
 #   git checkout -b ci-config
-#   cp allow-refs.sample allow-ref
+#   cp allow-ref.sample allow-ref
 #   $EDITOR allow-ref
 #   git commit -am "implement my ci preferences"
 #   git push
--------------------8<------------------
diff mbox series

Patch

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index fd4df939b5..802a4bf7cd 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -6,7 +6,39 @@  env:
   DEVELOPER: 1
 
 jobs:
+  ci-config:
+      runs-on: ubuntu-latest
+      outputs:
+        enabled: ${{ steps.check-ref.outputs.enabled }}
+      steps:
+        - name: try to clone ci-config branch
+          continue-on-error: true
+          run: |
+            git -c protocol.version=2 clone \
+              --no-tags \
+              --single-branch \
+              -b ci-config \
+              --depth 1 \
+              --no-checkout \
+              --filter=blob:none \
+              https://github.com/${{ github.repository }} \
+              config-repo &&
+              cd config-repo &&
+              git checkout HEAD -- ci/config
+        - id: check-ref
+          name: check whether CI is enabled for ref
+          run: |
+            enabled=yes
+            if test -x config-repo/ci/config/allow-ref &&
+               ! config-repo/ci/config/allow-ref '${{ github.ref }}'
+            then
+              enabled=no
+            fi
+            echo "::set-output name=enabled::$enabled"
+
   windows-build:
+    needs: ci-config
+    if: needs.ci-config.outputs.enabled == 'yes'
     runs-on: windows-latest
     steps:
     - uses: actions/checkout@v1
@@ -70,6 +102,8 @@  jobs:
         name: failed-tests-windows
         path: ${{env.FAILED_TEST_ARTIFACTS}}
   vs-build:
+    needs: ci-config
+    if: needs.ci-config.outputs.enabled == 'yes'
     env:
       MSYSTEM: MINGW64
       NO_PERL: 1
@@ -154,6 +188,8 @@  jobs:
                           ${{matrix.nr}} 10 t[0-9]*.sh)
         "@
   regular:
+    needs: ci-config
+    if: needs.ci-config.outputs.enabled == 'yes'
     strategy:
       matrix:
         vector:
@@ -189,6 +225,8 @@  jobs:
         name: failed-tests-${{matrix.vector.jobname}}
         path: ${{env.FAILED_TEST_ARTIFACTS}}
   dockerized:
+    needs: ci-config
+    if: needs.ci-config.outputs.enabled == 'yes'
     strategy:
       matrix:
         vector:
@@ -213,6 +251,8 @@  jobs:
         name: failed-tests-${{matrix.vector.jobname}}
         path: ${{env.FAILED_TEST_ARTIFACTS}}
   static-analysis:
+    needs: ci-config
+    if: needs.ci-config.outputs.enabled == 'yes'
     env:
       jobname: StaticAnalysis
     runs-on: ubuntu-latest
@@ -221,6 +261,8 @@  jobs:
     - run: ci/install-dependencies.sh
     - run: ci/run-static-analysis.sh
   documentation:
+    needs: ci-config
+    if: needs.ci-config.outputs.enabled == 'yes'
     env:
       jobname: Documentation
     runs-on: ubuntu-latest
diff --git a/ci/config/allow-refs.sample b/ci/config/allow-refs.sample
new file mode 100755
index 0000000000..f157f1945a
--- /dev/null
+++ b/ci/config/allow-refs.sample
@@ -0,0 +1,26 @@ 
+#!/bin/sh
+#
+# Sample script for enabling/disabling GitHub Actions CI runs on
+# particular refs. By default, CI is run for all branches pushed to
+# GitHub. You can override this by dropping the ".sample" from the script,
+# editing it, committing, and pushing the result to the "ci-config" branch of
+# your repository:
+#
+#   git checkout -b ci-config
+#   cp allow-refs.sample allow-refs
+#   $EDITOR allow-refs
+#   git commit -am "implement my ci preferences"
+#   git push
+#
+# This script will then be run when any refs are pushed to that repository. It
+# gets the fully qualified refname as the first argument, and should exit with
+# success only for refs for which you want to run CI.
+
+case "$1" in
+# allow one-off tests by pushing to "for-ci" or "for-ci/mybranch"
+refs/heads/for-ci*) true ;;
+# always build your integration branch
+refs/heads/my-integration-branch) true ;;
+# don't build any other branches or tags
+*) false ;;
+esac