Message ID | xmqq1qfvor35.fsf_-_@gitster.g (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ci: avoid building from the same commit in parallel | expand |
Junio C Hamano <gitster@pobox.com> writes: > * There are tons of concurrency groups defined, but as a trial > change, here is to cover the "regular" matrix that consumes the > most resources (linux-asan-ubsan is the worst culprit, it seems). Unfortunately, this did not work. https://github.com/git/git/actions/runs/5933451874 https://github.com/git/git/actions/runs/5933451805 are trial runs that had the same commit with this patch pushed to 'seen' and 'pu'. While one of them started the "regular" matrix, the other one indeed went into paused state and waited. But that is way too late. What happened was that their "config" (which everything else depends on) started in parallel before the serialization at the "regular" matrix kicked in. So, one did wait before doing the "regular" matrix, until the other one finished everything, and then kept going and did its own "regular" matrix for the same commit. It is because the avoidance of "redundant build" was done at the "config" phase, which both of them had already done X-<. If we wanted to do this, I suspect that we need to serialize the entire thing, not at the individual level where we currently define the "concurrency" thing.
Hi Junio, On Mon, 21 Aug 2023, Junio C Hamano wrote: > If we wanted to do this, I suspect that we need to serialize the > entire thing, not at the individual level where we currently define > the "concurrency" thing. Right, we'd need that `concurrency: ${{ github.sha }}` attribute on the `config` job. BTW there is another caveat. According to the documentation, if a job is queued while another job is already queued, that other job is canceled in favor of the latest one. I have not verified this by testing it out, but have no reason to doubt it. If that's the case, pushing, say, `master`, `main` and `next` to the same SHA will see one of them canceled. Which might be okay, if the cancelation message on the canceled one is indicative enough. This all depends on reducing the number of flakes (in particular the p4 tests in the ASAN job), but that's a story for another day. Ciao, Johannes
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > Right, we'd need that `concurrency: ${{ github.sha }}` attribute on the > `config` job. That was my first thought, but I am not sure how it would work. Doesn't skip-if-redundant grab the workflow runs that have succeeded and then see if one for the same commit already exists? If you used concurrency on the 'config', what gets serialized between two jobs for the same commit is only the 'config' phase, so 'master' may wait starting (because 'config' is what everybody else 'needs' it) while 'config' phase of 'main' runs, and then when it gets to the turn of 'config' phase of 'master', it would not find the run for the same commit being done for 'main' completed yet, would it? > BTW there is another caveat. According to the documentation, if a job is > queued while another job is already queued, that other job is canceled in > favor of the latest one. Yes, that was the impression I got; your second one will wait (so you need a working skip-if-redundant to turn it into noop), but the third and subsequent ones are discarded without starting, which unfortunately is what we may want to see happen. Hmph, from that point of view, would the best and simplest we can do be to use the commit object name as the concurrency key for the 'config' phase, and use something similar to cancel-in-progress (which kills the other one when the new one starts, but what we want is what stops the new one to start when it notices there is already one running)?
Hi Junio, On Tue, 22 Aug 2023, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > > Right, we'd need that `concurrency: ${{ github.sha }}` attribute on > > the `config` job. > > That was my first thought, but I am not sure how it would work. > > Doesn't skip-if-redundant grab the workflow runs that have succeeded > and then see if one for the same commit already exists? If you used > concurrency on the 'config', what gets serialized between two jobs > for the same commit is only the 'config' phase, so 'master' may wait > starting (because 'config' is what everybody else 'needs' it) while > 'config' phase of 'main' runs, and then when it gets to the turn of > 'config' phase of 'master', it would not find the run for the same > commit being done for 'main' completed yet, would it? Yes, that's true. But there is a silver lining: the `concurrency` can not only be specified on the job level, but also on the workflow run level. I tested this, and present the corresponding patch at the end of this mail. > > BTW there is another caveat. According to the documentation, if a job > > is queued while another job is already queued, that other job is > > canceled in favor of the latest one. > > Yes, that was the impression I got; your second one will wait (so > you need a working skip-if-redundant to turn it into noop), but the > third and subsequent ones are discarded without starting, which > unfortunately is what we may want to see happen. It's actually the last one that is still pending, while the intermediate ones will be canceled before they are started (see also the attached screenshot). The message that is shown in the web UI reads like this:
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > Hi Junio, > > On Tue, 22 Aug 2023, Junio C Hamano wrote: > >> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> >> > Right, we'd need that `concurrency: ${{ github.sha }}` attribute on >> > the `config` job. >> >> That was my first thought, but I am not sure how it would work. >> >> Doesn't skip-if-redundant grab the workflow runs that have succeeded >> and then see if one for the same commit already exists? If you used >> concurrency on the 'config', what gets serialized between two jobs >> for the same commit is only the 'config' phase, so 'master' may wait >> starting (because 'config' is what everybody else 'needs' it) while >> 'config' phase of 'main' runs, and then when it gets to the turn of >> 'config' phase of 'master', it would not find the run for the same >> commit being done for 'main' completed yet, would it? > > Yes, that's true. > > But there is a silver lining: the `concurrency` can not only be specified > on the job level, but also on the workflow run level. > I tested this, and present the corresponding patch at the end of this > mail. Yeah, serializing the whole thing was the only way I thought that would work, and I am glad you already tested that it works. Thanks.
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > Here is the patch: > > -- snipsnap -- > From: Junio C Hamano <gitster@pobox.com> > Date: Mon, 21 Aug 2023 17:31:26 -0700 > Subject: [PATCH] ci: avoid building from the same commit in parallel I forgot to say that I do not think I deserve the credit in the end result, as you've done all the hard part. Mind taking the authorship, while demoting me to "Helped-by" status? Thanks.
Hi Junio, On Wed, 23 Aug 2023, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > > Here is the patch: > > > > -- snipsnap -- > > From: Junio C Hamano <gitster@pobox.com> > > Date: Mon, 21 Aug 2023 17:31:26 -0700 > > Subject: [PATCH] ci: avoid building from the same commit in parallel > > I forgot to say that I do not think I deserve the credit in the end > result, as you've done all the hard part. > > Mind taking the authorship, while demoting me to "Helped-by" status? Sure. But I disagree that I did most of the work. You tried all the avenues that I would have tried, saving me tons of time by spending yours. Thank you, Johannes
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 30492eacdd..27b151aadf 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -240,8 +240,7 @@ jobs: needs: ci-config if: needs.ci-config.outputs.enabled == 'yes' concurrency: - group: ${{ matrix.vector.jobname }}-${{ matrix.vector.pool }}-${{ github.ref }} - cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} + group: ${{ matrix.vector.jobname }}-${{ matrix.vector.pool }}-${{ github.sha }} strategy: fail-fast: false matrix:
At times, we may need to push the same commit to multiple branches in the same push. Rewinding 'next' to rebuild on top of 'master' soon after a release is such an occasion. Making sure 'main' stays in sync with 'master' to help those who expect that primary branch of the project is named either of these is another. We used to use the branch name as the "concurrency group" key, but by switching to use the commit object name would make sure the builds for the same commit would happen serially, and by the time the second job becomes ready to run, the first job's outcome would be available and mking it unnecessary to run the second job. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- * There are tons of concurrency groups defined, but as a trial change, here is to cover the "regular" matrix that consumes the most resources (linux-asan-ubsan is the worst culprit, it seems). .github/workflows/main.yml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)