[1/1] ci: new github-action for git-l10n code review

Message ID	20210822161325.22038-2-worldhello.net@gmail.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <git-owner@kernel.org> From: Jiang Xin <worldhello.net@gmail.com> To: Git List <git@vger.kernel.org>, Junio C Hamano <gitster@pobox.com>, Alexander Shopov <ash@kambanaria.org>, Jordi Mas <jmas@softcatala.org>, =?utf-8?q?Matthias_R=C3=BCster?= <matthias.ruester@gmail.com>, Jimmy Angelakos <vyruss@hellug.gr>, =?utf-8?q?Christopher_D=C3=ADaz?= <christopher.diaz.riv@gmail.com>, =?utf-8?q?Jean-No=C3=ABl_Avila?= <jn.avila@free.fr>, Bagas Sanjaya <bagasdotme@gmail.com>, Alessandro Menti <alessandro.menti@alessandromenti.it>, Gwan-gyeong Mun <elongbug@gmail.com>, Arusekk <arek_koz@o2.pl>, Daniel Santos <hello@brighterdan.com>, Dimitriy Ryazantcev <DJm00n@mail.ru>, Peter Krefting <peter@softwolves.pp.se>, Emir SARI <bitigchi@me.com>, =?utf-8?b?VHLhuqduIE5n4buNYyBRdcOibg==?= <vnwildman@gmail.com>, Fangyi Zhou <me@fangyi.io>, Yi-Jyun Pan <pan93412@gmail.com> Cc: Jiang Xin <zhiyou.jx@alibaba-inc.com> Subject: [PATCH 1/1] ci: new github-action for git-l10n code review Date: Mon, 23 Aug 2021 00:13:25 +0800 Message-Id: <20210822161325.22038-2-worldhello.net@gmail.com> In-Reply-To: <20210822161325.22038-1-worldhello.net@gmail.com> References: <20210822161325.22038-1-worldhello.net@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	ci: new github-action for git-l10n code review \| expand [0/1] ci: new github-action for git-l10n code review [1/1] ci: new github-action for git-l10n code review

Jiang Xin Aug. 22, 2021, 4:13 p.m. UTC

From: Jiang Xin <zhiyou.jx@alibaba-inc.com>

Git l10n uses github pull request for code review. A helper program
"git-po-helper" can be used to check typos in ".po" files, validate
syntax, and check commit message. It would be convenient to integrate
this helper program to CI and add comments in pull request.

The new github-action workflow is added in ".github/workflows/l10n.yml",
which is disabled by default. To turn it on for the git-l10n related
repositories, such as "git-l10n/git-po", we can add a new branch named
"ci-config" and create a simple shell script at "ci/config/allow-l10n"
in this branch.

The new l10n workflow listens to two types of github events: push and
pull_request.

For a push event, it will scan commits one by one. If a commit does not
look like a l10n commit (no file in "po/" has been changed), it will
immediately fail without checking for further commits. While for a
pull_request event, all new introduced commits will be scanned.

"git-po-helper" will generate two kinds of suggestions, errors and
warnings. A l10n contributor should try to fix all the errors, and
should pay attention to the warnings. All the errors and warnings will
be reported in the last step of the l10n workflow as two message groups.
For a pull_request event, will create additional comments in pull
request to report the result.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
---
 .github/workflows/l10n.yml | 143 +++++++++++++++++++++++++++++++++++++
 1 file changed, 143 insertions(+)
 create mode 100644 .github/workflows/l10n.yml

Bagas Sanjaya Aug. 23, 2021, 6:28 a.m. UTC | #1

On 22/08/21 23.13, Jiang Xin wrote:
> +    - uses: actions/setup-go@v2
> +      with:
> +        go-version: ">=1.16"

Currently there is newer Go (1.17) [1], why don't you update to it?

[1]: https://golang.org/dl/go1.17.linux-amd64.tar.gz

Jiang Xin Aug. 23, 2021, 6:42 a.m. UTC | #2

On Mon, Aug 23, 2021 at 2:28 PM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
>
> On 22/08/21 23.13, Jiang Xin wrote:
> > +    - uses: actions/setup-go@v2
> > +      with:
> > +        go-version: ">=1.16"
>
> Currently there is newer Go (1.17) [1], why don't you update to it?

Only go 1.16 and above can install a go package in "path@revision"
format. Use semversion ">=1.16" fits my needs and makes this workflow
file stable. It's not good to upgrade this yml file regularly.  And
I'm not sure "action/setup-go" has a cache for go 1.17.

--
Jiang Xin

Johannes Schindelin Aug. 23, 2021, 9:02 p.m. UTC | #3

Hi,

On Mon, 23 Aug 2021, Jiang Xin wrote:

> From: Jiang Xin <zhiyou.jx@alibaba-inc.com>
>
> Git l10n uses github pull request for code review. A helper program
> "git-po-helper" can be used to check typos in ".po" files, validate
> syntax, and check commit message. It would be convenient to integrate
> this helper program to CI and add comments in pull request.
>
> The new github-action workflow is added in ".github/workflows/l10n.yml",
> which is disabled by default. To turn it on for the git-l10n related
> repositories, such as "git-l10n/git-po", we can add a new branch named
> "ci-config" and create a simple shell script at "ci/config/allow-l10n"
> in this branch.
>
> The new l10n workflow listens to two types of github events: push and
> pull_request.
>
> For a push event, it will scan commits one by one. If a commit does not
> look like a l10n commit (no file in "po/" has been changed), it will
> immediately fail without checking for further commits. While for a
> pull_request event, all new introduced commits will be scanned.
>
> "git-po-helper" will generate two kinds of suggestions, errors and
> warnings. A l10n contributor should try to fix all the errors, and
> should pay attention to the warnings. All the errors and warnings will
> be reported in the last step of the l10n workflow as two message groups.
> For a pull_request event, will create additional comments in pull
> request to report the result.

It is a good idea to automate this.

I am a bit concerned that the `ci-config` approach, even if we use it in
the Git project itself, is quite cumbersome to use, though. So I hope that
we can find an alternative solution.

One such solution could be to make the `git-po-helper` job contingent on
part of the repository name. For example:

  git-po-helper:
    if: endsWith(github.repository, '/git-po')
    [...]

would skip the job unless the target repository's name is `git-po`.

A couple more questions/suggestions are below:

> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> ---
>  .github/workflows/l10n.yml | 143 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 143 insertions(+)
>  create mode 100644 .github/workflows/l10n.yml
>
> diff --git a/.github/workflows/l10n.yml b/.github/workflows/l10n.yml
> new file mode 100644
> index 0000000000..60f162c121
> --- /dev/null
> +++ b/.github/workflows/l10n.yml
> @@ -0,0 +1,143 @@
> +name: git-l10n
> +
> +on: [push, pull_request]
> +
> +jobs:
> +  ci-config:
> +    runs-on: ubuntu-latest
> +    outputs:
> +      enabled: ${{ steps.check-l10n.outputs.enabled }}
> +    steps:
> +      - name: try to clone ci-config branch
> +        run: |
> +          git -c protocol.version=2 clone \
> +            --no-tags \
> +            --single-branch \
> +            -b ci-config \
> +            --depth 1 \
> +            --no-checkout \
> +            --filter=blob:none \
> +            https://github.com/${{ github.repository }} \
> +            config-repo &&
> +          cd config-repo &&
> +          git checkout HEAD -- ci/config || : ignore
> +      - id: check-l10n
> +        name: check whether CI is enabled for l10n
> +        run: |
> +          enabled=no
> +          if test -x config-repo/ci/config/allow-l10n &&
> +             config-repo/ci/config/allow-l10n '${{ github.ref }}'
> +          then
> +            enabled=yes
> +          fi
> +          echo "::set-output name=enabled::$enabled"
> +
> +  git-po-helper:
> +    needs: ci-config
> +    if: needs.ci-config.outputs.enabled == 'yes'
> +    runs-on: ubuntu-latest
> +    steps:
> +    - uses: actions/checkout@v2
> +      with:
> +        fetch-depth: '0'

There is code below to specifically fetch the base commit, but setting the
`fetch-depth` to zero means to do a full clone anyway.

So maybe the `git fetch` code below should be dropped, or the
`fetch-depth` override?

Since we cannot know how deep we need to fetch, though, I figure that it
would be indeed better to have a non-shallow clone (and a code comment
would help clarify this).

An even better alternative would be a partial clone, of course, but I fear
that there is no convenient way yet to configure `actions/checkout` to do
so.

> +    - uses: actions/setup-go@v2
> +      with:
> +        go-version: ">=1.16"
> +    - name: Install git-po-helper
> +      run: |

Since this is a one-liner, it would probably make sense to avoid that `|`
continuation.

> +        go install github.com/git-l10n/git-po-helper@main
> +    - name: Install other dependencies
> +      run: |
> +        sudo apt-get update -q &&
> +        sudo apt-get install -q -y gettext
> +    - id: check-commits
> +      name: Run git-po-helper
> +      run: |
> +        if test "${{ github.event_name }}" = "pull_request"
> +        then
> +          commit_from=${{ github.event.pull_request.base.sha }}
> +          commit_to=${{ github.event.pull_request.head.sha }}
> +        else
> +          commit_from=${{ github.event.before }}
> +          commit_to=${{ github.event.after }}
> +          if ! echo $commit_from | grep -q "^00*$"

This would probably read better as:

		case "$commit_from" in
		*[^0]*|'') ;; # not all zeros
		*)
			[...]
			;;
		esac

But we might not need it anyway. See the comment on the `git fetch`
command below.

> +          then
> +            if ! git rev-parse "$commit_from^{commit}"
> +            then
> +              git fetch origin $commit_from

As pointed out above, we cannot know how many commits we need to fetch.
Therefore, I would suggest to simply drop this `git fetch`.

> +            fi
> +          fi
> +        fi
> +        exit_code=0
> +        git-po-helper check-commits --github-action -- \
> +          $commit_from..$commit_to >git-po-helper.out 2>&1 ||

What does the `--github-action` option do? Might be good to document this
here.

> +        exit_code=$?
> +        echo "::set-output name=exit_code::$exit_code"
> +        has_error_msg=
> +        has_warning_msg=
> +        if test $exit_code -ne 0
> +        then
> +          has_error_msg=yes

Do we really need this `has_error_msg` variable? It would be much easier
to test the condition `env.ERROR_MSG != ''` in subsequent steps, and that
would already do the job, without having to go through output variables.

> +          if test "${{ github.event_name }}" = "pull_request"
> +          then
> +            echo "ERROR_MSG<<EOF" >>$GITHUB_ENV
> +            grep -v -e "^level=warning" -e WARNING git-po-helper.out |
> +              perl -pe 's/\e\[[0-9;]*m//g' >>$GITHUB_ENV

This is a bit dangerous. First of all, how can you be sure that
`git-po-helper.out` does not contain lines that consist of the letters
`EOF` (and would therefore interfere with the here-doc)?

Second, isn't it conceivable to have an `error:` line which contains the
characters `WARNING` too? That line would be filtered out...

Third, can `git-po-helper` be convinced _not_ to print color codes? The
output was redirected into a file, after all, and it does not go to a
terminal...

> +            echo "EOF" >>$GITHUB_ENV
> +          fi
> +        fi
> +        if grep -q -e "^level=warning" -e WARNING git-po-helper.out
> +        then
> +          has_warning_msg=yes
> +          if test "${{ github.event_name }}" = "pull_request"
> +          then
> +            echo "WARNING_MSG<<EOF" >>$GITHUB_ENV
> +            grep -v -e "^level=error" -e ERROR git-po-helper.out |
> +              perl -pe 's/\e\[[0-9;]*m//g' >>$GITHUB_ENV
> +            echo "EOF" >>$GITHUB_ENV
> +          fi
> +        fi
> +        echo "::set-output name=has_error_msg::$has_error_msg"
> +        echo "::set-output name=has_warning_msg::$has_warning_msg"
> +    - name: Report errors in comment for pull request
> +      uses: mshick/add-pr-comment@v1
> +      if: steps.check-commits.outputs.has_error_msg == 'yes' && github.event_name == 'pull_request'
> +      continue-on-error: true
> +      with:
> +        repo-token: ${{ secrets.GITHUB_TOKEN }}

This requires the `GITHUB_TOKEN` to have write permissions, which I would
highly recommend not to require.

Instead, it would probably make more sense to keep the output in the logs
of the workflow run.

> +        repo-token-user-login: 'github-actions[bot]'
> +        message: |
> +          Errors found by git-po-helper in workflow ${{ github.workflow }}:
> +          ```
> +          ${{ env.ERROR_MSG }}
> +          ```
> +    - name: Report warnings in comment for pull request
> +      uses: mshick/add-pr-comment@v1
> +      if: steps.check-commits.outputs.has_warning_msg == 'yes' && github.event_name == 'pull_request'
> +      continue-on-error: true
> +      with:
> +        repo-token: ${{ secrets.GITHUB_TOKEN }}
> +        repo-token-user-login: 'github-actions[bot]'
> +        message: |
> +          Warnings found by git-po-helper in workflow ${{ github.workflow }}:
> +          ```
> +          ${{ env.WARNING_MSG }}
> +          ```
> +    - name: Report and quit
> +      run: |
> +        if test "${{ steps.check-commits.outputs.has_error_msg }}" = "yes"

This would be easier to read and maintain if it was upgraded to an `if:`
condition:

    - name: Report errors
      if: step.check-commits.outputs.has_error_msg = "yes"
      run: |
        [...]

And then do the same for warnings.

Also, it would be more natural if the `Run git-po-helper` step was allowed
to fail when `git-po-helper` outputs errors. You would then have to use
this conditional in the `Report errors` step:

      if: failure() && step.check-commits.outputs.has_error_msg = "yes"

(compare to how Git's `CI/PR` workflow prints errors:
https://github.com/git/git/blob/v2.33.0/.github/workflows/main.yml#L120).

For the `Report warnings` step, you would however have to use something
slightly less intuitive:

      if: (success() || failure()) && step.check-commits.outputs.has_warning_msg = "yes"

Finally, I _do_ think that this line would be easier to read like this,
while basically doing the same thing but with less effort (because the
outputs would no longer have to be set):

      if: (success() || failure()) && env.WARNING_MSG != ""

Ciao,
Dscho

> +        then
> +          echo "::group::Errors found by git-po-helper"
> +          grep -v -e "^level=warning" -e WARNING git-po-helper.out
> +          echo "::endgroup::"
> +        fi
> +        if test "${{ steps.check-commits.outputs.has_warning_msg }}" = "yes"
> +        then
> +          echo "::group::Warnings found by git-po-helper"
> +          grep -v -e "^level=error" -e ERROR git-po-helper.out
> +          echo "::endgroup::"
> +        fi
> +        if test ${{ steps.check-commits.outputs.exit_code }} -ne 0
> +        then
> +          exit ${{ steps.check-commits.outputs.exit_code }}
> +        fi
> --
> 2.33.0
>
>

Junio C Hamano Aug. 23, 2021, 9:36 p.m. UTC | #4

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> For a push event, it will scan commits one by one. If a commit does not
>> look like a l10n commit (no file in "po/" has been changed), it will
>> immediately fail without checking for further commits. While for a
>> pull_request event, all new introduced commits will be scanned.
>>
>> "git-po-helper" will generate two kinds of suggestions, errors and
>> warnings. A l10n contributor should try to fix all the errors, and
>> should pay attention to the warnings. All the errors and warnings will
>> be reported in the last step of the l10n workflow as two message groups.
>> For a pull_request event, will create additional comments in pull
>> request to report the result.
>
> It is a good idea to automate this.
>
> I am a bit concerned that the `ci-config` approach, even if we use it in
> the Git project itself, is quite cumbersome to use, though. So I hope that
> we can find an alternative solution.
>
> One such solution could be to make the `git-po-helper` job contingent on
> part of the repository name. For example:
>
>   git-po-helper:
>     if: endsWith(github.repository, '/git-po')
>     [...]
>
> would skip the job unless the target repository's name is `git-po`.

Nice.

Can this be made into a matter purely local to git-l10n/git-po
repository and not git/git repository?  I am wondering if we can ee
if the current repository is git-l10n/git-po or its fork and run it
only if that is true.

Thanks.

Johannes Schindelin Aug. 24, 2021, 9:27 a.m. UTC | #5

Hi Junio,

On Mon, 23 Aug 2021, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> >> For a push event, it will scan commits one by one. If a commit does not
> >> look like a l10n commit (no file in "po/" has been changed), it will
> >> immediately fail without checking for further commits. While for a
> >> pull_request event, all new introduced commits will be scanned.
> >>
> >> "git-po-helper" will generate two kinds of suggestions, errors and
> >> warnings. A l10n contributor should try to fix all the errors, and
> >> should pay attention to the warnings. All the errors and warnings will
> >> be reported in the last step of the l10n workflow as two message groups.
> >> For a pull_request event, will create additional comments in pull
> >> request to report the result.
> >
> > It is a good idea to automate this.
> >
> > I am a bit concerned that the `ci-config` approach, even if we use it in
> > the Git project itself, is quite cumbersome to use, though. So I hope that
> > we can find an alternative solution.
> >
> > One such solution could be to make the `git-po-helper` job contingent on
> > part of the repository name. For example:
> >
> >   git-po-helper:
> >     if: endsWith(github.repository, '/git-po')
> >     [...]
> >
> > would skip the job unless the target repository's name is `git-po`.
>
> Nice.
>
> Can this be made into a matter purely local to git-l10n/git-po
> repository and not git/git repository? I am wondering if we can ee if
> the current repository is git-l10n/git-po or its fork and run it only if
> that is true.

The biggest problem is that there are forks of `git-l10n/git-po` that
accept PRs in their own right. That is what I tried to address by using
just the repository name, without the org/user prefix.

Ciao,
Dscho

Jiang Xin Aug. 24, 2021, 1:21 p.m. UTC | #6

On Tue, Aug 24, 2021 at 5:03 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> It is a good idea to automate this.
>
> I am a bit concerned that the `ci-config` approach, even if we use it in
> the Git project itself, is quite cumbersome to use, though. So I hope that
> we can find an alternative solution.
>
> One such solution could be to make the `git-po-helper` job contingent on
> part of the repository name. For example:
>
>   git-po-helper:
>     if: endsWith(github.repository, '/git-po')
>     [...]
>
> would skip the job unless the target repository's name is `git-po`.

Yes, this is a solution I also want to try at the very beginning. But
some l10n team leaders may fork their repositories directly from
git/git, and name their repo as "{OWNER}/git".  I want to try another
solution: check existence of file "ci/config/allow-l10n" in branch
"ci-config" using a GitHub API, instead of cloning the ci-config
branch and execute the shell script "ci/config/allow-l10n".

> A couple more questions/suggestions are below:
>
> > +  git-po-helper:
> > +    needs: ci-config
> > +    if: needs.ci-config.outputs.enabled == 'yes'
> > +    runs-on: ubuntu-latest
> > +    steps:
> > +    - uses: actions/checkout@v2
> > +      with:
> > +        fetch-depth: '0'
>
> There is code below to specifically fetch the base commit, but setting the
> `fetch-depth` to zero means to do a full clone anyway.
>
> So maybe the `git fetch` code below should be dropped, or the
> `fetch-depth` override?
>
> Since we cannot know how deep we need to fetch, though, I figure that it
> would be indeed better to have a non-shallow clone (and a code comment
> would help clarify this).

If we want do a shallow clone, maybe we can try like this:

1.  Make 'git-po-helper' works with a bare repository, so the initial clone
    can be a shallow bare repository without checkout.
2.  Will get two revisions, base and head revision, when the workflow is
    triggered. (In patch v1, base and head revision are named as
    commit_from and commit_to)
3. Fetch the base revision and head revision using command:
    git fetch origin --depth=100 <base> <head>

We used a fixed depth 100, because we don't know how many commits
these two revisions are diverged.

Will feed the result of "git rev-list <base>..<head>" to
"git-po-helper", if depth 100 is not deep enough, rev-list will fail,
and should try to
run "git rev-list <head> -100".

I think the first version of l10n.yml should use a full clone, and try
to refactor later to use a shallow or partial clone.

> An even better alternative would be a partial clone, of course, but I fear
> that there is no convenient way yet to configure `actions/checkout` to do
> so.

Good idea.

> > +    - uses: actions/setup-go@v2
> > +      with:
> > +        go-version: ">=1.16"
> > +    - name: Install git-po-helper
> > +      run: |
>
> Since this is a one-liner, it would probably make sense to avoid that `|`
> continuation.

Will do.

> > +      run: |
> > +        if test "${{ github.event_name }}" = "pull_request"
> > +        then
> > +          commit_from=${{ github.event.pull_request.base.sha }}
> > +          commit_to=${{ github.event.pull_request.head.sha }}
> > +        else
> > +          commit_from=${{ github.event.before }}
> > +          commit_to=${{ github.event.after }}
> > +          if ! echo $commit_from | grep -q "^00*$"
>
> This would probably read better as:
>
>                 case "$commit_from" in
>                 *[^0]*|'') ;; # not all zeros
>                 *)

It's better than my version "echo .. | grep".

>                         [...]
>                         ;;
>                 esac
>
> But we might not need it anyway. See the comment on the `git fetch`
> command below.
>
> > +          then
> > +            if ! git rev-parse "$commit_from^{commit}"
> > +            then
> > +              git fetch origin $commit_from
>
> As pointed out above, we cannot know how many commits we need to fetch.
> Therefore, I would suggest to simply drop this `git fetch`.

v2 renamed $commit_from to $base, and renamed $commit_to to $head.

For a force push, the base commit is missing from the initial clone,
and the missing revision can be only accessed (fetched) using a full
SHA. For a "pull_request_target" event, the "head" revision is also
missing, because "pull_request_target" only the base commit is
checkout.

> > +            fi
> > +          fi
> > +        fi
> > +        exit_code=0
> > +        git-po-helper check-commits --github-action -- \
> > +          $commit_from..$commit_to >git-po-helper.out 2>&1 ||
>
> What does the `--github-action` option do? Might be good to document this
> here.

The option "--github-action" will enable "--no-gpg" and
"--no-gettext-back-compatible" options. To make the workflow
"l10n.yml" stable, I introduced the "--github-action" option. You can
see I introduced another option "--github-action-event=<push |
pull_request_event>", and the boolean option "--github-action" can be
omitted.

> > +        exit_code=$?
> > +        echo "::set-output name=exit_code::$exit_code"
> > +        has_error_msg=
> > +        has_warning_msg=
> > +        if test $exit_code -ne 0
> > +        then
> > +          has_error_msg=yes
>
> Do we really need this `has_error_msg` variable? It would be much easier
> to test the condition `env.ERROR_MSG != ''` in subsequent steps, and that
> would already do the job, without having to go through output variables.

"env.ERROR_MSG" is an environment variable which contains multiple
lines.  Shell script like `if test ${{ env.ERROR_MSG }} != ""` may
break.

> > +          if test "${{ github.event_name }}" = "pull_request"
> > +          then
> > +            echo "ERROR_MSG<<EOF" >>$GITHUB_ENV
> > +            grep -v -e "^level=warning" -e WARNING git-po-helper.out |
> > +              perl -pe 's/\e\[[0-9;]*m//g' >>$GITHUB_ENV
>
> This is a bit dangerous. First of all, how can you be sure that
> `git-po-helper.out` does not contain lines that consist of the letters
> `EOF` (and would therefore interfere with the here-doc)?

Will replace possible "EOF$" from the output file.

> Second, isn't it conceivable to have an `error:` line which contains the
> characters `WARNING` too? That line would be filtered out...

Will report errors and warnings all together.

> Third, can `git-po-helper` be convinced _not_ to print color codes? The
> output was redirected into a file, after all, and it does not go to a
> terminal...

"git-po-helper" uses the package "logrus" for logging. The default
format of "logrus” to write log to file is for machines, not user
friendly. The only way is provide an additional option "ForceColor" to
it. So I must clear the color code from the output file, if I want to
create a comment for pull request. But the ansi colors are nice to
display in the report of the action workflow.

> > +            echo "EOF" >>$GITHUB_ENV
> > +          fi
> > +        fi
> > +        if grep -q -e "^level=warning" -e WARNING git-po-helper.out
> > +        then
> > +          has_warning_msg=yes
> > +          if test "${{ github.event_name }}" = "pull_request"
> > +          then
> > +            echo "WARNING_MSG<<EOF" >>$GITHUB_ENV
> > +            grep -v -e "^level=error" -e ERROR git-po-helper.out |
> > +              perl -pe 's/\e\[[0-9;]*m//g' >>$GITHUB_ENV
> > +            echo "EOF" >>$GITHUB_ENV
> > +          fi
> > +        fi
> > +        echo "::set-output name=has_error_msg::$has_error_msg"
> > +        echo "::set-output name=has_warning_msg::$has_warning_msg"
> > +    - name: Report errors in comment for pull request
> > +      uses: mshick/add-pr-comment@v1
> > +      if: steps.check-commits.outputs.has_error_msg == 'yes' && github.event_name == 'pull_request'
> > +      continue-on-error: true
> > +      with:
> > +        repo-token: ${{ secrets.GITHUB_TOKEN }}
>
> This requires the `GITHUB_TOKEN` to have write permissions, which I would
> highly recommend not to require.

The action "mshick/add-pr-comment@v1" needs this token. See:
https://github.com/mshick/add-pr-comment .

> Instead, it would probably make more sense to keep the output in the logs
> of the workflow run.

You can see this nice post from ecco:
https://github.community/t/github-actions-are-severely-limited-on-prs/18179

For a successful git-l10n workflow, there are no errors, but may have
warnings for possible typos found in a ".po" file.  There may be lots
of false positives in these warnings.  If I hide these warnings in the
log of a workflow, git-l10n contributors may not notice them. So I
need a way to create comments in pull requests.

See some typical code reviews for git-l10n:

* https://github.com/git-l10n/git-po/pull/541
* https://github.com/git-l10n/git-po/pull/555

> > +        repo-token-user-login: 'github-actions[bot]'
> > +        message: |
> > +          Errors found by git-po-helper in workflow ${{ github.workflow }}:
> > +          ```
> > +          ${{ env.ERROR_MSG }}
> > +          ```
> > +    - name: Report warnings in comment for pull request
> > +      uses: mshick/add-pr-comment@v1
> > +      if: steps.check-commits.outputs.has_warning_msg == 'yes' && github.event_name == 'pull_request'
> > +      continue-on-error: true
> > +      with:
> > +        repo-token: ${{ secrets.GITHUB_TOKEN }}
> > +        repo-token-user-login: 'github-actions[bot]'
> > +        message: |
> > +          Warnings found by git-po-helper in workflow ${{ github.workflow }}:
> > +          ```
> > +          ${{ env.WARNING_MSG }}
> > +          ```
> > +    - name: Report and quit
> > +      run: |
> > +        if test "${{ steps.check-commits.outputs.has_error_msg }}" = "yes"
>
> This would be easier to read and maintain if it was upgraded to an `if:`
> condition:
>
>     - name: Report errors
>       if: step.check-commits.outputs.has_error_msg = "yes"
>       run: |
>         [...]
>
> And then do the same for warnings.

When users check the log of a workflow, they will only expand the
failed step.  So warnings and errors are reported in one step.

> Also, it would be more natural if the `Run git-po-helper` step was allowed
> to fail when `git-po-helper` outputs errors. You would then have to use
> this conditional in the `Report errors` step:
>
>       if: failure() && step.check-commits.outputs.has_error_msg = "yes"
>
> (compare to how Git's `CI/PR` workflow prints errors:
> https://github.com/git/git/blob/v2.33.0/.github/workflows/main.yml#L120).
>
> For the `Report warnings` step, you would however have to use something
> slightly less intuitive:
>
>       if: (success() || failure()) && step.check-commits.outputs.has_warning_msg = "yes"

Will try.

> Finally, I _do_ think that this line would be easier to read like this,
> while basically doing the same thing but with less effort (because the
> outputs would no longer have to be set):
>
>       if: (success() || failure()) && env.WARNING_MSG != ""
>
> Ciao,
> Dscho

Jiang Xin Aug. 24, 2021, 1:36 p.m. UTC | #7

On Tue, Aug 24, 2021 at 5:36 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> >> For a push event, it will scan commits one by one. If a commit does not
> >> look like a l10n commit (no file in "po/" has been changed), it will
> >> immediately fail without checking for further commits. While for a
> >> pull_request event, all new introduced commits will be scanned.
> >>
> >> "git-po-helper" will generate two kinds of suggestions, errors and
> >> warnings. A l10n contributor should try to fix all the errors, and
> >> should pay attention to the warnings. All the errors and warnings will
> >> be reported in the last step of the l10n workflow as two message groups.
> >> For a pull_request event, will create additional comments in pull
> >> request to report the result.
> >
> > It is a good idea to automate this.
> >
> > I am a bit concerned that the `ci-config` approach, even if we use it in
> > the Git project itself, is quite cumbersome to use, though. So I hope that
> > we can find an alternative solution.
> >
> > One such solution could be to make the `git-po-helper` job contingent on
> > part of the repository name. For example:
> >
> >   git-po-helper:
> >     if: endsWith(github.repository, '/git-po')
> >     [...]
> >
> > would skip the job unless the target repository's name is `git-po`.
>
> Nice.
>
> Can this be made into a matter purely local to git-l10n/git-po
> repository and not git/git repository?  I am wondering if we can ee
> if the current repository is git-l10n/git-po or its fork and run it
> only if that is true.

I have read almost all the github documents on github actions, and
tried to find if I can use a local branch, such as "github-action" to
hold local github-actions, but no locky.

That is to say, the workflow file must be introduced to the master
branch of “git-l10n/git-po”. As "git-l10n/git-po" is a fork of
"git/git", the new workflow should be part of "git/git", and provide a
way to disable it by default.

--
Jiang Xin

Junio C Hamano Aug. 24, 2021, 7:04 p.m. UTC | #8

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> Can this be made into a matter purely local to git-l10n/git-po
>> repository and not git/git repository? I am wondering if we can ee if
>> the current repository is git-l10n/git-po or its fork and run it only if
>> that is true.
>
> The biggest problem is that there are forks of `git-l10n/git-po` that
> accept PRs in their own right. That is what I tried to address by using
> just the repository name, without the org/user prefix.

Well, it is why I wondered if we can see "if the repository is
git-l10n/git-po or its fork".  Perhaps I should have written "or
(recursively) its fork"?  I guess the repository name being git-po
may be a usable approximation ;-)

Junio C Hamano Aug. 24, 2021, 7:06 p.m. UTC | #9

Jiang Xin <worldhello.net@gmail.com> writes:

>> > One such solution could be to make the `git-po-helper` job contingent on
>> > part of the repository name. For example:
>> >
>> >   git-po-helper:
>> >     if: endsWith(github.repository, '/git-po')
>> >     [...]
>> >
>> > would skip the job unless the target repository's name is `git-po`.
>>
>> Nice.
>>
>> Can this be made into a matter purely local to git-l10n/git-po
>> repository and not git/git repository?  I am wondering if we can ee
>> if the current repository is git-l10n/git-po or its fork and run it
>> only if that is true.
>
> I have read almost all the github documents on github actions, and
> tried to find if I can use a local branch, such as "github-action" to
> hold local github-actions, but no locky.
>
> That is to say, the workflow file must be introduced to the master
> branch of “git-l10n/git-po”. As "git-l10n/git-po" is a fork of
> "git/git", the new workflow should be part of "git/git", and provide a
> way to disable it by default.

Yeah, that part I am agreeing with you.  I was just wondering if we
can do better than Dscho's heuristics (the repository name ending
with /git-po) to catch git-l10n/git-po or its fork to enable the
workflow in.

Jean-Noël Avila Aug. 24, 2021, 9:34 p.m. UTC | #10

Le mardi 24 août 2021, 11:27:44 CEST Johannes Schindelin a écrit :
> Hi Junio,
> 
> On Mon, 23 Aug 2021, Junio C Hamano wrote:
> 
> > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> >
> > >> For a push event, it will scan commits one by one. If a commit does not
> > >> look like a l10n commit (no file in "po/" has been changed), it will
> > >> immediately fail without checking for further commits. While for a
> > >> pull_request event, all new introduced commits will be scanned.
> > >>
> > >> "git-po-helper" will generate two kinds of suggestions, errors and
> > >> warnings. A l10n contributor should try to fix all the errors, and
> > >> should pay attention to the warnings. All the errors and warnings will
> > >> be reported in the last step of the l10n workflow as two message groups.
> > >> For a pull_request event, will create additional comments in pull
> > >> request to report the result.
> > >
> > > It is a good idea to automate this.
> > >
> > > I am a bit concerned that the `ci-config` approach, even if we use it in
> > > the Git project itself, is quite cumbersome to use, though. So I hope that
> > > we can find an alternative solution.
> > >
> > > One such solution could be to make the `git-po-helper` job contingent on
> > > part of the repository name. For example:
> > >
> > >   git-po-helper:
> > >     if: endsWith(github.repository, '/git-po')
> > >     [...]
> > >
> > > would skip the job unless the target repository's name is `git-po`.
> >
> > Nice.
> >
> > Can this be made into a matter purely local to git-l10n/git-po
> > repository and not git/git repository? I am wondering if we can ee if
> > the current repository is git-l10n/git-po or its fork and run it only if
> > that is true.
> 
> The biggest problem is that there are forks of `git-l10n/git-po` that
> accept PRs in their own right. That is what I tried to address by using
> just the repository name, without the org/user prefix.
> 
> Ciao,
> Dscho
> 

Well, I'm in this case, plus my repo is a fork of git/git. 

Would it not possible to formally require the presence of a "dumb" secret to run this rule? 

JN

Johannes Schindelin Aug. 25, 2021, 12:14 p.m. UTC | #11

Hi Xin,

On Tue, 24 Aug 2021, Jiang Xin wrote:

> On Tue, Aug 24, 2021 at 5:03 AM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> > It is a good idea to automate this.
> >
> > I am a bit concerned that the `ci-config` approach, even if we use it in
> > the Git project itself, is quite cumbersome to use, though. So I hope that
> > we can find an alternative solution.
> >
> > One such solution could be to make the `git-po-helper` job contingent on
> > part of the repository name. For example:
> >
> >   git-po-helper:
> >     if: endsWith(github.repository, '/git-po')
> >     [...]
> >
> > would skip the job unless the target repository's name is `git-po`.
>
> Yes, this is a solution I also want to try at the very beginning. But
> some l10n team leaders may fork their repositories directly from
> git/git, and name their repo as "{OWNER}/git".  I want to try another
> solution: check existence of file "ci/config/allow-l10n" in branch
> "ci-config" using a GitHub API, instead of cloning the ci-config
> branch and execute the shell script "ci/config/allow-l10n".

I understood that you were trying to imitate what git/git does.

The problem, in git/git as well as in your patch, is that this is highly
cumbersome to use. Yes, it would be better if there was an easier UI to do
what you want to do, but the current reality is: there isn't.

The main reason why it is so cumbersome to use is that your chosen
strategy scatters the CI configuration so much that it puts a mental
burden on the users. I, for one, have no idea what is currently in my
`ci-config` branch. And new users will be forced to struggle to set up
their fork in such a way that the configuration does what they want it to
do.

And in this instance, there is not even any good reason to make it hard on
the users because most will likely not need that new workflow at all. That
workflow is primarily interesting for the l10n maintainers.

To make it easier on the vast majority of contributors, I therefore
suggested to introduce an easy-to-parse condition that lives not only in
the same branch but in the very same file as the workflow _and_ it does
what most users need it to do: cognitive burden penalty averted!

Even better: the condition I suggested can be _easily_ extended by the few
l10n maintainers to include their particular for as target repository.

> > A couple more questions/suggestions are below:
> >
> > > +  git-po-helper:
> > > +    needs: ci-config
> > > +    if: needs.ci-config.outputs.enabled == 'yes'
> > > +    runs-on: ubuntu-latest
> > > +    steps:
> > > +    - uses: actions/checkout@v2
> > > +      with:
> > > +        fetch-depth: '0'
> >
> > There is code below to specifically fetch the base commit, but setting the
> > `fetch-depth` to zero means to do a full clone anyway.
> >
> > So maybe the `git fetch` code below should be dropped, or the
> > `fetch-depth` override?
> >
> > Since we cannot know how deep we need to fetch, though, I figure that it
> > would be indeed better to have a non-shallow clone (and a code comment
> > would help clarify this).
>
> If we want do a shallow clone, maybe we can try like this:
>
> 1.  Make 'git-po-helper' works with a bare repository, so the initial clone
>     can be a shallow bare repository without checkout.
> 2.  Will get two revisions, base and head revision, when the workflow is
>     triggered. (In patch v1, base and head revision are named as
>     commit_from and commit_to)
> 3. Fetch the base revision and head revision using command:
>     git fetch origin --depth=100 <base> <head>
>
> We used a fixed depth 100, because we don't know how many commits
> these two revisions are diverged.

Okay, but you did not use a fixed depth 100: you used depth 0, which turns
off shallow clones in actions/checkout.

> Will feed the result of "git rev-list <base>..<head>" to
> "git-po-helper", if depth 100 is not deep enough, rev-list will fail,
> and should try to
> run "git rev-list <head> -100".
>
> I think the first version of l10n.yml should use a full clone, and try
> to refactor later to use a shallow or partial clone.

Given how often the workflow will run, I am not sure that it is worth the
effort to micro-optimize this. Just use a full clone and you're done.

> >                         [...]
> >                         ;;
> >                 esac
> >
> > But we might not need it anyway. See the comment on the `git fetch`
> > command below.
> >
> > > +          then
> > > +            if ! git rev-parse "$commit_from^{commit}"
> > > +            then
> > > +              git fetch origin $commit_from
> >
> > As pointed out above, we cannot know how many commits we need to fetch.
> > Therefore, I would suggest to simply drop this `git fetch`.
>
> v2 renamed $commit_from to $base, and renamed $commit_to to $head.
>
> For a force push, the base commit is missing from the initial clone,

Ah, that's the point I was missing.

> and the missing revision can be only accessed (fetched) using a full
> SHA. For a "pull_request_target" event, the "head" revision is also
> missing, because "pull_request_target" only the base commit is
> checkout.

If you truly want to optimize the fetch (which I don't think is worth the
effort, as I mentioned above), you could also start with a shallow clone,
then call

	git config remote.origin.promisor true
	git config remote.origin.partialCloneFilter blob:none
	rm .git/shallow

Subsequent Git operations should transparently fetch whatever is missing.

But again, this strikes me as seriously premature optimization.

> > > +            fi
> > > +          fi
> > > +        fi
> > > +        exit_code=0
> > > +        git-po-helper check-commits --github-action -- \
> > > +          $commit_from..$commit_to >git-po-helper.out 2>&1 ||
> >
> > What does the `--github-action` option do? Might be good to document this
> > here.
>
> The option "--github-action" will enable "--no-gpg" and
> "--no-gettext-back-compatible" options. To make the workflow
> "l10n.yml" stable, I introduced the "--github-action" option. You can
> see I introduced another option "--github-action-event=<push |
> pull_request_event>", and the boolean option "--github-action" can be
> omitted.

Okay, I take your word for it.

> > > +        exit_code=$?
> > > +        echo "::set-output name=exit_code::$exit_code"
> > > +        has_error_msg=
> > > +        has_warning_msg=
> > > +        if test $exit_code -ne 0
> > > +        then
> > > +          has_error_msg=yes
> >
> > Do we really need this `has_error_msg` variable? It would be much easier
> > to test the condition `env.ERROR_MSG != ''` in subsequent steps, and that
> > would already do the job, without having to go through output variables.
>
> "env.ERROR_MSG" is an environment variable which contains multiple
> lines.  Shell script like `if test ${{ env.ERROR_MSG }} != ""` may
> break.

They won't, if you use `if test "$ERROR_MSG" != ""`, which is more
canonical anyway.

> > Third, can `git-po-helper` be convinced _not_ to print color codes? The
> > output was redirected into a file, after all, and it does not go to a
> > terminal...
>
> "git-po-helper" uses the package "logrus" for logging. The default
> format of "logrus” to write log to file is for machines, not user
> friendly. The only way is provide an additional option "ForceColor" to
> it. So I must clear the color code from the output file, if I want to
> create a comment for pull request. But the ansi colors are nice to
> display in the report of the action workflow.

Ah. That makes sense. Maybe it would be useful information for the commit
message?

> > > +            echo "EOF" >>$GITHUB_ENV
> > > +          fi
> > > +        fi
> > > +        if grep -q -e "^level=warning" -e WARNING git-po-helper.out
> > > +        then
> > > +          has_warning_msg=yes
> > > +          if test "${{ github.event_name }}" = "pull_request"
> > > +          then
> > > +            echo "WARNING_MSG<<EOF" >>$GITHUB_ENV
> > > +            grep -v -e "^level=error" -e ERROR git-po-helper.out |
> > > +              perl -pe 's/\e\[[0-9;]*m//g' >>$GITHUB_ENV
> > > +            echo "EOF" >>$GITHUB_ENV
> > > +          fi
> > > +        fi
> > > +        echo "::set-output name=has_error_msg::$has_error_msg"
> > > +        echo "::set-output name=has_warning_msg::$has_warning_msg"
> > > +    - name: Report errors in comment for pull request
> > > +      uses: mshick/add-pr-comment@v1
> > > +      if: steps.check-commits.outputs.has_error_msg == 'yes' && github.event_name == 'pull_request'
> > > +      continue-on-error: true
> > > +      with:
> > > +        repo-token: ${{ secrets.GITHUB_TOKEN }}
> >
> > This requires the `GITHUB_TOKEN` to have write permissions, which I would
> > highly recommend not to require.
>
> The action "mshick/add-pr-comment@v1" needs this token. See:
> https://github.com/mshick/add-pr-comment .

Yes, I am fully aware.

What I tried to point out is that the `GITHUB_TOKEN` you receive _may_
have write permissions (it used to be the default), but these days you can
configure it to be read-only, as a repository admin. For details, see
https://github.blog/changelog/2021-04-20-github-actions-control-permissions-for-github_token/#setting-the-default-permissions-for-the-organization-or-repository

If GITHUB_TOKEN is configured to be read-only (which I, for one, do with
all repositories I can, for security reasons), then `add-pr-comment` might
fail.

This was the case for the `check-whitespace` workflow on
git-for-windows/git, which is why I fixed that workflow in cc00362125c
(ci(check-whitespace): stop requiring a read/write token, 2021-07-14).

It would not make sense to re-introduce the same issue in a new workflow.

> > Instead, it would probably make more sense to keep the output in the logs
> > of the workflow run.
>
> You can see this nice post from ecco:
> https://github.community/t/github-actions-are-severely-limited-on-prs/18179

Oh, I am aware of the problem. Probably a lot more than you think I am.
After all, I introduced the Azure Pipeline definition which offered not
only a very convenient way to get to the logs of failed test cases, but
also had statistics on flaky tests, and allowed monitoring all kinds of
insights.

And GitHub workflows have none of that.

At least at the moment. If you want to investigate a test failure, you
have to open the failed run, but that won't get you to the right spot yet.
Instead, it opens the log of the `prove` run, which only tells you which
test script(s) failed.

What you _then_ have to do is to expand the `ci/print-test-failures.sh`
step (which _succeeded_ and hence it is not expanded by default) and then
you have to click on the magnifying glass symbol (i.e. _not_ use your
browser's "Find" functionality, that won't work) and search for "not ok"
and then skim over all `# known breakage` entries.

So yes, I do know about the (current) limitations of the GitHub workflows
UI ;-)

> For a successful git-l10n workflow, there are no errors, but may have
> warnings for possible typos found in a ".po" file.  There may be lots
> of false positives in these warnings.  If I hide these warnings in the
> log of a workflow, git-l10n contributors may not notice them. So I
> need a way to create comments in pull requests.

Or the workflow runs need to fail, and PRs need to require those runs to
pass.

> See some typical code reviews for git-l10n:
>
> * https://github.com/git-l10n/git-po/pull/541
> * https://github.com/git-l10n/git-po/pull/555

Thank you for linking those! Now that you said it, I thought about a
different strategy we're using in the Git for Windows project (where we
have a scheduled workflow that monitors a few of the 150 components
bundled in Git for Windows' installer, to get notified if there are new
versions, and the workflow needs write permission in order to open new
tickets). We use an environment secret (and environments can be limited
appropriately, e.g. by requiring approvals from a specific team). For
details, see
https://github.com/git-for-windows/git/commit/1abb4477f462c3d155ec68d40c961a5e0604ddf8

I could imagine that you could make your workflow contingent on the
presence of, say, `GIT_PO_HELPER_GITHUB_TOKEN`. That would not only solve
the "read-only token" problem but also the "don't run everywhere, please!"
problem. You (and other l10n maintainers) only have to create a Personal
Access Token with `repo` permission and add it as a secret.

But ideally, you would test whether an environment of a given name exists,
and I am not aware if such a thing is possible yet with GitHub workflows.

Food for thought?

> > > +        repo-token-user-login: 'github-actions[bot]'
> > > +        message: |
> > > +          Errors found by git-po-helper in workflow ${{ github.workflow }}:
> > > +          ```
> > > +          ${{ env.ERROR_MSG }}
> > > +          ```
> > > +    - name: Report warnings in comment for pull request
> > > +      uses: mshick/add-pr-comment@v1
> > > +      if: steps.check-commits.outputs.has_warning_msg == 'yes' && github.event_name == 'pull_request'
> > > +      continue-on-error: true
> > > +      with:
> > > +        repo-token: ${{ secrets.GITHUB_TOKEN }}
> > > +        repo-token-user-login: 'github-actions[bot]'
> > > +        message: |
> > > +          Warnings found by git-po-helper in workflow ${{ github.workflow }}:
> > > +          ```
> > > +          ${{ env.WARNING_MSG }}
> > > +          ```
> > > +    - name: Report and quit
> > > +      run: |
> > > +        if test "${{ steps.check-commits.outputs.has_error_msg }}" = "yes"
> >
> > This would be easier to read and maintain if it was upgraded to an `if:`
> > condition:
> >
> >     - name: Report errors
> >       if: step.check-commits.outputs.has_error_msg = "yes"
> >       run: |
> >         [...]
> >
> > And then do the same for warnings.
>
> When users check the log of a workflow, they will only expand the
> failed step.  So warnings and errors are reported in one step.

Right. You need to make the step fail that shows the errors/warnings. (In
Git's `main.yml`, we don't...)

Ciao,
Dscho

Jiang Xin Aug. 26, 2021, 2:25 a.m. UTC | #12

Hi Dscho,

On Wed, Aug 25, 2021 at 8:14 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> > Yes, this is a solution I also want to try at the very beginning. But
> > some l10n team leaders may fork their repositories directly from
> > git/git, and name their repo as "{OWNER}/git".  I want to try another
> > solution: check existence of file "ci/config/allow-l10n" in branch
> > "ci-config" using a GitHub API, instead of cloning the ci-config
> > branch and execute the shell script "ci/config/allow-l10n".
>
> I understood that you were trying to imitate what git/git does.
>
> The problem, in git/git as well as in your patch, is that this is highly
> cumbersome to use. Yes, it would be better if there was an easier UI to do
> what you want to do, but the current reality is: there isn't.
>
> The main reason why it is so cumbersome to use is that your chosen
> strategy scatters the CI configuration so much that it puts a mental
> burden on the users. I, for one, have no idea what is currently in my
> `ci-config` branch. And new users will be forced to struggle to set up
> their fork in such a way that the configuration does what they want it to
> do.
>
> And in this instance, there is not even any good reason to make it hard on
> the users because most will likely not need that new workflow at all. That
> workflow is primarily interesting for the l10n maintainers.
>
> To make it easier on the vast majority of contributors, I therefore
> suggested to introduce an easy-to-parse condition that lives not only in
> the same branch but in the very same file as the workflow _and_ it does
> what most users need it to do: cognitive burden penalty averted!
>
> Even better: the condition I suggested can be _easily_ extended by the few
> l10n maintainers to include their particular for as target repository.
>

I agree with you that the "ci-config" job is cumbersome to use, and I
will follow your suggestions. I may also want to try to implement a
github action later to check the existence of a file in a branch using
GitHub API as an supplementary way.

> > > A couple more questions/suggestions are below:
> > >
> > > > +  git-po-helper:
> > > > +    needs: ci-config
> > > > +    if: needs.ci-config.outputs.enabled == 'yes'
> > > > +    runs-on: ubuntu-latest
> > > > +    steps:
> > > > +    - uses: actions/checkout@v2
> > > > +      with:
> > > > +        fetch-depth: '0'
> > >
> > > There is code below to specifically fetch the base commit, but setting the
> > > `fetch-depth` to zero means to do a full clone anyway.
> > >
> > > So maybe the `git fetch` code below should be dropped, or the
> > > `fetch-depth` override?
> > >
> > > Since we cannot know how deep we need to fetch, though, I figure that it
> > > would be indeed better to have a non-shallow clone (and a code comment
> > > would help clarify this).
> >
> > If we want do a shallow clone, maybe we can try like this:
> >
> > 1.  Make 'git-po-helper' works with a bare repository, so the initial clone
> >     can be a shallow bare repository without checkout.
> > 2.  Will get two revisions, base and head revision, when the workflow is
> >     triggered. (In patch v1, base and head revision are named as
> >     commit_from and commit_to)
> > 3. Fetch the base revision and head revision using command:
> >     git fetch origin --depth=100 <base> <head>
> >
> > We used a fixed depth 100, because we don't know how many commits
> > these two revisions are diverged.

Wrong tense.  My English! :sweat:
s/We used a fixed depth 100/I plan to use a fixed depth of 100/

>
> Okay, but you did not use a fixed depth 100: you used depth 0, which turns
> off shallow clones in actions/checkout.
>

>
> If you truly want to optimize the fetch (which I don't think is worth the
> effort, as I mentioned above), you could also start with a shallow clone,
> then call
>
>         git config remote.origin.promisor true
>         git config remote.origin.partialCloneFilter blob:none
>         rm .git/shallow
>
> Subsequent Git operations should transparently fetch whatever is missing.
>
> But again, this strikes me as seriously premature optimization.

I like this. Will try.

And I wonder it would be better if the action "actions/checkout" can
support "fetch-depth: -1". Which means it does not do fetch at all,
but only sets the correct ssh_key and auth token.

> > The action "mshick/add-pr-comment@v1" needs this token. See:
> > https://github.com/mshick/add-pr-comment .
>
> Yes, I am fully aware.
>
> What I tried to point out is that the `GITHUB_TOKEN` you receive _may_
> have write permissions (it used to be the default), but these days you can
> configure it to be read-only, as a repository admin. For details, see
> https://github.blog/changelog/2021-04-20-github-actions-control-permissions-for-github_token/#setting-the-default-permissions-for-the-organization-or-repository
>
> If GITHUB_TOKEN is configured to be read-only (which I, for one, do with
> all repositories I can, for security reasons), then `add-pr-comment` might
> fail.
>
> This was the case for the `check-whitespace` workflow on
> git-for-windows/git, which is why I fixed that workflow in cc00362125c
> (ci(check-whitespace): stop requiring a read/write token, 2021-07-14).
>
> It would not make sense to re-introduce the same issue in a new workflow.

I understand your concern.  I will setup read only permission for
github action for the repo.

> > > Instead, it would probably make more sense to keep the output in the logs
> > > of the workflow run.
> >
> > You can see this nice post from ecco:
> > https://github.community/t/github-actions-are-severely-limited-on-prs/18179
>
> Oh, I am aware of the problem. Probably a lot more than you think I am.
> After all, I introduced the Azure Pipeline definition which offered not
> only a very convenient way to get to the logs of failed test cases, but
> also had statistics on flaky tests, and allowed monitoring all kinds of
> insights.
>
> And GitHub workflows have none of that.
>
> At least at the moment. If you want to investigate a test failure, you
> have to open the failed run, but that won't get you to the right spot yet.
> Instead, it opens the log of the `prove` run, which only tells you which
> test script(s) failed.
>
> What you _then_ have to do is to expand the `ci/print-test-failures.sh`
> step (which _succeeded_ and hence it is not expanded by default) and then
> you have to click on the magnifying glass symbol (i.e. _not_ use your
> browser's "Find" functionality, that won't work) and search for "not ok"
> and then skim over all `# known breakage` entries.
>
> So yes, I do know about the (current) limitations of the GitHub workflows
> UI ;-)
>
> > For a successful git-l10n workflow, there are no errors, but may have
> > warnings for possible typos found in a ".po" file.  There may be lots
> > of false positives in these warnings.  If I hide these warnings in the
> > log of a workflow, git-l10n contributors may not notice them. So I
> > need a way to create comments in pull requests.
>
> Or the workflow runs need to fail, and PRs need to require those runs to
> pass.
>
> > See some typical code reviews for git-l10n:
> >
> > * https://github.com/git-l10n/git-po/pull/541
> > * https://github.com/git-l10n/git-po/pull/555
>
> Thank you for linking those! Now that you said it, I thought about a
> different strategy we're using in the Git for Windows project (where we
> have a scheduled workflow that monitors a few of the 150 components
> bundled in Git for Windows' installer, to get notified if there are new
> versions, and the workflow needs write permission in order to open new
> tickets). We use an environment secret (and environments can be limited
> appropriately, e.g. by requiring approvals from a specific team). For
> details, see
> https://github.com/git-for-windows/git/commit/1abb4477f462c3d155ec68d40c961a5e0604ddf8
>
> I could imagine that you could make your workflow contingent on the
> presence of, say, `GIT_PO_HELPER_GITHUB_TOKEN`. That would not only solve
> the "read-only token" problem but also the "don't run everywhere, please!"
> problem. You (and other l10n maintainers) only have to create a Personal
> Access Token with `repo` permission and add it as a secret.
>
> But ideally, you would test whether an environment of a given name exists,
> and I am not aware if such a thing is possible yet with GitHub workflows.
>
> Food for thought?

To make it simple, I want to limit the permissions of the token in
"l10n.yml' and use the default `GITHUB_TOKEN`. like this:

          git-po-helper:
            needs: l10n-config
            if: needs.l10n-config.outputs.enabled == 'yes'
            runs-on: ubuntu-latest
    +       permissions:
    +          pull-requests: write
            steps:

It can work even if the administrator turns off
'read-write-permission' for action of the repository.

--
Regards,
Jiang Xin

Jeff King Aug. 27, 2021, 2:10 a.m. UTC | #13

On Wed, Aug 25, 2021 at 02:14:43PM +0200, Johannes Schindelin wrote:

> > Yes, this is a solution I also want to try at the very beginning. But
> > some l10n team leaders may fork their repositories directly from
> > git/git, and name their repo as "{OWNER}/git".  I want to try another
> > solution: check existence of file "ci/config/allow-l10n" in branch
> > "ci-config" using a GitHub API, instead of cloning the ci-config
> > branch and execute the shell script "ci/config/allow-l10n".
> 
> I understood that you were trying to imitate what git/git does.
> 
> The problem, in git/git as well as in your patch, is that this is highly
> cumbersome to use. Yes, it would be better if there was an easier UI to do
> what you want to do, but the current reality is: there isn't.
> 
> The main reason why it is so cumbersome to use is that your chosen
> strategy scatters the CI configuration so much that it puts a mental
> burden on the users. I, for one, have no idea what is currently in my
> `ci-config` branch. And new users will be forced to struggle to set up
> their fork in such a way that the configuration does what they want it to
> do.
> 
> And in this instance, there is not even any good reason to make it hard on
> the users because most will likely not need that new workflow at all. That
> workflow is primarily interesting for the l10n maintainers.

Just adding my two cents as the person who created the "ci-config"
mechanism: yes, it's absolutely horrible, and should be avoided if at
all possible. :)

Your repo-name solution seems like a quite reasonable alternative.

(I'd still love for there to be a way to selectively disable CI on
certain branches, but I didn't find another one, short of enforcing
branch-naming conventions that the whole project must agree on).

-Peff

Jiang Xin Aug. 27, 2021, 2:49 a.m. UTC | #14

On Fri, Aug 27, 2021 at 10:10 AM Jeff King <peff@peff.net> wrote:
>
> On Wed, Aug 25, 2021 at 02:14:43PM +0200, Johannes Schindelin wrote:
>
> > > Yes, this is a solution I also want to try at the very beginning. But
> > > some l10n team leaders may fork their repositories directly from
> > > git/git, and name their repo as "{OWNER}/git".  I want to try another
> > > solution: check existence of file "ci/config/allow-l10n" in branch
> > > "ci-config" using a GitHub API, instead of cloning the ci-config
> > > branch and execute the shell script "ci/config/allow-l10n".
> >
> > I understood that you were trying to imitate what git/git does.
> >
> > The problem, in git/git as well as in your patch, is that this is highly
> > cumbersome to use. Yes, it would be better if there was an easier UI to do
> > what you want to do, but the current reality is: there isn't.
> >
> > The main reason why it is so cumbersome to use is that your chosen
> > strategy scatters the CI configuration so much that it puts a mental
> > burden on the users. I, for one, have no idea what is currently in my
> > `ci-config` branch. And new users will be forced to struggle to set up
> > their fork in such a way that the configuration does what they want it to
> > do.
> >
> > And in this instance, there is not even any good reason to make it hard on
> > the users because most will likely not need that new workflow at all. That
> > workflow is primarily interesting for the l10n maintainers.
>
> Just adding my two cents as the person who created the "ci-config"
> mechanism: yes, it's absolutely horrible, and should be avoided if at
> all possible. :)
>
> Your repo-name solution seems like a quite reasonable alternative.
>
> (I'd still love for there to be a way to selectively disable CI on
> certain branches, but I didn't find another one, short of enforcing
> branch-naming conventions that the whole project must agree on).

Yesterday, I created a new github-action:

    https://github.com/marketplace/actions/file-exists-action

And want to use this action in the "ci-config (l10n-config)" job like
this, but it appears slower than before:

-- snip begins --
jobs:
  l10n-config:
    runs-on: ubuntu-latest
    outputs:
      enabled: ${{ steps.check-repo-name.outputs.matched == 'yes' ||
steps.check-file-exists.outputs.exists }}
    steps:
      - id: check-repo-name
        ... ...
      - id: check-file-exists
        name: Check file in branch
        if: steps.check-repo-name.outputs.matched != 'yes'
        uses: jiangxin/file-exists-action@v1
        with:
          ref: ci-config
          path: ci/config/allow-l10n
-- snip ends --

The execution of the "l10n-config" job will take 4 seconds vs 2
seconds before. This may because the new action
"jiangxin/file-exists-action" loads another VM to execute.

So in patch v3, I will remove the entirely "ci-config (l10n-config)"
job, and use an if condition like this:

-- snip begins --
name: git-l10n
on: [push, pull_request_target]
jobs:
  git-po-helper:
    if: endsWith(github.repository, '/git-po') ||
contains(github.head_ref, 'l10n') || contains(github.ref, 'l10n')
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
    steps:
      ... ...
--snip ends --

Will check the target repository name first.  If l10n leader has a
fork repository with different repo name, push to a special branch
name, such as "l10n/git-2.34", will also trigger the git-l10n
workflow.

--
Jiang Xin

Jiang Xin Aug. 31, 2021, 1:03 a.m. UTC | #15

On Tue, Aug 31, 2021 at 4:49 AM Jean-Noël AVILA <jn.avila@free.fr> wrote:
>
> Le mardi 24 août 2021, 11:27:44 CEST Johannes Schindelin a écrit :
> >
> > The biggest problem is that there are forks of `git-l10n/git-po` that
> > accept PRs in their own right. That is what I tried to address by using
> > just the repository name, without the org/user prefix.
> >
> > Ciao,
> > Dscho
> >
>
> Well, I'm in this case, plus my repo is a fork of git/git.
>
> Would it not possible to formally require the presence of a "dumb" secret to run this rule?
>

It is not supported to use secrets/env in if condition. See this post:

    https://github.community/t/jobs-job-id-if-does-not-work-with-env-secrets/16928

In patch v3, will check both repo name and branch names. See:

    https://lore.kernel.org/git/20210827071320.14183-1-worldhello.net@gmail.com/

--
Jiang Xin

[1/1] ci: new github-action for git-l10n code review

Commit Message

Comments

Patch