diff mbox series

[v5,1/4] Jobs based on custom runners: documentation and configuration placeholder

Message ID 20210219215838.752547-2-crosa@redhat.com (mailing list archive)
State New, archived
Headers show
Series GitLab Custom Runners and Jobs (was: QEMU Gating CI) | expand

Commit Message

Cleber Rosa Feb. 19, 2021, 9:58 p.m. UTC
As described in the included documentation, the "custom runner" jobs
extend the GitLab CI jobs already in place.  One of their primary
goals of catching and preventing regressions on a wider number of host
systems than the ones provided by GitLab's shared runners.

This sets the stage in which other community members can add their own
machine configuration documentation/scripts, and accompanying job
definitions.  As a general rule, those newly added contributed jobs
should run as "non-gating", until their reliability is verified (AKA
"allow_failure: true").

Signed-off-by: Cleber Rosa <crosa@redhat.com>
---
 .gitlab-ci.d/custom-runners.yml | 14 ++++++++++++++
 .gitlab-ci.yml                  |  1 +
 docs/devel/ci.rst               | 28 ++++++++++++++++++++++++++++
 docs/devel/index.rst            |  1 +
 4 files changed, 44 insertions(+)
 create mode 100644 .gitlab-ci.d/custom-runners.yml
 create mode 100644 docs/devel/ci.rst

Comments

Alex Bennée Feb. 23, 2021, 11:18 a.m. UTC | #1
Cleber Rosa <crosa@redhat.com> writes:

> As described in the included documentation, the "custom runner" jobs
> extend the GitLab CI jobs already in place.  One of their primary
> goals of catching and preventing regressions on a wider number of host
> systems than the ones provided by GitLab's shared runners.
>
> This sets the stage in which other community members can add their own
> machine configuration documentation/scripts, and accompanying job
> definitions.  As a general rule, those newly added contributed jobs
> should run as "non-gating", until their reliability is verified (AKA
> "allow_failure: true").
>
> Signed-off-by: Cleber Rosa <crosa@redhat.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Thomas Huth Feb. 23, 2021, 11:25 a.m. UTC | #2
On 19/02/2021 22.58, Cleber Rosa wrote:
> As described in the included documentation, the "custom runner" jobs
> extend the GitLab CI jobs already in place.  One of their primary
> goals of catching and preventing regressions on a wider number of host
> systems than the ones provided by GitLab's shared runners.
> 
> This sets the stage in which other community members can add their own
> machine configuration documentation/scripts, and accompanying job
> definitions.  As a general rule, those newly added contributed jobs
> should run as "non-gating", until their reliability is verified (AKA
> "allow_failure: true").
> 
> Signed-off-by: Cleber Rosa <crosa@redhat.com>
> ---
>   .gitlab-ci.d/custom-runners.yml | 14 ++++++++++++++
>   .gitlab-ci.yml                  |  1 +
>   docs/devel/ci.rst               | 28 ++++++++++++++++++++++++++++
>   docs/devel/index.rst            |  1 +
>   4 files changed, 44 insertions(+)
>   create mode 100644 .gitlab-ci.d/custom-runners.yml
>   create mode 100644 docs/devel/ci.rst
> 
> diff --git a/.gitlab-ci.d/custom-runners.yml b/.gitlab-ci.d/custom-runners.yml
> new file mode 100644
> index 0000000000..3004da2bda
> --- /dev/null
> +++ b/.gitlab-ci.d/custom-runners.yml
> @@ -0,0 +1,14 @@
> +# The CI jobs defined here require GitLab runners installed and
> +# registered on machines that match their operating system names,
> +# versions and architectures.  This is in contrast to the other CI
> +# jobs that are intended to run on GitLab's "shared" runners.
> +
> +# Different than the default approach on "shared" runners, based on
> +# containers, the custom runners have no such *requirement*, as those
> +# jobs should be capable of running on operating systems with no
> +# compatible container implementation, or no support from
> +# gitlab-runner.  To avoid problems that gitlab-runner can cause while
> +# reusing the GIT repository, let's enable the recursive submodule
> +# strategy.
> +variables:
> +  GIT_SUBMODULE_STRATEGY: recursive

Is it really necessary? I thought our configure script would take care of 
the submodules?

Apart from that:
Acked-by: Thomas Huth <thuth@redhat.com>
Philippe Mathieu-Daudé Feb. 23, 2021, 4:37 p.m. UTC | #3
On 2/23/21 12:25 PM, Thomas Huth wrote:
> On 19/02/2021 22.58, Cleber Rosa wrote:
>> As described in the included documentation, the "custom runner" jobs
>> extend the GitLab CI jobs already in place.  One of their primary
>> goals of catching and preventing regressions on a wider number of host
>> systems than the ones provided by GitLab's shared runners.
>>
>> This sets the stage in which other community members can add their own
>> machine configuration documentation/scripts, and accompanying job
>> definitions.  As a general rule, those newly added contributed jobs
>> should run as "non-gating", until their reliability is verified (AKA
>> "allow_failure: true").
>>
>> Signed-off-by: Cleber Rosa <crosa@redhat.com>
>> ---
>>   .gitlab-ci.d/custom-runners.yml | 14 ++++++++++++++
>>   .gitlab-ci.yml                  |  1 +
>>   docs/devel/ci.rst               | 28 ++++++++++++++++++++++++++++
>>   docs/devel/index.rst            |  1 +
>>   4 files changed, 44 insertions(+)
>>   create mode 100644 .gitlab-ci.d/custom-runners.yml
>>   create mode 100644 docs/devel/ci.rst
>>
>> diff --git a/.gitlab-ci.d/custom-runners.yml
>> b/.gitlab-ci.d/custom-runners.yml
>> new file mode 100644
>> index 0000000000..3004da2bda
>> --- /dev/null
>> +++ b/.gitlab-ci.d/custom-runners.yml
>> @@ -0,0 +1,14 @@
>> +# The CI jobs defined here require GitLab runners installed and
>> +# registered on machines that match their operating system names,
>> +# versions and architectures.  This is in contrast to the other CI
>> +# jobs that are intended to run on GitLab's "shared" runners.
>> +
>> +# Different than the default approach on "shared" runners, based on
>> +# containers, the custom runners have no such *requirement*, as those
>> +# jobs should be capable of running on operating systems with no
>> +# compatible container implementation, or no support from
>> +# gitlab-runner.  To avoid problems that gitlab-runner can cause while
>> +# reusing the GIT repository, let's enable the recursive submodule
>> +# strategy.
>> +variables:
>> +  GIT_SUBMODULE_STRATEGY: recursive
> 
> Is it really necessary? I thought our configure script would take care
> of the submodules?

Well, if there is a failure during the first clone (I got one network
timeout in the middle) then next time it doesn't work:

Updating/initializing submodules recursively...
Synchronizing submodule url for 'capstone'
Synchronizing submodule url for 'dtc'
Synchronizing submodule url for 'meson'
Synchronizing submodule url for 'roms/QemuMacDrivers'
Synchronizing submodule url for 'roms/SLOF'
Synchronizing submodule url for 'roms/edk2'
Synchronizing submodule url for
'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Synchronizing submodule url for
'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli'
Synchronizing submodule url for
'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/esaxx'
Synchronizing submodule url for
'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/libdivsufsort'
Synchronizing submodule url for
'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
Synchronizing submodule url for
'roms/edk2/MdeModulePkg/Library/BrotliCustomDecompressLib/brotli'
Synchronizing submodule url for
'roms/edk2/MdeModulePkg/Universal/RegularExpressionDxe/oniguruma'
Synchronizing submodule url for
'roms/edk2/UnitTestFrameworkPkg/Library/CmockaLib/cmocka'
Synchronizing submodule url for 'roms/ipxe'
Synchronizing submodule url for 'roms/openbios'
Synchronizing submodule url for 'roms/opensbi'
Synchronizing submodule url for 'roms/qboot'
Synchronizing submodule url for 'roms/qemu-palcode'
Synchronizing submodule url for 'roms/seabios'
Synchronizing submodule url for 'roms/seabios-hppa'
Synchronizing submodule url for 'roms/sgabios'
Synchronizing submodule url for 'roms/skiboot'
Synchronizing submodule url for 'roms/u-boot'
Synchronizing submodule url for 'roms/u-boot-sam460ex'
Synchronizing submodule url for 'roms/vbootrom'
Synchronizing submodule url for 'slirp'
Synchronizing submodule url for 'tests/fp/berkeley-softfloat-3'
Synchronizing submodule url for 'tests/fp/berkeley-testfloat-3'
Synchronizing submodule url for 'ui/keycodemapdb'
Entering 'capstone'
Entering 'dtc'
Entering 'meson'
Entering 'roms/QemuMacDrivers'
Entering 'roms/SLOF'
Entering 'roms/edk2'
Entering 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Entering 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli'
Entering 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/esaxx'
Entering
'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/libdivsufsort'
Entering 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
Entering 'roms/edk2/MdeModulePkg/Library/BrotliCustomDecompressLib/brotli'
Entering 'roms/edk2/MdeModulePkg/Universal/RegularExpressionDxe/oniguruma'
Entering 'roms/edk2/UnitTestFrameworkPkg/Library/CmockaLib/cmocka'
Entering 'roms/ipxe'
Entering 'roms/openbios'
Entering 'roms/opensbi'
Entering 'roms/qboot'
Entering 'roms/qemu-palcode'
Entering 'roms/seabios'
Entering 'roms/seabios-hppa'
Entering 'roms/sgabios'
Entering 'roms/skiboot'
Entering 'roms/u-boot'
Entering 'roms/u-boot-sam460ex'
Entering 'roms/vbootrom'
Entering 'slirp'
Entering 'tests/fp/berkeley-softfloat-3'
Entering 'tests/fp/berkeley-testfloat-3'
Entering 'ui/keycodemapdb'
Entering 'capstone'
HEAD is now at f8b1b833 fix CS_ mips_ OP structure comment error (#1674)
Entering 'dtc'
HEAD is now at 85e5d83 Makefile: when building libfdt only, do not add
unneeded deps
Entering 'meson'
HEAD is now at 776acd2a8 Bump versions to 0.55.3 for release
Entering 'roms/QemuMacDrivers'
HEAD is now at 90c488d Merge pull request #3 from
mcayland/fix/unbreak-256-color-mode
Entering 'roms/SLOF'
HEAD is now at e18ddad version: update to 20200717
Entering 'roms/edk2'
HEAD is now at 06dc822d04 Revert ".pytool/EccCheck: Disable Ecc error
code 10014 for open CI"
Entering 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
HEAD is now at b64af41 Fix typo in function
'softfloat_propagateNaNF128M' for RISC-V.
Entering 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli'
HEAD is now at 666c328 Make types of variable match (#796)
Entering 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/esaxx'
HEAD is now at ca7cb33 move to git
Entering
'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/libdivsufsort'
HEAD is now at 5f60d6f Merge pull request #7 from kloetzl/master
Entering 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
Entering 'roms/edk2/MdeModulePkg/Library/BrotliCustomDecompressLib/brotli'
HEAD is now at 63be8a9 unichr was removed in Python 3 because all str
are Unicode (#877)
Entering 'roms/edk2/MdeModulePkg/Universal/RegularExpressionDxe/oniguruma'
HEAD is now at b2c1da6 add ONIG_OPTION_CALLBACK_EACH_MATCH test
Entering 'roms/edk2/UnitTestFrameworkPkg/Library/CmockaLib/cmocka'
HEAD is now at 160dffe Don't use non-literal format strings
Entering 'roms/ipxe'
HEAD is now at 4bd064de [build] Fix building on older versions of gcc
Entering 'roms/openbios'
HEAD is now at 7f28286 PPC: mark first 4 pages of physical and virtual
memory as unavailable
Entering 'roms/opensbi'
HEAD is now at a98258d include: Bump-up version to 0.8
Entering 'roms/qboot'
HEAD is now at a5300c4 qboot: Disable PIE for ELF binary build step
Entering 'roms/qemu-palcode'
HEAD is now at bf0e136 Report machine checks to the kernel
Entering 'roms/seabios'
HEAD is now at 155821a docs: Note v1.14.0 release
Entering 'roms/seabios-hppa'
HEAD is now at 73b740f7 parisc: Set text planes and used_bits in STI fields
Entering 'roms/sgabios'
HEAD is now at cbaee52 SGABIOS: fix wrong video attrs for int 10h, ah==13h
Entering 'roms/skiboot'
HEAD is now at 3a6fdede skiboot v6.4 release notes
Entering 'roms/u-boot'
HEAD is now at d3689267f9 Prepare v2019.01
Entering 'roms/u-boot-sam460ex'
HEAD is now at 60b3916 Add README to clarify relation to U-Boot and
ACube's version
Entering 'roms/vbootrom'
HEAD is now at 0c37a43 Merge pull request #1 from google/disable-build-id
Entering 'slirp'
HEAD is now at 8f43a99 Merge branch 'stable-4.2' into 'stable-4.2'
Entering 'tests/fp/berkeley-softfloat-3'
HEAD is now at b64af41 Fix typo in function
'softfloat_propagateNaNF128M' for RISC-V.
Entering 'tests/fp/berkeley-testfloat-3'
HEAD is now at 5a59dce fail: constify fail_programName
Entering 'ui/keycodemapdb'
HEAD is now at 6119e6e Fix scan codes for Korean keys
fatal: Needed a single revision
Unable to find current revision in submodule path
'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
Failed to recurse into submodule path 'roms/edk2'
ERROR: Job failed: exit status 1
Cleber Rosa Feb. 23, 2021, 4:47 p.m. UTC | #4
On Tue, Feb 23, 2021 at 05:37:04PM +0100, Philippe Mathieu-Daudé wrote:
> On 2/23/21 12:25 PM, Thomas Huth wrote:
> > On 19/02/2021 22.58, Cleber Rosa wrote:
> >> As described in the included documentation, the "custom runner" jobs
> >> extend the GitLab CI jobs already in place.  One of their primary
> >> goals of catching and preventing regressions on a wider number of host
> >> systems than the ones provided by GitLab's shared runners.
> >>
> >> This sets the stage in which other community members can add their own
> >> machine configuration documentation/scripts, and accompanying job
> >> definitions.  As a general rule, those newly added contributed jobs
> >> should run as "non-gating", until their reliability is verified (AKA
> >> "allow_failure: true").
> >>
> >> Signed-off-by: Cleber Rosa <crosa@redhat.com>
> >> ---
> >>   .gitlab-ci.d/custom-runners.yml | 14 ++++++++++++++
> >>   .gitlab-ci.yml                  |  1 +
> >>   docs/devel/ci.rst               | 28 ++++++++++++++++++++++++++++
> >>   docs/devel/index.rst            |  1 +
> >>   4 files changed, 44 insertions(+)
> >>   create mode 100644 .gitlab-ci.d/custom-runners.yml
> >>   create mode 100644 docs/devel/ci.rst
> >>
> >> diff --git a/.gitlab-ci.d/custom-runners.yml
> >> b/.gitlab-ci.d/custom-runners.yml
> >> new file mode 100644
> >> index 0000000000..3004da2bda
> >> --- /dev/null
> >> +++ b/.gitlab-ci.d/custom-runners.yml
> >> @@ -0,0 +1,14 @@
> >> +# The CI jobs defined here require GitLab runners installed and
> >> +# registered on machines that match their operating system names,
> >> +# versions and architectures.  This is in contrast to the other CI
> >> +# jobs that are intended to run on GitLab's "shared" runners.
> >> +
> >> +# Different than the default approach on "shared" runners, based on
> >> +# containers, the custom runners have no such *requirement*, as those
> >> +# jobs should be capable of running on operating systems with no
> >> +# compatible container implementation, or no support from
> >> +# gitlab-runner.  To avoid problems that gitlab-runner can cause while
> >> +# reusing the GIT repository, let's enable the recursive submodule
> >> +# strategy.
> >> +variables:
> >> +  GIT_SUBMODULE_STRATEGY: recursive
> > 
> > Is it really necessary? I thought our configure script would take care
> > of the submodules?
>

I've done a lot of testing on bare metal systems, and the problems
that come from reusing the same system and failed cleanups can be very
frustrating.  It's unfortunate that we need this, but it was the
simplest and most reliable solution I found.  :/

Having said that, I noticed after I posted this series that this is
affecting all other jobs.  We don't need it that in the jobs based
on containers (for obvious reasons), so I see two options:

1) have it enabled on all jobs for consistency

2) have it enabled only on jobs that will reuse the repo

> Well, if there is a failure during the first clone (I got one network
> timeout in the middle) then next time it doesn't work:
> 
> Updating/initializing submodules recursively...
> Synchronizing submodule url for 'capstone'
> Synchronizing submodule url for 'dtc'
> Synchronizing submodule url for 'meson'
> Synchronizing submodule url for 'roms/QemuMacDrivers'
> Synchronizing submodule url for 'roms/SLOF'
> Synchronizing submodule url for 'roms/edk2'
> Synchronizing submodule url for
> 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
> Synchronizing submodule url for
> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli'
> Synchronizing submodule url for
> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/esaxx'
> Synchronizing submodule url for
> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/libdivsufsort'
> Synchronizing submodule url for
> 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
> Synchronizing submodule url for
> 'roms/edk2/MdeModulePkg/Library/BrotliCustomDecompressLib/brotli'
> Synchronizing submodule url for
> 'roms/edk2/MdeModulePkg/Universal/RegularExpressionDxe/oniguruma'
> Synchronizing submodule url for
> 'roms/edk2/UnitTestFrameworkPkg/Library/CmockaLib/cmocka'
> Synchronizing submodule url for 'roms/ipxe'
> Synchronizing submodule url for 'roms/openbios'
> Synchronizing submodule url for 'roms/opensbi'
> Synchronizing submodule url for 'roms/qboot'
> Synchronizing submodule url for 'roms/qemu-palcode'
> Synchronizing submodule url for 'roms/seabios'
> Synchronizing submodule url for 'roms/seabios-hppa'
> Synchronizing submodule url for 'roms/sgabios'
> Synchronizing submodule url for 'roms/skiboot'
> Synchronizing submodule url for 'roms/u-boot'
> Synchronizing submodule url for 'roms/u-boot-sam460ex'
> Synchronizing submodule url for 'roms/vbootrom'
> Synchronizing submodule url for 'slirp'
> Synchronizing submodule url for 'tests/fp/berkeley-softfloat-3'
> Synchronizing submodule url for 'tests/fp/berkeley-testfloat-3'
> Synchronizing submodule url for 'ui/keycodemapdb'
> Entering 'capstone'
> Entering 'dtc'
> Entering 'meson'
> Entering 'roms/QemuMacDrivers'
> Entering 'roms/SLOF'
> Entering 'roms/edk2'
> Entering 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
> Entering 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli'
> Entering 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/esaxx'
> Entering
> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/libdivsufsort'
> Entering 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
> Entering 'roms/edk2/MdeModulePkg/Library/BrotliCustomDecompressLib/brotli'
> Entering 'roms/edk2/MdeModulePkg/Universal/RegularExpressionDxe/oniguruma'
> Entering 'roms/edk2/UnitTestFrameworkPkg/Library/CmockaLib/cmocka'
> Entering 'roms/ipxe'
> Entering 'roms/openbios'
> Entering 'roms/opensbi'
> Entering 'roms/qboot'
> Entering 'roms/qemu-palcode'
> Entering 'roms/seabios'
> Entering 'roms/seabios-hppa'
> Entering 'roms/sgabios'
> Entering 'roms/skiboot'
> Entering 'roms/u-boot'
> Entering 'roms/u-boot-sam460ex'
> Entering 'roms/vbootrom'
> Entering 'slirp'
> Entering 'tests/fp/berkeley-softfloat-3'
> Entering 'tests/fp/berkeley-testfloat-3'
> Entering 'ui/keycodemapdb'
> Entering 'capstone'
> HEAD is now at f8b1b833 fix CS_ mips_ OP structure comment error (#1674)
> Entering 'dtc'
> HEAD is now at 85e5d83 Makefile: when building libfdt only, do not add
> unneeded deps
> Entering 'meson'
> HEAD is now at 776acd2a8 Bump versions to 0.55.3 for release
> Entering 'roms/QemuMacDrivers'
> HEAD is now at 90c488d Merge pull request #3 from
> mcayland/fix/unbreak-256-color-mode
> Entering 'roms/SLOF'
> HEAD is now at e18ddad version: update to 20200717
> Entering 'roms/edk2'
> HEAD is now at 06dc822d04 Revert ".pytool/EccCheck: Disable Ecc error
> code 10014 for open CI"
> Entering 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
> HEAD is now at b64af41 Fix typo in function
> 'softfloat_propagateNaNF128M' for RISC-V.
> Entering 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli'
> HEAD is now at 666c328 Make types of variable match (#796)
> Entering 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/esaxx'
> HEAD is now at ca7cb33 move to git
> Entering
> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/libdivsufsort'
> HEAD is now at 5f60d6f Merge pull request #7 from kloetzl/master
> Entering 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
> Entering 'roms/edk2/MdeModulePkg/Library/BrotliCustomDecompressLib/brotli'
> HEAD is now at 63be8a9 unichr was removed in Python 3 because all str
> are Unicode (#877)
> Entering 'roms/edk2/MdeModulePkg/Universal/RegularExpressionDxe/oniguruma'
> HEAD is now at b2c1da6 add ONIG_OPTION_CALLBACK_EACH_MATCH test
> Entering 'roms/edk2/UnitTestFrameworkPkg/Library/CmockaLib/cmocka'
> HEAD is now at 160dffe Don't use non-literal format strings
> Entering 'roms/ipxe'
> HEAD is now at 4bd064de [build] Fix building on older versions of gcc
> Entering 'roms/openbios'
> HEAD is now at 7f28286 PPC: mark first 4 pages of physical and virtual
> memory as unavailable
> Entering 'roms/opensbi'
> HEAD is now at a98258d include: Bump-up version to 0.8
> Entering 'roms/qboot'
> HEAD is now at a5300c4 qboot: Disable PIE for ELF binary build step
> Entering 'roms/qemu-palcode'
> HEAD is now at bf0e136 Report machine checks to the kernel
> Entering 'roms/seabios'
> HEAD is now at 155821a docs: Note v1.14.0 release
> Entering 'roms/seabios-hppa'
> HEAD is now at 73b740f7 parisc: Set text planes and used_bits in STI fields
> Entering 'roms/sgabios'
> HEAD is now at cbaee52 SGABIOS: fix wrong video attrs for int 10h, ah==13h
> Entering 'roms/skiboot'
> HEAD is now at 3a6fdede skiboot v6.4 release notes
> Entering 'roms/u-boot'
> HEAD is now at d3689267f9 Prepare v2019.01
> Entering 'roms/u-boot-sam460ex'
> HEAD is now at 60b3916 Add README to clarify relation to U-Boot and
> ACube's version
> Entering 'roms/vbootrom'
> HEAD is now at 0c37a43 Merge pull request #1 from google/disable-build-id
> Entering 'slirp'
> HEAD is now at 8f43a99 Merge branch 'stable-4.2' into 'stable-4.2'
> Entering 'tests/fp/berkeley-softfloat-3'
> HEAD is now at b64af41 Fix typo in function
> 'softfloat_propagateNaNF128M' for RISC-V.
> Entering 'tests/fp/berkeley-testfloat-3'
> HEAD is now at 5a59dce fail: constify fail_programName
> Entering 'ui/keycodemapdb'
> HEAD is now at 6119e6e Fix scan codes for Korean keys
> fatal: Needed a single revision
> Unable to find current revision in submodule path
> 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
> Failed to recurse into submodule path 'roms/edk2'
> ERROR: Job failed: exit status 1
> 

Yes, I've also found similar issues during my jobs.

Regards,
- Cleber.
Philippe Mathieu-Daudé Feb. 23, 2021, 5:24 p.m. UTC | #5
On 2/23/21 5:47 PM, Cleber Rosa wrote:
> On Tue, Feb 23, 2021 at 05:37:04PM +0100, Philippe Mathieu-Daudé wrote:
>> On 2/23/21 12:25 PM, Thomas Huth wrote:
>>> On 19/02/2021 22.58, Cleber Rosa wrote:
>>>> As described in the included documentation, the "custom runner" jobs
>>>> extend the GitLab CI jobs already in place.  One of their primary
>>>> goals of catching and preventing regressions on a wider number of host
>>>> systems than the ones provided by GitLab's shared runners.
>>>>
>>>> This sets the stage in which other community members can add their own
>>>> machine configuration documentation/scripts, and accompanying job
>>>> definitions.  As a general rule, those newly added contributed jobs
>>>> should run as "non-gating", until their reliability is verified (AKA
>>>> "allow_failure: true").
>>>>
>>>> Signed-off-by: Cleber Rosa <crosa@redhat.com>
>>>> ---
>>>>   .gitlab-ci.d/custom-runners.yml | 14 ++++++++++++++
>>>>   .gitlab-ci.yml                  |  1 +
>>>>   docs/devel/ci.rst               | 28 ++++++++++++++++++++++++++++
>>>>   docs/devel/index.rst            |  1 +
>>>>   4 files changed, 44 insertions(+)
>>>>   create mode 100644 .gitlab-ci.d/custom-runners.yml
>>>>   create mode 100644 docs/devel/ci.rst
>>>>
>>>> diff --git a/.gitlab-ci.d/custom-runners.yml
>>>> b/.gitlab-ci.d/custom-runners.yml
>>>> new file mode 100644
>>>> index 0000000000..3004da2bda
>>>> --- /dev/null
>>>> +++ b/.gitlab-ci.d/custom-runners.yml
>>>> @@ -0,0 +1,14 @@
>>>> +# The CI jobs defined here require GitLab runners installed and
>>>> +# registered on machines that match their operating system names,
>>>> +# versions and architectures.  This is in contrast to the other CI
>>>> +# jobs that are intended to run on GitLab's "shared" runners.
>>>> +
>>>> +# Different than the default approach on "shared" runners, based on
>>>> +# containers, the custom runners have no such *requirement*, as those
>>>> +# jobs should be capable of running on operating systems with no
>>>> +# compatible container implementation, or no support from
>>>> +# gitlab-runner.  To avoid problems that gitlab-runner can cause while
>>>> +# reusing the GIT repository, let's enable the recursive submodule
>>>> +# strategy.
>>>> +variables:
>>>> +  GIT_SUBMODULE_STRATEGY: recursive
>>>
>>> Is it really necessary? I thought our configure script would take care
>>> of the submodules?
>>
> 
> I've done a lot of testing on bare metal systems, and the problems
> that come from reusing the same system and failed cleanups can be very
> frustrating.  It's unfortunate that we need this, but it was the
> simplest and most reliable solution I found.  :/
> 
> Having said that, I noticed after I posted this series that this is
> affecting all other jobs.  We don't need it that in the jobs based
> on containers (for obvious reasons), so I see two options:
> 
> 1) have it enabled on all jobs for consistency
> 
> 2) have it enabled only on jobs that will reuse the repo
> 
>> Well, if there is a failure during the first clone (I got one network
>> timeout in the middle) 

[This network failure is pasted at the end]

>> then next time it doesn't work:
>>
>> Updating/initializing submodules recursively...
>> Synchronizing submodule url for 'capstone'
>> Synchronizing submodule url for 'dtc'
>> Synchronizing submodule url for 'meson'
>> Synchronizing submodule url for 'roms/QemuMacDrivers'
>> Synchronizing submodule url for 'roms/SLOF'
>> Synchronizing submodule url for 'roms/edk2'
>> Synchronizing submodule url for
>> 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
>> Synchronizing submodule url for
>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli'
>> Synchronizing submodule url for
>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/esaxx'
>> Synchronizing submodule url for
>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/libdivsufsort'
>> Synchronizing submodule url for
>> 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
>> Synchronizing submodule url for
>> 'roms/edk2/MdeModulePkg/Library/BrotliCustomDecompressLib/brotli'
>> Synchronizing submodule url for
>> 'roms/edk2/MdeModulePkg/Universal/RegularExpressionDxe/oniguruma'
>> Synchronizing submodule url for
>> 'roms/edk2/UnitTestFrameworkPkg/Library/CmockaLib/cmocka'
>> Synchronizing submodule url for 'roms/ipxe'
>> Synchronizing submodule url for 'roms/openbios'
>> Synchronizing submodule url for 'roms/opensbi'
>> Synchronizing submodule url for 'roms/qboot'
>> Synchronizing submodule url for 'roms/qemu-palcode'
>> Synchronizing submodule url for 'roms/seabios'
>> Synchronizing submodule url for 'roms/seabios-hppa'
>> Synchronizing submodule url for 'roms/sgabios'
>> Synchronizing submodule url for 'roms/skiboot'
>> Synchronizing submodule url for 'roms/u-boot'
>> Synchronizing submodule url for 'roms/u-boot-sam460ex'
>> Synchronizing submodule url for 'roms/vbootrom'
>> Synchronizing submodule url for 'slirp'
>> Synchronizing submodule url for 'tests/fp/berkeley-softfloat-3'
>> Synchronizing submodule url for 'tests/fp/berkeley-testfloat-3'
>> Synchronizing submodule url for 'ui/keycodemapdb'
>> Entering 'capstone'
>> Entering 'dtc'
>> Entering 'meson'
>> Entering 'roms/QemuMacDrivers'
>> Entering 'roms/SLOF'
>> Entering 'roms/edk2'
>> Entering 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
>> Entering 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli'
>> Entering 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/esaxx'
>> Entering
>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/libdivsufsort'
>> Entering 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
>> Entering 'roms/edk2/MdeModulePkg/Library/BrotliCustomDecompressLib/brotli'
>> Entering 'roms/edk2/MdeModulePkg/Universal/RegularExpressionDxe/oniguruma'
>> Entering 'roms/edk2/UnitTestFrameworkPkg/Library/CmockaLib/cmocka'
>> Entering 'roms/ipxe'
>> Entering 'roms/openbios'
>> Entering 'roms/opensbi'
>> Entering 'roms/qboot'
>> Entering 'roms/qemu-palcode'
>> Entering 'roms/seabios'
>> Entering 'roms/seabios-hppa'
>> Entering 'roms/sgabios'
>> Entering 'roms/skiboot'
>> Entering 'roms/u-boot'
>> Entering 'roms/u-boot-sam460ex'
>> Entering 'roms/vbootrom'
>> Entering 'slirp'
>> Entering 'tests/fp/berkeley-softfloat-3'
>> Entering 'tests/fp/berkeley-testfloat-3'
>> Entering 'ui/keycodemapdb'
>> Entering 'capstone'
>> HEAD is now at f8b1b833 fix CS_ mips_ OP structure comment error (#1674)
>> Entering 'dtc'
>> HEAD is now at 85e5d83 Makefile: when building libfdt only, do not add
>> unneeded deps
>> Entering 'meson'
>> HEAD is now at 776acd2a8 Bump versions to 0.55.3 for release
>> Entering 'roms/QemuMacDrivers'
>> HEAD is now at 90c488d Merge pull request #3 from
>> mcayland/fix/unbreak-256-color-mode
>> Entering 'roms/SLOF'
>> HEAD is now at e18ddad version: update to 20200717
>> Entering 'roms/edk2'
>> HEAD is now at 06dc822d04 Revert ".pytool/EccCheck: Disable Ecc error
>> code 10014 for open CI"
>> Entering 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
>> HEAD is now at b64af41 Fix typo in function
>> 'softfloat_propagateNaNF128M' for RISC-V.
>> Entering 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli'
>> HEAD is now at 666c328 Make types of variable match (#796)
>> Entering 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/esaxx'
>> HEAD is now at ca7cb33 move to git
>> Entering
>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/libdivsufsort'
>> HEAD is now at 5f60d6f Merge pull request #7 from kloetzl/master
>> Entering 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
>> Entering 'roms/edk2/MdeModulePkg/Library/BrotliCustomDecompressLib/brotli'
>> HEAD is now at 63be8a9 unichr was removed in Python 3 because all str
>> are Unicode (#877)
>> Entering 'roms/edk2/MdeModulePkg/Universal/RegularExpressionDxe/oniguruma'
>> HEAD is now at b2c1da6 add ONIG_OPTION_CALLBACK_EACH_MATCH test
>> Entering 'roms/edk2/UnitTestFrameworkPkg/Library/CmockaLib/cmocka'
>> HEAD is now at 160dffe Don't use non-literal format strings
>> Entering 'roms/ipxe'
>> HEAD is now at 4bd064de [build] Fix building on older versions of gcc
>> Entering 'roms/openbios'
>> HEAD is now at 7f28286 PPC: mark first 4 pages of physical and virtual
>> memory as unavailable
>> Entering 'roms/opensbi'
>> HEAD is now at a98258d include: Bump-up version to 0.8
>> Entering 'roms/qboot'
>> HEAD is now at a5300c4 qboot: Disable PIE for ELF binary build step
>> Entering 'roms/qemu-palcode'
>> HEAD is now at bf0e136 Report machine checks to the kernel
>> Entering 'roms/seabios'
>> HEAD is now at 155821a docs: Note v1.14.0 release
>> Entering 'roms/seabios-hppa'
>> HEAD is now at 73b740f7 parisc: Set text planes and used_bits in STI fields
>> Entering 'roms/sgabios'
>> HEAD is now at cbaee52 SGABIOS: fix wrong video attrs for int 10h, ah==13h
>> Entering 'roms/skiboot'
>> HEAD is now at 3a6fdede skiboot v6.4 release notes
>> Entering 'roms/u-boot'
>> HEAD is now at d3689267f9 Prepare v2019.01
>> Entering 'roms/u-boot-sam460ex'
>> HEAD is now at 60b3916 Add README to clarify relation to U-Boot and
>> ACube's version
>> Entering 'roms/vbootrom'
>> HEAD is now at 0c37a43 Merge pull request #1 from google/disable-build-id
>> Entering 'slirp'
>> HEAD is now at 8f43a99 Merge branch 'stable-4.2' into 'stable-4.2'
>> Entering 'tests/fp/berkeley-softfloat-3'
>> HEAD is now at b64af41 Fix typo in function
>> 'softfloat_propagateNaNF128M' for RISC-V.
>> Entering 'tests/fp/berkeley-testfloat-3'
>> HEAD is now at 5a59dce fail: constify fail_programName
>> Entering 'ui/keycodemapdb'
>> HEAD is now at 6119e6e Fix scan codes for Korean keys
>> fatal: Needed a single revision
>> Unable to find current revision in submodule path
>> 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
>> Failed to recurse into submodule path 'roms/edk2'
>> ERROR: Job failed: exit status 1
>>
> 
> Yes, I've also found similar issues during my jobs.

The problem with GIT_SUBMODULE_STRATEGY is it tries to
clone submodules from submodules (here EDK2) which QEMU
doesn't use:

Submodule 'boringssl' (https://boringssl.googlesource.com/boringssl)
registered for path
'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl'
Submodule 'krb5' (https://github.com/krb5/krb5) registered for path
'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/krb5'
Submodule 'pyca.cryptography' (https://github.com/pyca/cryptography.git)
registered for path
'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/pyca-cryptography'
Cloning into
'/var/lib/gitlab-runner/builds/shWMsY1a/0/philmd/qemu/roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl'...
fatal: unable to access 'https://boringssl.googlesource.com/boringssl/':
Failed to connect to boringssl.googlesource.com port 443: Connection
timed out
fatal: clone of 'https://boringssl.googlesource.com/boringssl' into
submodule path
'/var/lib/gitlab-runner/builds/shWMsY1a/0/philmd/qemu/roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl'
failed
Failed to clone 'boringssl'. Retry scheduled

I don't think we want to clone the unused boringssl on
QEMU namespace.
Philippe Mathieu-Daudé Feb. 23, 2021, 5:34 p.m. UTC | #6
On 2/23/21 6:24 PM, Philippe Mathieu-Daudé wrote:
> On 2/23/21 5:47 PM, Cleber Rosa wrote:
>> On Tue, Feb 23, 2021 at 05:37:04PM +0100, Philippe Mathieu-Daudé wrote:
>>> On 2/23/21 12:25 PM, Thomas Huth wrote:
>>>> On 19/02/2021 22.58, Cleber Rosa wrote:
>>>>> As described in the included documentation, the "custom runner" jobs
>>>>> extend the GitLab CI jobs already in place.  One of their primary
>>>>> goals of catching and preventing regressions on a wider number of host
>>>>> systems than the ones provided by GitLab's shared runners.
>>>>>
>>>>> This sets the stage in which other community members can add their own
>>>>> machine configuration documentation/scripts, and accompanying job
>>>>> definitions.  As a general rule, those newly added contributed jobs
>>>>> should run as "non-gating", until their reliability is verified (AKA
>>>>> "allow_failure: true").
>>>>>
>>>>> Signed-off-by: Cleber Rosa <crosa@redhat.com>
>>>>> ---
>>>>>   .gitlab-ci.d/custom-runners.yml | 14 ++++++++++++++
>>>>>   .gitlab-ci.yml                  |  1 +
>>>>>   docs/devel/ci.rst               | 28 ++++++++++++++++++++++++++++
>>>>>   docs/devel/index.rst            |  1 +
>>>>>   4 files changed, 44 insertions(+)
>>>>>   create mode 100644 .gitlab-ci.d/custom-runners.yml
>>>>>   create mode 100644 docs/devel/ci.rst
>>>>>
>>>>> diff --git a/.gitlab-ci.d/custom-runners.yml
>>>>> b/.gitlab-ci.d/custom-runners.yml
>>>>> new file mode 100644
>>>>> index 0000000000..3004da2bda
>>>>> --- /dev/null
>>>>> +++ b/.gitlab-ci.d/custom-runners.yml
>>>>> @@ -0,0 +1,14 @@
>>>>> +# The CI jobs defined here require GitLab runners installed and
>>>>> +# registered on machines that match their operating system names,
>>>>> +# versions and architectures.  This is in contrast to the other CI
>>>>> +# jobs that are intended to run on GitLab's "shared" runners.
>>>>> +
>>>>> +# Different than the default approach on "shared" runners, based on
>>>>> +# containers, the custom runners have no such *requirement*, as those
>>>>> +# jobs should be capable of running on operating systems with no
>>>>> +# compatible container implementation, or no support from
>>>>> +# gitlab-runner.  To avoid problems that gitlab-runner can cause while
>>>>> +# reusing the GIT repository, let's enable the recursive submodule
>>>>> +# strategy.
>>>>> +variables:
>>>>> +  GIT_SUBMODULE_STRATEGY: recursive
>>>>
>>>> Is it really necessary? I thought our configure script would take care
>>>> of the submodules?
>>>
>>
>> I've done a lot of testing on bare metal systems, and the problems
>> that come from reusing the same system and failed cleanups can be very
>> frustrating.  It's unfortunate that we need this, but it was the
>> simplest and most reliable solution I found.  :/
>>
>> Having said that, I noticed after I posted this series that this is
>> affecting all other jobs.  We don't need it that in the jobs based
>> on containers (for obvious reasons), so I see two options:
>>
>> 1) have it enabled on all jobs for consistency
>>
>> 2) have it enabled only on jobs that will reuse the repo
>>
>>> Well, if there is a failure during the first clone (I got one network
>>> timeout in the middle) 
> 
> [This network failure is pasted at the end]
> 
>>> then next time it doesn't work:
>>>
>>> Updating/initializing submodules recursively...
>>> Synchronizing submodule url for 'capstone'
>>> Synchronizing submodule url for 'dtc'
>>> Synchronizing submodule url for 'meson'
>>> Synchronizing submodule url for 'roms/QemuMacDrivers'
>>> Synchronizing submodule url for 'roms/SLOF'
>>> Synchronizing submodule url for 'roms/edk2'
>>> Synchronizing submodule url for
>>> 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
>>> Synchronizing submodule url for
>>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli'
>>> Synchronizing submodule url for
>>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/esaxx'
>>> Synchronizing submodule url for
>>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/libdivsufsort'
>>> Synchronizing submodule url for
>>> 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
>>> Synchronizing submodule url for
>>> 'roms/edk2/MdeModulePkg/Library/BrotliCustomDecompressLib/brotli'
>>> Synchronizing submodule url for
>>> 'roms/edk2/MdeModulePkg/Universal/RegularExpressionDxe/oniguruma'
>>> Synchronizing submodule url for
>>> 'roms/edk2/UnitTestFrameworkPkg/Library/CmockaLib/cmocka'

So far, beside the repository useful for QEMU, I cloned:

- boringssl
- krb5
- pyca-cryptography
- esaxx
- libdivsufsort
- oniguruma
- openssl
- brotli
- cmocka

But reach the runner time limit of 2h.

The directory reports 3GB of source code.

I don't think the series has been tested enough before posting,
I'm stopping here my experiments.

Regards,

Phil.
Daniel P. Berrangé Feb. 23, 2021, 5:45 p.m. UTC | #7
On Tue, Feb 23, 2021 at 11:47:18AM -0500, Cleber Rosa wrote:
> On Tue, Feb 23, 2021 at 05:37:04PM +0100, Philippe Mathieu-Daudé wrote:
> > On 2/23/21 12:25 PM, Thomas Huth wrote:
> > > On 19/02/2021 22.58, Cleber Rosa wrote:
> > >> As described in the included documentation, the "custom runner" jobs
> > >> extend the GitLab CI jobs already in place.  One of their primary
> > >> goals of catching and preventing regressions on a wider number of host
> > >> systems than the ones provided by GitLab's shared runners.
> > >>
> > >> This sets the stage in which other community members can add their own
> > >> machine configuration documentation/scripts, and accompanying job
> > >> definitions.  As a general rule, those newly added contributed jobs
> > >> should run as "non-gating", until their reliability is verified (AKA
> > >> "allow_failure: true").
> > >>
> > >> Signed-off-by: Cleber Rosa <crosa@redhat.com>
> > >> ---
> > >>   .gitlab-ci.d/custom-runners.yml | 14 ++++++++++++++
> > >>   .gitlab-ci.yml                  |  1 +
> > >>   docs/devel/ci.rst               | 28 ++++++++++++++++++++++++++++
> > >>   docs/devel/index.rst            |  1 +
> > >>   4 files changed, 44 insertions(+)
> > >>   create mode 100644 .gitlab-ci.d/custom-runners.yml
> > >>   create mode 100644 docs/devel/ci.rst
> > >>
> > >> diff --git a/.gitlab-ci.d/custom-runners.yml
> > >> b/.gitlab-ci.d/custom-runners.yml
> > >> new file mode 100644
> > >> index 0000000000..3004da2bda
> > >> --- /dev/null
> > >> +++ b/.gitlab-ci.d/custom-runners.yml
> > >> @@ -0,0 +1,14 @@
> > >> +# The CI jobs defined here require GitLab runners installed and
> > >> +# registered on machines that match their operating system names,
> > >> +# versions and architectures.  This is in contrast to the other CI
> > >> +# jobs that are intended to run on GitLab's "shared" runners.
> > >> +
> > >> +# Different than the default approach on "shared" runners, based on
> > >> +# containers, the custom runners have no such *requirement*, as those
> > >> +# jobs should be capable of running on operating systems with no
> > >> +# compatible container implementation, or no support from
> > >> +# gitlab-runner.  To avoid problems that gitlab-runner can cause while
> > >> +# reusing the GIT repository, let's enable the recursive submodule
> > >> +# strategy.
> > >> +variables:
> > >> +  GIT_SUBMODULE_STRATEGY: recursive
> > > 
> > > Is it really necessary? I thought our configure script would take care
> > > of the submodules?
> >
> 
> I've done a lot of testing on bare metal systems, and the problems
> that come from reusing the same system and failed cleanups can be very
> frustrating.  It's unfortunate that we need this, but it was the
> simplest and most reliable solution I found.  :/

Hmmm, this makes it sound like the job is not being run in a
fresh pristine checkout. IMHO we need to guarantee that in
general, at which point submodules should "just work", unless
the running is blocking network access ?



Regards,
Daniel
Cleber Rosa Feb. 23, 2021, 6:09 p.m. UTC | #8
On Tue, Feb 23, 2021 at 06:34:07PM +0100, Philippe Mathieu-Daudé wrote:
> On 2/23/21 6:24 PM, Philippe Mathieu-Daudé wrote:
> > On 2/23/21 5:47 PM, Cleber Rosa wrote:
> >> On Tue, Feb 23, 2021 at 05:37:04PM +0100, Philippe Mathieu-Daudé wrote:
> >>> On 2/23/21 12:25 PM, Thomas Huth wrote:
> >>>> On 19/02/2021 22.58, Cleber Rosa wrote:
> >>>>> As described in the included documentation, the "custom runner" jobs
> >>>>> extend the GitLab CI jobs already in place.  One of their primary
> >>>>> goals of catching and preventing regressions on a wider number of host
> >>>>> systems than the ones provided by GitLab's shared runners.
> >>>>>
> >>>>> This sets the stage in which other community members can add their own
> >>>>> machine configuration documentation/scripts, and accompanying job
> >>>>> definitions.  As a general rule, those newly added contributed jobs
> >>>>> should run as "non-gating", until their reliability is verified (AKA
> >>>>> "allow_failure: true").
> >>>>>
> >>>>> Signed-off-by: Cleber Rosa <crosa@redhat.com>
> >>>>> ---
> >>>>>   .gitlab-ci.d/custom-runners.yml | 14 ++++++++++++++
> >>>>>   .gitlab-ci.yml                  |  1 +
> >>>>>   docs/devel/ci.rst               | 28 ++++++++++++++++++++++++++++
> >>>>>   docs/devel/index.rst            |  1 +
> >>>>>   4 files changed, 44 insertions(+)
> >>>>>   create mode 100644 .gitlab-ci.d/custom-runners.yml
> >>>>>   create mode 100644 docs/devel/ci.rst
> >>>>>
> >>>>> diff --git a/.gitlab-ci.d/custom-runners.yml
> >>>>> b/.gitlab-ci.d/custom-runners.yml
> >>>>> new file mode 100644
> >>>>> index 0000000000..3004da2bda
> >>>>> --- /dev/null
> >>>>> +++ b/.gitlab-ci.d/custom-runners.yml
> >>>>> @@ -0,0 +1,14 @@
> >>>>> +# The CI jobs defined here require GitLab runners installed and
> >>>>> +# registered on machines that match their operating system names,
> >>>>> +# versions and architectures.  This is in contrast to the other CI
> >>>>> +# jobs that are intended to run on GitLab's "shared" runners.
> >>>>> +
> >>>>> +# Different than the default approach on "shared" runners, based on
> >>>>> +# containers, the custom runners have no such *requirement*, as those
> >>>>> +# jobs should be capable of running on operating systems with no
> >>>>> +# compatible container implementation, or no support from
> >>>>> +# gitlab-runner.  To avoid problems that gitlab-runner can cause while
> >>>>> +# reusing the GIT repository, let's enable the recursive submodule
> >>>>> +# strategy.
> >>>>> +variables:
> >>>>> +  GIT_SUBMODULE_STRATEGY: recursive
> >>>>
> >>>> Is it really necessary? I thought our configure script would take care
> >>>> of the submodules?
> >>>
> >>
> >> I've done a lot of testing on bare metal systems, and the problems
> >> that come from reusing the same system and failed cleanups can be very
> >> frustrating.  It's unfortunate that we need this, but it was the
> >> simplest and most reliable solution I found.  :/
> >>
> >> Having said that, I noticed after I posted this series that this is
> >> affecting all other jobs.  We don't need it that in the jobs based
> >> on containers (for obvious reasons), so I see two options:
> >>
> >> 1) have it enabled on all jobs for consistency
> >>
> >> 2) have it enabled only on jobs that will reuse the repo
> >>
> >>> Well, if there is a failure during the first clone (I got one network
> >>> timeout in the middle) 
> > 
> > [This network failure is pasted at the end]
> > 
> >>> then next time it doesn't work:
> >>>
> >>> Updating/initializing submodules recursively...
> >>> Synchronizing submodule url for 'capstone'
> >>> Synchronizing submodule url for 'dtc'
> >>> Synchronizing submodule url for 'meson'
> >>> Synchronizing submodule url for 'roms/QemuMacDrivers'
> >>> Synchronizing submodule url for 'roms/SLOF'
> >>> Synchronizing submodule url for 'roms/edk2'
> >>> Synchronizing submodule url for
> >>> 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
> >>> Synchronizing submodule url for
> >>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli'
> >>> Synchronizing submodule url for
> >>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/esaxx'
> >>> Synchronizing submodule url for
> >>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/libdivsufsort'
> >>> Synchronizing submodule url for
> >>> 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
> >>> Synchronizing submodule url for
> >>> 'roms/edk2/MdeModulePkg/Library/BrotliCustomDecompressLib/brotli'
> >>> Synchronizing submodule url for
> >>> 'roms/edk2/MdeModulePkg/Universal/RegularExpressionDxe/oniguruma'
> >>> Synchronizing submodule url for
> >>> 'roms/edk2/UnitTestFrameworkPkg/Library/CmockaLib/cmocka'
> 
> So far, beside the repository useful for QEMU, I cloned:
> 
> - boringssl
> - krb5
> - pyca-cryptography
> - esaxx
> - libdivsufsort
> - oniguruma
> - openssl
> - brotli
> - cmocka
>

Hi Phil,

I'm not following what you meant by "I cloned"... Are you experimenting
with this on a machine of your own and manually cloning the submodules?

> But reach the runner time limit of 2h.
>
> The directory reports 3GB of source code.
> 
> I don't think the series has been tested enough before posting,

Please take into consideration that this series, although simple in
content, touches and interacts with a lot of moving pieces, and
possibly with personal systems that I did not have, or will have,
access to.  As far as public testing proof goes, you can see a
pipeline here with this version of this series here:

   https://gitlab.com/cleber.gnu/qemu/-/pipelines/258982039/builds

As I said elsewhere, I only noticed the recursive submodule being
applied to the existing jobs after I submitted the series.  Mea culpa.
But:

 * none of the jobs took noticeably longer than the previous baseline
 * there was one *container build failure* (safe to say it's not
   related)
 * all other jobs passed successfully

And, along with the previous versions, this series were tested on all
the previously included architectures and operating systems.  It's
unfortunate that because of your experience at this time (my
apologies), you don't realize the amount of testing done so far.

> I'm stopping here my experiments.
>
> Regards,
> 
> Phil.
> 

I honestly appreciate your help here up to this point.

Regards,
- Cleber.
Wainer dos Santos Moschetta Feb. 23, 2021, 9:34 p.m. UTC | #9
Hi,

On 2/23/21 2:45 PM, Daniel P. Berrangé wrote:
> On Tue, Feb 23, 2021 at 11:47:18AM -0500, Cleber Rosa wrote:
>> On Tue, Feb 23, 2021 at 05:37:04PM +0100, Philippe Mathieu-Daudé wrote:
>>> On 2/23/21 12:25 PM, Thomas Huth wrote:
>>>> On 19/02/2021 22.58, Cleber Rosa wrote:
>>>>> As described in the included documentation, the "custom runner" jobs
>>>>> extend the GitLab CI jobs already in place.  One of their primary
>>>>> goals of catching and preventing regressions on a wider number of host
>>>>> systems than the ones provided by GitLab's shared runners.
>>>>>
>>>>> This sets the stage in which other community members can add their own
>>>>> machine configuration documentation/scripts, and accompanying job
>>>>> definitions.  As a general rule, those newly added contributed jobs
>>>>> should run as "non-gating", until their reliability is verified (AKA
>>>>> "allow_failure: true").
>>>>>
>>>>> Signed-off-by: Cleber Rosa <crosa@redhat.com>
>>>>> ---
>>>>>    .gitlab-ci.d/custom-runners.yml | 14 ++++++++++++++
>>>>>    .gitlab-ci.yml                  |  1 +
>>>>>    docs/devel/ci.rst               | 28 ++++++++++++++++++++++++++++
>>>>>    docs/devel/index.rst            |  1 +
>>>>>    4 files changed, 44 insertions(+)
>>>>>    create mode 100644 .gitlab-ci.d/custom-runners.yml
>>>>>    create mode 100644 docs/devel/ci.rst
>>>>>
>>>>> diff --git a/.gitlab-ci.d/custom-runners.yml
>>>>> b/.gitlab-ci.d/custom-runners.yml
>>>>> new file mode 100644
>>>>> index 0000000000..3004da2bda
>>>>> --- /dev/null
>>>>> +++ b/.gitlab-ci.d/custom-runners.yml
>>>>> @@ -0,0 +1,14 @@
>>>>> +# The CI jobs defined here require GitLab runners installed and
>>>>> +# registered on machines that match their operating system names,
>>>>> +# versions and architectures.  This is in contrast to the other CI
>>>>> +# jobs that are intended to run on GitLab's "shared" runners.
>>>>> +
>>>>> +# Different than the default approach on "shared" runners, based on
>>>>> +# containers, the custom runners have no such *requirement*, as those
>>>>> +# jobs should be capable of running on operating systems with no
>>>>> +# compatible container implementation, or no support from
>>>>> +# gitlab-runner.  To avoid problems that gitlab-runner can cause while
>>>>> +# reusing the GIT repository, let's enable the recursive submodule
>>>>> +# strategy.
>>>>> +variables:
>>>>> +  GIT_SUBMODULE_STRATEGY: recursive
>>>> Is it really necessary? I thought our configure script would take care
>>>> of the submodules?
>> I've done a lot of testing on bare metal systems, and the problems
>> that come from reusing the same system and failed cleanups can be very
>> frustrating.  It's unfortunate that we need this, but it was the
>> simplest and most reliable solution I found.  :/
> Hmmm, this makes it sound like the job is not being run in a
> fresh pristine checkout. IMHO we need to guarantee that in
> general, at which point submodules should "just work", unless
> the running is blocking network access ?

Setting the git strategy may work out:

https://docs.gitlab.com/ee/ci/runners/README.html#git-strategy

- Wainer

>
>
>
> Regards,
> Daniel
Philippe Mathieu-Daudé Feb. 24, 2021, 11:57 a.m. UTC | #10
On 2/23/21 7:09 PM, Cleber Rosa wrote:
> On Tue, Feb 23, 2021 at 06:34:07PM +0100, Philippe Mathieu-Daudé wrote:
>> On 2/23/21 6:24 PM, Philippe Mathieu-Daudé wrote:
>>> On 2/23/21 5:47 PM, Cleber Rosa wrote:
>>>> On Tue, Feb 23, 2021 at 05:37:04PM +0100, Philippe Mathieu-Daudé wrote:
>>>>> On 2/23/21 12:25 PM, Thomas Huth wrote:
>>>>>> On 19/02/2021 22.58, Cleber Rosa wrote:
>>>>>>> As described in the included documentation, the "custom runner" jobs
>>>>>>> extend the GitLab CI jobs already in place.  One of their primary
>>>>>>> goals of catching and preventing regressions on a wider number of host
>>>>>>> systems than the ones provided by GitLab's shared runners.
>>>>>>>
>>>>>>> This sets the stage in which other community members can add their own
>>>>>>> machine configuration documentation/scripts, and accompanying job
>>>>>>> definitions.  As a general rule, those newly added contributed jobs
>>>>>>> should run as "non-gating", until their reliability is verified (AKA
>>>>>>> "allow_failure: true").
>>>>>>>
>>>>>>> Signed-off-by: Cleber Rosa <crosa@redhat.com>
>>>>>>> ---
>>>>>>>   .gitlab-ci.d/custom-runners.yml | 14 ++++++++++++++
>>>>>>>   .gitlab-ci.yml                  |  1 +
>>>>>>>   docs/devel/ci.rst               | 28 ++++++++++++++++++++++++++++
>>>>>>>   docs/devel/index.rst            |  1 +
>>>>>>>   4 files changed, 44 insertions(+)
>>>>>>>   create mode 100644 .gitlab-ci.d/custom-runners.yml
>>>>>>>   create mode 100644 docs/devel/ci.rst
>>>>>>>
>>>>>>> diff --git a/.gitlab-ci.d/custom-runners.yml
>>>>>>> b/.gitlab-ci.d/custom-runners.yml
>>>>>>> new file mode 100644
>>>>>>> index 0000000000..3004da2bda
>>>>>>> --- /dev/null
>>>>>>> +++ b/.gitlab-ci.d/custom-runners.yml
>>>>>>> @@ -0,0 +1,14 @@
>>>>>>> +# The CI jobs defined here require GitLab runners installed and
>>>>>>> +# registered on machines that match their operating system names,
>>>>>>> +# versions and architectures.  This is in contrast to the other CI
>>>>>>> +# jobs that are intended to run on GitLab's "shared" runners.
>>>>>>> +
>>>>>>> +# Different than the default approach on "shared" runners, based on
>>>>>>> +# containers, the custom runners have no such *requirement*, as those
>>>>>>> +# jobs should be capable of running on operating systems with no
>>>>>>> +# compatible container implementation, or no support from
>>>>>>> +# gitlab-runner.  To avoid problems that gitlab-runner can cause while
>>>>>>> +# reusing the GIT repository, let's enable the recursive submodule
>>>>>>> +# strategy.
>>>>>>> +variables:
>>>>>>> +  GIT_SUBMODULE_STRATEGY: recursive
>>>>>>
>>>>>> Is it really necessary? I thought our configure script would take care
>>>>>> of the submodules?
>>>>>
>>>>
>>>> I've done a lot of testing on bare metal systems, and the problems
>>>> that come from reusing the same system and failed cleanups can be very
>>>> frustrating.  It's unfortunate that we need this, but it was the
>>>> simplest and most reliable solution I found.  :/
>>>>
>>>> Having said that, I noticed after I posted this series that this is
>>>> affecting all other jobs.  We don't need it that in the jobs based
>>>> on containers (for obvious reasons), so I see two options:
>>>>
>>>> 1) have it enabled on all jobs for consistency
>>>>
>>>> 2) have it enabled only on jobs that will reuse the repo
>>>>
>>>>> Well, if there is a failure during the first clone (I got one network
>>>>> timeout in the middle) 
>>>
>>> [This network failure is pasted at the end]
>>>
>>>>> then next time it doesn't work:
>>>>>
>>>>> Updating/initializing submodules recursively...
>>>>> Synchronizing submodule url for 'capstone'
>>>>> Synchronizing submodule url for 'dtc'
>>>>> Synchronizing submodule url for 'meson'
>>>>> Synchronizing submodule url for 'roms/QemuMacDrivers'
>>>>> Synchronizing submodule url for 'roms/SLOF'
>>>>> Synchronizing submodule url for 'roms/edk2'
>>>>> Synchronizing submodule url for
>>>>> 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
>>>>> Synchronizing submodule url for
>>>>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli'
>>>>> Synchronizing submodule url for
>>>>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/esaxx'
>>>>> Synchronizing submodule url for
>>>>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/libdivsufsort'
>>>>> Synchronizing submodule url for
>>>>> 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl'
>>>>> Synchronizing submodule url for
>>>>> 'roms/edk2/MdeModulePkg/Library/BrotliCustomDecompressLib/brotli'
>>>>> Synchronizing submodule url for
>>>>> 'roms/edk2/MdeModulePkg/Universal/RegularExpressionDxe/oniguruma'
>>>>> Synchronizing submodule url for
>>>>> 'roms/edk2/UnitTestFrameworkPkg/Library/CmockaLib/cmocka'
>>
>> So far, beside the repository useful for QEMU, I cloned:
>>
>> - boringssl
>> - krb5
>> - pyca-cryptography
>> - esaxx
>> - libdivsufsort
>> - oniguruma
>> - openssl
>> - brotli
>> - cmocka
>>
> 
> Hi Phil,
> 
> I'm not following what you meant by "I cloned"... Are you experimenting
> with this on a machine of your own and manually cloning the submodules?

I meant "my test runner has been cloning ..."

>> But reach the runner time limit of 2h.

The first failure was 1h, I raised the job limit to the maximum
I could use for this runner, 2h.

>> The directory reports 3GB of source code.
>>
>> I don't think the series has been tested enough before posting,
> 
> Please take into consideration that this series, although simple in
> content, touches and interacts with a lot of moving pieces, and
> possibly with personal systems that I did not have, or will have,
> access to.  As far as public testing proof goes, you can see a
> pipeline here with this version of this series here:
> 
>    https://gitlab.com/cleber.gnu/qemu/-/pipelines/258982039/builds

Expand the timeout and retry the same job on the same runner
various times:

diff --git a/.gitlab-ci.d/custom-runners.yml
b/.gitlab-ci.d/custom-runners.yml
@@ -17,6 +17,7 @@ variables:
 # setup by the scripts/ci/setup/build-environment.yml task
 # "Install basic packages to build QEMU on Ubuntu 18.04/20.04"
 ubuntu-18.04-s390x-all-linux-static:
+ timeout: 2h 30m
  allow_failure: true
  needs: []
  stage: build

Each time it will clone more submodules.

I stopped at the 3rd intent.

> As I said elsewhere, I only noticed the recursive submodule being
> applied to the existing jobs after I submitted the series.  Mea culpa.
> But:
> 
>  * none of the jobs took noticeably longer than the previous baseline
>  * there was one *container build failure* (safe to say it's not
>    related)
>  * all other jobs passed successfully

I had less luck then (see the docker-dind jobs started on the custom
runner commented elsewhere in this thread).

> And, along with the previous versions, this series were tested on all
> the previously included architectures and operating systems.  It's
> unfortunate that because of your experience at this time (my
> apologies), you don't realize the amount of testing done so far.

As I commented to Erik on IRC, the single difference I did
is use the distribution runner, not the official one:

$ sudo apt-get install gitlab-runner docker.io

Then registered changing the path (/usr/bin/gitlab-runner instead
of /usr/local/bin/gitlab-runner). Everything else left unchanged.

>> I'm stopping here my experiments.
>>
>> Regards,
>>
>> Phil.
>>
> 
> I honestly appreciate your help here up to this point.
> 
> Regards,
> - Cleber.
>
Cleber Rosa Feb. 24, 2021, 3:47 p.m. UTC | #11
On Wed, Feb 24, 2021 at 12:57:52PM +0100, Philippe Mathieu-Daudé wrote:
> On 2/23/21 7:09 PM, Cleber Rosa wrote:
> > Hi Phil,
> > 
> > I'm not following what you meant by "I cloned"... Are you experimenting
> > with this on a machine of your own and manually cloning the submodules?
> 
> I meant "my test runner has been cloning ..."
> 
> >> But reach the runner time limit of 2h.
> 
> The first failure was 1h, I raised the job limit to the maximum
> I could use for this runner, 2h.
> 
> >> The directory reports 3GB of source code.
> >>
> >> I don't think the series has been tested enough before posting,
> > 
> > Please take into consideration that this series, although simple in
> > content, touches and interacts with a lot of moving pieces, and
> > possibly with personal systems that I did not have, or will have,
> > access to.  As far as public testing proof goes, you can see a
> > pipeline here with this version of this series here:
> > 
> >    https://gitlab.com/cleber.gnu/qemu/-/pipelines/258982039/builds
> 
> Expand the timeout and retry the same job on the same runner
> various times:
> 
> diff --git a/.gitlab-ci.d/custom-runners.yml
> b/.gitlab-ci.d/custom-runners.yml
> @@ -17,6 +17,7 @@ variables:
>  # setup by the scripts/ci/setup/build-environment.yml task
>  # "Install basic packages to build QEMU on Ubuntu 18.04/20.04"
>  ubuntu-18.04-s390x-all-linux-static:
> + timeout: 2h 30m
>   allow_failure: true
>   needs: []
>   stage: build
> 
> Each time it will clone more submodules.
> 
> I stopped at the 3rd intent.
> 
> > As I said elsewhere, I only noticed the recursive submodule being
> > applied to the existing jobs after I submitted the series.  Mea culpa.
> > But:
> > 
> >  * none of the jobs took noticeably longer than the previous baseline
> >  * there was one *container build failure* (safe to say it's not
> >    related)
> >  * all other jobs passed successfully
> 
> I had less luck then (see the docker-dind jobs started on the custom
> runner commented elsewhere in this thread).
>

Hi Phil,

I replied to this issue elsewhere too... I assume you missed the
documentation and did not uncheck the "Run untagged jobs" as
instructed.

> > And, along with the previous versions, this series were tested on all
> > the previously included architectures and operating systems.  It's
> > unfortunate that because of your experience at this time (my
> > apologies), you don't realize the amount of testing done so far.
> 
> As I commented to Erik on IRC, the single difference I did
> is use the distribution runner, not the official one:
> 
> $ sudo apt-get install gitlab-runner docker.io
> 
> Then registered changing the path (/usr/bin/gitlab-runner instead
> of /usr/local/bin/gitlab-runner). Everything else left unchanged.
>

Assuming you did your experiments on Ubuntu 20.04:

   # dpkg -l gitlab-runner
  ||/ Name           Version              Architecture Description
  +++-==============-====================-============-=====================================================
  ii  gitlab-runner  11.2.0+dfsg-2ubuntu1 amd64        GitLab Runner - runs continuous integration (CI) jobs

This supposedly "single" difference, actually amounts to thousands of
changes (not counting the possible downstream patches, differences with
regards to packaging, etc):

  [gitlab-runner]$ git log --no-merges --oneline v11.2.0..v13.1.1 | wc -l
  1477

Version 13.1.1 referred above is what you'd get *if* using the
playbook.

Like I said before, I very much appreciate your help reviewing this,
but unfortunately what you used was *WAY OFF* what was proposed.

And you're right, this version was not tested enough (on an
environment similar to what you used) before it was posted.

Regards,
- Cleber.
diff mbox series

Patch

diff --git a/.gitlab-ci.d/custom-runners.yml b/.gitlab-ci.d/custom-runners.yml
new file mode 100644
index 0000000000..3004da2bda
--- /dev/null
+++ b/.gitlab-ci.d/custom-runners.yml
@@ -0,0 +1,14 @@ 
+# The CI jobs defined here require GitLab runners installed and
+# registered on machines that match their operating system names,
+# versions and architectures.  This is in contrast to the other CI
+# jobs that are intended to run on GitLab's "shared" runners.
+
+# Different than the default approach on "shared" runners, based on
+# containers, the custom runners have no such *requirement*, as those
+# jobs should be capable of running on operating systems with no
+# compatible container implementation, or no support from
+# gitlab-runner.  To avoid problems that gitlab-runner can cause while
+# reusing the GIT repository, let's enable the recursive submodule
+# strategy.
+variables:
+  GIT_SUBMODULE_STRATEGY: recursive
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 8b6d495288..ae19442e93 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -12,6 +12,7 @@  include:
   - local: '/.gitlab-ci.d/opensbi.yml'
   - local: '/.gitlab-ci.d/containers.yml'
   - local: '/.gitlab-ci.d/crossbuilds.yml'
+  - local: '/.gitlab-ci.d/custom-runners.yml'
 
 .native_build_job_template: &native_build_job_definition
   stage: build
diff --git a/docs/devel/ci.rst b/docs/devel/ci.rst
new file mode 100644
index 0000000000..585b7bf4b8
--- /dev/null
+++ b/docs/devel/ci.rst
@@ -0,0 +1,28 @@ 
+==
+CI
+==
+
+QEMU has configurations enabled for a number of different CI services.
+The most up to date information about them and their status can be
+found at::
+
+   https://wiki.qemu.org/Testing/CI
+
+Jobs on Custom Runners
+======================
+
+Besides the jobs run under the various CI systems listed before, there
+are a number additional jobs that will run before an actual merge.
+These use the same GitLab CI's service/framework already used for all
+other GitLab based CI jobs, but rely on additional systems, not the
+ones provided by GitLab as "shared runners".
+
+The architecture of GitLab's CI service allows different machines to
+be set up with GitLab's "agent", called gitlab-runner, which will take
+care of running jobs created by events such as a push to a branch.
+Here, the combination of a machine, properly configured with GitLab's
+gitlab-runner, is called a "custom runner".
+
+The GitLab CI jobs definition for the custom runners are located under::
+
+  .gitlab-ci.d/custom-runners.yml
diff --git a/docs/devel/index.rst b/docs/devel/index.rst
index 22854e334d..b178448a91 100644
--- a/docs/devel/index.rst
+++ b/docs/devel/index.rst
@@ -23,6 +23,7 @@  Contents:
    migration
    atomics
    stable-process
+   ci
    qtest
    decodetree
    secure-coding-practices