[v9,0/3] Acceptance test: Add "boot_linux" acceptance test

This acceptance test, validates that a full blown Linux guest can successfully boot in QEMU. In this specific case, the guest chosen is Fedora version 31. It covers the following architectures and machine types: * x86_64, pc-i440fx and pc-q35 machine types, with TCG and KVM as accelerators * aarch64 and virt machine type, with TCG and KVM as accelerators * ppc64 and pseries machine type with TCG as accelerator * s390x and s390-ccw-virtio machine type with TCG as accelerator This has been tested on x86_64, ppc64le and aarch64 hosts and has been running reliably (in my experience) on Travis CI. Git: - URI: https://github.com/clebergnu/qemu/tree/test_boot_linux_v9 - Remote: https://github.com/clebergnu/qemu - Branch: test_boot_linux_v9 Travis CI: - Build: https://travis-ci.org/clebergnu/qemu/builds/652694503 Previous version: - v8: https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg04095.html - v7: https://lists.gnu.org/archive/html/qemu-devel/2019-11/msg00220.html - v6: https://lists.gnu.org/archive/html/qemu-devel/2019-06/msg01202.html - v5: https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg04652.html - v4: https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg02032.html - v3: https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg01677.html - v2: https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg04318.html - v1: http://lists.nongnu.org/archive/html/qemu-devel/2018-09/msg02530.html Changes from v8: ================ * Renamed "BLD_DIR" to "BUILD_DIR", "SRC_DIR" to "SOURCE_DIR" and dropped "LNK_DIR" variables on tests/acceptance/avocado_qemu/__init__.py * Changed memory allocation to 1024 MB, so that it puts less pressure on the host memory, and should be compatible with 32bit hosts (I've found no significant effects to the test times) * Explicitly enabled TCG and skip tests if it's not available * Added tags for when accel is TCG ("accel:tcg") * Added additional tags for "pc" alias, that is, "pc-i440fx" * Renamed tests to make the machine type and accellerator more explicit: - BootLinuxX8664.test_pc => BootLinuxX8664.test_pc_i440fx_tcg - BootLinuxX8664.test_pc_kvm => BootLinuxX8664.test_pc_i440fx_kvm - BootLinuxX8664.test_q35 => BootLinuxX8664.test_pc_q35_tcg - BootLinuxX8664.test_kvm_q35 => BootLinuxX8664.test_pc_q35_kvm - BootLinuxAarch64.test_virt => BootLinuxAarch64.test_virt_tcg - BootLinuxAarch64.test_kvm_virt => BootLinuxAarch64.test_virt_kvm - BootLinuxPPC64.test_pseries => BootLinuxPPC64.test_pseries_tcg - BootLinuxS390X.test_s390_ccw_virtio => BootLinuxS390X.test_s390_ccw_virtio_tcg * Renamed target "get-vmimage" to "get-vm-images", and added a help entry under "check-help". * Bumped pycdlib version to 1.9.0, which contains an endianess bug that was seen on s390x hosts. Changes from v7: ================ This version drops a number of commits that had been already reviewed and have been merged: * Dropped commit "Acceptance tests: use relative location for tests", already present in the latest master. * Dropped commit "Acceptance tests: use avocado tags for machine type", already present in the latest master. * Dropped commit: "Acceptance tests: introduce utility method for tags unique vals", already present in the latest master. With regards to the handling of the build directory, and the usage of a qemu-img binary from the build tree, the following changed: * Dropped commit "Acceptance tests: add the build directory to the system PATH", because the qemu-img binary to be used is now explicitly defined, instead of relying on the modification of the PATH environment variable. * Dropped commit "Acceptance tests: depend on qemu-img", replaced by explicitly setting the qemu-img binary to be used for snapshot generation. Also, the newly added "--enable-tools" configure line on Travis CI makes sure that a matching qemu-img binary is available on CI. * Dropped commit "Acceptance tests: keep a stable reference to the QEMU build dir", replaced by a different approach that introduces variables tracking the build dir, source dir and link (from build to source) dir. * New commit "Acceptance tests: introduce BLD_DIR, SRC_DIR and LNK_DIR". * New commit "Acceptance tests: add make targets to download images", that downloads the cloud images, aka vmimages, before the test execution itself. * New commit "[TO BE REMOVED] Use Avocado master branch + vmimage fix" to facilitate the review/test of this version. Additionally: * The check for the availability of kvm now makes use of the strengthened qemu.accel.kvm_available() and passes the QEMU binary as an argument to make sure KVM support is compiled into that binary. * The timeout was increased to 900 seconds. This is just one extra step to avoid false negatives on very slow systems. As a comparison, on Travis CI, on a x86_64 host, the slowest test takes around 250 seconds (boot_linux.py:BootLinuxAarch64.test_virt). On x86_64 systems with KVM enabled, my experience is that a test will take around 15 seconds. Changes from v6: ================ * Bumped Fedora to most recently released version (31). * Included new architectures (ppc64 and s390x), consolidating all tests into the same commit. * New commit: "Acceptance tests: use avocado tags for machine type" * New commit: "Acceptance tests: introduce utility method for tags unique vals" * New commit: "Acceptance test x86_cpu_model_versions: use default vm", needed to normalize the use of the machine type tags * Added a lot of leniency to the test setup (and reliability to the test/job), canceling the test if there are any failures while downloading/preparing the boot images. * Made use of Avocado's data drainer a regular feature (dropped the commit with RFC) and squashed it. * Bumped pycdlib version to 1.8.0 * Dropped explicit "--enable-slirp=git" (added on v5) to Travis CI configure line, as the default configuration on Travis CI now results in user networking capabilities. Changes from v5: ================ * Added explicit "--enable-slirp=git" to Travis CI configure line, as these tests depend on "-netdev user" like networking. * Bumped Fedora to most recently released version (30). * Changed "checksum" parameter to 'sha256' and use the same hashes as provided by the Fedora project (instead of using Avocado's default sha1 and compute and use a different hash value). * New commit: Add "boot_linux" test for aarch64 and virt machine type * New commit: [RFC]: use Avocado data drainer for console logging Changes from v4: ================ * New commit "Acceptance tests: use relative location for tests" * New commit "Acceptance tests: keep a stable reference to the QEMU build dir" * Pinned the Fedora 29 image by adding a checksum. The goal is to never allow more than one component to change at a time (the one allowed to change is QEMU itself). Updates to the image should be manual. (Based on comments from Cornelia) * Moved the downloading of the Fedora 29 cloud image to the test setUp() method, canceling the test if the image can not be downloaded. * Removed the ":avocado: enable" tag, given that Avocado versions 68.0 and later operate on a "recursive by default" manner, that is able to correctly identify this as an Avocado test. Changes from v3: ================ * New patch "Acceptance tests: depend on qemu-img" Known Issues on v3 (no longer applicable): ========================================== * A recent TCG performance regression[1] affects this test in a number of ways: - The test execution may timeout by itself - The generation of SSH host keys in the guest's first boot is also affected (possibly also a timeout) - The cloud-init "phone home" feature attempts to read the host keys and fails, causing the test to timeout and fail These are not observed anymore once the fix[2] is applied. [1] - https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg00338.html [2] - https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg01129.html Changes from v2: ================ * Updated the tag to include the "arch:" key, in a similar fashion as to the tests in the "Acceptance Tests: target architecture support": - https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg00369.html * Renamed the test method name to test_x86_64_pc, again, similarly to the boot_linux_console.py tests in the series mentioned before. * Set the machine type explicitly, again similarly to the boot_linux_console.py tests in the series mentioned before. * Added messages after the launch of the VM, to let test runners know the test know waits for a boot confirmation from the the guest (Eduardo). * Updated commit message to reflect the fact that this version does not allow for parameterization of the guest OS, version, etc. * Dropped the RFC prefix on patch "RFC: Acceptance tests: add the build directory to the system PATH" * Changed the comments on "RFC: Acceptance tests: add the build directory to the system PATH" to make it clear the addition of a the build directory to the PATH may influence other utility code. Changes from v1: ================ * The commit message was adjusted, removing the reference to the avocado.utils.vmimage encoding issue on previous Avocado versions (<= 64.0) and the fix that would (and was) included in Avocado version 65.0. * Effectively added pycdlib==1.6.0 to the requirements.txt file, added on a56931eef3, and adjusted the commit message was also to reflect that. * Updated the default version of the guest OS, from Fedora 28 to 29. Besides possible improvements in the (virtual) hardware coverage, it brings a performance improvement in the order of 20% to the test. * Removed all direct parameters usage. Because some parameters and its default values implemented in the test would prevent it from running on some environments. Example: the "accel" parameter had a default value of "kvm", which would prevent this test, that boots a x86_64 OS, from running on a host arch different than x86_64. I recognize that it's desirable to make tests reusable and parameterized (that was the reason for the first version doing so), but the mechanism to be used to define the architectures that a given test should support is still an open issue, and has been discussed in other threads. I'll follow up those discussions with a proposal, and until then, removing those aspects from this test implementation seemed to be the best option. A caveat: this test currently adds the same tag (x86_64) and follows other assumptions made on "boot_linux_console.py", that is, that a x86_64 target binary will be used to run it. If a user is in an environment that does not have a x86_64 target binary, it could filter those tests out with: "avocado run --filter-by-tags='-x86_64' tests/acceptance". * Removed most arguments to the QEMU command line for pretty much the same reasons described above, and by following the general perception that I could grasp from other discussions that QEMU defaults should preferrably be used. This test, as well as others, can and should be extended later to allow for different test scenarios by passing well documented parameter values. That is, they should respect well-known parameters such as "accel" mentioned above, so that the same test can run with KVM or TCG. * Changed the value of the memory argument to 1024, which based on my experimentations and observations is the minimum amount of RAM for the Fedora 29 cloud image to sucessfully boot on QEMU. I know there's no such thing as a "one size fits all", specially for QEMU, but this makes me wonder wether a x86_64 machine type shouldn't have its default_ram_size bumped to a number practical enough to run modern operating systems. * Added a new patch "RFC: Acceptance tests: add the build directory to the system PATH", which is supposed to gather feedback on how to enable the use of built binaries, such as qemu-img, to code used by the test code. The specific situation here is that the vmimage, part of the avocado.utils libraries, makes use of qemu-img to create snapshot files. Even though we could require qemu-img to be installed as a dependency of tests, system wide, it actually goes against the goal of testing all QEMU things from the source/build tree. This became aparent with tests running on environments such as Travis CI, which don't necessarily have qemu-img available elsewhere. Cleber Rosa (3): Acceptance tests: introduce BUILD_DIR and SOURCE_DIR Acceptance test: add "boot_linux" tests Acceptance tests: add make targets to download images .travis.yml | 2 +- tests/Makefile.include | 19 +- tests/acceptance/avocado_qemu/__init__.py | 25 ++- tests/acceptance/boot_linux.py | 215 ++++++++++++++++++++++ tests/requirements.txt | 3 +- 5 files changed, 254 insertions(+), 10 deletions(-) create mode 100644 tests/acceptance/boot_linux.py

Message ID	20200220020652.16276-1-crosa@redhat.com (mailing list archive)
Headers	show Return-Path: <SRS0=7FAs=4I=nongnu.org=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EFA7C2465D From: Cleber Rosa <crosa@redhat.com> To: qemu-devel@nongnu.org Subject: [PATCH v9 0/3] Acceptance test: Add "boot_linux" acceptance test Date: Wed, 19 Feb 2020 21:06:49 -0500 Message-Id: <20200220020652.16276-1-crosa@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Precedence: list Cc: Fam Zheng <fam@euphon.net>, Eduardo Habkost <ehabkost@redhat.com>, =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>, Wainer dos Santos Moschetta <wainersm@redhat.com>, Willian Rampazzo <wrampazz@redhat.com>, Cleber Rosa <crosa@redhat.com>, =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= <philmd@redhat.com>, Beraldo Leal <bleal@redhat.com> Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
Series	Acceptance test: Add "boot_linux" acceptance test \| expand [v9,0/3] Acceptance test: Add "boot_linux" acceptance test [v9,1/3] Acceptance tests: introduce BUILD_DIR and SOURCE_DIR [v9,2/3] Acceptance test: add "boot_linux" tests [v9,3/3] Acceptance tests: add make targets to download images

[v9,0/3] Acceptance test: Add "boot_linux" acceptance test

Message

Comments