Message ID | 20200619020602.118306-1-david@gibson.dropbear.id.au (mailing list archive) |
---|---|
Headers | show |
Series | Generalize memory encryption models | expand |
Patchew URL: https://patchew.org/QEMU/20200619020602.118306-1-david@gibson.dropbear.id.au/ Hi, This series failed the asan build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #!/bin/bash export ARCH=x86_64 make docker-image-fedora V=1 NETWORK=1 time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1 === TEST SCRIPT END === CC contrib/vhost-user-input/main.o LINK tests/qemu-iotests/socket_scm_helper GEN docs/interop/qemu-qmp-ref.html /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) GEN docs/interop/qemu-qmp-ref.txt GEN docs/interop/qemu-qmp-ref.7 CC qga/commands.o --- SIGN pc-bios/optionrom/pvh.bin LINK elf2dmp CC qemu-img.o /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINK qemu-io LINK qemu-edid /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINK fsdev/virtfs-proxy-helper /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINK scsi/qemu-pr-helper LINK qemu-bridge-helper AR libvhost-user.a --- GEN docs/interop/qemu-ga-ref.txt GEN docs/interop/qemu-ga-ref.7 LINK qemu-keymap /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINK ivshmem-client /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINK ivshmem-server /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINK qemu-nbd LINK qemu-storage-daemon /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINK virtiofsd /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINK vhost-user-input LINK qemu-ga /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINK qemu-img /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) GEN x86_64-softmmu/hmp-commands.h GEN x86_64-softmmu/config-devices.h GEN x86_64-softmmu/hmp-commands-info.h --- CC x86_64-softmmu/gdbstub-xml.o CC x86_64-softmmu/trace/generated-helpers.o LINK x86_64-softmmu/qemu-system-x86_64 /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) common.rc: line 50: test: check: binary operator expected (printf '#define QEMU_PKGVERSION ""\n'; printf '#define QEMU_FULL_VERSION "5.0.50"\n'; ) > qemu-version.h.tmp make -C /tmp/qemu-test/src/slirp BUILD_DIR="/tmp/qemu-test/build/slirp" PKG_CONFIG="pkg-config" CC="clang" AR="ar" LD="ld" RANLIB="ranlib" CFLAGS="-I/usr/include/pixman-1 -Werror -fsanitize=undefined -fsanitize=address -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -std=gnu99 -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wno-initializer-overrides -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-string-plus-int -Wno-typedef-redefinition -Wno-tautological-type-limit-compare -fstack-protector-strong -I/usr/include/p11-kit-1 -DSTRUCT_IOVEC_DEFINED -I/usr/include/libpng16 -I/usr/include/spice-1 -I/usr/include/spice-server -I/usr/include/cacard -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/nss3 -I/usr/include/nspr4 -pthread -I/usr/include/libmount -I/usr/include/blkid -I/usr/include/pixman-1 -I/tmp/qemu-test/src/tests -I/tmp/qemu-test/src/tests/qtest -g " LDFLAGS="-Wl,--warn-common -fsanitize=undefined -fsanitize=address -Wl,-z,relro -Wl,-z,now -pie -m64 -fstack-protector-strong" --- clang -iquote /tmp/qemu-test/build/tests -iquote tests -iquote /tmp/qemu-test/src/tcg/i386 -isystem /tmp/qemu-test/src/linux-headers -isystem /tmp/qemu-test/build/linux-headers -iquote . -iquote /tmp/qemu-test/src -iquote /tmp/qemu-test/src/accel/tcg -iquote /tmp/qemu-test/src/include -iquote /tmp/qemu-test/src/disas/libvixl -I/usr/include/pixman-1 -Werror -fsanitize=undefined -fsanitize=address -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -std=gnu99 -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wno-initializer-overrides -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-string-plus-int -Wno-typedef-redefinition -Wno-tautological-type-limit-compare -fstack-protector-strong -I/usr/include/p11-kit-1 -DSTRUCT_IOVEC_DEFINED -I/usr/include/libpng16 -I/usr/include/spice-1 -I/usr/include/spice-server -I/usr/include/cacard -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/nss3 -I/usr/include/nspr4 -pthread -I/usr/include/libmount -I/usr/include/blkid -I/usr/include/pixman-1 -I/tmp/qemu-test/src/tests -I/tmp/qemu-test/src/tests/qtest -MMD -MP -MT tests/test-crypto-cipher.o -MF tests/test-crypto-cipher.d -g -c -o tests/test-crypto-cipher.o /tmp/qemu-test/src/tests/test-crypto-cipher.c clang -iquote /tmp/qemu-test/build/tests -iquote tests -iquote /tmp/qemu-test/src/tcg/i386 -isystem /tmp/qemu-test/src/linux-headers -isystem /tmp/qemu-test/build/linux-headers -iquote . -iquote /tmp/qemu-test/src -iquote /tmp/qemu-test/src/accel/tcg -iquote /tmp/qemu-test/src/include -iquote /tmp/qemu-test/src/disas/libvixl -I/usr/include/pixman-1 -Werror -fsanitize=undefined -fsanitize=address -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -std=gnu99 -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wno-initializer-overrides -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-string-plus-int -Wno-typedef-redefinition -Wno-tautological-type-limit-compare -fstack-protector-strong -I/usr/include/p11-kit-1 -DSTRUCT_IOVEC_DEFINED -I/usr/include/libpng16 -I/usr/include/spice-1 -I/usr/include/spice-server -I/usr/include/cacard -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/nss3 -I/usr/include/nspr4 -pthread -I/usr/include/libmount -I/usr/include/blkid -I/usr/include/pixman-1 -I/tmp/qemu-test/src/tests -I/tmp/qemu-test/src/tests/qtest -MMD -MP -MT tests/test-crypto-secret.o -MF tests/test-crypto-secret.d -g -c -o tests/test-crypto-secret.o /tmp/qemu-test/src/tests/test-crypto-secret.c clang -iquote /tmp/qemu-test/build/tests -iquote tests -iquote /tmp/qemu-test/src/tcg/i386 -isystem /tmp/qemu-test/src/linux-headers -isystem /tmp/qemu-test/build/linux-headers -iquote . -iquote /tmp/qemu-test/src -iquote /tmp/qemu-test/src/accel/tcg -iquote /tmp/qemu-test/src/include -iquote /tmp/qemu-test/src/disas/libvixl -I/usr/include/pixman-1 -Werror -fsanitize=undefined -fsanitize=address -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -std=gnu99 -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wno-initializer-overrides -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-string-plus-int -Wno-typedef-redefinition -Wno-tautological-type-limit-compare -fstack-protector-strong -I/usr/include/p11-kit-1 -DSTRUCT_IOVEC_DEFINED -I/usr/include/libpng16 -I/usr/include/spice-1 -I/usr/include/spice-server -I/usr/include/cacard -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/nss3 -I/usr/include/nspr4 -pthread -I/usr/include/libmount -I/usr/include/blkid -I/usr/include/pixman-1 -I/tmp/qemu-test/src/tests -I/tmp/qemu-test/src/tests/qtest -MMD -MP -MT tests/test-crypto-tlscredsx509.o -MF tests/test-crypto-tlscredsx509.d -g -c -o tests/test-crypto-tlscredsx509.o /tmp/qemu-test/src/tests/test-crypto-tlscredsx509.c /tmp/qemu-test/src/tests/qht-bench.c:287:29: error: implicit conversion from 'unsigned long' to 'double' changes value from 18446744073709551615 to 18446744073709551616 [-Werror,-Wimplicit-int-float-conversion] *threshold = rate * UINT64_MAX; ~ ^~~~~~~~~~ /usr/include/stdint.h:130:23: note: expanded from macro 'UINT64_MAX' --- 18446744073709551615UL ^~~~~~~~~~~~~~~~~~~~~~ 1 error generated. make: *** [/tmp/qemu-test/src/rules.mak:69: tests/qht-bench.o] Error 1 make: *** Waiting for unfinished jobs.... Traceback (most recent call last): File "./tests/docker/docker.py", line 669, in <module> --- raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=0f30ada1843f4b7a8b54df19919b5462', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=x86_64-softmmu', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-wdzkmbob/src/docker-src.2020-06-18-22.37.15.1263:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-debug']' returned non-zero exit status 2. filter=--filter=label=com.qemu.instance.uuid=0f30ada1843f4b7a8b54df19919b5462 make[1]: *** [docker-run] Error 1 make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-wdzkmbob/src' make: *** [docker-run-test-debug@fedora] Error 2 real 5m41.189s user 0m7.985s The full log is available at http://patchew.org/logs/20200619020602.118306-1-david@gibson.dropbear.id.au/testing.asan/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com
On 19.06.20 04:05, David Gibson wrote: > A number of hardware platforms are implementing mechanisms whereby the > hypervisor does not have unfettered access to guest memory, in order > to mitigate the security impact of a compromised hypervisor. > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > its own memory encryption mechanism. POWER has an upcoming mechanism > to accomplish this in a different way, using a new memory protection > level plus a small trusted ultravisor. s390 also has a protected > execution environment. Each architecture finds its own way to vandalize the original architecture, some in more extreme/obscure ways than others. I guess in the long term we'll regret most of that, but what do I know :) > > The current code (committed or draft) for these features has each > platform's version configured entirely differently. That doesn't seem > ideal for users, or particularly for management layers. > > AMD SEV introduces a notionally generic machine option > "machine-encryption", but it doesn't actually cover any cases other > than SEV. > > This series is a proposal to at least partially unify configuration > for these mechanisms, by renaming and generalizing AMD's > "memory-encryption" property. It is replaced by a > "host-trust-limitation" property pointing to a platform specific > object which configures and manages the specific details. > I consider the property name sub-optimal. Yes, I am aware that there are other approaches being discussed on the KVM list to disallow access to guest memory without memory encryption. (most of them sound like people are trying to convert KVM into XEN, but again, what do I know ... :) ) "host-trust-limitation" sounds like "I am the hypervisor, I configure limited trust into myself". Also, "untrusted-host" would be a little bit nicer (I think trust is a black/white thing). However, once we have multiple options to protect a guest (memory encryption, unmapping guest pages ,...) the name will no longer really suffice to configure QEMU, no? > For now this series covers just AMD SEV and POWER PEF. I'm hoping it > can be extended to cover the Intel and s390 mechanisms as well, > though. The only approach on s390x to not glue command line properties to the cpu model would be to remove the CPU model feature and replace it by the command line parameter. But that would, of course, be an incompatible break. How do upper layers actually figure out if memory encryption etc is available? on s390x, it's simply via the expanded host CPU model.
On Fri, 19 Jun 2020 10:28:22 +0200 David Hildenbrand <david@redhat.com> wrote: > On 19.06.20 04:05, David Gibson wrote: > > A number of hardware platforms are implementing mechanisms whereby the > > hypervisor does not have unfettered access to guest memory, in order > > to mitigate the security impact of a compromised hypervisor. > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > > its own memory encryption mechanism. POWER has an upcoming mechanism > > to accomplish this in a different way, using a new memory protection > > level plus a small trusted ultravisor. s390 also has a protected > > execution environment. > > Each architecture finds its own way to vandalize the original > architecture, some in more extreme/obscure ways than others. I guess in > the long term we'll regret most of that, but what do I know :) > > > > > The current code (committed or draft) for these features has each > > platform's version configured entirely differently. That doesn't seem > > ideal for users, or particularly for management layers. > > > > AMD SEV introduces a notionally generic machine option > > "machine-encryption", but it doesn't actually cover any cases other > > than SEV. > > > > This series is a proposal to at least partially unify configuration > > for these mechanisms, by renaming and generalizing AMD's > > "memory-encryption" property. It is replaced by a > > "host-trust-limitation" property pointing to a platform specific > > object which configures and manages the specific details. > > > > I consider the property name sub-optimal. Yes, I am aware that there are > other approaches being discussed on the KVM list to disallow access to > guest memory without memory encryption. (most of them sound like people > are trying to convert KVM into XEN, but again, what do I know ... :) ) > > "host-trust-limitation" sounds like "I am the hypervisor, I configure > limited trust into myself". Also, "untrusted-host" would be a little bit > nicer (I think trust is a black/white thing). > > However, once we have multiple options to protect a guest (memory > encryption, unmapping guest pages ,...) the name will no longer really > suffice to configure QEMU, no? Hm... we could have a property that accepts bits indicating where the actual limitation lies. Different parts of the code could then make more fine-grained decisions of what needs to be done. Feels a bit overengineered today; but maybe there's already stuff with different semantics in the pipeline somewhere? > > > For now this series covers just AMD SEV and POWER PEF. I'm hoping it > > can be extended to cover the Intel and s390 mechanisms as well, > > though. > > The only approach on s390x to not glue command line properties to the > cpu model would be to remove the CPU model feature and replace it by the > command line parameter. But that would, of course, be an incompatible break. Yuck. We still need to provide the cpu feature to the *guest* in any case, no? > > How do upper layers actually figure out if memory encryption etc is > available? on s390x, it's simply via the expanded host CPU model. >
On Fri, Jun 19, 2020 at 10:28:22AM +0200, David Hildenbrand wrote: > On 19.06.20 04:05, David Gibson wrote: > > A number of hardware platforms are implementing mechanisms whereby the > > hypervisor does not have unfettered access to guest memory, in order > > to mitigate the security impact of a compromised hypervisor. > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > > its own memory encryption mechanism. POWER has an upcoming mechanism > > to accomplish this in a different way, using a new memory protection > > level plus a small trusted ultravisor. s390 also has a protected > > execution environment. > > Each architecture finds its own way to vandalize the original > architecture, some in more extreme/obscure ways than others. I guess in > the long term we'll regret most of that, but what do I know :) Well, sure, but that's no *more* true if we start from a common point. > > The current code (committed or draft) for these features has each > > platform's version configured entirely differently. That doesn't seem > > ideal for users, or particularly for management layers. > > > > AMD SEV introduces a notionally generic machine option > > "machine-encryption", but it doesn't actually cover any cases other > > than SEV. > > > > This series is a proposal to at least partially unify configuration > > for these mechanisms, by renaming and generalizing AMD's > > "memory-encryption" property. It is replaced by a > > "host-trust-limitation" property pointing to a platform specific > > object which configures and manages the specific details. > > I consider the property name sub-optimal. Yes, I am aware that there are > other approaches being discussed on the KVM list to disallow access to > guest memory without memory encryption. (most of them sound like people > are trying to convert KVM into XEN, but again, what do I know ... :) ) I don't love the name, it's just the best I've thought of so far. > "host-trust-limitation" sounds like "I am the hypervisor, I configure > limited trust into myself". Which is kind of... accurate... > Also, "untrusted-host" would be a little bit > nicer (I think trust is a black/white thing). > However, once we have multiple options to protect a guest (memory > encryption, unmapping guest pages ,...) the name will no longer really > suffice to configure QEMU, no? That's why it takes a parameter. It points to an object which can itself have more properties to configure the details. SEV already needs that set up, though for both PEF and s390 PV we could pre-create a standard htl object. > > For now this series covers just AMD SEV and POWER PEF. I'm hoping it > > can be extended to cover the Intel and s390 mechanisms as well, > > though. > > The only approach on s390x to not glue command line properties to the > cpu model would be to remove the CPU model feature and replace it by the > command line parameter. But that would, of course, be an incompatible break. I don't really understand why you're so against setting the cpu default parameters from the machine. The machine already sets basic configuration for all sorts of devices in the VM, that's kind of what it's for. > How do upper layers actually figure out if memory encryption etc is > available? on s390x, it's simply via the expanded host CPU model. Haven't really tackled that yet. But one way that works for multiple systems has got to be better than a separate one for each, right?
>> "host-trust-limitation" sounds like "I am the hypervisor, I configure >> limited trust into myself". Also, "untrusted-host" would be a little bit >> nicer (I think trust is a black/white thing). >> >> However, once we have multiple options to protect a guest (memory >> encryption, unmapping guest pages ,...) the name will no longer really >> suffice to configure QEMU, no? > > Hm... we could have a property that accepts bits indicating where the > actual limitation lies. Different parts of the code could then make > more fine-grained decisions of what needs to be done. Feels a bit > overengineered today; but maybe there's already stuff with different > semantics in the pipeline somewhere? > >> >>> For now this series covers just AMD SEV and POWER PEF. I'm hoping it >>> can be extended to cover the Intel and s390 mechanisms as well, >>> though. >> >> The only approach on s390x to not glue command line properties to the >> cpu model would be to remove the CPU model feature and replace it by the >> command line parameter. But that would, of course, be an incompatible break. > > Yuck. > > We still need to provide the cpu feature to the *guest* in any case, no? Yeah, but that could be wired up internally. Wouldn't consider it clean, though (I second the "overengineered" above).
>> However, once we have multiple options to protect a guest (memory >> encryption, unmapping guest pages ,...) the name will no longer really >> suffice to configure QEMU, no? > > That's why it takes a parameter. It points to an object which can > itself have more properties to configure the details. SEV already > needs that set up, though for both PEF and s390 PV we could pre-create > a standard htl object. Ah, okay, that's the "platform specific object which configures and manages the specific details". It would have been nice in the cover letter to show some examples of how that would look like. So it's wrapping architecture-specific data in a common parameter. Hmm. > >>> For now this series covers just AMD SEV and POWER PEF. I'm hoping it >>> can be extended to cover the Intel and s390 mechanisms as well, >>> though. >> >> The only approach on s390x to not glue command line properties to the >> cpu model would be to remove the CPU model feature and replace it by the >> command line parameter. But that would, of course, be an incompatible break. > > I don't really understand why you're so against setting the cpu > default parameters from the machine. The machine already sets basic > configuration for all sorts of devices in the VM, that's kind of what > it's for. It's a general design philosophy that the CPU model (especially the host CPU model) does not depend on other command line parameters (except the accelerator, and I think in corner cases on the machine). Necessary for reliable host model probing by libvirt, for example. We also don't have similar things for nested virt. > >> How do upper layers actually figure out if memory encryption etc is >> available? on s390x, it's simply via the expanded host CPU model. > > Haven't really tackled that yet. But one way that works for multiple > systems has got to be better than a separate one for each, right? I think that's an important piece. Especially once multiple different approaches are theoretically available one wants to sense from upper layers. At least on s390x, it really is like just another CPU-visible feature that tells the guest that it can switch to protected mode.
On Fri, 19 Jun 2020 11:56:49 +0200 David Hildenbrand <david@redhat.com> wrote: > >>> For now this series covers just AMD SEV and POWER PEF. I'm hoping it > >>> can be extended to cover the Intel and s390 mechanisms as well, > >>> though. > >> > >> The only approach on s390x to not glue command line properties to the > >> cpu model would be to remove the CPU model feature and replace it by the > >> command line parameter. But that would, of course, be an incompatible break. > > > > Yuck. > > > > We still need to provide the cpu feature to the *guest* in any case, no? > > Yeah, but that could be wired up internally. Wouldn't consider it clean, > though (I second the "overengineered" above). Could an internally wired-up cpu feature be introspected? Also, what happens if new cpu features are introduced that have a dependency on or a conflict with this one?
On 19.06.20 12:05, Cornelia Huck wrote: > On Fri, 19 Jun 2020 11:56:49 +0200 > David Hildenbrand <david@redhat.com> wrote: > >>>>> For now this series covers just AMD SEV and POWER PEF. I'm hoping it >>>>> can be extended to cover the Intel and s390 mechanisms as well, >>>>> though. >>>> >>>> The only approach on s390x to not glue command line properties to the >>>> cpu model would be to remove the CPU model feature and replace it by the >>>> command line parameter. But that would, of course, be an incompatible break. >>> >>> Yuck. >>> >>> We still need to provide the cpu feature to the *guest* in any case, no? >> >> Yeah, but that could be wired up internally. Wouldn't consider it clean, >> though (I second the "overengineered" above). > > Could an internally wired-up cpu feature be introspected? Also, what Nope. It would just be e.g., a "machine feature" indicated to the guest via the STFL interface/instruction. I was tackling the introspect part when asking David how to sense from upper layers. It would have to be sense via a different interface as it would not longer be modeled as part of CPU features in QEMU. > happens if new cpu features are introduced that have a dependency on or > a conflict with this one? Conflict: bail out in QEMU when incompatible options are specified. Dependency: warn and continue/fixup (e.g., mask off?) Not clean I think.
On Fri, 19 Jun 2020 12:10:13 +0200 David Hildenbrand <david@redhat.com> wrote: > On 19.06.20 12:05, Cornelia Huck wrote: > > On Fri, 19 Jun 2020 11:56:49 +0200 > > David Hildenbrand <david@redhat.com> wrote: > > > >>>>> For now this series covers just AMD SEV and POWER PEF. I'm hoping it > >>>>> can be extended to cover the Intel and s390 mechanisms as well, > >>>>> though. > >>>> > >>>> The only approach on s390x to not glue command line properties to the > >>>> cpu model would be to remove the CPU model feature and replace it by the > >>>> command line parameter. But that would, of course, be an incompatible break. > >>> > >>> Yuck. > >>> > >>> We still need to provide the cpu feature to the *guest* in any case, no? > >> > >> Yeah, but that could be wired up internally. Wouldn't consider it clean, > >> though (I second the "overengineered" above). > > > > Could an internally wired-up cpu feature be introspected? Also, what > > Nope. It would just be e.g., a "machine feature" indicated to the guest > via the STFL interface/instruction. I was tackling the introspect part > when asking David how to sense from upper layers. It would have to be > sense via a different interface as it would not longer be modeled as > part of CPU features in QEMU. > > > happens if new cpu features are introduced that have a dependency on or > > a conflict with this one? > > Conflict: bail out in QEMU when incompatible options are specified. > Dependency: warn and continue/fixup (e.g., mask off?) Masking off would likely be surprising to the user. > Not clean I think. I agree. Still unsure how to bring this new machine property and the cpu feature together. Would be great to have the same interface everywhere, but having two distinct command line objects depend on each other sucks. Automatically setting the feature bit if pv is supported complicates things further. (Is there any requirement that the machine object has been already set up before the cpu features are processed? Or the other way around?) Does this have any implications when probing with the 'none' machine?
On 19.06.20 04:05, David Gibson wrote: > A number of hardware platforms are implementing mechanisms whereby the > hypervisor does not have unfettered access to guest memory, in order > to mitigate the security impact of a compromised hypervisor. > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > its own memory encryption mechanism. POWER has an upcoming mechanism > to accomplish this in a different way, using a new memory protection > level plus a small trusted ultravisor. s390 also has a protected > execution environment. > > The current code (committed or draft) for these features has each > platform's version configured entirely differently. That doesn't seem > ideal for users, or particularly for management layers. > > AMD SEV introduces a notionally generic machine option > "machine-encryption", but it doesn't actually cover any cases other > than SEV. > > This series is a proposal to at least partially unify configuration > for these mechanisms, by renaming and generalizing AMD's > "memory-encryption" property. It is replaced by a > "host-trust-limitation" property pointing to a platform specific > object which configures and manages the specific details. > > For now this series covers just AMD SEV and POWER PEF. I'm hoping it > can be extended to cover the Intel and s390 mechanisms as well, > though. Let me try to summarize what I understand what you try to achieve: one command line parameter for all platforms that common across all platforms: - disable KSM - by default enables iommu_platform per platform: - setup the necessary encryption scheme when appropriate - block migration -.... The tricky part is certainly the per platform thing. For example on s390 we just have a cpumodel flag that provides interfaces to the guest to switch into protected mode via the ultravisor. This works perfectly fine with the host model, so no need to configure anything. The platform code then disables KSM _on_switchover_ and not in general. Because the guest CAN switch into protected, but it does not have to. So this feels really hard to do right. Would a virtual BoF on KVM forum be too late? We had a BoF on protected guests last year and that was valuable.
On Mon, 22 Jun 2020 16:27:28 +0200 Christian Borntraeger <borntraeger@de.ibm.com> wrote: > On 19.06.20 04:05, David Gibson wrote: > > A number of hardware platforms are implementing mechanisms whereby the > > hypervisor does not have unfettered access to guest memory, in order > > to mitigate the security impact of a compromised hypervisor. > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > > its own memory encryption mechanism. POWER has an upcoming mechanism > > to accomplish this in a different way, using a new memory protection > > level plus a small trusted ultravisor. s390 also has a protected > > execution environment. > > > > The current code (committed or draft) for these features has each > > platform's version configured entirely differently. That doesn't seem > > ideal for users, or particularly for management layers. > > > > AMD SEV introduces a notionally generic machine option > > "machine-encryption", but it doesn't actually cover any cases other > > than SEV. > > > > This series is a proposal to at least partially unify configuration > > for these mechanisms, by renaming and generalizing AMD's > > "memory-encryption" property. It is replaced by a > > "host-trust-limitation" property pointing to a platform specific > > object which configures and manages the specific details. > > > > For now this series covers just AMD SEV and POWER PEF. I'm hoping it > > can be extended to cover the Intel and s390 mechanisms as well, > > though. > > Let me try to summarize what I understand what you try to achieve: > one command line parameter for all platforms that > > common across all platforms: > - disable KSM > - by default enables iommu_platform > > > per platform: > - setup the necessary encryption scheme when appropriate > - block migration > -.... > > > The tricky part is certainly the per platform thing. For example on > s390 we just have a cpumodel flag that provides interfaces to the guest > to switch into protected mode via the ultravisor. This works perfectly > fine with the host model, so no need to configure anything. The platform > code then disables KSM _on_switchover_ and not in general. Because the > guest CAN switch into protected, but it does not have to. > > So this feels really hard to do right. Would a virtual BoF on KVM forum > be too late? We had a BoF on protected guests last year and that was > valuable. Maybe we can do some kind of call to discuss this earlier? (Maybe in the KVM call slot on Tuesdays?) I think it would be really helpful if everybody would have at least a general understanding about how encryption/protection works on the different architectures.
On Mon, Jun 22, 2020 at 02:02:54PM +0200, Cornelia Huck wrote: > On Fri, 19 Jun 2020 12:10:13 +0200 > David Hildenbrand <david@redhat.com> wrote: > > > On 19.06.20 12:05, Cornelia Huck wrote: > > > On Fri, 19 Jun 2020 11:56:49 +0200 > > > David Hildenbrand <david@redhat.com> wrote: > > > > > >>>>> For now this series covers just AMD SEV and POWER PEF. I'm hoping it > > >>>>> can be extended to cover the Intel and s390 mechanisms as well, > > >>>>> though. > > >>>> > > >>>> The only approach on s390x to not glue command line properties to the > > >>>> cpu model would be to remove the CPU model feature and replace it by the > > >>>> command line parameter. But that would, of course, be an incompatible break. > > >>> > > >>> Yuck. > > >>> > > >>> We still need to provide the cpu feature to the *guest* in any case, no? > > >> > > >> Yeah, but that could be wired up internally. Wouldn't consider it clean, > > >> though (I second the "overengineered" above). > > > > > > Could an internally wired-up cpu feature be introspected? Also, what > > > > Nope. It would just be e.g., a "machine feature" indicated to the guest > > via the STFL interface/instruction. I was tackling the introspect part > > when asking David how to sense from upper layers. It would have to be > > sense via a different interface as it would not longer be modeled as > > part of CPU features in QEMU. > > > > > happens if new cpu features are introduced that have a dependency on or > > > a conflict with this one? > > > > Conflict: bail out in QEMU when incompatible options are specified. > > Dependency: warn and continue/fixup (e.g., mask off?) > > Masking off would likely be surprising to the user. > > > Not clean I think. > > I agree. > > Still unsure how to bring this new machine property and the cpu feature > together. Would be great to have the same interface everywhere, but > having two distinct command line objects depend on each other sucks. Kinda, but the reality is that hardware - virtual and otherwise - frequently doesn't have entirely orthogonal configuration for each of its components. This is by no means new in that regard. > Automatically setting the feature bit if pv is supported complicates > things further. AIUI, on s390 the "unpack" feature is available by default on recent models. In that case you could do this: * Don't modify either cpu or HTL options based on each other * Bail out if the user specifies a non "unpack" secure CPU along with the HTL option Cases of note: - User specifies an old CPU model + htl or explicitly sets unpack=off + htl => fails with an error, correctly - User specifies modern/default cpu + htl, with secure aware guest => works as a secure guest - User specifies modern/default cpu + htl, with non secure aware guest => works, though not secure (and maybe slower than neccessary) - User specifies modern/default cpu, no htl, with non-secure guest => works, "unpack" feature is present but unused - User specifies modern/default cpu, no htl, secure guest => this is the worst one. It kind of works by accident if you've also manually specified whatever virtio (and anything else) options are necessary. Ugly, but no different from the situation right now, IIUC > (Is there any requirement that the machine object has been already set > up before the cpu features are processed? Or the other way around?) CPUs are usually created by the machine, so I believe we can count on the machine object being there first. > Does this have any implications when probing with the 'none' machine? I'm not sure. In your case, I guess the cpu bit would still show up as before, so it would tell you base feature availability, but not whether you can use the new configuration option. Since the HTL option is generic, you could still set it on the "none" machine, though it wouldn't really have any effect. That is, if you could create a suitable object to point it at, which would depend on ... details.
On Fri, Jun 19, 2020 at 12:04:25PM +0200, David Hildenbrand wrote: > >> However, once we have multiple options to protect a guest (memory > >> encryption, unmapping guest pages ,...) the name will no longer really > >> suffice to configure QEMU, no? > > > > That's why it takes a parameter. It points to an object which can > > itself have more properties to configure the details. SEV already > > needs that set up, though for both PEF and s390 PV we could pre-create > > a standard htl object. > > Ah, okay, that's the "platform specific object which configures and > manages the specific details". It would have been nice in the cover > letter to show some examples of how that would look like. Ok, I can try to add some. > So it's wrapping architecture-specific data in a common > parameter. Hmm. Well, I don't know I'd say "wrapping". You have a common parameter that points to an object with a well defined interface. The available implementations of that object will tend to be either zero or one per architecture, but there's no theoretical reason it has to be. Indeed we expect at least 2 for x86 (SEV and the Intel one who's name I never remember). Extra ones are entirely plausible for POWER and maybe s390 too, when an updated version of PEF or PV inevitably rolls around. Some sort of new HTL scheme which could work across multiple archs is much less likely, but it's not totally impossible either. > >>> For now this series covers just AMD SEV and POWER PEF. I'm hoping it > >>> can be extended to cover the Intel and s390 mechanisms as well, > >>> though. > >> > >> The only approach on s390x to not glue command line properties to the > >> cpu model would be to remove the CPU model feature and replace it by the > >> command line parameter. But that would, of course, be an incompatible break. > > > > I don't really understand why you're so against setting the cpu > > default parameters from the machine. The machine already sets basic > > configuration for all sorts of devices in the VM, that's kind of what > > it's for. > > It's a general design philosophy that the CPU model (especially the host > CPU model) does not depend on other command line parameters (except the > accelerator, and I think in corner cases on the machine). Necessary for > reliable host model probing by libvirt, for example. Ok, I've proposed a revision which doesn't require altering the CPU model elsewhere in this thread. > We also don't have similar things for nested virt. I'm not sure what you're getting at there. > >> How do upper layers actually figure out if memory encryption etc is > >> available? on s390x, it's simply via the expanded host CPU model. > > > > Haven't really tackled that yet. But one way that works for multiple > > systems has got to be better than a separate one for each, right? > > I think that's an important piece. Especially once multiple different > approaches are theoretically available one wants to sense from upper layers. Fair point. So... IIRC there's a general way of looking at available properties for any object, including the machine. So we can probe for availability of the "host-trust-limitation" property itself easily enough. I guess we do need a way of probing for what implementations of the htl interface are available. And, if we go down that path, if there are any pre-generated htl objects available. > At least on s390x, it really is like just another CPU-visible feature > that tells the guest that it can switch to protected mode. Right.. which is great for you, since you already have a nice orthogonal interface for that. On POWER, (a) CPU model isn't enough since you need a running ultravisor as well and (b) CPU feature detection is already a real mess for.. reasons.
On Mon, Jun 22, 2020 at 04:27:28PM +0200, Christian Borntraeger wrote: > On 19.06.20 04:05, David Gibson wrote: > > A number of hardware platforms are implementing mechanisms whereby the > > hypervisor does not have unfettered access to guest memory, in order > > to mitigate the security impact of a compromised hypervisor. > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > > its own memory encryption mechanism. POWER has an upcoming mechanism > > to accomplish this in a different way, using a new memory protection > > level plus a small trusted ultravisor. s390 also has a protected > > execution environment. > > > > The current code (committed or draft) for these features has each > > platform's version configured entirely differently. That doesn't seem > > ideal for users, or particularly for management layers. > > > > AMD SEV introduces a notionally generic machine option > > "machine-encryption", but it doesn't actually cover any cases other > > than SEV. > > > > This series is a proposal to at least partially unify configuration > > for these mechanisms, by renaming and generalizing AMD's > > "memory-encryption" property. It is replaced by a > > "host-trust-limitation" property pointing to a platform specific > > object which configures and manages the specific details. > > > > For now this series covers just AMD SEV and POWER PEF. I'm hoping it > > can be extended to cover the Intel and s390 mechanisms as well, > > though. > > Let me try to summarize what I understand what you try to achieve: > one command line parameter for all platforms that > > common across all platforms: > - disable KSM > - by default enables iommu_platform Pretty much, yes. Plus, in future if we discover other things that don't make sense in the context of a guest whose memory we can't freely access, it can check for those and set sane defaults accordingly. > per platform: > - setup the necessary encryption scheme when appropriate > - block migration That's true for now, but I believe there are plans to make secure guests migratable, so that's not an inherent property. > -.... > > > The tricky part is certainly the per platform thing. For example on > s390 we just have a cpumodel flag that provides interfaces to the guest > to switch into protected mode via the ultravisor. This works perfectly > fine with the host model, so no need to configure anything. The platform > code then disables KSM _on_switchover_ and not in general. Right, because your platform code is aware of the switchover. On POWER, we aren't. > Because the > guest CAN switch into protected, but it does not have to. > > So this feels really hard to do right. Would a virtual BoF on KVM forum > be too late? We had a BoF on protected guests last year and that was > valuable.
On Wed, Jun 24, 2020 at 09:06:48AM +0200, Cornelia Huck wrote: > On Mon, 22 Jun 2020 16:27:28 +0200 > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > > On 19.06.20 04:05, David Gibson wrote: > > > A number of hardware platforms are implementing mechanisms whereby the > > > hypervisor does not have unfettered access to guest memory, in order > > > to mitigate the security impact of a compromised hypervisor. > > > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > > > its own memory encryption mechanism. POWER has an upcoming mechanism > > > to accomplish this in a different way, using a new memory protection > > > level plus a small trusted ultravisor. s390 also has a protected > > > execution environment. > > > > > > The current code (committed or draft) for these features has each > > > platform's version configured entirely differently. That doesn't seem > > > ideal for users, or particularly for management layers. > > > > > > AMD SEV introduces a notionally generic machine option > > > "machine-encryption", but it doesn't actually cover any cases other > > > than SEV. > > > > > > This series is a proposal to at least partially unify configuration > > > for these mechanisms, by renaming and generalizing AMD's > > > "memory-encryption" property. It is replaced by a > > > "host-trust-limitation" property pointing to a platform specific > > > object which configures and manages the specific details. > > > > > > For now this series covers just AMD SEV and POWER PEF. I'm hoping it > > > can be extended to cover the Intel and s390 mechanisms as well, > > > though. > > > > Let me try to summarize what I understand what you try to achieve: > > one command line parameter for all platforms that > > > > common across all platforms: > > - disable KSM > > - by default enables iommu_platform > > > > > > per platform: > > - setup the necessary encryption scheme when appropriate > > - block migration > > -.... > > > > > > The tricky part is certainly the per platform thing. For example on > > s390 we just have a cpumodel flag that provides interfaces to the guest > > to switch into protected mode via the ultravisor. This works perfectly > > fine with the host model, so no need to configure anything. The platform > > code then disables KSM _on_switchover_ and not in general. Because the > > guest CAN switch into protected, but it does not have to. > > > > So this feels really hard to do right. Would a virtual BoF on KVM forum > > be too late? We had a BoF on protected guests last year and that was > > valuable. > > Maybe we can do some kind of call to discuss this earlier? (Maybe in > the KVM call slot on Tuesdays?) I think it would be really helpful if > everybody would have at least a general understanding about how > encryption/protection works on the different architectures. Yes, I think this would be a good idea. KVM Forum is probably later than we want, plus since it is virtual, I probably won't be shifting into the right timezone to attend much of it. I don't know when that Tuesday KVM call is. Generally the best available time for Australia/Europe meetings this time of year is 9am CET / 5pm AEST. As a once off I could go somewhat later into my evening, though.
On Thu, Jun 25, 2020 at 03:47:23PM +1000, David Gibson wrote: > On Wed, Jun 24, 2020 at 09:06:48AM +0200, Cornelia Huck wrote: > > On Mon, 22 Jun 2020 16:27:28 +0200 > > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > > > > On 19.06.20 04:05, David Gibson wrote: > > > > A number of hardware platforms are implementing mechanisms whereby the > > > > hypervisor does not have unfettered access to guest memory, in order > > > > to mitigate the security impact of a compromised hypervisor. > > > > > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > > > > its own memory encryption mechanism. POWER has an upcoming mechanism > > > > to accomplish this in a different way, using a new memory protection > > > > level plus a small trusted ultravisor. s390 also has a protected > > > > execution environment. > > > > > > > > The current code (committed or draft) for these features has each > > > > platform's version configured entirely differently. That doesn't seem > > > > ideal for users, or particularly for management layers. > > > > > > > > AMD SEV introduces a notionally generic machine option > > > > "machine-encryption", but it doesn't actually cover any cases other > > > > than SEV. > > > > > > > > This series is a proposal to at least partially unify configuration > > > > for these mechanisms, by renaming and generalizing AMD's > > > > "memory-encryption" property. It is replaced by a > > > > "host-trust-limitation" property pointing to a platform specific > > > > object which configures and manages the specific details. > > > > > > > > For now this series covers just AMD SEV and POWER PEF. I'm hoping it > > > > can be extended to cover the Intel and s390 mechanisms as well, > > > > though. > > > > > > Let me try to summarize what I understand what you try to achieve: > > > one command line parameter for all platforms that > > > > > > common across all platforms: > > > - disable KSM > > > - by default enables iommu_platform > > > > > > > > > per platform: > > > - setup the necessary encryption scheme when appropriate > > > - block migration > > > -.... > > > > > > > > > The tricky part is certainly the per platform thing. For example on > > > s390 we just have a cpumodel flag that provides interfaces to the guest > > > to switch into protected mode via the ultravisor. This works perfectly > > > fine with the host model, so no need to configure anything. The platform > > > code then disables KSM _on_switchover_ and not in general. Because the > > > guest CAN switch into protected, but it does not have to. > > > > > > So this feels really hard to do right. Would a virtual BoF on KVM forum > > > be too late? We had a BoF on protected guests last year and that was > > > valuable. > > > > Maybe we can do some kind of call to discuss this earlier? (Maybe in > > the KVM call slot on Tuesdays?) I think it would be really helpful if > > everybody would have at least a general understanding about how > > encryption/protection works on the different architectures. > > Yes, I think this would be a good idea. KVM Forum is probably later > than we want, plus since it is virtual, I probably won't be shifting > into the right timezone to attend much of it. > > I don't know when that Tuesday KVM call is. Generally the best > available time for Australia/Europe meetings this time of year is 9am > CET / 5pm AEST. As a once off I could go somewhat later into my > evening, though. Oh.. plus I'm on holiday next week and the one after (27 Jun - 12 Jul).
> >> So it's wrapping architecture-specific data in a common >> parameter. Hmm. > > Well, I don't know I'd say "wrapping". You have a common parameter > that points to an object with a well defined interface. The available > implementations of that object will tend to be either zero or one per > architecture, but there's no theoretical reason it has to be. Indeed > we expect at least 2 for x86 (SEV and the Intel one who's name I never > remember). Extra ones are entirely plausible for POWER and maybe s390 > too, when an updated version of PEF or PV inevitably rolls around. > > Some sort of new HTL scheme which could work across multiple archs is > much less likely, but it's not totally impossible either. > >>>>> For now this series covers just AMD SEV and POWER PEF. I'm hoping it >>>>> can be extended to cover the Intel and s390 mechanisms as well, >>>>> though. >>>> >>>> The only approach on s390x to not glue command line properties to the >>>> cpu model would be to remove the CPU model feature and replace it by the >>>> command line parameter. But that would, of course, be an incompatible break. >>> >>> I don't really understand why you're so against setting the cpu >>> default parameters from the machine. The machine already sets basic >>> configuration for all sorts of devices in the VM, that's kind of what >>> it's for. >> >> It's a general design philosophy that the CPU model (especially the host >> CPU model) does not depend on other command line parameters (except the >> accelerator, and I think in corner cases on the machine). Necessary for >> reliable host model probing by libvirt, for example. > > Ok, I've proposed a revision which doesn't require altering the CPU > model elsewhere in this thread. > >> We also don't have similar things for nested virt. > > I'm not sure what you're getting at there. Sorry, back when we introduced nested virt there was a similar (internal?) discussion, to enable/disable it via a machine flag and not via 1..X CPU features. We went for the latter, because it matches the actual architecture and allows for easy migration checks etc. Nested virt also collides with some features currently (e.g., huge page backing for the guest), but not as severe as encrypted virtualization. > >>>> How do upper layers actually figure out if memory encryption etc is >>>> available? on s390x, it's simply via the expanded host CPU model. >>> >>> Haven't really tackled that yet. But one way that works for multiple >>> systems has got to be better than a separate one for each, right? >> >> I think that's an important piece. Especially once multiple different >> approaches are theoretically available one wants to sense from upper layers. > > Fair point. > > So... IIRC there's a general way of looking at available properties > for any object, including the machine. So we can probe for > availability of the "host-trust-limitation" property itself easily > enough. You can have a look at how it's currently probed by libvirt in https://www.redhat.com/archives/libvir-list/2020-June/msg00518.html For now, the s390x check consists of - checking if /sys/firmware/uv is available - checking if the kernel cmdline contains 'prot_virt=1' The sev check is - checking if /sys/module/kvm_amd/parameters/sev contains the value '1' - checking if /dev/sev So at least libvirt does not sense via the CPU model on s390x yet. > > I guess we do need a way of probing for what implementations of the > htl interface are available. And, if we go down that path, if there > are any pre-generated htl objects available. > >> At least on s390x, it really is like just another CPU-visible feature >> that tells the guest that it can switch to protected mode. > > Right.. which is great for you, since you already have a nice > orthogonal interface for that. On POWER, (a) CPU model isn't enough > since you need a running ultravisor as well and (b) CPU feature > detection is already a real mess for.. reasons. I can understand the pain of the latter ... :)
>> Still unsure how to bring this new machine property and the cpu feature >> together. Would be great to have the same interface everywhere, but >> having two distinct command line objects depend on each other sucks. > > Kinda, but the reality is that hardware - virtual and otherwise - > frequently doesn't have entirely orthogonal configuration for each of > its components. This is by no means new in that regard. > >> Automatically setting the feature bit if pv is supported complicates >> things further. > > AIUI, on s390 the "unpack" feature is available by default on recent > models. In that case you could do this: > > * Don't modify either cpu or HTL options based on each other > * Bail out if the user specifies a non "unpack" secure CPU along with > the HTL option > > Cases of note: > - User specifies an old CPU model + htl > or explicitly sets unpack=off + htl > => fails with an error, correctly > - User specifies modern/default cpu + htl, with secure aware guest > => works as a secure guest > - User specifies modern/default cpu + htl, with non secure aware guest > => works, though not secure (and maybe slower than neccessary) > - User specifies modern/default cpu, no htl, with non-secure guest > => works, "unpack" feature is present but unused > - User specifies modern/default cpu, no htl, secure guest > => this is the worst one. It kind of works by accident if > you've also manually specified whatever virtio (and > anything else) options are necessary. Ugly, but no > different from the situation right now, IIUC > >> (Is there any requirement that the machine object has been already set >> up before the cpu features are processed? Or the other way around?) > > CPUs are usually created by the machine, so I believe we can count on > the machine object being there first. CPU model initialization is one of the first things machine initialization code does on s390x. static void ccw_init(MachineState *machine) { [... memory init ...] s390_sclp_init(); s390_memory_init(machine->ram); /* init CPUs (incl. CPU model) early so s390_has_feature() works */ s390_init_cpus(machine); [...] } > >> Does this have any implications when probing with the 'none' machine? > > I'm not sure. In your case, I guess the cpu bit would still show up > as before, so it would tell you base feature availability, but not > whether you can use the new configuration option. > > Since the HTL option is generic, you could still set it on the "none" > machine, though it wouldn't really have any effect. That is, if you > could create a suitable object to point it at, which would depend on > ... details. > The important point is that we never want the (expanded) host cpu model look different when either specifying or not specifying the HTL property. We don't want to run into issues where libvirt probes and gets host model X, but when using that probed model (automatically) for a guest domain, we suddenly cannot run X anymore.
On Thu, 25 Jun 2020 08:59:00 +0200 David Hildenbrand <david@redhat.com> wrote: > >>>> How do upper layers actually figure out if memory encryption etc is > >>>> available? on s390x, it's simply via the expanded host CPU model. > >>> > >>> Haven't really tackled that yet. But one way that works for multiple > >>> systems has got to be better than a separate one for each, right? > >> > >> I think that's an important piece. Especially once multiple different > >> approaches are theoretically available one wants to sense from upper layers. > > > > Fair point. > > > > So... IIRC there's a general way of looking at available properties > > for any object, including the machine. So we can probe for > > availability of the "host-trust-limitation" property itself easily > > enough. > > You can have a look at how it's currently probed by libvirt in > > https://www.redhat.com/archives/libvir-list/2020-June/msg00518.html > > For now, the s390x check consists of > - checking if /sys/firmware/uv is available > - checking if the kernel cmdline contains 'prot_virt=1' > > The sev check is > - checking if /sys/module/kvm_amd/parameters/sev contains the > value '1' > - checking if /dev/sev > > So at least libvirt does not sense via the CPU model on s390x yet. It checks for 158 (which is apparently 'host supports secure execution'). IIUC, only 161 ('unpack facility') is relevant for the guest... does that also show up on the host? (I guess it does, as it describes an ultravisor feature, IIUC.) If it is always implied, libvirt probably does not need an extra check.
On Thu, Jun 25, 2020 at 09:06:05AM +0200, David Hildenbrand wrote: > >> Still unsure how to bring this new machine property and the cpu feature > >> together. Would be great to have the same interface everywhere, but > >> having two distinct command line objects depend on each other sucks. > > > > Kinda, but the reality is that hardware - virtual and otherwise - > > frequently doesn't have entirely orthogonal configuration for each of > > its components. This is by no means new in that regard. > > > >> Automatically setting the feature bit if pv is supported complicates > >> things further. > > > > AIUI, on s390 the "unpack" feature is available by default on recent > > models. In that case you could do this: > > > > * Don't modify either cpu or HTL options based on each other > > * Bail out if the user specifies a non "unpack" secure CPU along with > > the HTL option > > > > Cases of note: > > - User specifies an old CPU model + htl > > or explicitly sets unpack=off + htl > > => fails with an error, correctly > > - User specifies modern/default cpu + htl, with secure aware guest > > => works as a secure guest > > - User specifies modern/default cpu + htl, with non secure aware guest > > => works, though not secure (and maybe slower than neccessary) > > - User specifies modern/default cpu, no htl, with non-secure guest > > => works, "unpack" feature is present but unused > > - User specifies modern/default cpu, no htl, secure guest > > => this is the worst one. It kind of works by accident if > > you've also manually specified whatever virtio (and > > anything else) options are necessary. Ugly, but no > > different from the situation right now, IIUC > > > >> (Is there any requirement that the machine object has been already set > >> up before the cpu features are processed? Or the other way around?) > > > > CPUs are usually created by the machine, so I believe we can count on > > the machine object being there first. > > CPU model initialization is one of the first things machine > initialization code does on s390x. As it is for most platforms, but still, the values of machine properties are available to you at this point. > static void ccw_init(MachineState *machine) > { > [... memory init ...] > s390_sclp_init(); > s390_memory_init(machine->ram); > /* init CPUs (incl. CPU model) early so s390_has_feature() works */ > s390_init_cpus(machine); > [...] > } > > > > >> Does this have any implications when probing with the 'none' machine? > > > > I'm not sure. In your case, I guess the cpu bit would still show up > > as before, so it would tell you base feature availability, but not > > whether you can use the new configuration option. > > > > Since the HTL option is generic, you could still set it on the "none" > > machine, though it wouldn't really have any effect. That is, if you > > could create a suitable object to point it at, which would depend on > > ... details. > > > > The important point is that we never want the (expanded) host cpu model > look different when either specifying or not specifying the HTL > property. Ah, yes, I see your point. So my current suggestion will satisfy that, basically it is: cpu has unpack (inc. by default) && htl specified => works (allowing secure), as expected !cpu has unpack && htl specified => bails out with an error !cpu has unpack && !htl specified => works for a non-secure guest, as expected => guest will fail if it attempts to go secure cpu has unpack && !htl specified => works as expected for a non-secure guest (unpack feature is present, but unused) => secure guest may work "by accident", but only if all virtio properties have the right values, which is the user's problem That last case is kinda ugly, but I think it's tolerable. > We don't want to run into issues where libvirt probes and gets > host model X, but when using that probed model (automatically) for a > guest domain, we suddenly cannot run X anymore. >
>>>> Does this have any implications when probing with the 'none' machine? >>> >>> I'm not sure. In your case, I guess the cpu bit would still show up >>> as before, so it would tell you base feature availability, but not >>> whether you can use the new configuration option. >>> >>> Since the HTL option is generic, you could still set it on the "none" >>> machine, though it wouldn't really have any effect. That is, if you >>> could create a suitable object to point it at, which would depend on >>> ... details. >>> >> >> The important point is that we never want the (expanded) host cpu model >> look different when either specifying or not specifying the HTL >> property. > > Ah, yes, I see your point. So my current suggestion will satisfy > that, basically it is: > > cpu has unpack (inc. by default) && htl specified > => works (allowing secure), as expected ack > > !cpu has unpack && htl specified > => bails out with an error ack > > !cpu has unpack && !htl specified > => works for a non-secure guest, as expected > => guest will fail if it attempts to go secure ack, behavior just like running on older hw without unpack > > cpu has unpack && !htl specified > => works as expected for a non-secure guest (unpack feature is > present, but unused) > => secure guest may work "by accident", but only if all virtio > properties have the right values, which is the user's > problem > > That last case is kinda ugly, but I think it's tolerable. Right, we must not affect non-secure guests, and existing secure setups (e.g., older qemu machines). Will have to think about this some more, but does not sound too crazy. Thanks!
On 6/26/20 8:53 AM, David Hildenbrand wrote: >>>>> Does this have any implications when probing with the 'none' machine? >>>> >>>> I'm not sure. In your case, I guess the cpu bit would still show up >>>> as before, so it would tell you base feature availability, but not >>>> whether you can use the new configuration option. >>>> >>>> Since the HTL option is generic, you could still set it on the "none" >>>> machine, though it wouldn't really have any effect. That is, if you >>>> could create a suitable object to point it at, which would depend on >>>> ... details. >>>> >>> >>> The important point is that we never want the (expanded) host cpu model >>> look different when either specifying or not specifying the HTL >>> property. >> >> Ah, yes, I see your point. So my current suggestion will satisfy >> that, basically it is: >> >> cpu has unpack (inc. by default) && htl specified >> => works (allowing secure), as expected > > ack > >> >> !cpu has unpack && htl specified >> => bails out with an error > > ack > >> >> !cpu has unpack && !htl specified >> => works for a non-secure guest, as expected >> => guest will fail if it attempts to go secure > > ack, behavior just like running on older hw without unpack > >> >> cpu has unpack && !htl specified >> => works as expected for a non-secure guest (unpack feature is >> present, but unused) >> => secure guest may work "by accident", but only if all virtio >> properties have the right values, which is the user's >> problem >> >> That last case is kinda ugly, but I think it's tolerable. > > Right, we must not affect non-secure guests, and existing secure setups > (e.g., older qemu machines). Will have to think about this some more, > but does not sound too crazy. I severely dislike having to specify things to make PV work. The IOMMU is already a thorn in our side and we're working on making the whole ordeal completely transparent so the only requirement to make this work is the right machine, kernel, qemu and kernel cmd line option "prot_virt=1". That's why we do the reboot into PV mode in the first place. I.e. the goal is that if customers convert compatible guests into protected ones and start them up on a z15 on a distro with PV support they can just use the guest without having to change XML or command line parameters. Internal customers have already created bugs because they did not follow the documentation and the more cmd options we bring the more bugzillas we'll get. PV is already in the field/GA and can be ordered, as is our documentation. @Christian: Please chime in here > > Thanks! >
On Fri, Jun 26, 2020 at 11:01:58AM +0200, Janosch Frank wrote: > On 6/26/20 8:53 AM, David Hildenbrand wrote: > >>>>> Does this have any implications when probing with the 'none' machine? > >>>> > >>>> I'm not sure. In your case, I guess the cpu bit would still show up > >>>> as before, so it would tell you base feature availability, but not > >>>> whether you can use the new configuration option. > >>>> > >>>> Since the HTL option is generic, you could still set it on the "none" > >>>> machine, though it wouldn't really have any effect. That is, if you > >>>> could create a suitable object to point it at, which would depend on > >>>> ... details. > >>>> > >>> > >>> The important point is that we never want the (expanded) host cpu model > >>> look different when either specifying or not specifying the HTL > >>> property. > >> > >> Ah, yes, I see your point. So my current suggestion will satisfy > >> that, basically it is: > >> > >> cpu has unpack (inc. by default) && htl specified > >> => works (allowing secure), as expected > > > > ack > > > >> > >> !cpu has unpack && htl specified > >> => bails out with an error > > > > ack > > > >> > >> !cpu has unpack && !htl specified > >> => works for a non-secure guest, as expected > >> => guest will fail if it attempts to go secure > > > > ack, behavior just like running on older hw without unpack > > > >> > >> cpu has unpack && !htl specified > >> => works as expected for a non-secure guest (unpack feature is > >> present, but unused) > >> => secure guest may work "by accident", but only if all virtio > >> properties have the right values, which is the user's > >> problem > >> > >> That last case is kinda ugly, but I think it's tolerable. > > > > Right, we must not affect non-secure guests, and existing secure setups > > (e.g., older qemu machines). Will have to think about this some more, > > but does not sound too crazy. > > I severely dislike having to specify things to make PV work. > The IOMMU is already a thorn in our side and we're working on making the > whole ordeal completely transparent so the only requirement to make this > work is the right machine, kernel, qemu and kernel cmd line option > "prot_virt=1". That's why we do the reboot into PV mode in the first place. > > I.e. the goal is that if customers convert compatible guests into > protected ones and start them up on a z15 on a distro with PV support > they can just use the guest without having to change XML or command line > parameters. If you're exposing new features to the guest machine, then it is usually to be expected that XML and QEMU command line will change. Some simple things might be hidable behind a new QEMU machine type or CPU model, but there's a limit to how much should be hidden that way while staying sane. I'd really expect the configuration to change when switching a guest to a new hardware platform and wanting major new functionality to be enabled. The XML / QEMU config is a low level instantiation of a particular feature set, optimized for a specific machine, rather than a high level description of ideal "best" config independent of host machine. Regards, Daniel
On 6/26/20 11:32 AM, Daniel P. Berrangé wrote: > On Fri, Jun 26, 2020 at 11:01:58AM +0200, Janosch Frank wrote: >> On 6/26/20 8:53 AM, David Hildenbrand wrote: >>>>>>> Does this have any implications when probing with the 'none' machine? >>>>>> >>>>>> I'm not sure. In your case, I guess the cpu bit would still show up >>>>>> as before, so it would tell you base feature availability, but not >>>>>> whether you can use the new configuration option. >>>>>> >>>>>> Since the HTL option is generic, you could still set it on the "none" >>>>>> machine, though it wouldn't really have any effect. That is, if you >>>>>> could create a suitable object to point it at, which would depend on >>>>>> ... details. >>>>>> >>>>> >>>>> The important point is that we never want the (expanded) host cpu model >>>>> look different when either specifying or not specifying the HTL >>>>> property. >>>> >>>> Ah, yes, I see your point. So my current suggestion will satisfy >>>> that, basically it is: >>>> >>>> cpu has unpack (inc. by default) && htl specified >>>> => works (allowing secure), as expected >>> >>> ack >>> >>>> >>>> !cpu has unpack && htl specified >>>> => bails out with an error >>> >>> ack >>> >>>> >>>> !cpu has unpack && !htl specified >>>> => works for a non-secure guest, as expected >>>> => guest will fail if it attempts to go secure >>> >>> ack, behavior just like running on older hw without unpack >>> >>>> >>>> cpu has unpack && !htl specified >>>> => works as expected for a non-secure guest (unpack feature is >>>> present, but unused) >>>> => secure guest may work "by accident", but only if all virtio >>>> properties have the right values, which is the user's >>>> problem >>>> >>>> That last case is kinda ugly, but I think it's tolerable. >>> >>> Right, we must not affect non-secure guests, and existing secure setups >>> (e.g., older qemu machines). Will have to think about this some more, >>> but does not sound too crazy. >> >> I severely dislike having to specify things to make PV work. >> The IOMMU is already a thorn in our side and we're working on making the >> whole ordeal completely transparent so the only requirement to make this >> work is the right machine, kernel, qemu and kernel cmd line option >> "prot_virt=1". That's why we do the reboot into PV mode in the first place. >> >> I.e. the goal is that if customers convert compatible guests into >> protected ones and start them up on a z15 on a distro with PV support >> they can just use the guest without having to change XML or command line >> parameters. > > If you're exposing new features to the guest machine, then it is usually > to be expected that XML and QEMU command line will change. Some simple > things might be hidable behind a new QEMU machine type or CPU model, but > there's a limit to how much should be hidden that way while staying sane. > > I'd really expect the configuration to change when switching a guest to > a new hardware platform and wanting major new functionality to be enabled. > The XML / QEMU config is a low level instantiation of a particular feature > set, optimized for a specific machine, rather than a high level description > of ideal "best" config independent of host machine. You still have to set the host command line and make sure that unpack is available. Currently you also have to specify the IOMMU which we like to drop as a requirement. Everything else is dependent on runtime information which tells us if we need to take a PV or non-PV branch. Having the unpack facility should be enough to use the unpack facility. Keep in mind that we have no real concept of a special protected VM to begin with. If the VM never boots into a protected kernel it will never be protected. On a reboot it drops from protected into unprotected mode to execute the bios and boot loader and then may or may not move back into a protected state. > > Regards, > Daniel >
* Janosch Frank (frankja@linux.ibm.com) wrote: > On 6/26/20 11:32 AM, Daniel P. Berrangé wrote: > > On Fri, Jun 26, 2020 at 11:01:58AM +0200, Janosch Frank wrote: > >> On 6/26/20 8:53 AM, David Hildenbrand wrote: > >>>>>>> Does this have any implications when probing with the 'none' machine? > >>>>>> > >>>>>> I'm not sure. In your case, I guess the cpu bit would still show up > >>>>>> as before, so it would tell you base feature availability, but not > >>>>>> whether you can use the new configuration option. > >>>>>> > >>>>>> Since the HTL option is generic, you could still set it on the "none" > >>>>>> machine, though it wouldn't really have any effect. That is, if you > >>>>>> could create a suitable object to point it at, which would depend on > >>>>>> ... details. > >>>>>> > >>>>> > >>>>> The important point is that we never want the (expanded) host cpu model > >>>>> look different when either specifying or not specifying the HTL > >>>>> property. > >>>> > >>>> Ah, yes, I see your point. So my current suggestion will satisfy > >>>> that, basically it is: > >>>> > >>>> cpu has unpack (inc. by default) && htl specified > >>>> => works (allowing secure), as expected > >>> > >>> ack > >>> > >>>> > >>>> !cpu has unpack && htl specified > >>>> => bails out with an error > >>> > >>> ack > >>> > >>>> > >>>> !cpu has unpack && !htl specified > >>>> => works for a non-secure guest, as expected > >>>> => guest will fail if it attempts to go secure > >>> > >>> ack, behavior just like running on older hw without unpack > >>> > >>>> > >>>> cpu has unpack && !htl specified > >>>> => works as expected for a non-secure guest (unpack feature is > >>>> present, but unused) > >>>> => secure guest may work "by accident", but only if all virtio > >>>> properties have the right values, which is the user's > >>>> problem > >>>> > >>>> That last case is kinda ugly, but I think it's tolerable. > >>> > >>> Right, we must not affect non-secure guests, and existing secure setups > >>> (e.g., older qemu machines). Will have to think about this some more, > >>> but does not sound too crazy. > >> > >> I severely dislike having to specify things to make PV work. > >> The IOMMU is already a thorn in our side and we're working on making the > >> whole ordeal completely transparent so the only requirement to make this > >> work is the right machine, kernel, qemu and kernel cmd line option > >> "prot_virt=1". That's why we do the reboot into PV mode in the first place. > >> > >> I.e. the goal is that if customers convert compatible guests into > >> protected ones and start them up on a z15 on a distro with PV support > >> they can just use the guest without having to change XML or command line > >> parameters. > > > > If you're exposing new features to the guest machine, then it is usually > > to be expected that XML and QEMU command line will change. Some simple > > things might be hidable behind a new QEMU machine type or CPU model, but > > there's a limit to how much should be hidden that way while staying sane. > > > > I'd really expect the configuration to change when switching a guest to > > a new hardware platform and wanting major new functionality to be enabled. > > The XML / QEMU config is a low level instantiation of a particular feature > > set, optimized for a specific machine, rather than a high level description > > of ideal "best" config independent of host machine. > > You still have to set the host command line and make sure that unpack is > available. Currently you also have to specify the IOMMU which we like to > drop as a requirement. Everything else is dependent on runtime > information which tells us if we need to take a PV or non-PV branch. > Having the unpack facility should be enough to use the unpack facility. > > Keep in mind that we have no real concept of a special protected VM to > begin with. If the VM never boots into a protected kernel it will never > be protected. On a reboot it drops from protected into unprotected mode > to execute the bios and boot loader and then may or may not move back > into a protected state. My worry isn't actually how painful adding all the iommu glue is, but what happens when users forget; especially if they forget for one device. I could appreciate having a machine option to cause iommu to then get turned on with all other devices; but I think also we could do with something that failed with a nice error if an iommu flag was missing. For SEV this could be done pretty early, but for power/s390 I guess you'd have to do this when someone tried to enable secure mode, but I'm not sure you can tell. Dave > > > > Regards, > > Daniel > > > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Fri, Jun 26, 2020 at 11:29:03AM +0100, Dr. David Alan Gilbert wrote: > * Janosch Frank (frankja@linux.ibm.com) wrote: > > On 6/26/20 11:32 AM, Daniel P. Berrangé wrote: > > > On Fri, Jun 26, 2020 at 11:01:58AM +0200, Janosch Frank wrote: > > >> On 6/26/20 8:53 AM, David Hildenbrand wrote: > > >>>>>>> Does this have any implications when probing with the 'none' machine? > > >>>>>> > > >>>>>> I'm not sure. In your case, I guess the cpu bit would still show up > > >>>>>> as before, so it would tell you base feature availability, but not > > >>>>>> whether you can use the new configuration option. > > >>>>>> > > >>>>>> Since the HTL option is generic, you could still set it on the "none" > > >>>>>> machine, though it wouldn't really have any effect. That is, if you > > >>>>>> could create a suitable object to point it at, which would depend on > > >>>>>> ... details. > > >>>>>> > > >>>>> > > >>>>> The important point is that we never want the (expanded) host cpu model > > >>>>> look different when either specifying or not specifying the HTL > > >>>>> property. > > >>>> > > >>>> Ah, yes, I see your point. So my current suggestion will satisfy > > >>>> that, basically it is: > > >>>> > > >>>> cpu has unpack (inc. by default) && htl specified > > >>>> => works (allowing secure), as expected > > >>> > > >>> ack > > >>> > > >>>> > > >>>> !cpu has unpack && htl specified > > >>>> => bails out with an error > > >>> > > >>> ack > > >>> > > >>>> > > >>>> !cpu has unpack && !htl specified > > >>>> => works for a non-secure guest, as expected > > >>>> => guest will fail if it attempts to go secure > > >>> > > >>> ack, behavior just like running on older hw without unpack > > >>> > > >>>> > > >>>> cpu has unpack && !htl specified > > >>>> => works as expected for a non-secure guest (unpack feature is > > >>>> present, but unused) > > >>>> => secure guest may work "by accident", but only if all virtio > > >>>> properties have the right values, which is the user's > > >>>> problem > > >>>> > > >>>> That last case is kinda ugly, but I think it's tolerable. > > >>> > > >>> Right, we must not affect non-secure guests, and existing secure setups > > >>> (e.g., older qemu machines). Will have to think about this some more, > > >>> but does not sound too crazy. > > >> > > >> I severely dislike having to specify things to make PV work. > > >> The IOMMU is already a thorn in our side and we're working on making the > > >> whole ordeal completely transparent so the only requirement to make this > > >> work is the right machine, kernel, qemu and kernel cmd line option > > >> "prot_virt=1". That's why we do the reboot into PV mode in the first place. > > >> > > >> I.e. the goal is that if customers convert compatible guests into > > >> protected ones and start them up on a z15 on a distro with PV support > > >> they can just use the guest without having to change XML or command line > > >> parameters. > > > > > > If you're exposing new features to the guest machine, then it is usually > > > to be expected that XML and QEMU command line will change. Some simple > > > things might be hidable behind a new QEMU machine type or CPU model, but > > > there's a limit to how much should be hidden that way while staying sane. > > > > > > I'd really expect the configuration to change when switching a guest to > > > a new hardware platform and wanting major new functionality to be enabled. > > > The XML / QEMU config is a low level instantiation of a particular feature > > > set, optimized for a specific machine, rather than a high level description > > > of ideal "best" config independent of host machine. > > > > You still have to set the host command line and make sure that unpack is > > available. Currently you also have to specify the IOMMU which we like to > > drop as a requirement. Everything else is dependent on runtime > > information which tells us if we need to take a PV or non-PV branch. > > Having the unpack facility should be enough to use the unpack facility. > > > > Keep in mind that we have no real concept of a special protected VM to > > begin with. If the VM never boots into a protected kernel it will never > > be protected. On a reboot it drops from protected into unprotected mode > > to execute the bios and boot loader and then may or may not move back > > into a protected state. > > My worry isn't actually how painful adding all the iommu glue is, but > what happens when users forget; especially if they forget for one > device. > > I could appreciate having a machine option to cause iommu to then get > turned on with all other devices; but I think also we could do with > something that failed with a nice error if an iommu flag was missing. > For SEV this could be done pretty early, but for power/s390 I guess > you'd have to do this when someone tried to enable secure mode, but > I'm not sure you can tell. What is the cost / downside of turning on the iommu option for virtio devices ? Is it something that is reasonable for a mgmt app todo unconditionally, regardless of whether memory encryption is in use, or will that have a negative impact on things ? Regards, Daniel
On 6/26/20 12:58 PM, Daniel P. Berrangé wrote: > On Fri, Jun 26, 2020 at 11:29:03AM +0100, Dr. David Alan Gilbert wrote: >> * Janosch Frank (frankja@linux.ibm.com) wrote: >>> On 6/26/20 11:32 AM, Daniel P. Berrangé wrote: >>>> On Fri, Jun 26, 2020 at 11:01:58AM +0200, Janosch Frank wrote: >>>>> On 6/26/20 8:53 AM, David Hildenbrand wrote: >>>>>>>>>> Does this have any implications when probing with the 'none' machine? >>>>>>>>> >>>>>>>>> I'm not sure. In your case, I guess the cpu bit would still show up >>>>>>>>> as before, so it would tell you base feature availability, but not >>>>>>>>> whether you can use the new configuration option. >>>>>>>>> >>>>>>>>> Since the HTL option is generic, you could still set it on the "none" >>>>>>>>> machine, though it wouldn't really have any effect. That is, if you >>>>>>>>> could create a suitable object to point it at, which would depend on >>>>>>>>> ... details. >>>>>>>>> >>>>>>>> >>>>>>>> The important point is that we never want the (expanded) host cpu model >>>>>>>> look different when either specifying or not specifying the HTL >>>>>>>> property. >>>>>>> >>>>>>> Ah, yes, I see your point. So my current suggestion will satisfy >>>>>>> that, basically it is: >>>>>>> >>>>>>> cpu has unpack (inc. by default) && htl specified >>>>>>> => works (allowing secure), as expected >>>>>> >>>>>> ack >>>>>> >>>>>>> >>>>>>> !cpu has unpack && htl specified >>>>>>> => bails out with an error >>>>>> >>>>>> ack >>>>>> >>>>>>> >>>>>>> !cpu has unpack && !htl specified >>>>>>> => works for a non-secure guest, as expected >>>>>>> => guest will fail if it attempts to go secure >>>>>> >>>>>> ack, behavior just like running on older hw without unpack >>>>>> >>>>>>> >>>>>>> cpu has unpack && !htl specified >>>>>>> => works as expected for a non-secure guest (unpack feature is >>>>>>> present, but unused) >>>>>>> => secure guest may work "by accident", but only if all virtio >>>>>>> properties have the right values, which is the user's >>>>>>> problem >>>>>>> >>>>>>> That last case is kinda ugly, but I think it's tolerable. >>>>>> >>>>>> Right, we must not affect non-secure guests, and existing secure setups >>>>>> (e.g., older qemu machines). Will have to think about this some more, >>>>>> but does not sound too crazy. >>>>> >>>>> I severely dislike having to specify things to make PV work. >>>>> The IOMMU is already a thorn in our side and we're working on making the >>>>> whole ordeal completely transparent so the only requirement to make this >>>>> work is the right machine, kernel, qemu and kernel cmd line option >>>>> "prot_virt=1". That's why we do the reboot into PV mode in the first place. >>>>> >>>>> I.e. the goal is that if customers convert compatible guests into >>>>> protected ones and start them up on a z15 on a distro with PV support >>>>> they can just use the guest without having to change XML or command line >>>>> parameters. >>>> >>>> If you're exposing new features to the guest machine, then it is usually >>>> to be expected that XML and QEMU command line will change. Some simple >>>> things might be hidable behind a new QEMU machine type or CPU model, but >>>> there's a limit to how much should be hidden that way while staying sane. >>>> >>>> I'd really expect the configuration to change when switching a guest to >>>> a new hardware platform and wanting major new functionality to be enabled. >>>> The XML / QEMU config is a low level instantiation of a particular feature >>>> set, optimized for a specific machine, rather than a high level description >>>> of ideal "best" config independent of host machine. >>> >>> You still have to set the host command line and make sure that unpack is >>> available. Currently you also have to specify the IOMMU which we like to >>> drop as a requirement. Everything else is dependent on runtime >>> information which tells us if we need to take a PV or non-PV branch. >>> Having the unpack facility should be enough to use the unpack facility. >>> >>> Keep in mind that we have no real concept of a special protected VM to >>> begin with. If the VM never boots into a protected kernel it will never >>> be protected. On a reboot it drops from protected into unprotected mode >>> to execute the bios and boot loader and then may or may not move back >>> into a protected state. >> >> My worry isn't actually how painful adding all the iommu glue is, but >> what happens when users forget; especially if they forget for one >> device. >> >> I could appreciate having a machine option to cause iommu to then get >> turned on with all other devices; but I think also we could do with >> something that failed with a nice error if an iommu flag was missing. >> For SEV this could be done pretty early, but for power/s390 I guess >> you'd have to do this when someone tried to enable secure mode, but >> I'm not sure you can tell. > > What is the cost / downside of turning on the iommu option for virtio > devices ? Is it something that is reasonable for a mgmt app todo > unconditionally, regardless of whether memory encryption is in use, > or will that have a negative impact on things ? speed, memory usage and compatibility problems. There might also be a problem with s390 having to use <=2GB iommu areas in the guest, I need to check with Halil if this is still true. Also, if the default or specified IOMMU buffer size isn't big enough for your IO workload the guest is gonna have a very bad time. I.e. if somebody has an alternative implementation of bounce buffers we'd be happy to take it :) > > Regards, > Daniel >
On Fri, 26 Jun 2020 14:49:37 +0200 Janosch Frank <frankja@linux.ibm.com> wrote: > On 6/26/20 12:58 PM, Daniel P. Berrangé wrote: > > On Fri, Jun 26, 2020 at 11:29:03AM +0100, Dr. David Alan Gilbert wrote: > >> * Janosch Frank (frankja@linux.ibm.com) wrote: > >>> On 6/26/20 11:32 AM, Daniel P. Berrangé wrote: > >>>> On Fri, Jun 26, 2020 at 11:01:58AM +0200, Janosch Frank wrote: > >>>>> On 6/26/20 8:53 AM, David Hildenbrand wrote: > >>>>>>>>>> Does this have any implications when probing with the 'none' machine? > >>>>>>>>> > >>>>>>>>> I'm not sure. In your case, I guess the cpu bit would still show up > >>>>>>>>> as before, so it would tell you base feature availability, but not > >>>>>>>>> whether you can use the new configuration option. > >>>>>>>>> > >>>>>>>>> Since the HTL option is generic, you could still set it on the "none" > >>>>>>>>> machine, though it wouldn't really have any effect. That is, if you > >>>>>>>>> could create a suitable object to point it at, which would depend on > >>>>>>>>> ... details. > >>>>>>>>> > >>>>>>>> > >>>>>>>> The important point is that we never want the (expanded) host cpu model > >>>>>>>> look different when either specifying or not specifying the HTL > >>>>>>>> property. > >>>>>>> > >>>>>>> Ah, yes, I see your point. So my current suggestion will satisfy > >>>>>>> that, basically it is: > >>>>>>> > >>>>>>> cpu has unpack (inc. by default) && htl specified > >>>>>>> => works (allowing secure), as expected > >>>>>> > >>>>>> ack > >>>>>> > >>>>>>> > >>>>>>> !cpu has unpack && htl specified > >>>>>>> => bails out with an error > >>>>>> > >>>>>> ack > >>>>>> > >>>>>>> > >>>>>>> !cpu has unpack && !htl specified > >>>>>>> => works for a non-secure guest, as expected > >>>>>>> => guest will fail if it attempts to go secure > >>>>>> > >>>>>> ack, behavior just like running on older hw without unpack > >>>>>> > >>>>>>> > >>>>>>> cpu has unpack && !htl specified > >>>>>>> => works as expected for a non-secure guest (unpack feature is > >>>>>>> present, but unused) > >>>>>>> => secure guest may work "by accident", but only if all virtio > >>>>>>> properties have the right values, which is the user's > >>>>>>> problem > >>>>>>> > >>>>>>> That last case is kinda ugly, but I think it's tolerable. > >>>>>> > >>>>>> Right, we must not affect non-secure guests, and existing secure setups > >>>>>> (e.g., older qemu machines). Will have to think about this some more, > >>>>>> but does not sound too crazy. > >>>>> > >>>>> I severely dislike having to specify things to make PV work. > >>>>> The IOMMU is already a thorn in our side and we're working on making the > >>>>> whole ordeal completely transparent so the only requirement to make this > >>>>> work is the right machine, kernel, qemu and kernel cmd line option > >>>>> "prot_virt=1". That's why we do the reboot into PV mode in the first place. > >>>>> > >>>>> I.e. the goal is that if customers convert compatible guests into > >>>>> protected ones and start them up on a z15 on a distro with PV support > >>>>> they can just use the guest without having to change XML or command line > >>>>> parameters. > >>>> > >>>> If you're exposing new features to the guest machine, then it is usually > >>>> to be expected that XML and QEMU command line will change. Some simple > >>>> things might be hidable behind a new QEMU machine type or CPU model, but > >>>> there's a limit to how much should be hidden that way while staying sane. > >>>> > >>>> I'd really expect the configuration to change when switching a guest to > >>>> a new hardware platform and wanting major new functionality to be enabled. > >>>> The XML / QEMU config is a low level instantiation of a particular feature > >>>> set, optimized for a specific machine, rather than a high level description > >>>> of ideal "best" config independent of host machine. > >>> > >>> You still have to set the host command line and make sure that unpack is > >>> available. Currently you also have to specify the IOMMU which we like to > >>> drop as a requirement. Everything else is dependent on runtime > >>> information which tells us if we need to take a PV or non-PV branch. > >>> Having the unpack facility should be enough to use the unpack facility. > >>> > >>> Keep in mind that we have no real concept of a special protected VM to > >>> begin with. If the VM never boots into a protected kernel it will never > >>> be protected. On a reboot it drops from protected into unprotected mode > >>> to execute the bios and boot loader and then may or may not move back > >>> into a protected state. > >> > >> My worry isn't actually how painful adding all the iommu glue is, but > >> what happens when users forget; especially if they forget for one > >> device. > >> > >> I could appreciate having a machine option to cause iommu to then get > >> turned on with all other devices; but I think also we could do with > >> something that failed with a nice error if an iommu flag was missing. > >> For SEV this could be done pretty early, but for power/s390 I guess > >> you'd have to do this when someone tried to enable secure mode, but > >> I'm not sure you can tell. > > > > What is the cost / downside of turning on the iommu option for virtio > > devices ? Is it something that is reasonable for a mgmt app todo > > unconditionally, regardless of whether memory encryption is in use, > > or will that have a negative impact on things ? > > speed, memory usage and compatibility problems. > There might also be a problem with s390 having to use <=2GB iommu areas > in the guest, I need to check with Halil if this is still true. It is partially true. The coherent_dma_mask is 31 bit and the dma_mask is 64. That means if iommu=on but !PV the coherent stuff will use <= 2GB (that stuff allocated by virtio core, like virtqueues, CCWs, etc.) but there will be no bounce buffering. We don't even initialize swiotlb if !PV. I agree with Janosch, we want iommu='on' only when really needed. I've tried to make that point several times. Regards, Halil > > Also, if the default or specified IOMMU buffer size isn't big enough for > your IO workload the guest is gonna have a very bad time. I.e. if > somebody has an alternative implementation of bounce buffers we'd be > happy to take it :) > > > > > Regards, > > Daniel > > > >