Message ID | 20241218113255.232356-1-thuth@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | tests/functional: Convert the kvm_xen_guest avocado test | expand |
On 18 December 2024 12:32:49 CET, Thomas Huth <thuth@redhat.com> wrote: >Use the serial console to execute the commands in the guest instead >of using ssh since we don't have ssh support in the functional >framework yet. > >Signed-off-by: Thomas Huth <thuth@redhat.com> Hm, but serial is lossy and experience shows that it leads to flaky tests if the guest (or host) misses bytes. While SSH would just go slower.
On Wed, Dec 18, 2024 at 12:48:01PM +0100, David Woodhouse wrote: > On 18 December 2024 12:32:49 CET, Thomas Huth <thuth@redhat.com> wrote: > >Use the serial console to execute the commands in the guest instead > >of using ssh since we don't have ssh support in the functional > >framework yet. > > > >Signed-off-by: Thomas Huth <thuth@redhat.com> > > Hm, but serial is lossy and experience shows that it leads to > flaky tests if the guest (or host) misses bytes. While SSH would just go slower. Practically all of our tests are using the serial console for interaction. QEMU serial port emulation is generally written to stall if the fifo is full, and not throwaway data. With regards, Daniel
On 18/12/2024 12.48, David Woodhouse wrote: > On 18 December 2024 12:32:49 CET, Thomas Huth <thuth@redhat.com> wrote: >> Use the serial console to execute the commands in the guest instead >> of using ssh since we don't have ssh support in the functional >> framework yet. >> >> Signed-off-by: Thomas Huth <thuth@redhat.com> > > Hm, but serial is lossy and experience shows that it leads to flaky tests if the guest (or host) misses bytes. While SSH would just go slower. The issue with the serial console should be fixed since: https://gitlab.com/qemu-project/qemu/-/commit/cdad03b74f759857d784e074755 We didn't see any more issues with all the other tests since that has been merged. Thomas
On 18/12/2024 13.40, Thomas Huth wrote: > On 18/12/2024 12.48, David Woodhouse wrote: >> On 18 December 2024 12:32:49 CET, Thomas Huth <thuth@redhat.com> wrote: >>> Use the serial console to execute the commands in the guest instead >>> of using ssh since we don't have ssh support in the functional >>> framework yet. >>> >>> Signed-off-by: Thomas Huth <thuth@redhat.com> >> >> Hm, but serial is lossy and experience shows that it leads to flaky tests >> if the guest (or host) misses bytes. While SSH would just go slower. > > The issue with the serial console should be fixed since: > > https://gitlab.com/qemu-project/qemu/-/commit/cdad03b74f759857d784e074755 > > We didn't see any more issues with all the other tests since that has been > merged. But FWIW, there seems to be another issue with this test. While running it multiple times, I sometimes see test_kvm_xen_guest_novector_noapic hanging. According to the console output, the guest waits in vain for a device: 2024-12-18 14:32:58,606: Initializing XFRM netlink socket 2024-12-18 14:32:58,607: NET: Registered PF_INET6 protocol family 2024-12-18 14:32:58,609: Segment Routing with IPv6 2024-12-18 14:32:58,609: In-situ OAM (IOAM) with IPv6 2024-12-18 14:32:58,610: NET: Registered PF_PACKET protocol family 2024-12-18 14:32:58,610: 8021q: 802.1Q VLAN Support v1.8 2024-12-18 14:32:58,611: 9pnet: Installing 9P2000 support 2024-12-18 14:32:58,613: NET: Registered PF_VSOCK protocol family 2024-12-18 14:32:58,614: IPI shorthand broadcast: enabled 2024-12-18 14:32:58,619: sched_clock: Marking stable (551147059, -6778955)->(590359530, -45991426) 2024-12-18 14:32:59,507: tsc: Refined TSC clocksource calibration: 2495.952 MHz 2024-12-18 14:32:59,508: clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x23fa49fc138, max_idle_ns: 440795295059 ns 2024-12-18 14:32:59,509: clocksource: Switched to clocksource tsc 2024-12-18 14:33:28,667: xenbus_probe_frontend: Waiting for devices to initialise: 25s...20s...15s...10s...5s...0s... Have you seen this problem before? Thomas
On Wed, 2024-12-18 at 14:38 +0100, Thomas Huth wrote: > On 18/12/2024 13.40, Thomas Huth wrote: > > On 18/12/2024 12.48, David Woodhouse wrote: > > > On 18 December 2024 12:32:49 CET, Thomas Huth <thuth@redhat.com> wrote: > > > > Use the serial console to execute the commands in the guest instead > > > > of using ssh since we don't have ssh support in the functional > > > > framework yet. > > > > > > > > Signed-off-by: Thomas Huth <thuth@redhat.com> > > > > > > Hm, but serial is lossy and experience shows that it leads to flaky tests > > > if the guest (or host) misses bytes. While SSH would just go slower. > > > > The issue with the serial console should be fixed since: > > > > https://gitlab.com/qemu-project/qemu/-/commit/cdad03b74f759857d784e074755 > > > > We didn't see any more issues with all the other tests since that has been > > merged. Fair enough, thanks. In that case: Acked-by: David Woodhouse <dwmw@amazon.co.uk> > But FWIW, there seems to be another issue with this test. While running it > multiple times, I sometimes see test_kvm_xen_guest_novector_noapic hanging. > According to the console output, the guest waits in vain for a device: > > 2024-12-18 14:32:58,606: Initializing XFRM netlink socket > 2024-12-18 14:32:58,607: NET: Registered PF_INET6 protocol family > 2024-12-18 14:32:58,609: Segment Routing with IPv6 > 2024-12-18 14:32:58,609: In-situ OAM (IOAM) with IPv6 > 2024-12-18 14:32:58,610: NET: Registered PF_PACKET protocol family > 2024-12-18 14:32:58,610: 8021q: 802.1Q VLAN Support v1.8 > 2024-12-18 14:32:58,611: 9pnet: Installing 9P2000 support > 2024-12-18 14:32:58,613: NET: Registered PF_VSOCK protocol family > 2024-12-18 14:32:58,614: IPI shorthand broadcast: enabled > 2024-12-18 14:32:58,619: sched_clock: Marking stable (551147059, -6778955)->(590359530, -45991426) > 2024-12-18 14:32:59,507: tsc: Refined TSC clocksource calibration: 2495.952 MHz > 2024-12-18 14:32:59,508: clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x23fa49fc138, max_idle_ns: 440795295059 ns > 2024-12-18 14:32:59,509: clocksource: Switched to clocksource tsc > 2024-12-18 14:33:28,667: xenbus_probe_frontend: Waiting for devices to initialise: 25s...20s...15s...10s...5s...0s... > > Have you seen this problem before? That seems like event channel interrupts aren't being routed to the legacy i8259 PIC. I've certainly seen that kind of thing before, especially when asserted level-triggered interrupts weren't correctly being asserted. But I don't expect that of QEMU. I'll see if I can reproduce; thanks. How often does it happen?
On 18/12/2024 12.48, David Woodhouse wrote: > On 18 December 2024 12:32:49 CET, Thomas Huth <thuth@redhat.com> wrote: >> Use the serial console to execute the commands in the guest instead >> of using ssh since we don't have ssh support in the functional >> framework yet. >> >> Signed-off-by: Thomas Huth <thuth@redhat.com> > > Hm, but serial is lossy and experience shows that it leads to flaky tests if the guest (or host) misses bytes. While SSH would just go slower. I now noticed some issue with the serial console in this test, too. Looks like the "Starting dropbear sshd: OK" is not print in an atomic way by the guest, sometimes there are other kernel messages between the ":" and the "OK". It works reliable when removing the "OK" from the string. Thomas
On 18/12/2024 15.11, David Woodhouse wrote: > On Wed, 2024-12-18 at 14:38 +0100, Thomas Huth wrote: ... >> But FWIW, there seems to be another issue with this test. While running it >> multiple times, I sometimes see test_kvm_xen_guest_novector_noapic hanging. >> According to the console output, the guest waits in vain for a device: >> >> 2024-12-18 14:32:58,606: Initializing XFRM netlink socket >> 2024-12-18 14:32:58,607: NET: Registered PF_INET6 protocol family >> 2024-12-18 14:32:58,609: Segment Routing with IPv6 >> 2024-12-18 14:32:58,609: In-situ OAM (IOAM) with IPv6 >> 2024-12-18 14:32:58,610: NET: Registered PF_PACKET protocol family >> 2024-12-18 14:32:58,610: 8021q: 802.1Q VLAN Support v1.8 >> 2024-12-18 14:32:58,611: 9pnet: Installing 9P2000 support >> 2024-12-18 14:32:58,613: NET: Registered PF_VSOCK protocol family >> 2024-12-18 14:32:58,614: IPI shorthand broadcast: enabled >> 2024-12-18 14:32:58,619: sched_clock: Marking stable (551147059, -6778955)->(590359530, -45991426) >> 2024-12-18 14:32:59,507: tsc: Refined TSC clocksource calibration: 2495.952 MHz >> 2024-12-18 14:32:59,508: clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x23fa49fc138, max_idle_ns: 440795295059 ns >> 2024-12-18 14:32:59,509: clocksource: Switched to clocksource tsc >> 2024-12-18 14:33:28,667: xenbus_probe_frontend: Waiting for devices to initialise: 25s...20s...15s...10s...5s...0s... >> >> Have you seen this problem before? > > That seems like event channel interrupts aren't being routed to the > legacy i8259 PIC. I've certainly seen that kind of thing before, > especially when asserted level-triggered interrupts weren't correctly > being asserted. But I don't expect that of QEMU. I'll see if I can > reproduce; thanks. > > How often does it happen? With the new functional test, it happens maybe 2 times out of 100 test runs. I wasn't able to reproduce it with the avocado version yet, but that also runs 10x slower, so it takes a longer time to get to that many runs... Thomas
On 18/12/2024 17.19, Thomas Huth wrote: > On 18/12/2024 15.11, David Woodhouse wrote: >> On Wed, 2024-12-18 at 14:38 +0100, Thomas Huth wrote: > ... >>> But FWIW, there seems to be another issue with this test. While running it >>> multiple times, I sometimes see test_kvm_xen_guest_novector_noapic hanging. >>> According to the console output, the guest waits in vain for a device: >>> >>> 2024-12-18 14:32:58,606: Initializing XFRM netlink socket >>> 2024-12-18 14:32:58,607: NET: Registered PF_INET6 protocol family >>> 2024-12-18 14:32:58,609: Segment Routing with IPv6 >>> 2024-12-18 14:32:58,609: In-situ OAM (IOAM) with IPv6 >>> 2024-12-18 14:32:58,610: NET: Registered PF_PACKET protocol family >>> 2024-12-18 14:32:58,610: 8021q: 802.1Q VLAN Support v1.8 >>> 2024-12-18 14:32:58,611: 9pnet: Installing 9P2000 support >>> 2024-12-18 14:32:58,613: NET: Registered PF_VSOCK protocol family >>> 2024-12-18 14:32:58,614: IPI shorthand broadcast: enabled >>> 2024-12-18 14:32:58,619: sched_clock: Marking stable (551147059, >>> -6778955)->(590359530, -45991426) >>> 2024-12-18 14:32:59,507: tsc: Refined TSC clocksource calibration: >>> 2495.952 MHz >>> 2024-12-18 14:32:59,508: clocksource: tsc: mask: 0xffffffffffffffff >>> max_cycles: 0x23fa49fc138, max_idle_ns: 440795295059 ns >>> 2024-12-18 14:32:59,509: clocksource: Switched to clocksource tsc >>> 2024-12-18 14:33:28,667: xenbus_probe_frontend: Waiting for devices to >>> initialise: 25s...20s...15s...10s...5s...0s... >>> >>> Have you seen this problem before? >> >> That seems like event channel interrupts aren't being routed to the >> legacy i8259 PIC. I've certainly seen that kind of thing before, >> especially when asserted level-triggered interrupts weren't correctly >> being asserted. But I don't expect that of QEMU. I'll see if I can >> reproduce; thanks. >> >> How often does it happen? > > With the new functional test, it happens maybe 2 times out of 100 test runs. > > I wasn't able to reproduce it with the avocado version yet, but that also > runs 10x slower, so it takes a longer time to get to that many runs... Ok, FWIW, I've now also seen the problem with the old avocado version of the test, so it's nothing that has been introduced by my patch. I just had to downgrade to Avocado v88 again since the current version v103 does not seem to correctly output the console anymore :-/ (which is another good indicator that we really need to get the stuff moved over to the functional framework now). Thomas
On 18 December 2024 18:16:13 CET, Thomas Huth <thuth@redhat.com> wrote: >On 18/12/2024 17.19, Thomas Huth wrote: >> On 18/12/2024 15.11, David Woodhouse wrote: >>> On Wed, 2024-12-18 at 14:38 +0100, Thomas Huth wrote: >> ... >>>> But FWIW, there seems to be another issue with this test. While running it >>>> multiple times, I sometimes see test_kvm_xen_guest_novector_noapic hanging. >>>> According to the console output, the guest waits in vain for a device: >>>> >>>> 2024-12-18 14:32:58,606: Initializing XFRM netlink socket >>>> 2024-12-18 14:32:58,607: NET: Registered PF_INET6 protocol family >>>> 2024-12-18 14:32:58,609: Segment Routing with IPv6 >>>> 2024-12-18 14:32:58,609: In-situ OAM (IOAM) with IPv6 >>>> 2024-12-18 14:32:58,610: NET: Registered PF_PACKET protocol family >>>> 2024-12-18 14:32:58,610: 8021q: 802.1Q VLAN Support v1.8 >>>> 2024-12-18 14:32:58,611: 9pnet: Installing 9P2000 support >>>> 2024-12-18 14:32:58,613: NET: Registered PF_VSOCK protocol family >>>> 2024-12-18 14:32:58,614: IPI shorthand broadcast: enabled >>>> 2024-12-18 14:32:58,619: sched_clock: Marking stable (551147059, -6778955)->(590359530, -45991426) >>>> 2024-12-18 14:32:59,507: tsc: Refined TSC clocksource calibration: 2495.952 MHz >>>> 2024-12-18 14:32:59,508: clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x23fa49fc138, max_idle_ns: 440795295059 ns >>>> 2024-12-18 14:32:59,509: clocksource: Switched to clocksource tsc >>>> 2024-12-18 14:33:28,667: xenbus_probe_frontend: Waiting for devices to initialise: 25s...20s...15s...10s...5s...0s... >>>> >>>> Have you seen this problem before? >>> >>> That seems like event channel interrupts aren't being routed to the >>> legacy i8259 PIC. I've certainly seen that kind of thing before, >>> especially when asserted level-triggered interrupts weren't correctly >>> being asserted. But I don't expect that of QEMU. I'll see if I can >>> reproduce; thanks. >>> >>> How often does it happen? >> >> With the new functional test, it happens maybe 2 times out of 100 test runs. >> >> I wasn't able to reproduce it with the avocado version yet, but that also runs 10x slower, so it takes a longer time to get to that many runs... > >Ok, FWIW, I've now also seen the problem with the old avocado version of the test, so it's nothing that has been introduced by my patch. I just had to downgrade to Avocado v88 again since the current version v103 does not seem to correctly output the console anymore :-/ (which is another good indicator that we really need to get the stuff moved over to the functional framework now). > > Thomas > I have reproduced it, will look into it. I'm fairly sure this was all working reliably at the time the Xen support was merged; that's why I wrote these test cases after all.
On Wed, 2024-12-18 at 17:19 +0100, Thomas Huth wrote: > On 18/12/2024 15.11, David Woodhouse wrote: > > On Wed, 2024-12-18 at 14:38 +0100, Thomas Huth wrote: > ... > > > But FWIW, there seems to be another issue with this test. While running it > > > multiple times, I sometimes see test_kvm_xen_guest_novector_noapic hanging. > > > According to the console output, the guest waits in vain for a device: > > > > > > 2024-12-18 14:32:58,606: Initializing XFRM netlink socket > > > 2024-12-18 14:32:58,607: NET: Registered PF_INET6 protocol family > > > 2024-12-18 14:32:58,609: Segment Routing with IPv6 > > > 2024-12-18 14:32:58,609: In-situ OAM (IOAM) with IPv6 > > > 2024-12-18 14:32:58,610: NET: Registered PF_PACKET protocol family > > > 2024-12-18 14:32:58,610: 8021q: 802.1Q VLAN Support v1.8 > > > 2024-12-18 14:32:58,611: 9pnet: Installing 9P2000 support > > > 2024-12-18 14:32:58,613: NET: Registered PF_VSOCK protocol family > > > 2024-12-18 14:32:58,614: IPI shorthand broadcast: enabled > > > 2024-12-18 14:32:58,619: sched_clock: Marking stable (551147059, -6778955)->(590359530, -45991426) > > > 2024-12-18 14:32:59,507: tsc: Refined TSC clocksource calibration: 2495.952 MHz > > > 2024-12-18 14:32:59,508: clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x23fa49fc138, max_idle_ns: 440795295059 ns > > > 2024-12-18 14:32:59,509: clocksource: Switched to clocksource tsc > > > 2024-12-18 14:33:28,667: xenbus_probe_frontend: Waiting for devices to initialise: 25s...20s...15s...10s...5s...0s... > > > > > > Have you seen this problem before? > > > > That seems like event channel interrupts aren't being routed to the > > legacy i8259 PIC. I've certainly seen that kind of thing before, > > especially when asserted level-triggered interrupts weren't correctly > > being asserted. But I don't expect that of QEMU. I'll see if I can > > reproduce; thanks. > > > > How often does it happen? > > With the new functional test, it happens maybe 2 times out of 100 test runs. > > I wasn't able to reproduce it with the avocado version yet, but that also > runs 10x slower, so it takes a longer time to get to that many runs... I can reproduce it probably about one in ten attempts. It seems like it's because of the way QEMU handles shared level- triggered interrupts. We kind of work around it with PCI INTx demultiplexing, but the Xen guest explicitly asks for the interrupt to be delivered to INT10. So it asserts INT10, then I suspect something on the PCI deasserts it, and the Xen interrupt is lost. The guest configures the Xen event channel IRQ to be delivered on IRQ10, but IRQ10 is *also* a PCI INTX of some device. So if the PCI device *clears* IRQ10 at the wrong moment, we miss a Xen interrupt... which means we end up waiting for ever. We *really* ought to do this with a callback when the interrupt is acked in the PIC/IOAPIC, and any interrupt source which wants it to be still asserted can reassert it from that callback, which is precisely how the vfio eventfd 'resampler' works. Or *should* work, if QEMU was actually capable of working that way. I'll see if I can come up with a workaround... or whether I fall into the rabbithole of fixing the overall level interrupt stuff.
On Wed, 2024-12-18 at 22:42 +0100, David Woodhouse wrote: > > It seems like it's because of the way QEMU handles shared level- > triggered interrupts. Yeah, this hack seems to confirm it. As I said, PCI INTx manages to demux correctly, but any time you have non-PCI interrupt sharing, it's hosed because they all just set/clear the GSI as if they own it, and there's no OR gate in sight. Now I have to decide if this is going to provoke me into attempting to fix it for the general case with callbacks and fixing VFIO resampling too, or whether I paper over it for QEMU with something *slightly* less icky than this (which ideally would not lose levels from PCI devices either)... diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 07bd0c9ab8..4c2e8876e5 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -301,7 +301,20 @@ static void gsi_assert_bh(void *opaque) xen_evtchn_set_callback_level(!!vi->evtchn_upcall_pending); } } - +int xen_evtchn_check_gsi(int n, int level) +{ + struct vcpu_info *vi = kvm_xen_get_vcpu_info_hva(0); + XenEvtchnState *s = xen_evtchn_singleton; + if (!s || n != s->callback_gsi || !vi) { + return level; + } + if (vi->evtchn_upcall_pending && !level) { + printf("Refusing to deassert GSI#%d which is asserted by Xen\n", + n); + return 1; + } + return level; +} void xen_evtchn_create(unsigned int nr_gsis, qemu_irq *system_gsis) { XenEvtchnState *s = XEN_EVTCHN(sysbus_create_simple(TYPE_XEN_EVTCHN, diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index b740acfc0d..c1f56869b3 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -31,6 +31,7 @@ struct kvm_irq_routing_entry; int xen_evtchn_translate_pirq_msi(struct kvm_irq_routing_entry *route, uint64_t address, uint32_t data); bool xen_evtchn_deliver_pirq_msi(uint64_t address, uint32_t data); +int xen_evtchn_check_gsi(int n, int level); /* diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c index dc031af662..4185f467ee 100644 --- a/hw/i386/x86-common.c +++ b/hw/i386/x86-common.c @@ -452,6 +452,7 @@ void gsi_handler(void *opaque, int n, int level) GSIState *s = opaque; trace_x86_gsi_interrupt(n, level); + level = xen_evtchn_check_gsi(n, level); switch (n) { case 0 ... ISA_NUM_IRQS - 1: if (s->i8259_irq[n]) {
On Wed, 2024-12-18 at 16:54 +0100, Thomas Huth wrote: > On 18/12/2024 12.48, David Woodhouse wrote: > > On 18 December 2024 12:32:49 CET, Thomas Huth <thuth@redhat.com> wrote: > > > Use the serial console to execute the commands in the guest instead > > > of using ssh since we don't have ssh support in the functional > > > framework yet. > > > > > > Signed-off-by: Thomas Huth <thuth@redhat.com> > > > > Hm, but serial is lossy and experience shows that it leads to flaky tests if the guest (or host) misses bytes. While SSH would just go slower. > > I now noticed some issue with the serial console in this test, too. > Looks like the "Starting dropbear sshd: OK" is not print in an atomic way by > the guest, sometimes there are other kernel messages between the ":" and the > "OK". It works reliable when removing the "OK" from the string. Nah, that still isn't atomic; you just got lucky because the race window is smaller. It's not like serial ports are at a premium; can't you have a separate port for kernel vs. userspace messages?
On 18/12/2024 23.14, David Woodhouse wrote: > On Wed, 2024-12-18 at 16:54 +0100, Thomas Huth wrote: >> On 18/12/2024 12.48, David Woodhouse wrote: >>> On 18 December 2024 12:32:49 CET, Thomas Huth <thuth@redhat.com> wrote: >>>> Use the serial console to execute the commands in the guest instead >>>> of using ssh since we don't have ssh support in the functional >>>> framework yet. >>>> >>>> Signed-off-by: Thomas Huth <thuth@redhat.com> >>> >>> Hm, but serial is lossy and experience shows that it leads to flaky tests if the guest (or host) misses bytes. While SSH would just go slower. >> >> I now noticed some issue with the serial console in this test, too. >> Looks like the "Starting dropbear sshd: OK" is not print in an atomic way by >> the guest, sometimes there are other kernel messages between the ":" and the >> "OK". It works reliable when removing the "OK" from the string. > > Nah, that still isn't atomic; you just got lucky because the race > window is smaller. It's not like serial ports are at a premium; can't > you have a separate port for kernel vs. userspace messages? Maybe easiest solution: Simply add "quiet" to the kernel command line, then it does not write the kernel messages to the serial console anymore. Thomas
On 19 December 2024 09:35:13 CET, Thomas Huth <thuth@redhat.com> wrote: >On 18/12/2024 23.14, David Woodhouse wrote: >> On Wed, 2024-12-18 at 16:54 +0100, Thomas Huth wrote: >>> On 18/12/2024 12.48, David Woodhouse wrote: >>>> On 18 December 2024 12:32:49 CET, Thomas Huth <thuth@redhat.com> wrote: >>>>> Use the serial console to execute the commands in the guest instead >>>>> of using ssh since we don't have ssh support in the functional >>>>> framework yet. >>>>> >>>>> Signed-off-by: Thomas Huth <thuth@redhat.com> >>>> >>>> Hm, but serial is lossy and experience shows that it leads to flaky tests if the guest (or host) misses bytes. While SSH would just go slower. >>> >>> I now noticed some issue with the serial console in this test, too. >>> Looks like the "Starting dropbear sshd: OK" is not print in an atomic way by >>> the guest, sometimes there are other kernel messages between the ":" and the >>> "OK". It works reliable when removing the "OK" from the string. >> >> Nah, that still isn't atomic; you just got lucky because the race >> window is smaller. It's not like serial ports are at a premium; can't >> you have a separate port for kernel vs. userspace messages? > >Maybe easiest solution: Simply add "quiet" to the kernel command line, then it does not write the kernel messages to the serial console anymore. Want to resend the bug report about that test failing again? But without the kernel messages this time... :)
On 19/12/2024 09.49, David Woodhouse wrote: > On 19 December 2024 09:35:13 CET, Thomas Huth <thuth@redhat.com> wrote: >> On 18/12/2024 23.14, David Woodhouse wrote: >>> On Wed, 2024-12-18 at 16:54 +0100, Thomas Huth wrote: >>>> On 18/12/2024 12.48, David Woodhouse wrote: >>>>> On 18 December 2024 12:32:49 CET, Thomas Huth <thuth@redhat.com> wrote: >>>>>> Use the serial console to execute the commands in the guest instead >>>>>> of using ssh since we don't have ssh support in the functional >>>>>> framework yet. >>>>>> >>>>>> Signed-off-by: Thomas Huth <thuth@redhat.com> >>>>> >>>>> Hm, but serial is lossy and experience shows that it leads to flaky tests if the guest (or host) misses bytes. While SSH would just go slower. >>>> >>>> I now noticed some issue with the serial console in this test, too. >>>> Looks like the "Starting dropbear sshd: OK" is not print in an atomic way by >>>> the guest, sometimes there are other kernel messages between the ":" and the >>>> "OK". It works reliable when removing the "OK" from the string. >>> >>> Nah, that still isn't atomic; you just got lucky because the race >>> window is smaller. It's not like serial ports are at a premium; can't >>> you have a separate port for kernel vs. userspace messages? >> >> Maybe easiest solution: Simply add "quiet" to the kernel command line, then it does not write the kernel messages to the serial console anymore. > > Want to resend the bug report about that test failing again? But without the kernel messages this time... :) With "quiet", the output just looks like this when it hangs: Cannot get hvm parameter CONSOLE_EVTCHN (18): -22! Spectre V2 : Kernel not compiled with retpoline; no mitigation available! kvm_intel: VMX not supported by CPU 0 Cannot get hvm parameter CONSOLE_EVTCHN (18): -22! fail to initialize ptp_kvm Anyway, to properly track this, I've now created a ticket with the full log: https://gitlab.com/qemu-project/qemu/-/issues/2731 Thomas
On Thu, 2024-12-19 at 13:24 +0100, Thomas Huth wrote: > On 19/12/2024 09.49, David Woodhouse wrote: > > On 19 December 2024 09:35:13 CET, Thomas Huth <thuth@redhat.com> wrote: > > > On 18/12/2024 23.14, David Woodhouse wrote: > > > > On Wed, 2024-12-18 at 16:54 +0100, Thomas Huth wrote: > > > > > On 18/12/2024 12.48, David Woodhouse wrote: > > > > > > On 18 December 2024 12:32:49 CET, Thomas Huth <thuth@redhat.com> wrote: > > > > > > > Use the serial console to execute the commands in the guest instead > > > > > > > of using ssh since we don't have ssh support in the functional > > > > > > > framework yet. > > > > > > > > > > > > > > Signed-off-by: Thomas Huth <thuth@redhat.com> > > > > > > > > > > > > Hm, but serial is lossy and experience shows that it leads to flaky tests if the guest (or host) misses bytes. While SSH would just go slower. > > > > > > > > > > I now noticed some issue with the serial console in this test, too. > > > > > Looks like the "Starting dropbear sshd: OK" is not print in an atomic way by > > > > > the guest, sometimes there are other kernel messages between the ":" and the > > > > > "OK". It works reliable when removing the "OK" from the string. > > > > > > > > Nah, that still isn't atomic; you just got lucky because the race > > > > window is smaller. It's not like serial ports are at a premium; can't > > > > you have a separate port for kernel vs. userspace messages? > > > > > > Maybe easiest solution: Simply add "quiet" to the kernel command line, then it does not write the kernel messages to the serial console anymore. > > > > Want to resend the bug report about that test failing again? But without the kernel messages this time... :) > > With "quiet", the output just looks like this when it hangs: > > Cannot get hvm parameter CONSOLE_EVTCHN (18): -22! > Spectre V2 : Kernel not compiled with retpoline; no mitigation available! > kvm_intel: VMX not supported by CPU 0 > Cannot get hvm parameter CONSOLE_EVTCHN (18): -22! > fail to initialize ptp_kvm Yeah, that request was rhetorical. That output is useless for understanding anything about what happened. > Anyway, to properly track this, I've now created a ticket with the full log: > > https://gitlab.com/qemu-project/qemu/-/issues/2731 The patch below should fix it. I don't like it very much; it's very much papering over a much bigger generic problem with QEMU's handling of shared interrupts. Basically, *nothing* should just directly set the system GSIs to "their" desired level with qemu_set_irq(). Each device should feed into a multiplexer which is essentially an OR gate, and the *output* of that mux goes into the actual GSI. Alternatively, when the interrupt line is acked at the interrupt controller, we should do a callback to query whether that line should still be asserted (which is exactly how VFIO's resampler should work, but we can't use it correctly from QEMU and currently just invoke it for *every* MMIO access to a VFIO device!). This just implements the logical OR for the Xen event channel callback GSI, since the Xen code *already* has a hook in the x86 gsi_handler() function to capture external interrupts being routed to event channel PIRQs. So we just shuffle that a little and let it 'adjust' the level that's being set. diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 07bd0c9ab8..e3bd48d954 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -140,6 +140,8 @@ struct XenEvtchnState { uint64_t callback_param; bool evtchn_in_kernel; + bool setting_callback_gsi; + int extern_gsi_level; uint32_t callback_gsi; QEMUBH *gsi_bh; @@ -431,7 +433,9 @@ void xen_evtchn_set_callback_level(int level) } if (s->callback_gsi && s->callback_gsi < s->nr_callback_gsis) { - qemu_set_irq(s->callback_gsis[s->callback_gsi], level); + s->setting_callback_gsi = true; + qemu_set_irq(s->callback_gsis[s->callback_gsi], level || s->extern_gsi_level); + s->setting_callback_gsi = false; if (level) { /* Ensure the vCPU polls for deassertion */ kvm_xen_set_callback_asserted(); @@ -1596,7 +1600,7 @@ static int allocate_pirq(XenEvtchnState *s, int type, int gsi) return pirq; } -bool xen_evtchn_set_gsi(int gsi, int level) +bool xen_evtchn_set_gsi(int gsi, int *level) { XenEvtchnState *s = xen_evtchn_singleton; int pirq; @@ -1608,16 +1612,29 @@ bool xen_evtchn_set_gsi(int gsi, int level) } /* - * Check that that it *isn't* the event channel GSI, and thus - * that we are not recursing and it's safe to take s->port_lock. + * For the callback_gsi we need to implement a logical OR of the event + * channel GSI and the external input (e.g. from PCI INTx), because + * QEMU itself doesn't support shared level interrupts via demux or + * resamplers. * - * Locking aside, it's perfectly sane to bail out early for that - * special case, as it would make no sense for the event channel - * GSI to be routed back to event channels, when the delivery - * method is to raise the GSI... that recursion wouldn't *just* - * be a locking issue. + * Remember the level which is being set by *other* callers. + * + * The event channel GSI cannot be routed to PIRQ, as that would make + * no sense. It would also be a locking problem so bail out early and + * don't potentially deadlock on s->port_lock. */ if (gsi && gsi == s->callback_gsi) { + /* Remember the external state of the GSI pin (e.g. from PCI INTx) */ + if (!s->setting_callback_gsi) { + s->extern_gsi_level = *level; + if (!s->extern_gsi_level) { + struct vcpu_info *vi = kvm_xen_get_vcpu_info_hva(0); + if (vi && vi->evtchn_upcall_pending) { + *level = 1; + } + } + } + return false; } diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h index b740acfc0d..0521ebc092 100644 --- a/hw/i386/kvm/xen_evtchn.h +++ b/hw/i386/kvm/xen_evtchn.h @@ -23,7 +23,7 @@ void xen_evtchn_set_callback_level(int level); int xen_evtchn_set_port(uint16_t port); -bool xen_evtchn_set_gsi(int gsi, int level); +bool xen_evtchn_set_gsi(int gsi, int *level); void xen_evtchn_snoop_msi(PCIDevice *dev, bool is_msix, unsigned int vector, uint64_t addr, uint32_t data, bool is_masked); void xen_evtchn_remove_pci_device(PCIDevice *dev); diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c index dc031af662..77acd331ee 100644 --- a/hw/i386/x86-common.c +++ b/hw/i386/x86-common.c @@ -450,28 +450,36 @@ static long get_file_size(FILE *f) void gsi_handler(void *opaque, int n, int level) { GSIState *s = opaque; + bool bypass_ioapic = false; trace_x86_gsi_interrupt(n, level); - switch (n) { - case 0 ... ISA_NUM_IRQS - 1: - if (s->i8259_irq[n]) { - /* Under KVM, Kernel will forward to both PIC and IOAPIC */ - qemu_set_irq(s->i8259_irq[n], level); - } - /* fall through */ - case ISA_NUM_IRQS ... IOAPIC_NUM_PINS - 1: + #ifdef CONFIG_XEN_EMU /* * Xen delivers the GSI to the Legacy PIC (not that Legacy PIC * routing actually works properly under Xen). And then to * *either* the PIRQ handling or the I/OAPIC depending on * whether the former wants it. + * + * Additionally, this hook allows the Xen event channel GSI to + * work around QEMU's lack of support for shared level interrupts, + * by keeping track of the externally driven state of the pin and + * implementing a logical OR with the state of the evtchn GSI. */ - if (xen_mode == XEN_EMULATE && xen_evtchn_set_gsi(n, level)) { - break; - } + if (xen_mode == XEN_EMULATE) + bypass_ioapic = xen_evtchn_set_gsi(n, &level); #endif - qemu_set_irq(s->ioapic_irq[n], level); + + switch (n) { + case 0 ... ISA_NUM_IRQS - 1: + if (s->i8259_irq[n]) { + /* Under KVM, Kernel will forward to both PIC and IOAPIC */ + qemu_set_irq(s->i8259_irq[n], level); + } + /* fall through */ + case ISA_NUM_IRQS ... IOAPIC_NUM_PINS - 1: + if (!bypass_ioapic) + qemu_set_irq(s->ioapic_irq[n], level); break; case IO_APIC_SECONDARY_IRQBASE ... IO_APIC_SECONDARY_IRQBASE + IOAPIC_NUM_PINS - 1:
On 12/19/24 04:56, David Woodhouse wrote: > On Thu, 2024-12-19 at 13:24 +0100, Thomas Huth wrote: >> On 19/12/2024 09.49, David Woodhouse wrote: >>> On 19 December 2024 09:35:13 CET, Thomas Huth <thuth@redhat.com> wrote: >>>> On 18/12/2024 23.14, David Woodhouse wrote: >>>>> On Wed, 2024-12-18 at 16:54 +0100, Thomas Huth wrote: >>>>>> On 18/12/2024 12.48, David Woodhouse wrote: >>>>>>> On 18 December 2024 12:32:49 CET, Thomas Huth <thuth@redhat.com> wrote: >>>>>>>> Use the serial console to execute the commands in the guest instead >>>>>>>> of using ssh since we don't have ssh support in the functional >>>>>>>> framework yet. >>>>>>>> >>>>>>>> Signed-off-by: Thomas Huth <thuth@redhat.com> >>>>>>> >>>>>>> Hm, but serial is lossy and experience shows that it leads to flaky tests if the guest (or host) misses bytes. While SSH would just go slower. >>>>>> >>>>>> I now noticed some issue with the serial console in this test, too. >>>>>> Looks like the "Starting dropbear sshd: OK" is not print in an atomic way by >>>>>> the guest, sometimes there are other kernel messages between the ":" and the >>>>>> "OK". It works reliable when removing the "OK" from the string. >>>>> >>>>> Nah, that still isn't atomic; you just got lucky because the race >>>>> window is smaller. It's not like serial ports are at a premium; can't >>>>> you have a separate port for kernel vs. userspace messages? >>>> >>>> Maybe easiest solution: Simply add "quiet" to the kernel command line, then it does not write the kernel messages to the serial console anymore. >>> >>> Want to resend the bug report about that test failing again? But without the kernel messages this time... :) >> >> With "quiet", the output just looks like this when it hangs: >> >> Cannot get hvm parameter CONSOLE_EVTCHN (18): -22! >> Spectre V2 : Kernel not compiled with retpoline; no mitigation available! >> kvm_intel: VMX not supported by CPU 0 >> Cannot get hvm parameter CONSOLE_EVTCHN (18): -22! >> fail to initialize ptp_kvm > > Yeah, that request was rhetorical. That output is useless for > understanding anything about what happened. > >> Anyway, to properly track this, I've now created a ticket with the full log: >> >> https://gitlab.com/qemu-project/qemu/-/issues/2731 > > The patch below should fix it. I don't like it very much; it's very > much papering over a much bigger generic problem with QEMU's handling > of shared interrupts. > > Basically, *nothing* should just directly set the system GSIs to > "their" desired level with qemu_set_irq(). Each device should feed into > a multiplexer which is essentially an OR gate, and the *output* of that > mux goes into the actual GSI. We have such a device: include/hw/or-irq.h. How simple it is to wire that into this machine model is left unexplored. r~
On Thu, 2024-12-19 at 10:05 -0800, Richard Henderson wrote: > On 12/19/24 04:56, David Woodhouse wrote: > > On Thu, 2024-12-19 at 13:24 +0100, Thomas Huth wrote: > > > > > Anyway, to properly track this, I've now created a ticket with the full log: > > > > > > https://gitlab.com/qemu-project/qemu/-/issues/2731 > > > > The patch below should fix it. I don't like it very much; it's very > > much papering over a much bigger generic problem with QEMU's handling > > of shared interrupts. > > > > Basically, *nothing* should just directly set the system GSIs to > > "their" desired level with qemu_set_irq(). Each device should feed into > > a multiplexer which is essentially an OR gate, and the *output* of that > > mux goes into the actual GSI. > > We have such a device: include/hw/or-irq.h. > How simple it is to wire that into this machine model is left > unexplored. It's not trivial; I think every source feeding interrupts into x86ms->gsi[] would need to have its own input into one of these OrIRQ devices. At which point parts of the PCI bus INTx routing start to look a bit pointless since it's mostly just implementing that function (and the SCI IRQ for ICH9 is also manually OR'd in). I wouldn't be entirely averse to embarking on that journey, but it *still* wouldn't fix VFIO. For that we really do want the other mode, of invoking callbacks to reassert the IRQ if the device still wants attention. That would actually be a lot nicer for the Xen GSI case too, as we'd just resample when the IRQ is acked at the APIC, rather than constantly having to check whether we should deassert it. And doing *that* one is even more yak shaving. And would also lead to fixing the whole MSI routing mess to get cached translations right... For now, given that the Xen code already *had* a hook in gsi_handler, I think I can live with the patch I posted.
diff --git a/MAINTAINERS b/MAINTAINERS index 430a0f4f8c..389b390de1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -484,7 +484,7 @@ S: Supported F: include/sysemu/kvm_xen.h F: target/i386/kvm/xen* F: hw/i386/kvm/xen* -F: tests/avocado/kvm_xen_guest.py +F: tests/functional/test_x86_64_kvm_xen.py Guest CPU Cores (other accelerators) ------------------------------------ diff --git a/tests/functional/meson.build b/tests/functional/meson.build index 16d4bd903f..8c3d1c26da 100644 --- a/tests/functional/meson.build +++ b/tests/functional/meson.build @@ -42,6 +42,7 @@ test_timeouts = { 'riscv64_tuxrun' : 120, 's390x_ccw_virtio' : 420, 'sh4_tuxrun' : 240, + 'x86_64_kvm_xen' : 180, } tests_generic_system = [ @@ -254,6 +255,7 @@ tests_x86_64_system_thorough = [ 'netdev_ethtool', 'virtio_gpu', 'x86_64_hotplug_cpu', + 'x86_64_kvm_xen', 'x86_64_tuxrun', ] diff --git a/tests/avocado/kvm_xen_guest.py b/tests/functional/test_x86_64_kvm_xen.py old mode 100644 new mode 100755 similarity index 64% rename from tests/avocado/kvm_xen_guest.py rename to tests/functional/test_x86_64_kvm_xen.py index f8cb458d5d..849b4ecfc2 --- a/tests/avocado/kvm_xen_guest.py +++ b/tests/functional/test_x86_64_kvm_xen.py @@ -1,3 +1,5 @@ +#!/usr/bin/env python3 +# # KVM Xen guest functional tests # # Copyright © 2021 Red Hat, Inc. @@ -13,17 +15,10 @@ from qemu.machine import machine -from avocado_qemu import LinuxSSHMixIn -from avocado_qemu import QemuSystemTest -from avocado_qemu import wait_for_console_pattern +from qemu_test import QemuSystemTest, Asset, exec_command_and_wait_for_pattern +from qemu_test import wait_for_console_pattern -class KVMXenGuest(QemuSystemTest, LinuxSSHMixIn): - """ - :avocado: tags=arch:x86_64 - :avocado: tags=machine:q35 - :avocado: tags=accel:kvm - :avocado: tags=kvm_xen_guest - """ +class KVMXenGuest(QemuSystemTest): KERNEL_DEFAULT = 'printk.time=0 root=/dev/xvda console=ttyS0' @@ -33,14 +28,15 @@ class KVMXenGuest(QemuSystemTest, LinuxSSHMixIn): # Fetch assets from the kvm-xen-guest subdir of my shared test # images directory on fileserver.linaro.org where you can find # build instructions for how they where assembled. - def get_asset(self, name, sha1): - base_url = ('https://fileserver.linaro.org/s/' - 'kE4nCFLdQcoBF9t/download?' - 'path=%2Fkvm-xen-guest&files=' ) - url = base_url + name - # use explicit name rather than failing to neatly parse the - # URL into a unique one - return self.fetch_asset(name=name, locations=(url), asset_hash=sha1) + ASSET_KERNEL = Asset( + ('https://fileserver.linaro.org/s/kE4nCFLdQcoBF9t/download?' + 'path=%2Fkvm-xen-guest&files=bzImage'), + 'ec0ad7bb8c33c5982baee0a75505fe7dbf29d3ff5d44258204d6307c6fe0132a') + + ASSET_ROOTFS = Asset( + ('https://fileserver.linaro.org/s/kE4nCFLdQcoBF9t/download?' + 'path=%2Fkvm-xen-guest&files=rootfs.ext4'), + 'b11045d649006c649c184e93339aaa41a8fe20a1a86620af70323252eb29e40b') def common_vm_setup(self): # We also catch lack of KVM_XEN support if we fail to launch @@ -51,10 +47,8 @@ def common_vm_setup(self): self.vm.add_args("-accel", "kvm,xen-version=0x4000a,kernel-irqchip=split") self.vm.add_args("-smp", "2") - self.kernel_path = self.get_asset("bzImage", - "367962983d0d32109998a70b45dcee4672d0b045") - self.rootfs = self.get_asset("rootfs.ext4", - "f1478401ea4b3fa2ea196396be44315bab2bb5e4") + self.kernel_path = self.ASSET_KERNEL.fetch() + self.rootfs = self.ASSET_ROOTFS.fetch() def run_and_check(self): self.vm.add_args('-kernel', self.kernel_path, @@ -79,10 +73,11 @@ def run_and_check(self): console_pattern = 'Starting dropbear sshd: OK' wait_for_console_pattern(self, console_pattern, 'Oops') self.log.info('sshd ready') - self.ssh_connect('root', '', False) - self.ssh_command('cat /proc/cmdline') - self.ssh_command('dmesg | grep -e "Grant table initialized"') + exec_command_and_wait_for_pattern(self, 'cat /proc/cmdline', 'xen') + exec_command_and_wait_for_pattern(self, 'dmesg | grep "Grant table"', + 'Grant table initialized') + wait_for_console_pattern(self, '#', 'Oops') def test_kvm_xen_guest(self): """ @@ -94,7 +89,9 @@ def test_kvm_xen_guest(self): self.kernel_params = (self.KERNEL_DEFAULT + ' xen_emul_unplug=ide-disks') self.run_and_check() - self.ssh_command('grep xen-pirq.*msi /proc/interrupts') + exec_command_and_wait_for_pattern(self, + 'grep xen-pirq.*msi /proc/interrupts', + 'virtio0-output') def test_kvm_xen_guest_nomsi(self): """ @@ -106,7 +103,9 @@ def test_kvm_xen_guest_nomsi(self): self.kernel_params = (self.KERNEL_DEFAULT + ' xen_emul_unplug=ide-disks pci=nomsi') self.run_and_check() - self.ssh_command('grep xen-pirq.* /proc/interrupts') + exec_command_and_wait_for_pattern(self, + 'grep xen-pirq.* /proc/interrupts', + 'virtio0') def test_kvm_xen_guest_noapic_nomsi(self): """ @@ -118,7 +117,9 @@ def test_kvm_xen_guest_noapic_nomsi(self): self.kernel_params = (self.KERNEL_DEFAULT + ' xen_emul_unplug=ide-disks noapic pci=nomsi') self.run_and_check() - self.ssh_command('grep xen-pirq /proc/interrupts') + exec_command_and_wait_for_pattern(self, + 'grep xen-pirq /proc/interrupts', + 'virtio0') def test_kvm_xen_guest_vapic(self): """ @@ -130,8 +131,13 @@ def test_kvm_xen_guest_vapic(self): self.kernel_params = (self.KERNEL_DEFAULT + ' xen_emul_unplug=ide-disks') self.run_and_check() - self.ssh_command('grep xen-pirq /proc/interrupts') - self.ssh_command('grep PCI-MSI /proc/interrupts') + exec_command_and_wait_for_pattern(self, + 'grep xen-pirq /proc/interrupts', + 'acpi') + wait_for_console_pattern(self, '#') + exec_command_and_wait_for_pattern(self, + 'grep PCI-MSI /proc/interrupts', + 'virtio0-output') def test_kvm_xen_guest_novector(self): """ @@ -143,7 +149,9 @@ def test_kvm_xen_guest_novector(self): ' xen_emul_unplug=ide-disks' + ' xen_no_vector_callback') self.run_and_check() - self.ssh_command('grep xen-platform-pci /proc/interrupts') + exec_command_and_wait_for_pattern(self, + 'grep xen-platform-pci /proc/interrupts', + 'fasteoi') def test_kvm_xen_guest_novector_nomsi(self): """ @@ -156,7 +164,9 @@ def test_kvm_xen_guest_novector_nomsi(self): ' xen_emul_unplug=ide-disks pci=nomsi' + ' xen_no_vector_callback') self.run_and_check() - self.ssh_command('grep xen-platform-pci /proc/interrupts') + exec_command_and_wait_for_pattern(self, + 'grep xen-platform-pci /proc/interrupts', + 'IO-APIC') def test_kvm_xen_guest_novector_noapic(self): """ @@ -168,4 +178,9 @@ def test_kvm_xen_guest_novector_noapic(self): ' xen_emul_unplug=ide-disks' + ' xen_no_vector_callback noapic') self.run_and_check() - self.ssh_command('grep xen-platform-pci /proc/interrupts') + exec_command_and_wait_for_pattern(self, + 'grep xen-platform-pci /proc/interrupts', + 'XT-PIC') + +if __name__ == '__main__': + QemuSystemTest.main()
Use the serial console to execute the commands in the guest instead of using ssh since we don't have ssh support in the functional framework yet. Signed-off-by: Thomas Huth <thuth@redhat.com> --- MAINTAINERS | 2 +- tests/functional/meson.build | 2 + .../test_x86_64_kvm_xen.py} | 81 +++++++++++-------- 3 files changed, 51 insertions(+), 34 deletions(-) rename tests/{avocado/kvm_xen_guest.py => functional/test_x86_64_kvm_xen.py} (64%) mode change 100644 => 100755