Message ID | 20231101192058.3419310-2-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/vmx: Multiple fixes | expand |
On 01.11.2023 20:20, Andrew Cooper wrote: > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -4097,10 +4097,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) > case EXIT_REASON_MCE_DURING_VMENTRY: > do_machine_check(regs); > break; > - > - case EXIT_REASON_INIT: > - printk(XENLOG_ERR "Error: INIT received - ignoring\n"); > - return; /* Renter the guest without further processing */ > } Wouldn't the printk() better remain where it was, and just the "return" be purged? Otherwise between here and ... > @@ -4390,6 +4386,12 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) > case EXIT_REASON_TRIPLE_FAULT: > hvm_triple_fault(); > break; > + > + case EXIT_REASON_INIT: > + /* TODO: Turn into graceful shutdown. */ > + printk(XENLOG_ERR "Error: INIT received - ignoring\n"); > + break; ... here there are various paths which bypass the main switch(), and hence there would be no trace left of the unexpected INIT in certain cases. Jan
On 02/11/2023 8:57 am, Jan Beulich wrote: > On 01.11.2023 20:20, Andrew Cooper wrote: >> --- a/xen/arch/x86/hvm/vmx/vmx.c >> +++ b/xen/arch/x86/hvm/vmx/vmx.c >> @@ -4097,10 +4097,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) >> case EXIT_REASON_MCE_DURING_VMENTRY: >> do_machine_check(regs); >> break; >> - >> - case EXIT_REASON_INIT: >> - printk(XENLOG_ERR "Error: INIT received - ignoring\n"); >> - return; /* Renter the guest without further processing */ >> } > Wouldn't the printk() better remain where it was, and just the "return" be > purged? Not really... that would hit the unknown vmexit path in the second. We actually have a variety of empty cases in the second. We could add another. ~Andrew
On 10.01.2024 20:11, Andrew Cooper wrote: > On 02/11/2023 8:57 am, Jan Beulich wrote: >> On 01.11.2023 20:20, Andrew Cooper wrote: >>> --- a/xen/arch/x86/hvm/vmx/vmx.c >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c >>> @@ -4097,10 +4097,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) >>> case EXIT_REASON_MCE_DURING_VMENTRY: >>> do_machine_check(regs); >>> break; >>> - >>> - case EXIT_REASON_INIT: >>> - printk(XENLOG_ERR "Error: INIT received - ignoring\n"); >>> - return; /* Renter the guest without further processing */ >>> } >> Wouldn't the printk() better remain where it was, and just the "return" be >> purged? > > Not really... that would hit the unknown vmexit path in the second. Well, I didn't mean to suggest to purge the other hunk. Instead I meant ... > We actually have a variety of empty cases in the second. We could add > another. ... something along these lines - do nothing but "break;" there. Jan
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 1edc7f1e919f..d26920d03bbc 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -4097,10 +4097,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) case EXIT_REASON_MCE_DURING_VMENTRY: do_machine_check(regs); break; - - case EXIT_REASON_INIT: - printk(XENLOG_ERR "Error: INIT received - ignoring\n"); - return; /* Renter the guest without further processing */ } /* Now enable interrupts so it's safe to take locks. */ @@ -4390,6 +4386,12 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) case EXIT_REASON_TRIPLE_FAULT: hvm_triple_fault(); break; + + case EXIT_REASON_INIT: + /* TODO: Turn into graceful shutdown. */ + printk(XENLOG_ERR "Error: INIT received - ignoring\n"); + break; + case EXIT_REASON_PENDING_VIRT_INTR: /* Disable the interrupt window. */ v->arch.hvm.vmx.exec_control &= ~CPU_BASED_VIRTUAL_INTR_PENDING;
When receiving an INIT, a prior bugfix tried to ignore the INIT and continue onwards. Unfortunately it's not safe to return at that point in vmx_vmexit_handler(). Just out of context in the first hunk is a local_irqs_enabled() which is depended-upon by the return-to-guest path, causing the following checklock failure in debug builds: (XEN) Error: INIT received - ignoring (XEN) CHECKLOCK FAILURE: prev irqsafe: 0, curr irqsafe 1 (XEN) Xen BUG at common/spinlock.c:132 (XEN) ----[ Xen-4.19-unstable x86_64 debug=y Tainted: H ]---- ... (XEN) Xen call trace: (XEN) [<ffff82d040238e10>] R check_lock+0xcd/0xe1 (XEN) [<ffff82d040238fe3>] F _spin_lock+0x1b/0x60 (XEN) [<ffff82d0402ed6a8>] F pt_update_irq+0x32/0x3bb (XEN) [<ffff82d0402b9632>] F vmx_intr_assist+0x3b/0x51d (XEN) [<ffff82d040206447>] F vmx_asm_vmexit_handler+0xf7/0x210 Luckily, this is benign in release builds. Accidentally having IRQs disabled when trying to take an IRQs-on lock isn't a deadlock-vulnerable pattern. Move the INIT handling into the main switch statement. In hindsight, it's wrong to skip other normal VMExit steps. Fixes: b1f11273d5a7 ("x86/vmx: Don't spuriously crash the domain when INIT is received") Reported-by: Reima ISHII <ishiir@g.ecc.u-tokyo.ac.jp> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> CC: Jun Nakajima <jun.nakajima@intel.com> CC: Kevin Tian <kevin.tian@intel.com> CC: Tamas K Lengyel <tamas@tklengyel.com> CC: Reima ISHII <ishiir@g.ecc.u-tokyo.ac.jp> CC: Takahiro Shinagawa <shina@ecc.u-tokyo.ac.jp> With this patch in place, the INIT is ignored and the guest continues: (XEN) HVM1 restore: CPU 0 (d1) --- Xen Test Framework --- (d1) Environment: HVM 64bit (Long mode 4 levels) (XEN) Error: INIT received - ignoring (d1) Test result: SUCCESS --- xen/arch/x86/hvm/vmx/vmx.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)