Message ID | 20230419143155.36864-1-roger.pau@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] xen/vcpu: ignore VCPU_SSHOTTMR_future | expand |
On 19.04.2023 16:31, Roger Pau Monne wrote: > The usage of VCPU_SSHOTTMR_future in Linux prior to 4.7 is bogus. > When the hypervisor returns -ETIME (timeout in the past) Linux keeps > retrying to setup the timer with a higher timeout instead of > self-injecting a timer interrupt. > > On boxes without any hardware assistance for logdirty we have seen HVM > Linux guests < 4.7 with 32vCPUs give up trying to setup the timer when > logdirty is enabled: > > CE: Reprogramming failure. Giving up > CE: xen increased min_delta_ns to 1000000 nsec > CE: Reprogramming failure. Giving up > CE: Reprogramming failure. Giving up > CE: xen increased min_delta_ns to 506250 nsec > CE: xen increased min_delta_ns to 759375 nsec > CE: xen increased min_delta_ns to 1000000 nsec > CE: Reprogramming failure. Giving up > CE: Reprogramming failure. Giving up > CE: Reprogramming failure. Giving up > Freezing user space processes ... > INFO: rcu_sched detected stalls on CPUs/tasks: { 14} (detected by 10, t=60002 jiffies, g=4006, c=4005, q=14130) > Task dump for CPU 14: > swapper/14 R running task 0 0 1 0x00000000 > Call Trace: > [<ffffffff90160f5d>] ? rcu_eqs_enter_common.isra.30+0x3d/0xf0 > [<ffffffff907b9bde>] ? default_idle+0x1e/0xd0 > [<ffffffff90039570>] ? arch_cpu_idle+0x20/0xc0 > [<ffffffff9010820a>] ? cpu_startup_entry+0x14a/0x1e0 > [<ffffffff9005d3a7>] ? start_secondary+0x1f7/0x270 > [<ffffffff900000d5>] ? start_cpu+0x5/0x14 > INFO: rcu_sched detected stalls on CPUs/tasks: { 26} (detected by 24, t=60002 jiffies, g=6922, c=6921, q=7013) > Task dump for CPU 26: > swapper/26 R running task 0 0 1 0x00000000 > Call Trace: > [<ffffffff90160f5d>] ? rcu_eqs_enter_common.isra.30+0x3d/0xf0 > [<ffffffff907b9bde>] ? default_idle+0x1e/0xd0 > [<ffffffff90039570>] ? arch_cpu_idle+0x20/0xc0 > [<ffffffff9010820a>] ? cpu_startup_entry+0x14a/0x1e0 > [<ffffffff9005d3a7>] ? start_secondary+0x1f7/0x270 > [<ffffffff900000d5>] ? start_cpu+0x5/0x14 > INFO: rcu_sched detected stalls on CPUs/tasks: { 26} (detected by 24, t=60002 jiffies, g=8499, c=8498, q=7664) > Task dump for CPU 26: > swapper/26 R running task 0 0 1 0x00000000 > Call Trace: > [<ffffffff90160f5d>] ? rcu_eqs_enter_common.isra.30+0x3d/0xf0 > [<ffffffff907b9bde>] ? default_idle+0x1e/0xd0 > [<ffffffff90039570>] ? arch_cpu_idle+0x20/0xc0 > [<ffffffff9010820a>] ? cpu_startup_entry+0x14a/0x1e0 > [<ffffffff9005d3a7>] ? start_secondary+0x1f7/0x270 > [<ffffffff900000d5>] ? start_cpu+0x5/0x14 > > Thus leading to CPU stalls and a broken system as a result. > > Workaround this bogus usage by ignoring the VCPU_SSHOTTMR_future in > the hypervisor. Old Linux versions are the only ones known to have > (wrongly) attempted to use the flag, and ignoring it is compatible > with the behavior expected by any guests setting that flag. > > Note the usage of the flag has been removed from Linux by commit: > > c06b6d70feb3 xen/x86: don't lose event interrupts > > Which landed in Linux 4.7. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > Acked-by: Henry Wang <Henry.Wang@arm.com> # CHANGELOG A little hesitantly, but since no-one else appears to show any interest: Acked-by: Jan Beulich <jbeulich@suse.com> Jan
On Wed, May 03, 2023 at 12:38:59PM +0200, Jan Beulich wrote: > On 19.04.2023 16:31, Roger Pau Monne wrote: > > The usage of VCPU_SSHOTTMR_future in Linux prior to 4.7 is bogus. > > When the hypervisor returns -ETIME (timeout in the past) Linux keeps > > retrying to setup the timer with a higher timeout instead of > > self-injecting a timer interrupt. > > > > On boxes without any hardware assistance for logdirty we have seen HVM > > Linux guests < 4.7 with 32vCPUs give up trying to setup the timer when > > logdirty is enabled: > > > > CE: Reprogramming failure. Giving up > > CE: xen increased min_delta_ns to 1000000 nsec > > CE: Reprogramming failure. Giving up > > CE: Reprogramming failure. Giving up > > CE: xen increased min_delta_ns to 506250 nsec > > CE: xen increased min_delta_ns to 759375 nsec > > CE: xen increased min_delta_ns to 1000000 nsec > > CE: Reprogramming failure. Giving up > > CE: Reprogramming failure. Giving up > > CE: Reprogramming failure. Giving up > > Freezing user space processes ... > > INFO: rcu_sched detected stalls on CPUs/tasks: { 14} (detected by 10, t=60002 jiffies, g=4006, c=4005, q=14130) > > Task dump for CPU 14: > > swapper/14 R running task 0 0 1 0x00000000 > > Call Trace: > > [<ffffffff90160f5d>] ? rcu_eqs_enter_common.isra.30+0x3d/0xf0 > > [<ffffffff907b9bde>] ? default_idle+0x1e/0xd0 > > [<ffffffff90039570>] ? arch_cpu_idle+0x20/0xc0 > > [<ffffffff9010820a>] ? cpu_startup_entry+0x14a/0x1e0 > > [<ffffffff9005d3a7>] ? start_secondary+0x1f7/0x270 > > [<ffffffff900000d5>] ? start_cpu+0x5/0x14 > > INFO: rcu_sched detected stalls on CPUs/tasks: { 26} (detected by 24, t=60002 jiffies, g=6922, c=6921, q=7013) > > Task dump for CPU 26: > > swapper/26 R running task 0 0 1 0x00000000 > > Call Trace: > > [<ffffffff90160f5d>] ? rcu_eqs_enter_common.isra.30+0x3d/0xf0 > > [<ffffffff907b9bde>] ? default_idle+0x1e/0xd0 > > [<ffffffff90039570>] ? arch_cpu_idle+0x20/0xc0 > > [<ffffffff9010820a>] ? cpu_startup_entry+0x14a/0x1e0 > > [<ffffffff9005d3a7>] ? start_secondary+0x1f7/0x270 > > [<ffffffff900000d5>] ? start_cpu+0x5/0x14 > > INFO: rcu_sched detected stalls on CPUs/tasks: { 26} (detected by 24, t=60002 jiffies, g=8499, c=8498, q=7664) > > Task dump for CPU 26: > > swapper/26 R running task 0 0 1 0x00000000 > > Call Trace: > > [<ffffffff90160f5d>] ? rcu_eqs_enter_common.isra.30+0x3d/0xf0 > > [<ffffffff907b9bde>] ? default_idle+0x1e/0xd0 > > [<ffffffff90039570>] ? arch_cpu_idle+0x20/0xc0 > > [<ffffffff9010820a>] ? cpu_startup_entry+0x14a/0x1e0 > > [<ffffffff9005d3a7>] ? start_secondary+0x1f7/0x270 > > [<ffffffff900000d5>] ? start_cpu+0x5/0x14 > > > > Thus leading to CPU stalls and a broken system as a result. > > > > Workaround this bogus usage by ignoring the VCPU_SSHOTTMR_future in > > the hypervisor. Old Linux versions are the only ones known to have > > (wrongly) attempted to use the flag, and ignoring it is compatible > > with the behavior expected by any guests setting that flag. > > > > Note the usage of the flag has been removed from Linux by commit: > > > > c06b6d70feb3 xen/x86: don't lose event interrupts > > > > Which landed in Linux 4.7. > > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > Acked-by: Henry Wang <Henry.Wang@arm.com> # CHANGELOG > > A little hesitantly, but since no-one else appears to show any interest: > Acked-by: Jan Beulich <jbeulich@suse.com> Thanks. Roger.
diff --git a/CHANGELOG.md b/CHANGELOG.md index 5dbf8b06d72c..5bfd3aa5c0d5 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) ### Changed - Repurpose command line gnttab_max_{maptrack_,}frames options so they don't cap toolstack provided values. + - Ignore VCPUOP_set_singleshot_timer's VCPU_SSHOTTMR_future flag. The only + known user doesn't use it properly, leading to in-guest breakage. ### Added - On x86, support for features new in Intel Sapphire Rapids CPUs: diff --git a/xen/common/domain.c b/xen/common/domain.c index 626debbae095..6a440590fe2a 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -1762,9 +1762,16 @@ long common_vcpu_op(int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg) if ( copy_from_guest(&set, arg, 1) ) return -EFAULT; - if ( (set.flags & VCPU_SSHOTTMR_future) && - (set.timeout_abs_ns < NOW()) ) - return -ETIME; + if ( set.timeout_abs_ns < NOW() ) + { + /* + * Simplify the logic if the timeout has already expired and just + * inject the event. + */ + stop_timer(&v->singleshot_timer); + send_timer_event(v); + break; + } migrate_timer(&v->singleshot_timer, smp_processor_id()); set_timer(&v->singleshot_timer, set.timeout_abs_ns); diff --git a/xen/include/public/vcpu.h b/xen/include/public/vcpu.h index 81a3b3a7438c..a836b264a911 100644 --- a/xen/include/public/vcpu.h +++ b/xen/include/public/vcpu.h @@ -150,7 +150,10 @@ typedef struct vcpu_set_singleshot_timer vcpu_set_singleshot_timer_t; DEFINE_XEN_GUEST_HANDLE(vcpu_set_singleshot_timer_t); /* Flags to VCPUOP_set_singleshot_timer. */ - /* Require the timeout to be in the future (return -ETIME if it's passed). */ + /* + * Request the timeout to be in the future (return -ETIME if it's passed) + * but can be ignored by the hypervisor. + */ #define _VCPU_SSHOTTMR_future (0) #define VCPU_SSHOTTMR_future (1U << _VCPU_SSHOTTMR_future)