Message ID | 20230828151519.2187418-1-mimu@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] KVM: s390: fix gisa destroy operation might lead to cpu stalls | expand |
On 8/28/23 11:15 AM, Michael Mueller wrote: > A GISA cannot be destroyed as long it is linked in the GIB alert list > as this would break the alert list. Just waiting for its removal from > the list triggered by another vm is not sufficient as it might be the > only vm. The below shown cpu stall situation might occur when GIB alerts > are delayed and is fixed by calling process_gib_alert_list() instead of > waiting. > > At this time the vcpus of the vm are already destroyed and thus > no vcpu can be kicked to enter the SIE again if for some reason an > interrupt is pending for that vm. > > Additionally the IAM restore value is set to 0x00. That would be a bug > introduced by incomplete device de-registration, i.e. missing > kvm_s390_gisc_unregister() call. > > Setting this value and the IAM in the GISA to 0x00 guarantees that late > interrupts don't bring the GISA back into the alert list. > > CPU stall caused by kvm_s390_gisa_destroy(): > > [ 4915.311372] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 14-.... } 24533 jiffies s: 5269 root: 0x1/. > [ 4915.311390] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x4000/. > [ 4915.311394] Task dump for CPU 14: > [ 4915.311395] task:qemu-system-s39 state:R running task stack:0 pid:217198 ppid:1 flags:0x00000045 > [ 4915.311399] Call Trace: > [ 4915.311401] [<0000038003a33a10>] 0x38003a33a10 > [ 4933.861321] rcu: INFO: rcu_sched self-detected stall on CPU > [ 4933.861332] rcu: 14-....: (42008 ticks this GP) idle=53f4/1/0x4000000000000000 softirq=61530/61530 fqs=14031 > [ 4933.861353] rcu: (t=42008 jiffies g=238109 q=100360 ncpus=18) > [ 4933.861357] CPU: 14 PID: 217198 Comm: qemu-system-s39 Not tainted 6.5.0-20230816.rc6.git26.a9d17c5d8813.300.fc38.s390x #1 > [ 4933.861360] Hardware name: IBM 8561 T01 703 (LPAR) > [ 4933.861361] Krnl PSW : 0704e00180000000 000003ff804bfc66 (kvm_s390_gisa_destroy+0x3e/0xe0 [kvm]) > [ 4933.861414] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 > [ 4933.861416] Krnl GPRS: 0000000000000000 00000372000000fc 00000002134f8000 000000000d5f5900 > [ 4933.861419] 00000002f5ea1d18 00000002f5ea1d18 0000000000000000 0000000000000000 > [ 4933.861420] 00000002134fa890 00000002134f8958 000000000d5f5900 00000002134f8000 > [ 4933.861422] 000003ffa06acf98 000003ffa06858b0 0000038003a33c20 0000038003a33bc8 > [ 4933.861430] Krnl Code: 000003ff804bfc58: ec66002b007e cij %r6,0,6,000003ff804bfcae > 000003ff804bfc5e: b904003a lgr %r3,%r10 > #000003ff804bfc62: a7f40005 brc 15,000003ff804bfc6c > >000003ff804bfc66: e330b7300204 lg %r3,10032(%r11) > 000003ff804bfc6c: 58003000 l %r0,0(%r3) > 000003ff804bfc70: ec03fffb6076 crj %r0,%r3,6,000003ff804bfc66 > 000003ff804bfc76: e320b7600271 lay %r2,10080(%r11) > 000003ff804bfc7c: c0e5fffea339 brasl %r14,000003ff804942ee > [ 4933.861444] Call Trace: > [ 4933.861445] [<000003ff804bfc66>] kvm_s390_gisa_destroy+0x3e/0xe0 [kvm] > [ 4933.861460] ([<00000002623523de>] free_unref_page+0xee/0x148) > [ 4933.861507] [<000003ff804aea98>] kvm_arch_destroy_vm+0x50/0x120 [kvm] > [ 4933.861521] [<000003ff8049d374>] kvm_destroy_vm+0x174/0x288 [kvm] > [ 4933.861532] [<000003ff8049d4fe>] kvm_vm_release+0x36/0x48 [kvm] > [ 4933.861542] [<00000002623cd04a>] __fput+0xea/0x2a8 > [ 4933.861547] [<00000002620d5bf8>] task_work_run+0x88/0xf0 > [ 4933.861551] [<00000002620b0aa6>] do_exit+0x2c6/0x528 > [ 4933.861556] [<00000002620b0f00>] do_group_exit+0x40/0xb8 > [ 4933.861557] [<00000002620b0fa6>] __s390x_sys_exit_group+0x2e/0x30 > [ 4933.861559] [<0000000262d481f4>] __do_syscall+0x1d4/0x200 > [ 4933.861563] [<0000000262d59028>] system_call+0x70/0x98 > [ 4933.861565] Last Breaking-Event-Address: > [ 4933.861566] [<0000038003a33b60>] 0x38003a33b60 > > Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()") > Signed-off-by: Michael Mueller <mimu@linux.ibm.com> > --- > arch/s390/kvm/interrupt.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c > index 85e39f472bb4..75e200bd1030 100644 > --- a/arch/s390/kvm/interrupt.c > +++ b/arch/s390/kvm/interrupt.c > @@ -3216,11 +3216,12 @@ void kvm_s390_gisa_destroy(struct kvm *kvm) > > if (!gi->origin) > return; > - if (gi->alert.mask) > - KVM_EVENT(3, "vm 0x%pK has unexpected iam 0x%02x", > - kvm, gi->alert.mask); > - while (gisa_in_alert_list(gi->origin)) > - cpu_relax(); > + WARN(gi->alert.mask != 0x00, > + "unexpected non zero alert.mask 0x%02x", > + gi->alert.mask); > + gi->alert.mask = 0x00; > + if (gisa_set_iam(gi->origin, gi->alert.mask)) > + process_gib_alert_list(); > hrtimer_cancel(&gi->timer); Thanks for the prior explanations. This looks pretty good to me now, I think the subtlety that I was missing is that we are kicking off the callback (gisa_vcpu_kicker) via hrtimer_start with an immediate expiry (0) and relying on the fact that this hrtimer_cancel here will wait until that callback has finished. AFAIU that means that now we will either set the IAM immediately here via gisa_set_iam or via the callback after handling the alert; in both cases this will prevent further alerts and we won't clear gi->origin until after that point. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> > gi->origin = NULL; > VM_EVENT(kvm, 3, "gisa 0x%pK destroyed", gisa);
On 30.08.23 15:13, Matthew Rosato wrote: > On 8/28/23 11:15 AM, Michael Mueller wrote: >> A GISA cannot be destroyed as long it is linked in the GIB alert list >> as this would break the alert list. Just waiting for its removal from >> the list triggered by another vm is not sufficient as it might be the >> only vm. The below shown cpu stall situation might occur when GIB alerts >> are delayed and is fixed by calling process_gib_alert_list() instead of >> waiting. >> >> At this time the vcpus of the vm are already destroyed and thus >> no vcpu can be kicked to enter the SIE again if for some reason an >> interrupt is pending for that vm. >> >> Additionally the IAM restore value is set to 0x00. That would be a bug >> introduced by incomplete device de-registration, i.e. missing >> kvm_s390_gisc_unregister() call. >> >> Setting this value and the IAM in the GISA to 0x00 guarantees that late >> interrupts don't bring the GISA back into the alert list. >> >> CPU stall caused by kvm_s390_gisa_destroy(): >> >> [ 4915.311372] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 14-.... } 24533 jiffies s: 5269 root: 0x1/. >> [ 4915.311390] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x4000/. >> [ 4915.311394] Task dump for CPU 14: >> [ 4915.311395] task:qemu-system-s39 state:R running task stack:0 pid:217198 ppid:1 flags:0x00000045 >> [ 4915.311399] Call Trace: >> [ 4915.311401] [<0000038003a33a10>] 0x38003a33a10 >> [ 4933.861321] rcu: INFO: rcu_sched self-detected stall on CPU >> [ 4933.861332] rcu: 14-....: (42008 ticks this GP) idle=53f4/1/0x4000000000000000 softirq=61530/61530 fqs=14031 >> [ 4933.861353] rcu: (t=42008 jiffies g=238109 q=100360 ncpus=18) >> [ 4933.861357] CPU: 14 PID: 217198 Comm: qemu-system-s39 Not tainted 6.5.0-20230816.rc6.git26.a9d17c5d8813.300.fc38.s390x #1 >> [ 4933.861360] Hardware name: IBM 8561 T01 703 (LPAR) >> [ 4933.861361] Krnl PSW : 0704e00180000000 000003ff804bfc66 (kvm_s390_gisa_destroy+0x3e/0xe0 [kvm]) >> [ 4933.861414] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 >> [ 4933.861416] Krnl GPRS: 0000000000000000 00000372000000fc 00000002134f8000 000000000d5f5900 >> [ 4933.861419] 00000002f5ea1d18 00000002f5ea1d18 0000000000000000 0000000000000000 >> [ 4933.861420] 00000002134fa890 00000002134f8958 000000000d5f5900 00000002134f8000 >> [ 4933.861422] 000003ffa06acf98 000003ffa06858b0 0000038003a33c20 0000038003a33bc8 >> [ 4933.861430] Krnl Code: 000003ff804bfc58: ec66002b007e cij %r6,0,6,000003ff804bfcae >> 000003ff804bfc5e: b904003a lgr %r3,%r10 >> #000003ff804bfc62: a7f40005 brc 15,000003ff804bfc6c >> >000003ff804bfc66: e330b7300204 lg %r3,10032(%r11) >> 000003ff804bfc6c: 58003000 l %r0,0(%r3) >> 000003ff804bfc70: ec03fffb6076 crj %r0,%r3,6,000003ff804bfc66 >> 000003ff804bfc76: e320b7600271 lay %r2,10080(%r11) >> 000003ff804bfc7c: c0e5fffea339 brasl %r14,000003ff804942ee >> [ 4933.861444] Call Trace: >> [ 4933.861445] [<000003ff804bfc66>] kvm_s390_gisa_destroy+0x3e/0xe0 [kvm] >> [ 4933.861460] ([<00000002623523de>] free_unref_page+0xee/0x148) >> [ 4933.861507] [<000003ff804aea98>] kvm_arch_destroy_vm+0x50/0x120 [kvm] >> [ 4933.861521] [<000003ff8049d374>] kvm_destroy_vm+0x174/0x288 [kvm] >> [ 4933.861532] [<000003ff8049d4fe>] kvm_vm_release+0x36/0x48 [kvm] >> [ 4933.861542] [<00000002623cd04a>] __fput+0xea/0x2a8 >> [ 4933.861547] [<00000002620d5bf8>] task_work_run+0x88/0xf0 >> [ 4933.861551] [<00000002620b0aa6>] do_exit+0x2c6/0x528 >> [ 4933.861556] [<00000002620b0f00>] do_group_exit+0x40/0xb8 >> [ 4933.861557] [<00000002620b0fa6>] __s390x_sys_exit_group+0x2e/0x30 >> [ 4933.861559] [<0000000262d481f4>] __do_syscall+0x1d4/0x200 >> [ 4933.861563] [<0000000262d59028>] system_call+0x70/0x98 >> [ 4933.861565] Last Breaking-Event-Address: >> [ 4933.861566] [<0000038003a33b60>] 0x38003a33b60 >> >> Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()") >> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >> --- >> arch/s390/kvm/interrupt.c | 11 ++++++----- >> 1 file changed, 6 insertions(+), 5 deletions(-) >> >> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c >> index 85e39f472bb4..75e200bd1030 100644 >> --- a/arch/s390/kvm/interrupt.c >> +++ b/arch/s390/kvm/interrupt.c >> @@ -3216,11 +3216,12 @@ void kvm_s390_gisa_destroy(struct kvm *kvm) >> >> if (!gi->origin) >> return; >> - if (gi->alert.mask) >> - KVM_EVENT(3, "vm 0x%pK has unexpected iam 0x%02x", >> - kvm, gi->alert.mask); >> - while (gisa_in_alert_list(gi->origin)) >> - cpu_relax(); >> + WARN(gi->alert.mask != 0x00, >> + "unexpected non zero alert.mask 0x%02x", >> + gi->alert.mask); >> + gi->alert.mask = 0x00; >> + if (gisa_set_iam(gi->origin, gi->alert.mask)) >> + process_gib_alert_list(); >> hrtimer_cancel(&gi->timer); > > Thanks for the prior explanations. This looks pretty good to me now, I think the subtlety that I was missing is that we are kicking off the callback (gisa_vcpu_kicker) via hrtimer_start with an immediate expiry (0) and relying on the fact that this hrtimer_cancel here will wait until that callback has finished. AFAIU that means that now we will either set the IAM immediately here via gisa_set_iam or via the callback after handling the alert; in both cases this will prevent further alerts and we won't clear gi->origin until after that point. > > Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> > Thanks Matt! Janosch will pick the patch taday. > >> gi->origin = NULL; >> VM_EVENT(kvm, 3, "gisa 0x%pK destroyed", gisa); >
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c index 85e39f472bb4..75e200bd1030 100644 --- a/arch/s390/kvm/interrupt.c +++ b/arch/s390/kvm/interrupt.c @@ -3216,11 +3216,12 @@ void kvm_s390_gisa_destroy(struct kvm *kvm) if (!gi->origin) return; - if (gi->alert.mask) - KVM_EVENT(3, "vm 0x%pK has unexpected iam 0x%02x", - kvm, gi->alert.mask); - while (gisa_in_alert_list(gi->origin)) - cpu_relax(); + WARN(gi->alert.mask != 0x00, + "unexpected non zero alert.mask 0x%02x", + gi->alert.mask); + gi->alert.mask = 0x00; + if (gisa_set_iam(gi->origin, gi->alert.mask)) + process_gib_alert_list(); hrtimer_cancel(&gi->timer); gi->origin = NULL; VM_EVENT(kvm, 3, "gisa 0x%pK destroyed", gisa);