diff mbox series

[v3] KVM: s390: fix gisa destroy operation might lead to cpu stalls

Message ID 20230828151519.2187418-1-mimu@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series [v3] KVM: s390: fix gisa destroy operation might lead to cpu stalls | expand

Commit Message

Michael Mueller Aug. 28, 2023, 3:15 p.m. UTC
A GISA cannot be destroyed as long it is linked in the GIB alert list
as this would break the alert list. Just waiting for its removal from
the list triggered by another vm is not sufficient as it might be the
only vm. The below shown cpu stall situation might occur when GIB alerts
are delayed and is fixed by calling process_gib_alert_list() instead of
waiting.

At this time the vcpus of the vm are already destroyed and thus
no vcpu can be kicked to enter the SIE again if for some reason an
interrupt is pending for that vm.

Additionally the IAM restore value is set to 0x00. That would be a bug
introduced by incomplete device de-registration, i.e. missing
kvm_s390_gisc_unregister() call.

Setting this value and the IAM in the GISA to 0x00 guarantees that late
interrupts don't bring the GISA back into the alert list.

CPU stall caused by kvm_s390_gisa_destroy():

 [ 4915.311372] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 14-.... } 24533 jiffies s: 5269 root: 0x1/.
 [ 4915.311390] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x4000/.
 [ 4915.311394] Task dump for CPU 14:
 [ 4915.311395] task:qemu-system-s39 state:R  running task     stack:0     pid:217198 ppid:1      flags:0x00000045
 [ 4915.311399] Call Trace:
 [ 4915.311401]  [<0000038003a33a10>] 0x38003a33a10
 [ 4933.861321] rcu: INFO: rcu_sched self-detected stall on CPU
 [ 4933.861332] rcu: 	14-....: (42008 ticks this GP) idle=53f4/1/0x4000000000000000 softirq=61530/61530 fqs=14031
 [ 4933.861353] rcu: 	(t=42008 jiffies g=238109 q=100360 ncpus=18)
 [ 4933.861357] CPU: 14 PID: 217198 Comm: qemu-system-s39 Not tainted 6.5.0-20230816.rc6.git26.a9d17c5d8813.300.fc38.s390x #1
 [ 4933.861360] Hardware name: IBM 8561 T01 703 (LPAR)
 [ 4933.861361] Krnl PSW : 0704e00180000000 000003ff804bfc66 (kvm_s390_gisa_destroy+0x3e/0xe0 [kvm])
 [ 4933.861414]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
 [ 4933.861416] Krnl GPRS: 0000000000000000 00000372000000fc 00000002134f8000 000000000d5f5900
 [ 4933.861419]            00000002f5ea1d18 00000002f5ea1d18 0000000000000000 0000000000000000
 [ 4933.861420]            00000002134fa890 00000002134f8958 000000000d5f5900 00000002134f8000
 [ 4933.861422]            000003ffa06acf98 000003ffa06858b0 0000038003a33c20 0000038003a33bc8
 [ 4933.861430] Krnl Code: 000003ff804bfc58: ec66002b007e	cij	%r6,0,6,000003ff804bfcae
                           000003ff804bfc5e: b904003a		lgr	%r3,%r10
                          #000003ff804bfc62: a7f40005		brc	15,000003ff804bfc6c
                          >000003ff804bfc66: e330b7300204	lg	%r3,10032(%r11)
                           000003ff804bfc6c: 58003000		l	%r0,0(%r3)
                           000003ff804bfc70: ec03fffb6076	crj	%r0,%r3,6,000003ff804bfc66
                           000003ff804bfc76: e320b7600271	lay	%r2,10080(%r11)
                           000003ff804bfc7c: c0e5fffea339	brasl	%r14,000003ff804942ee
 [ 4933.861444] Call Trace:
 [ 4933.861445]  [<000003ff804bfc66>] kvm_s390_gisa_destroy+0x3e/0xe0 [kvm]
 [ 4933.861460] ([<00000002623523de>] free_unref_page+0xee/0x148)
 [ 4933.861507]  [<000003ff804aea98>] kvm_arch_destroy_vm+0x50/0x120 [kvm]
 [ 4933.861521]  [<000003ff8049d374>] kvm_destroy_vm+0x174/0x288 [kvm]
 [ 4933.861532]  [<000003ff8049d4fe>] kvm_vm_release+0x36/0x48 [kvm]
 [ 4933.861542]  [<00000002623cd04a>] __fput+0xea/0x2a8
 [ 4933.861547]  [<00000002620d5bf8>] task_work_run+0x88/0xf0
 [ 4933.861551]  [<00000002620b0aa6>] do_exit+0x2c6/0x528
 [ 4933.861556]  [<00000002620b0f00>] do_group_exit+0x40/0xb8
 [ 4933.861557]  [<00000002620b0fa6>] __s390x_sys_exit_group+0x2e/0x30
 [ 4933.861559]  [<0000000262d481f4>] __do_syscall+0x1d4/0x200
 [ 4933.861563]  [<0000000262d59028>] system_call+0x70/0x98
 [ 4933.861565] Last Breaking-Event-Address:
 [ 4933.861566]  [<0000038003a33b60>] 0x38003a33b60

Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")
Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
---
 arch/s390/kvm/interrupt.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Comments

Matthew Rosato Aug. 30, 2023, 1:13 p.m. UTC | #1
On 8/28/23 11:15 AM, Michael Mueller wrote:
> A GISA cannot be destroyed as long it is linked in the GIB alert list
> as this would break the alert list. Just waiting for its removal from
> the list triggered by another vm is not sufficient as it might be the
> only vm. The below shown cpu stall situation might occur when GIB alerts
> are delayed and is fixed by calling process_gib_alert_list() instead of
> waiting.
> 
> At this time the vcpus of the vm are already destroyed and thus
> no vcpu can be kicked to enter the SIE again if for some reason an
> interrupt is pending for that vm.
> 
> Additionally the IAM restore value is set to 0x00. That would be a bug
> introduced by incomplete device de-registration, i.e. missing
> kvm_s390_gisc_unregister() call.
> 
> Setting this value and the IAM in the GISA to 0x00 guarantees that late
> interrupts don't bring the GISA back into the alert list.
> 
> CPU stall caused by kvm_s390_gisa_destroy():
> 
>  [ 4915.311372] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 14-.... } 24533 jiffies s: 5269 root: 0x1/.
>  [ 4915.311390] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x4000/.
>  [ 4915.311394] Task dump for CPU 14:
>  [ 4915.311395] task:qemu-system-s39 state:R  running task     stack:0     pid:217198 ppid:1      flags:0x00000045
>  [ 4915.311399] Call Trace:
>  [ 4915.311401]  [<0000038003a33a10>] 0x38003a33a10
>  [ 4933.861321] rcu: INFO: rcu_sched self-detected stall on CPU
>  [ 4933.861332] rcu: 	14-....: (42008 ticks this GP) idle=53f4/1/0x4000000000000000 softirq=61530/61530 fqs=14031
>  [ 4933.861353] rcu: 	(t=42008 jiffies g=238109 q=100360 ncpus=18)
>  [ 4933.861357] CPU: 14 PID: 217198 Comm: qemu-system-s39 Not tainted 6.5.0-20230816.rc6.git26.a9d17c5d8813.300.fc38.s390x #1
>  [ 4933.861360] Hardware name: IBM 8561 T01 703 (LPAR)
>  [ 4933.861361] Krnl PSW : 0704e00180000000 000003ff804bfc66 (kvm_s390_gisa_destroy+0x3e/0xe0 [kvm])
>  [ 4933.861414]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
>  [ 4933.861416] Krnl GPRS: 0000000000000000 00000372000000fc 00000002134f8000 000000000d5f5900
>  [ 4933.861419]            00000002f5ea1d18 00000002f5ea1d18 0000000000000000 0000000000000000
>  [ 4933.861420]            00000002134fa890 00000002134f8958 000000000d5f5900 00000002134f8000
>  [ 4933.861422]            000003ffa06acf98 000003ffa06858b0 0000038003a33c20 0000038003a33bc8
>  [ 4933.861430] Krnl Code: 000003ff804bfc58: ec66002b007e	cij	%r6,0,6,000003ff804bfcae
>                            000003ff804bfc5e: b904003a		lgr	%r3,%r10
>                           #000003ff804bfc62: a7f40005		brc	15,000003ff804bfc6c
>                           >000003ff804bfc66: e330b7300204	lg	%r3,10032(%r11)
>                            000003ff804bfc6c: 58003000		l	%r0,0(%r3)
>                            000003ff804bfc70: ec03fffb6076	crj	%r0,%r3,6,000003ff804bfc66
>                            000003ff804bfc76: e320b7600271	lay	%r2,10080(%r11)
>                            000003ff804bfc7c: c0e5fffea339	brasl	%r14,000003ff804942ee
>  [ 4933.861444] Call Trace:
>  [ 4933.861445]  [<000003ff804bfc66>] kvm_s390_gisa_destroy+0x3e/0xe0 [kvm]
>  [ 4933.861460] ([<00000002623523de>] free_unref_page+0xee/0x148)
>  [ 4933.861507]  [<000003ff804aea98>] kvm_arch_destroy_vm+0x50/0x120 [kvm]
>  [ 4933.861521]  [<000003ff8049d374>] kvm_destroy_vm+0x174/0x288 [kvm]
>  [ 4933.861532]  [<000003ff8049d4fe>] kvm_vm_release+0x36/0x48 [kvm]
>  [ 4933.861542]  [<00000002623cd04a>] __fput+0xea/0x2a8
>  [ 4933.861547]  [<00000002620d5bf8>] task_work_run+0x88/0xf0
>  [ 4933.861551]  [<00000002620b0aa6>] do_exit+0x2c6/0x528
>  [ 4933.861556]  [<00000002620b0f00>] do_group_exit+0x40/0xb8
>  [ 4933.861557]  [<00000002620b0fa6>] __s390x_sys_exit_group+0x2e/0x30
>  [ 4933.861559]  [<0000000262d481f4>] __do_syscall+0x1d4/0x200
>  [ 4933.861563]  [<0000000262d59028>] system_call+0x70/0x98
>  [ 4933.861565] Last Breaking-Event-Address:
>  [ 4933.861566]  [<0000038003a33b60>] 0x38003a33b60
> 
> Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")
> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
> ---
>  arch/s390/kvm/interrupt.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index 85e39f472bb4..75e200bd1030 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -3216,11 +3216,12 @@ void kvm_s390_gisa_destroy(struct kvm *kvm)
>  
>  	if (!gi->origin)
>  		return;
> -	if (gi->alert.mask)
> -		KVM_EVENT(3, "vm 0x%pK has unexpected iam 0x%02x",
> -			  kvm, gi->alert.mask);
> -	while (gisa_in_alert_list(gi->origin))
> -		cpu_relax();
> +	WARN(gi->alert.mask != 0x00,
> +	     "unexpected non zero alert.mask 0x%02x",
> +	     gi->alert.mask);
> +	gi->alert.mask = 0x00;
> +	if (gisa_set_iam(gi->origin, gi->alert.mask))
> +		process_gib_alert_list();
>  	hrtimer_cancel(&gi->timer);

Thanks for the prior explanations.  This looks pretty good to me now, I think the subtlety that I was missing is that we are kicking off the callback (gisa_vcpu_kicker) via hrtimer_start with an immediate expiry (0) and relying on the fact that this hrtimer_cancel here will wait until that callback has finished.  AFAIU that means that now we will either set the IAM immediately here via gisa_set_iam or via the callback after handling the alert; in both cases this will prevent further alerts and we won't clear gi->origin until after that point.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>


>  	gi->origin = NULL;
>  	VM_EVENT(kvm, 3, "gisa 0x%pK destroyed", gisa);
Michael Mueller Aug. 31, 2023, 9:18 a.m. UTC | #2
On 30.08.23 15:13, Matthew Rosato wrote:
> On 8/28/23 11:15 AM, Michael Mueller wrote:
>> A GISA cannot be destroyed as long it is linked in the GIB alert list
>> as this would break the alert list. Just waiting for its removal from
>> the list triggered by another vm is not sufficient as it might be the
>> only vm. The below shown cpu stall situation might occur when GIB alerts
>> are delayed and is fixed by calling process_gib_alert_list() instead of
>> waiting.
>>
>> At this time the vcpus of the vm are already destroyed and thus
>> no vcpu can be kicked to enter the SIE again if for some reason an
>> interrupt is pending for that vm.
>>
>> Additionally the IAM restore value is set to 0x00. That would be a bug
>> introduced by incomplete device de-registration, i.e. missing
>> kvm_s390_gisc_unregister() call.
>>
>> Setting this value and the IAM in the GISA to 0x00 guarantees that late
>> interrupts don't bring the GISA back into the alert list.
>>
>> CPU stall caused by kvm_s390_gisa_destroy():
>>
>>   [ 4915.311372] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 14-.... } 24533 jiffies s: 5269 root: 0x1/.
>>   [ 4915.311390] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x4000/.
>>   [ 4915.311394] Task dump for CPU 14:
>>   [ 4915.311395] task:qemu-system-s39 state:R  running task     stack:0     pid:217198 ppid:1      flags:0x00000045
>>   [ 4915.311399] Call Trace:
>>   [ 4915.311401]  [<0000038003a33a10>] 0x38003a33a10
>>   [ 4933.861321] rcu: INFO: rcu_sched self-detected stall on CPU
>>   [ 4933.861332] rcu: 	14-....: (42008 ticks this GP) idle=53f4/1/0x4000000000000000 softirq=61530/61530 fqs=14031
>>   [ 4933.861353] rcu: 	(t=42008 jiffies g=238109 q=100360 ncpus=18)
>>   [ 4933.861357] CPU: 14 PID: 217198 Comm: qemu-system-s39 Not tainted 6.5.0-20230816.rc6.git26.a9d17c5d8813.300.fc38.s390x #1
>>   [ 4933.861360] Hardware name: IBM 8561 T01 703 (LPAR)
>>   [ 4933.861361] Krnl PSW : 0704e00180000000 000003ff804bfc66 (kvm_s390_gisa_destroy+0x3e/0xe0 [kvm])
>>   [ 4933.861414]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
>>   [ 4933.861416] Krnl GPRS: 0000000000000000 00000372000000fc 00000002134f8000 000000000d5f5900
>>   [ 4933.861419]            00000002f5ea1d18 00000002f5ea1d18 0000000000000000 0000000000000000
>>   [ 4933.861420]            00000002134fa890 00000002134f8958 000000000d5f5900 00000002134f8000
>>   [ 4933.861422]            000003ffa06acf98 000003ffa06858b0 0000038003a33c20 0000038003a33bc8
>>   [ 4933.861430] Krnl Code: 000003ff804bfc58: ec66002b007e	cij	%r6,0,6,000003ff804bfcae
>>                             000003ff804bfc5e: b904003a		lgr	%r3,%r10
>>                            #000003ff804bfc62: a7f40005		brc	15,000003ff804bfc6c
>>                            >000003ff804bfc66: e330b7300204	lg	%r3,10032(%r11)
>>                             000003ff804bfc6c: 58003000		l	%r0,0(%r3)
>>                             000003ff804bfc70: ec03fffb6076	crj	%r0,%r3,6,000003ff804bfc66
>>                             000003ff804bfc76: e320b7600271	lay	%r2,10080(%r11)
>>                             000003ff804bfc7c: c0e5fffea339	brasl	%r14,000003ff804942ee
>>   [ 4933.861444] Call Trace:
>>   [ 4933.861445]  [<000003ff804bfc66>] kvm_s390_gisa_destroy+0x3e/0xe0 [kvm]
>>   [ 4933.861460] ([<00000002623523de>] free_unref_page+0xee/0x148)
>>   [ 4933.861507]  [<000003ff804aea98>] kvm_arch_destroy_vm+0x50/0x120 [kvm]
>>   [ 4933.861521]  [<000003ff8049d374>] kvm_destroy_vm+0x174/0x288 [kvm]
>>   [ 4933.861532]  [<000003ff8049d4fe>] kvm_vm_release+0x36/0x48 [kvm]
>>   [ 4933.861542]  [<00000002623cd04a>] __fput+0xea/0x2a8
>>   [ 4933.861547]  [<00000002620d5bf8>] task_work_run+0x88/0xf0
>>   [ 4933.861551]  [<00000002620b0aa6>] do_exit+0x2c6/0x528
>>   [ 4933.861556]  [<00000002620b0f00>] do_group_exit+0x40/0xb8
>>   [ 4933.861557]  [<00000002620b0fa6>] __s390x_sys_exit_group+0x2e/0x30
>>   [ 4933.861559]  [<0000000262d481f4>] __do_syscall+0x1d4/0x200
>>   [ 4933.861563]  [<0000000262d59028>] system_call+0x70/0x98
>>   [ 4933.861565] Last Breaking-Event-Address:
>>   [ 4933.861566]  [<0000038003a33b60>] 0x38003a33b60
>>
>> Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")
>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>> ---
>>   arch/s390/kvm/interrupt.c | 11 ++++++-----
>>   1 file changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>> index 85e39f472bb4..75e200bd1030 100644
>> --- a/arch/s390/kvm/interrupt.c
>> +++ b/arch/s390/kvm/interrupt.c
>> @@ -3216,11 +3216,12 @@ void kvm_s390_gisa_destroy(struct kvm *kvm)
>>   
>>   	if (!gi->origin)
>>   		return;
>> -	if (gi->alert.mask)
>> -		KVM_EVENT(3, "vm 0x%pK has unexpected iam 0x%02x",
>> -			  kvm, gi->alert.mask);
>> -	while (gisa_in_alert_list(gi->origin))
>> -		cpu_relax();
>> +	WARN(gi->alert.mask != 0x00,
>> +	     "unexpected non zero alert.mask 0x%02x",
>> +	     gi->alert.mask);
>> +	gi->alert.mask = 0x00;
>> +	if (gisa_set_iam(gi->origin, gi->alert.mask))
>> +		process_gib_alert_list();
>>   	hrtimer_cancel(&gi->timer);
> 
> Thanks for the prior explanations.  This looks pretty good to me now, I think the subtlety that I was missing is that we are kicking off the callback (gisa_vcpu_kicker) via hrtimer_start with an immediate expiry (0) and relying on the fact that this hrtimer_cancel here will wait until that callback has finished.  AFAIU that means that now we will either set the IAM immediately here via gisa_set_iam or via the callback after handling the alert; in both cases this will prevent further alerts and we won't clear gi->origin until after that point.
> 
> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
> 

Thanks Matt!

Janosch will pick the patch taday.

> 
>>   	gi->origin = NULL;
>>   	VM_EVENT(kvm, 3, "gisa 0x%pK destroyed", gisa);
>
diff mbox series

Patch

diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 85e39f472bb4..75e200bd1030 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -3216,11 +3216,12 @@  void kvm_s390_gisa_destroy(struct kvm *kvm)
 
 	if (!gi->origin)
 		return;
-	if (gi->alert.mask)
-		KVM_EVENT(3, "vm 0x%pK has unexpected iam 0x%02x",
-			  kvm, gi->alert.mask);
-	while (gisa_in_alert_list(gi->origin))
-		cpu_relax();
+	WARN(gi->alert.mask != 0x00,
+	     "unexpected non zero alert.mask 0x%02x",
+	     gi->alert.mask);
+	gi->alert.mask = 0x00;
+	if (gisa_set_iam(gi->origin, gi->alert.mask))
+		process_gib_alert_list();
 	hrtimer_cancel(&gi->timer);
 	gi->origin = NULL;
 	VM_EVENT(kvm, 3, "gisa 0x%pK destroyed", gisa);