Message ID | 20210406153122.22874-1-akrowiak@linux.ibm.com (mailing list archive) |
---|---|
Headers | show |
Series | s390/vfio-ap: dynamic configuration support | expand |
On Tue, 6 Apr 2021 11:31:09 -0400 Tony Krowiak <akrowiak@linux.ibm.com> wrote: > Tony Krowiak (13): > s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks The subsequent patches, re introduce this circular locking dependency problem. See my kernel messages for the details. The link we severe in the above patch is re-introduced at several places. One of them is assign_adapter_store(). Regards, Halil [ +0.000236] vfio_ap matrix: MDEV: Registered [ +0.037919] vfio_mdev 4f77ad87-1e62-4959-8b7a-c677c98d2194: Adding to iommu group 1 [ +0.000092] vfio_mdev 4f77ad87-1e62-4959-8b7a-c677c98d2194: MDEV: group_id = 1 [Apr 8 22:31] ====================================================== [ +0.000002] WARNING: possible circular locking dependency detected [ +0.000002] 5.12.0-rc6-00016-g5bea90816c56 #57 Not tainted [ +0.000002] ------------------------------------------------------ [ +0.000002] CPU 1/KVM/6651 is trying to acquire lock: [ +0.000002] 00000000cef9d508 (&matrix_dev->lock){+.+.}-{3:3}, at: handle_pqap+0x56/0x1c8 [vfio_ap] [ +0.000011] but task is already holding lock: [ +0.000001] 00000000d41f4308 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x898 [kvm] [ +0.000038] which lock already depends on the new lock. [ +0.000002] the existing dependency chain (in reverse order) is: [ +0.000001] -> #2 (&vcpu->mutex){+.+.}-{3:3}: [ +0.000004] validate_chain+0x796/0xa20 [ +0.000006] __lock_acquire+0x420/0x7c8 [ +0.000003] lock_acquire.part.0+0xec/0x1e8 [ +0.000002] lock_acquire+0xb8/0x208 [ +0.000002] __mutex_lock+0xa2/0x928 [ +0.000005] mutex_lock_nested+0x32/0x40 [ +0.000002] kvm_s390_cpus_to_pv+0x4e/0xf8 [kvm] [ +0.000019] kvm_s390_handle_pv+0x1ce/0x6b0 [kvm] [ +0.000018] kvm_arch_vm_ioctl+0x3ec/0x550 [kvm] [ +0.000019] kvm_vm_ioctl+0x40e/0x4a8 [kvm] [ +0.000018] __s390x_sys_ioctl+0xc0/0x100 [ +0.000004] do_syscall+0x7e/0xd0 [ +0.000043] __do_syscall+0xc0/0xd8 [ +0.000004] system_call+0x72/0x98 [ +0.000004] -> #1 (&kvm->lock){+.+.}-{3:3}: [ +0.000004] validate_chain+0x796/0xa20 [ +0.000002] __lock_acquire+0x420/0x7c8 [ +0.000002] lock_acquire.part.0+0xec/0x1e8 [ +0.000002] lock_acquire+0xb8/0x208 [ +0.000003] __mutex_lock+0xa2/0x928 [ +0.000002] mutex_lock_nested+0x32/0x40 [ +0.000002] kvm_arch_crypto_set_masks+0x4a/0x2b8 [kvm] [ +0.000018] vfio_ap_mdev_refresh_apcb+0xd0/0xe0 [vfio_ap] [ +0.000003] assign_adapter_store+0x1f2/0x240 [vfio_ap] [ +0.000003] kernfs_fop_write_iter+0x13e/0x1e0 [ +0.000003] new_sync_write+0x10a/0x198 [ +0.000003] vfs_write.part.0+0x196/0x290 [ +0.000002] ksys_write+0x6c/0xf8 [ +0.000003] do_syscall+0x7e/0xd0 [ +0.000002] __do_syscall+0xc0/0xd8 [ +0.000003] system_call+0x72/0x98 [ +0.000002] -> #0 (&matrix_dev->lock){+.+.}-{3:3}: [ +0.000004] check_noncircular+0x16e/0x190 [ +0.000002] check_prev_add+0xec/0xf38 [ +0.000002] validate_chain+0x796/0xa20 [ +0.000002] __lock_acquire+0x420/0x7c8 [ +0.000002] lock_acquire.part.0+0xec/0x1e8 [ +0.000002] lock_acquire+0xb8/0x208 [ +0.000002] __mutex_lock+0xa2/0x928 [ +0.000002] mutex_lock_nested+0x32/0x40 [ +0.000003] handle_pqap+0x56/0x1c8 [vfio_ap] [ +0.000002] handle_pqap+0xe2/0x1d8 [kvm] [ +0.000019] kvm_handle_sie_intercept+0x134/0x248 [kvm] [ +0.000019] vcpu_post_run+0x2b6/0x580 [kvm] [ +0.000018] __vcpu_run+0x27e/0x388 [kvm] [ +0.000019] kvm_arch_vcpu_ioctl_run+0x10a/0x278 [kvm] [ +0.000018] kvm_vcpu_ioctl+0x2cc/0x898 [kvm] [ +0.000018] __s390x_sys_ioctl+0xc0/0x100 [ +0.000003] do_syscall+0x7e/0xd0 [ +0.000002] __do_syscall+0xc0/0xd8 [ +0.000002] system_call+0x72/0x98 [ +0.000003] other info that might help us debug this: [ +0.000001] Chain exists of: &matrix_dev->lock --> &kvm->lock --> &vcpu->mutex [ +0.000005] Possible unsafe locking scenario: [ +0.000001] CPU0 CPU1 [ +0.000001] ---- ---- [ +0.000002] lock(&vcpu->mutex); [ +0.000002] lock(&kvm->lock); [ +0.000002] lock(&vcpu->mutex); [ +0.000002] lock(&matrix_dev->lock); [ +0.000002] *** DEADLOCK *** [ +0.000002] 2 locks held by CPU 1/KVM/6651: [ +0.000002] #0: 00000000d41f4308 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x898 [kvm] [ +0.000023] #1: 00000000da2fc508 (&kvm->srcu){....}-{0:0}, at: __vcpu_run+0x1ec/0x388 [kvm] [ +0.000021] stack backtrace: [ +0.000002] CPU: 6 PID: 6651 Comm: CPU 1/KVM Not tainted 5.12.0-rc6-00016-g5bea90816c56 #57 [ +0.000004] Hardware name: IBM 8561 T01 701 (LPAR) [ +0.000001] Call Trace: [ +0.000002] [<00000002010e7ef0>] show_stack+0x90/0xf8 [ +0.000007] [<00000002010fb5b2>] dump_stack+0xba/0x108 [ +0.000002] [<000000020053feb6>] check_noncircular+0x16e/0x190 [ +0.000003] [<0000000200541424>] check_prev_add+0xec/0xf38 [ +0.000002] [<0000000200542a06>] validate_chain+0x796/0xa20 [ +0.000003] [<0000000200545430>] __lock_acquire+0x420/0x7c8 [ +0.000002] [<00000002005441a4>] lock_acquire.part.0+0xec/0x1e8 [ +0.000002] [<0000000200544358>] lock_acquire+0xb8/0x208 [ +0.000003] [<000000020110aeea>] __mutex_lock+0xa2/0x928 [ +0.000002] [<000000020110b7a2>] mutex_lock_nested+0x32/0x40 [ +0.000003] [<000003ff8060fb5e>] handle_pqap+0x56/0x1c8 [vfio_ap] [ +0.000003] [<000003ff80597412>] handle_pqap+0xe2/0x1d8 [kvm] [ +0.000018] [<000003ff8058c924>] kvm_handle_sie_intercept+0x134/0x248 [kvm] [ +0.000020] [<000003ff80588e96>] vcpu_post_run+0x2b6/0x580 [kvm] [ +0.000019] [<000003ff805893de>] __vcpu_run+0x27e/0x388 [kvm] [ +0.000018] [<000003ff80589d0a>] kvm_arch_vcpu_ioctl_run+0x10a/0x278 [kvm] [ +0.000019] [<000003ff805704d4>] kvm_vcpu_ioctl+0x2cc/0x898 [kvm] [ +0.000019] [<0000000200801ee8>] __s390x_sys_ioctl+0xc0/0x100 [ +0.000003] [<000000020046e7ae>] do_syscall+0x7e/0xd0 [ +0.000003] [<00000002010ffc20>] __do_syscall+0xc0/0xd8 [ +0.000002] [<0000000201110c42>] system_call+0x72/0x98 [ +0.000003] INFO: lockdep is turned off. [ +6.846296] vfio_mdev 4f77ad87-1e62-4959-8b7a-c677c98d2194: Removing from iommu group 1 [ +0.000028] vfio_mdev 4f77ad87-1e62-4959-8b7a-c677c98d2194: MDEV: detaching iommu [ +0.007677] vfio_ap matrix: MDEV: Unregistering > s390/vfio-ap: use new AP bus interface to search for queue devices > s390/vfio-ap: move probe and remove callbacks to vfio_ap_ops.c > s390/vfio-ap: manage link between queue struct and matrix mdev > s390/vfio-ap: introduce shadow APCB > s390/vfio-ap: refresh guest's APCB by filtering APQNs assigned to mdev > s390/vfio-ap: allow assignment of unavailable AP queues to mdev device > s390/vfio-ap: allow hot plug/unplug of AP resources using mdev device > s390/zcrypt: driver callback to indicate resource in use > s390/vfio-ap: implement in-use callback for vfio_ap driver > s390/vfio-ap: sysfs attribute to display the guest's matrix > s390/zcrypt: notify drivers on config changed and scan complete > callbacks > s390/vfio-ap: update docs to include dynamic config support
On 4/8/21 4:38 PM, Halil Pasic wrote: > On Tue, 6 Apr 2021 11:31:09 -0400 > Tony Krowiak <akrowiak@linux.ibm.com> wrote: > >> Tony Krowiak (13): >> s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks > The subsequent patches, re introduce this circular locking dependency > problem. See my kernel messages for the details. The link we severe > in the above patch is re-introduced at several places. One of them is > assign_adapter_store(). Like in the patch referenced above, the lockdep splat occurs when the APCB masks are set which requires acquisition of the kvm lock. Patch 08/13, allow hot plug/unplug of AP resources using mdev, introduces code that updates the APCB masks whenever an adapter, domain or control domain is assigned or unassigned as well as when a queue device is probed or removed. I think the solution from the patch above can be implemented here to resolve this problem. > > Regards, > Halil > > [ +0.000236] vfio_ap matrix: MDEV: Registered > [ +0.037919] vfio_mdev 4f77ad87-1e62-4959-8b7a-c677c98d2194: Adding to iommu group 1 > [ +0.000092] vfio_mdev 4f77ad87-1e62-4959-8b7a-c677c98d2194: MDEV: group_id = 1 > > [Apr 8 22:31] ====================================================== > [ +0.000002] WARNING: possible circular locking dependency detected > [ +0.000002] 5.12.0-rc6-00016-g5bea90816c56 #57 Not tainted > [ +0.000002] ------------------------------------------------------ > [ +0.000002] CPU 1/KVM/6651 is trying to acquire lock: > [ +0.000002] 00000000cef9d508 (&matrix_dev->lock){+.+.}-{3:3}, at: handle_pqap+0x56/0x1c8 [vfio_ap] > [ +0.000011] > but task is already holding lock: > [ +0.000001] 00000000d41f4308 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x898 [kvm] > [ +0.000038] > which lock already depends on the new lock. > > [ +0.000002] > the existing dependency chain (in reverse order) is: > [ +0.000001] > -> #2 (&vcpu->mutex){+.+.}-{3:3}: > [ +0.000004] validate_chain+0x796/0xa20 > [ +0.000006] __lock_acquire+0x420/0x7c8 > [ +0.000003] lock_acquire.part.0+0xec/0x1e8 > [ +0.000002] lock_acquire+0xb8/0x208 > [ +0.000002] __mutex_lock+0xa2/0x928 > [ +0.000005] mutex_lock_nested+0x32/0x40 > [ +0.000002] kvm_s390_cpus_to_pv+0x4e/0xf8 [kvm] > [ +0.000019] kvm_s390_handle_pv+0x1ce/0x6b0 [kvm] > [ +0.000018] kvm_arch_vm_ioctl+0x3ec/0x550 [kvm] > [ +0.000019] kvm_vm_ioctl+0x40e/0x4a8 [kvm] > [ +0.000018] __s390x_sys_ioctl+0xc0/0x100 > [ +0.000004] do_syscall+0x7e/0xd0 > [ +0.000043] __do_syscall+0xc0/0xd8 > [ +0.000004] system_call+0x72/0x98 > [ +0.000004] > -> #1 (&kvm->lock){+.+.}-{3:3}: > [ +0.000004] validate_chain+0x796/0xa20 > [ +0.000002] __lock_acquire+0x420/0x7c8 > [ +0.000002] lock_acquire.part.0+0xec/0x1e8 > [ +0.000002] lock_acquire+0xb8/0x208 > [ +0.000003] __mutex_lock+0xa2/0x928 > [ +0.000002] mutex_lock_nested+0x32/0x40 > [ +0.000002] kvm_arch_crypto_set_masks+0x4a/0x2b8 [kvm] > [ +0.000018] vfio_ap_mdev_refresh_apcb+0xd0/0xe0 [vfio_ap] > [ +0.000003] assign_adapter_store+0x1f2/0x240 [vfio_ap] > [ +0.000003] kernfs_fop_write_iter+0x13e/0x1e0 > [ +0.000003] new_sync_write+0x10a/0x198 > [ +0.000003] vfs_write.part.0+0x196/0x290 > [ +0.000002] ksys_write+0x6c/0xf8 > [ +0.000003] do_syscall+0x7e/0xd0 > [ +0.000002] __do_syscall+0xc0/0xd8 > [ +0.000003] system_call+0x72/0x98 > [ +0.000002] > -> #0 (&matrix_dev->lock){+.+.}-{3:3}: > [ +0.000004] check_noncircular+0x16e/0x190 > [ +0.000002] check_prev_add+0xec/0xf38 > [ +0.000002] validate_chain+0x796/0xa20 > [ +0.000002] __lock_acquire+0x420/0x7c8 > [ +0.000002] lock_acquire.part.0+0xec/0x1e8 > [ +0.000002] lock_acquire+0xb8/0x208 > [ +0.000002] __mutex_lock+0xa2/0x928 > [ +0.000002] mutex_lock_nested+0x32/0x40 > [ +0.000003] handle_pqap+0x56/0x1c8 [vfio_ap] > [ +0.000002] handle_pqap+0xe2/0x1d8 [kvm] > [ +0.000019] kvm_handle_sie_intercept+0x134/0x248 [kvm] > [ +0.000019] vcpu_post_run+0x2b6/0x580 [kvm] > [ +0.000018] __vcpu_run+0x27e/0x388 [kvm] > [ +0.000019] kvm_arch_vcpu_ioctl_run+0x10a/0x278 [kvm] > [ +0.000018] kvm_vcpu_ioctl+0x2cc/0x898 [kvm] > [ +0.000018] __s390x_sys_ioctl+0xc0/0x100 > [ +0.000003] do_syscall+0x7e/0xd0 > [ +0.000002] __do_syscall+0xc0/0xd8 > [ +0.000002] system_call+0x72/0x98 > [ +0.000003] > other info that might help us debug this: > > [ +0.000001] Chain exists of: > &matrix_dev->lock --> &kvm->lock --> &vcpu->mutex > > [ +0.000005] Possible unsafe locking scenario: > > [ +0.000001] CPU0 CPU1 > [ +0.000001] ---- ---- > [ +0.000002] lock(&vcpu->mutex); > [ +0.000002] lock(&kvm->lock); > [ +0.000002] lock(&vcpu->mutex); > [ +0.000002] lock(&matrix_dev->lock); > [ +0.000002] > *** DEADLOCK *** > > [ +0.000002] 2 locks held by CPU 1/KVM/6651: > [ +0.000002] #0: 00000000d41f4308 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x90/0x898 [kvm] > [ +0.000023] #1: 00000000da2fc508 (&kvm->srcu){....}-{0:0}, at: __vcpu_run+0x1ec/0x388 [kvm] > [ +0.000021] > stack backtrace: > [ +0.000002] CPU: 6 PID: 6651 Comm: CPU 1/KVM Not tainted 5.12.0-rc6-00016-g5bea90816c56 #57 > [ +0.000004] Hardware name: IBM 8561 T01 701 (LPAR) > [ +0.000001] Call Trace: > [ +0.000002] [<00000002010e7ef0>] show_stack+0x90/0xf8 > [ +0.000007] [<00000002010fb5b2>] dump_stack+0xba/0x108 > [ +0.000002] [<000000020053feb6>] check_noncircular+0x16e/0x190 > [ +0.000003] [<0000000200541424>] check_prev_add+0xec/0xf38 > [ +0.000002] [<0000000200542a06>] validate_chain+0x796/0xa20 > [ +0.000003] [<0000000200545430>] __lock_acquire+0x420/0x7c8 > [ +0.000002] [<00000002005441a4>] lock_acquire.part.0+0xec/0x1e8 > [ +0.000002] [<0000000200544358>] lock_acquire+0xb8/0x208 > [ +0.000003] [<000000020110aeea>] __mutex_lock+0xa2/0x928 > [ +0.000002] [<000000020110b7a2>] mutex_lock_nested+0x32/0x40 > [ +0.000003] [<000003ff8060fb5e>] handle_pqap+0x56/0x1c8 [vfio_ap] > [ +0.000003] [<000003ff80597412>] handle_pqap+0xe2/0x1d8 [kvm] > [ +0.000018] [<000003ff8058c924>] kvm_handle_sie_intercept+0x134/0x248 [kvm] > [ +0.000020] [<000003ff80588e96>] vcpu_post_run+0x2b6/0x580 [kvm] > [ +0.000019] [<000003ff805893de>] __vcpu_run+0x27e/0x388 [kvm] > [ +0.000018] [<000003ff80589d0a>] kvm_arch_vcpu_ioctl_run+0x10a/0x278 [kvm] > [ +0.000019] [<000003ff805704d4>] kvm_vcpu_ioctl+0x2cc/0x898 [kvm] > [ +0.000019] [<0000000200801ee8>] __s390x_sys_ioctl+0xc0/0x100 > [ +0.000003] [<000000020046e7ae>] do_syscall+0x7e/0xd0 > [ +0.000003] [<00000002010ffc20>] __do_syscall+0xc0/0xd8 > [ +0.000002] [<0000000201110c42>] system_call+0x72/0x98 > [ +0.000003] INFO: lockdep is turned off. > [ +6.846296] vfio_mdev 4f77ad87-1e62-4959-8b7a-c677c98d2194: Removing from iommu group 1 > [ +0.000028] vfio_mdev 4f77ad87-1e62-4959-8b7a-c677c98d2194: MDEV: detaching iommu > [ +0.007677] vfio_ap matrix: MDEV: Unregistering > > >> s390/vfio-ap: use new AP bus interface to search for queue devices >> s390/vfio-ap: move probe and remove callbacks to vfio_ap_ops.c >> s390/vfio-ap: manage link between queue struct and matrix mdev >> s390/vfio-ap: introduce shadow APCB >> s390/vfio-ap: refresh guest's APCB by filtering APQNs assigned to mdev >> s390/vfio-ap: allow assignment of unavailable AP queues to mdev device >> s390/vfio-ap: allow hot plug/unplug of AP resources using mdev device >> s390/zcrypt: driver callback to indicate resource in use >> s390/vfio-ap: implement in-use callback for vfio_ap driver >> s390/vfio-ap: sysfs attribute to display the guest's matrix >> s390/zcrypt: notify drivers on config changed and scan complete >> callbacks >> s390/vfio-ap: update docs to include dynamic config support