Message ID | 20191024234910.GA9038@ziepe.ca (mailing list archive) |
---|---|
State | Mainlined |
Commit | 1524b12a6e02a85264af4ed208b034a2239ef374 |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | RDMA/mlx5: Use irq xarray locking for mkey_table | expand |
On Fri, Oct 25, 2019 at 02:49:13AM +0300, Jason Gunthorpe wrote: > The mkey_table xarray is touched by the reg_mr_callback() function which > is called from a hard irq. Thus all other uses of xa_lock must use the > _irq variants. > > WARNING: inconsistent lock state > 5.4.0-rc1 #12 Not tainted > -------------------------------- > inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. > python3/343 [HC0[0]:SC0[0]:HE1:SE1] takes: > ffff888182be1d40 (&(&xa->xa_lock)->rlock#3){?.-.}, at: xa_erase+0x12/0x30 > {IN-HARDIRQ-W} state was registered at: > lock_acquire+0xe1/0x200 > _raw_spin_lock_irqsave+0x35/0x50 > reg_mr_callback+0x2dd/0x450 [mlx5_ib] > mlx5_cmd_exec_cb_handler+0x2c/0x70 [mlx5_core] > mlx5_cmd_comp_handler+0x355/0x840 [mlx5_core] > [..] > > Possible unsafe locking scenario: > > CPU0 > ---- > lock(&(&xa->xa_lock)->rlock#3); > <Interrupt> > lock(&(&xa->xa_lock)->rlock#3); > > *** DEADLOCK *** > > 2 locks held by python3/343: > #0: ffff88818eb4bd38 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_ioctl+0xe5/0x1e0 [ib_uverbs] > #1: ffff888176c76d38 (&file->hw_destroy_rwsem){++++}, at: uobj_destroy+0x2d/0x90 [ib_uverbs] > > stack backtrace: > CPU: 3 PID: 343 Comm: python3 Not tainted 5.4.0-rc1 #12 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 > Call Trace: > dump_stack+0x86/0xca > print_usage_bug.cold.50+0x2e5/0x355 > mark_lock+0x871/0xb50 > ? match_held_lock+0x20/0x250 > ? check_usage_forwards+0x240/0x240 > __lock_acquire+0x7de/0x23a0 > ? __kasan_check_read+0x11/0x20 > ? mark_lock+0xae/0xb50 > ? mark_held_locks+0xb0/0xb0 > ? find_held_lock+0xca/0xf0 > lock_acquire+0xe1/0x200 > ? xa_erase+0x12/0x30 > _raw_spin_lock+0x2a/0x40 > ? xa_erase+0x12/0x30 > xa_erase+0x12/0x30 > mlx5_ib_dealloc_mw+0x55/0xa0 [mlx5_ib] > uverbs_dealloc_mw+0x3c/0x70 [ib_uverbs] > uverbs_free_mw+0x1a/0x20 [ib_uverbs] > destroy_hw_idr_uobject+0x49/0xa0 [ib_uverbs] > [..] > > Fixes: 0417791536ae ("RDMA/mlx5: Add missing synchronize_srcu() for MW cases") > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> > --- > drivers/infiniband/hw/mlx5/mr.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > Thanks, Acked-by: Leon Romanovsky <leonro@mellanox.com>
On Thu, Oct 24, 2019 at 11:49:13PM +0000, Jason Gunthorpe wrote: > The mkey_table xarray is touched by the reg_mr_callback() function which > is called from a hard irq. Thus all other uses of xa_lock must use the > _irq variants. > > WARNING: inconsistent lock state > 5.4.0-rc1 #12 Not tainted > -------------------------------- > inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. > python3/343 [HC0[0]:SC0[0]:HE1:SE1] takes: > ffff888182be1d40 (&(&xa->xa_lock)->rlock#3){?.-.}, at: xa_erase+0x12/0x30 > {IN-HARDIRQ-W} state was registered at: > lock_acquire+0xe1/0x200 > _raw_spin_lock_irqsave+0x35/0x50 > reg_mr_callback+0x2dd/0x450 [mlx5_ib] > mlx5_cmd_exec_cb_handler+0x2c/0x70 [mlx5_core] > mlx5_cmd_comp_handler+0x355/0x840 [mlx5_core] > [..] > > Possible unsafe locking scenario: > > CPU0 > ---- > lock(&(&xa->xa_lock)->rlock#3); > <Interrupt> > lock(&(&xa->xa_lock)->rlock#3); > > *** DEADLOCK *** > > 2 locks held by python3/343: > #0: ffff88818eb4bd38 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_ioctl+0xe5/0x1e0 [ib_uverbs] > #1: ffff888176c76d38 (&file->hw_destroy_rwsem){++++}, at: uobj_destroy+0x2d/0x90 [ib_uverbs] > > stack backtrace: > CPU: 3 PID: 343 Comm: python3 Not tainted 5.4.0-rc1 #12 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 > Call Trace: > dump_stack+0x86/0xca > print_usage_bug.cold.50+0x2e5/0x355 > mark_lock+0x871/0xb50 > ? match_held_lock+0x20/0x250 > ? check_usage_forwards+0x240/0x240 > __lock_acquire+0x7de/0x23a0 > ? __kasan_check_read+0x11/0x20 > ? mark_lock+0xae/0xb50 > ? mark_held_locks+0xb0/0xb0 > ? find_held_lock+0xca/0xf0 > lock_acquire+0xe1/0x200 > ? xa_erase+0x12/0x30 > _raw_spin_lock+0x2a/0x40 > ? xa_erase+0x12/0x30 > xa_erase+0x12/0x30 > mlx5_ib_dealloc_mw+0x55/0xa0 [mlx5_ib] > uverbs_dealloc_mw+0x3c/0x70 [ib_uverbs] > uverbs_free_mw+0x1a/0x20 [ib_uverbs] > destroy_hw_idr_uobject+0x49/0xa0 [ib_uverbs] > [..] > > Fixes: 0417791536ae ("RDMA/mlx5: Add missing synchronize_srcu() for MW cases") > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> > Acked-by: Leon Romanovsky <leonro@mellanox.com> > --- > drivers/infiniband/hw/mlx5/mr.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Applied to for-rc Jason
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 630599311586ec..7019c12005f4c1 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1967,8 +1967,8 @@ int mlx5_ib_dealloc_mw(struct ib_mw *mw) int err; if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) { - xa_erase(&dev->mdev->priv.mkey_table, - mlx5_base_mkey(mmw->mmkey.key)); + xa_erase_irq(&dev->mdev->priv.mkey_table, + mlx5_base_mkey(mmw->mmkey.key)); /* * pagefault_single_data_segment() may be accessing mmw under * SRCU if the user bound an ODP MR to this MW.
The mkey_table xarray is touched by the reg_mr_callback() function which is called from a hard irq. Thus all other uses of xa_lock must use the _irq variants. WARNING: inconsistent lock state 5.4.0-rc1 #12 Not tainted -------------------------------- inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. python3/343 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff888182be1d40 (&(&xa->xa_lock)->rlock#3){?.-.}, at: xa_erase+0x12/0x30 {IN-HARDIRQ-W} state was registered at: lock_acquire+0xe1/0x200 _raw_spin_lock_irqsave+0x35/0x50 reg_mr_callback+0x2dd/0x450 [mlx5_ib] mlx5_cmd_exec_cb_handler+0x2c/0x70 [mlx5_core] mlx5_cmd_comp_handler+0x355/0x840 [mlx5_core] [..] Possible unsafe locking scenario: CPU0 ---- lock(&(&xa->xa_lock)->rlock#3); <Interrupt> lock(&(&xa->xa_lock)->rlock#3); *** DEADLOCK *** 2 locks held by python3/343: #0: ffff88818eb4bd38 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_ioctl+0xe5/0x1e0 [ib_uverbs] #1: ffff888176c76d38 (&file->hw_destroy_rwsem){++++}, at: uobj_destroy+0x2d/0x90 [ib_uverbs] stack backtrace: CPU: 3 PID: 343 Comm: python3 Not tainted 5.4.0-rc1 #12 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x86/0xca print_usage_bug.cold.50+0x2e5/0x355 mark_lock+0x871/0xb50 ? match_held_lock+0x20/0x250 ? check_usage_forwards+0x240/0x240 __lock_acquire+0x7de/0x23a0 ? __kasan_check_read+0x11/0x20 ? mark_lock+0xae/0xb50 ? mark_held_locks+0xb0/0xb0 ? find_held_lock+0xca/0xf0 lock_acquire+0xe1/0x200 ? xa_erase+0x12/0x30 _raw_spin_lock+0x2a/0x40 ? xa_erase+0x12/0x30 xa_erase+0x12/0x30 mlx5_ib_dealloc_mw+0x55/0xa0 [mlx5_ib] uverbs_dealloc_mw+0x3c/0x70 [ib_uverbs] uverbs_free_mw+0x1a/0x20 [ib_uverbs] destroy_hw_idr_uobject+0x49/0xa0 [ib_uverbs] [..] Fixes: 0417791536ae ("RDMA/mlx5: Add missing synchronize_srcu() for MW cases") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> --- drivers/infiniband/hw/mlx5/mr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)