diff mbox series

RDMA/mlx5: Use irq xarray locking for mkey_table

Message ID 20191024234910.GA9038@ziepe.ca (mailing list archive)
State Mainlined
Commit 1524b12a6e02a85264af4ed208b034a2239ef374
Delegated to: Jason Gunthorpe
Headers show
Series RDMA/mlx5: Use irq xarray locking for mkey_table | expand

Commit Message

Jason Gunthorpe Oct. 24, 2019, 11:49 p.m. UTC
The mkey_table xarray is touched by the reg_mr_callback() function which
is called from a hard irq. Thus all other uses of xa_lock must use the
_irq variants.

  WARNING: inconsistent lock state
  5.4.0-rc1 #12 Not tainted
  --------------------------------
  inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
  python3/343 [HC0[0]:SC0[0]:HE1:SE1] takes:
  ffff888182be1d40 (&(&xa->xa_lock)->rlock#3){?.-.}, at: xa_erase+0x12/0x30
  {IN-HARDIRQ-W} state was registered at:
    lock_acquire+0xe1/0x200
    _raw_spin_lock_irqsave+0x35/0x50
    reg_mr_callback+0x2dd/0x450 [mlx5_ib]
    mlx5_cmd_exec_cb_handler+0x2c/0x70 [mlx5_core]
    mlx5_cmd_comp_handler+0x355/0x840 [mlx5_core]
   [..]

   Possible unsafe locking scenario:

         CPU0
         ----
    lock(&(&xa->xa_lock)->rlock#3);
    <Interrupt>
      lock(&(&xa->xa_lock)->rlock#3);

   *** DEADLOCK ***

  2 locks held by python3/343:
   #0: ffff88818eb4bd38 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_ioctl+0xe5/0x1e0 [ib_uverbs]
   #1: ffff888176c76d38 (&file->hw_destroy_rwsem){++++}, at: uobj_destroy+0x2d/0x90 [ib_uverbs]

  stack backtrace:
  CPU: 3 PID: 343 Comm: python3 Not tainted 5.4.0-rc1 #12
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x86/0xca
   print_usage_bug.cold.50+0x2e5/0x355
   mark_lock+0x871/0xb50
   ? match_held_lock+0x20/0x250
   ? check_usage_forwards+0x240/0x240
   __lock_acquire+0x7de/0x23a0
   ? __kasan_check_read+0x11/0x20
   ? mark_lock+0xae/0xb50
   ? mark_held_locks+0xb0/0xb0
   ? find_held_lock+0xca/0xf0
   lock_acquire+0xe1/0x200
   ? xa_erase+0x12/0x30
   _raw_spin_lock+0x2a/0x40
   ? xa_erase+0x12/0x30
   xa_erase+0x12/0x30
   mlx5_ib_dealloc_mw+0x55/0xa0 [mlx5_ib]
   uverbs_dealloc_mw+0x3c/0x70 [ib_uverbs]
   uverbs_free_mw+0x1a/0x20 [ib_uverbs]
   destroy_hw_idr_uobject+0x49/0xa0 [ib_uverbs]
   [..]

Fixes: 0417791536ae ("RDMA/mlx5: Add missing synchronize_srcu() for MW cases")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/infiniband/hw/mlx5/mr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Leon Romanovsky Oct. 27, 2019, 8:38 a.m. UTC | #1
On Fri, Oct 25, 2019 at 02:49:13AM +0300, Jason Gunthorpe wrote:
> The mkey_table xarray is touched by the reg_mr_callback() function which
> is called from a hard irq. Thus all other uses of xa_lock must use the
> _irq variants.
>
>   WARNING: inconsistent lock state
>   5.4.0-rc1 #12 Not tainted
>   --------------------------------
>   inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
>   python3/343 [HC0[0]:SC0[0]:HE1:SE1] takes:
>   ffff888182be1d40 (&(&xa->xa_lock)->rlock#3){?.-.}, at: xa_erase+0x12/0x30
>   {IN-HARDIRQ-W} state was registered at:
>     lock_acquire+0xe1/0x200
>     _raw_spin_lock_irqsave+0x35/0x50
>     reg_mr_callback+0x2dd/0x450 [mlx5_ib]
>     mlx5_cmd_exec_cb_handler+0x2c/0x70 [mlx5_core]
>     mlx5_cmd_comp_handler+0x355/0x840 [mlx5_core]
>    [..]
>
>    Possible unsafe locking scenario:
>
>          CPU0
>          ----
>     lock(&(&xa->xa_lock)->rlock#3);
>     <Interrupt>
>       lock(&(&xa->xa_lock)->rlock#3);
>
>    *** DEADLOCK ***
>
>   2 locks held by python3/343:
>    #0: ffff88818eb4bd38 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_ioctl+0xe5/0x1e0 [ib_uverbs]
>    #1: ffff888176c76d38 (&file->hw_destroy_rwsem){++++}, at: uobj_destroy+0x2d/0x90 [ib_uverbs]
>
>   stack backtrace:
>   CPU: 3 PID: 343 Comm: python3 Not tainted 5.4.0-rc1 #12
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
>   Call Trace:
>    dump_stack+0x86/0xca
>    print_usage_bug.cold.50+0x2e5/0x355
>    mark_lock+0x871/0xb50
>    ? match_held_lock+0x20/0x250
>    ? check_usage_forwards+0x240/0x240
>    __lock_acquire+0x7de/0x23a0
>    ? __kasan_check_read+0x11/0x20
>    ? mark_lock+0xae/0xb50
>    ? mark_held_locks+0xb0/0xb0
>    ? find_held_lock+0xca/0xf0
>    lock_acquire+0xe1/0x200
>    ? xa_erase+0x12/0x30
>    _raw_spin_lock+0x2a/0x40
>    ? xa_erase+0x12/0x30
>    xa_erase+0x12/0x30
>    mlx5_ib_dealloc_mw+0x55/0xa0 [mlx5_ib]
>    uverbs_dealloc_mw+0x3c/0x70 [ib_uverbs]
>    uverbs_free_mw+0x1a/0x20 [ib_uverbs]
>    destroy_hw_idr_uobject+0x49/0xa0 [ib_uverbs]
>    [..]
>
> Fixes: 0417791536ae ("RDMA/mlx5: Add missing synchronize_srcu() for MW cases")
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> ---
>  drivers/infiniband/hw/mlx5/mr.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>

Thanks,
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Jason Gunthorpe Oct. 28, 2019, 5:10 p.m. UTC | #2
On Thu, Oct 24, 2019 at 11:49:13PM +0000, Jason Gunthorpe wrote:
> The mkey_table xarray is touched by the reg_mr_callback() function which
> is called from a hard irq. Thus all other uses of xa_lock must use the
> _irq variants.
> 
>   WARNING: inconsistent lock state
>   5.4.0-rc1 #12 Not tainted
>   --------------------------------
>   inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
>   python3/343 [HC0[0]:SC0[0]:HE1:SE1] takes:
>   ffff888182be1d40 (&(&xa->xa_lock)->rlock#3){?.-.}, at: xa_erase+0x12/0x30
>   {IN-HARDIRQ-W} state was registered at:
>     lock_acquire+0xe1/0x200
>     _raw_spin_lock_irqsave+0x35/0x50
>     reg_mr_callback+0x2dd/0x450 [mlx5_ib]
>     mlx5_cmd_exec_cb_handler+0x2c/0x70 [mlx5_core]
>     mlx5_cmd_comp_handler+0x355/0x840 [mlx5_core]
>    [..]
> 
>    Possible unsafe locking scenario:
> 
>          CPU0
>          ----
>     lock(&(&xa->xa_lock)->rlock#3);
>     <Interrupt>
>       lock(&(&xa->xa_lock)->rlock#3);
> 
>    *** DEADLOCK ***
> 
>   2 locks held by python3/343:
>    #0: ffff88818eb4bd38 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_ioctl+0xe5/0x1e0 [ib_uverbs]
>    #1: ffff888176c76d38 (&file->hw_destroy_rwsem){++++}, at: uobj_destroy+0x2d/0x90 [ib_uverbs]
> 
>   stack backtrace:
>   CPU: 3 PID: 343 Comm: python3 Not tainted 5.4.0-rc1 #12
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
>   Call Trace:
>    dump_stack+0x86/0xca
>    print_usage_bug.cold.50+0x2e5/0x355
>    mark_lock+0x871/0xb50
>    ? match_held_lock+0x20/0x250
>    ? check_usage_forwards+0x240/0x240
>    __lock_acquire+0x7de/0x23a0
>    ? __kasan_check_read+0x11/0x20
>    ? mark_lock+0xae/0xb50
>    ? mark_held_locks+0xb0/0xb0
>    ? find_held_lock+0xca/0xf0
>    lock_acquire+0xe1/0x200
>    ? xa_erase+0x12/0x30
>    _raw_spin_lock+0x2a/0x40
>    ? xa_erase+0x12/0x30
>    xa_erase+0x12/0x30
>    mlx5_ib_dealloc_mw+0x55/0xa0 [mlx5_ib]
>    uverbs_dealloc_mw+0x3c/0x70 [ib_uverbs]
>    uverbs_free_mw+0x1a/0x20 [ib_uverbs]
>    destroy_hw_idr_uobject+0x49/0xa0 [ib_uverbs]
>    [..]
> 
> Fixes: 0417791536ae ("RDMA/mlx5: Add missing synchronize_srcu() for MW cases")
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Acked-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/infiniband/hw/mlx5/mr.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Applied to for-rc

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 630599311586ec..7019c12005f4c1 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1967,8 +1967,8 @@  int mlx5_ib_dealloc_mw(struct ib_mw *mw)
 	int err;
 
 	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
-		xa_erase(&dev->mdev->priv.mkey_table,
-			 mlx5_base_mkey(mmw->mmkey.key));
+		xa_erase_irq(&dev->mdev->priv.mkey_table,
+			     mlx5_base_mkey(mmw->mmkey.key));
 		/*
 		 * pagefault_single_data_segment() may be accessing mmw under
 		 * SRCU if the user bound an ODP MR to this MW.