mbox series

[for-next,v13,00/10] Fix race conditions in rxe_pool

Message ID 20220404215059.39819-1-rpearsonhpe@gmail.com (mailing list archive)
Headers show
Series Fix race conditions in rxe_pool | expand

Message

Bob Pearson April 4, 2022, 9:50 p.m. UTC
There are several race conditions discovered in the current rdma_rxe
driver.  They mostly relate to races between normal operations and
destroying objects.  This patch series
 - Makes several minor cleanups in rxe_pool.[ch]
 - Adds wait for completions to the paths in verbs APIs which destroy
   objects.
 - Changes read side locking to rcu.
 - Moves object cleanup code to after ref count is zero

Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
v13
  Rebased to current for-next
  Addressed Jason's comments on patch 1, 8 and 9. 8 and 9 are
  combined into one patch.
v12
  Rebased to current wip/jgg-for-next
  Dropped patches already accepted from v11.
  Moved all object cleanup code to type specific cleanup routines.
  Renamed to match Jason's requests.
  Addressed some other issued raised.
  Kept the contentious rxe_hide() function but renamed to
  rxe_disable_lookup() which says what it does. I am still convinced
  this cleaner than other alternatives like moving xa_erase to the
  top of destroy routines but just for indexed objects.
v11
  Rebased to current for-next.
  Reordered patches and made other changes to respond to issues
  reported by Jason Gunthorpe.
v10
  Rebased to current wip/jgg-for-next.
  Split some patches into smaller ones.
v9
  Corrected issues reported by Jason Gunthorpe,
  Converted locking in rxe_mcast.c and rxe_pool.c to use RCU
  Split up the patches into smaller changes
v8
  Fixed an additional race in 3/8 which was not handled correctly.
v7
  Corrected issues reported by Jason Gunthorpe
Link: https://lore.kernel.org/linux-rdma/20211207190947.GH6385@nvidia.com/
Link: https://lore.kernel.org/linux-rdma/20211207191857.GI6385@nvidia.com/
Link: https://lore.kernel.org/linux-rdma/20211207192824.GJ6385@nvidia.com/
v6
  Fixed a kzalloc flags bug.
  Fixed comment bug reported by 'Kernel Test Robot'.
  Changed type of rxe_pool.c in __rxe_fini().
v5
  Removed patches already accepted into for-next and addressed comments
  from Jason Gunthorpe.
v4
  Restructured patch series to change to xarray earlier which
  greatly simplified the changes.
  Rebased to current for-next
v3
  Changed rxe_alloc to use GFP_KERNEL
  Addressed other comments by Jason Gunthorp
  Merged the previous 06/10 and 07/10 patches into one since they overlapped
  Added some minor cleanups as 10/10
v2
  Rebased to current for-next.
  Added 4 additional patches

Bob Pearson (10):
  RDMA/rxe: Remove IB_SRQ_INIT_MASK
  RDMA/rxe: Add rxe_srq_cleanup()
  RDMA/rxe: Check rxe_get() return value
  RDMA/rxe: Move qp cleanup code to rxe_qp_do_cleanup()
  RDMA/rxe: Move mr cleanup code to rxe_mr_cleanup()
  RDMA/rxe: Move mw cleanup code to rxe_mw_cleanup()
  RDMA/rxe: Enforce IBA C11-17
  RDMA/rxe: Stop lookup of partially built objects
  RDMA/rxe: Convert read side locking to rcu
  RDMA/rxe: Cleanup rxe_pool.c

 drivers/infiniband/sw/rxe/rxe_comp.c  |   3 +-
 drivers/infiniband/sw/rxe/rxe_loc.h   |  17 ++-
 drivers/infiniband/sw/rxe/rxe_mr.c    |  12 +-
 drivers/infiniband/sw/rxe/rxe_mw.c    |  61 +++++-----
 drivers/infiniband/sw/rxe/rxe_pool.c  | 157 ++++++++++++++++++++++----
 drivers/infiniband/sw/rxe/rxe_pool.h  |  11 +-
 drivers/infiniband/sw/rxe/rxe_qp.c    |  22 ++--
 drivers/infiniband/sw/rxe/rxe_req.c   |   3 +-
 drivers/infiniband/sw/rxe/rxe_resp.c  |   3 +-
 drivers/infiniband/sw/rxe/rxe_srq.c   | 129 +++++++++++++--------
 drivers/infiniband/sw/rxe/rxe_verbs.c |  68 ++++++-----
 drivers/infiniband/sw/rxe/rxe_verbs.h |   1 +
 12 files changed, 325 insertions(+), 162 deletions(-)

base-commit: 3123109284176b1532874591f7c81f3837bbdc17

Comments

Jason Gunthorpe April 8, 2022, 6:06 p.m. UTC | #1
On Mon, Apr 04, 2022 at 04:50:50PM -0500, Bob Pearson wrote:
> There are several race conditions discovered in the current rdma_rxe
> driver.  They mostly relate to races between normal operations and
> destroying objects.  This patch series
>  - Makes several minor cleanups in rxe_pool.[ch]
>  - Adds wait for completions to the paths in verbs APIs which destroy
>    objects.
>  - Changes read side locking to rcu.
>  - Moves object cleanup code to after ref count is zero

This all seems fine to me now, except for the question about the
tasklets

Thanks,
Jason
Bob Pearson April 8, 2022, 6:15 p.m. UTC | #2
On 4/8/22 13:06, Jason Gunthorpe wrote:
> On Mon, Apr 04, 2022 at 04:50:50PM -0500, Bob Pearson wrote:
>> There are several race conditions discovered in the current rdma_rxe
>> driver.  They mostly relate to races between normal operations and
>> destroying objects.  This patch series
>>  - Makes several minor cleanups in rxe_pool.[ch]
>>  - Adds wait for completions to the paths in verbs APIs which destroy
>>    objects.
>>  - Changes read side locking to rcu.
>>  - Moves object cleanup code to after ref count is zero
> 
> This all seems fine to me now, except for the question about the
> tasklets
> 
> Thanks,
> Jason
Thanks. At the moment the first priority is to figure out how
blktest regressed in 5.18-rc1+. Hopefully it will be easy to figure out.

Bob
Bob Pearson April 20, 2022, 11:04 p.m. UTC | #3
On 4/8/22 13:06, Jason Gunthorpe wrote:
> On Mon, Apr 04, 2022 at 04:50:50PM -0500, Bob Pearson wrote:
>> There are several race conditions discovered in the current rdma_rxe
>> driver.  They mostly relate to races between normal operations and
>> destroying objects.  This patch series
>>  - Makes several minor cleanups in rxe_pool.[ch]
>>  - Adds wait for completions to the paths in verbs APIs which destroy
>>    objects.
>>  - Changes read side locking to rcu.
>>  - Moves object cleanup code to after ref count is zero
> 
> This all seems fine to me now, except for the question about the
> tasklets
> 
> Thanks,
> Jason

There has been a long delay because of the mr = NULL bug and the locking
problems. With the following patches applied (last to first) I do not
see any lockdep warnings, seg faults or anything else in dmesg for
long runs of

	pyverbs
	perftests (ib_xxx_bw, ib_xxx_lat)
	rping (node to node)
	blktests (srp)

These patches were in v13 of the "Fix race conditions" patch. I will send v14 today.
8d342cb8d7ce RDMA/rxe: Cleanup rxe_pool.c

6e4c52e04bc9 RDMA/rxe: Convert read side locking to rcu

e3e46d864b98 RDMA/rxe: Stop lookup of partially built objects

e1fb6b7225d0 RDMA/rxe: Enforce IBA C11-17

2607d042376f RDMA/rxe: Move mw cleanup code to rxe_mw_cleanup()

ca082913b915 RDMA/rxe: Move mr cleanup code to rxe_mr_cleanup()

394f24ebc81b RDMA/rxe: Move qp cleanup code to rxe_qp_do_cleanup()


3fb445b66e5c RDMA/rxe: Add rxe_srq_cleanup()

4730b0ed751a RDMA/rxe: Remove IB_SRQ_INIT_MASK


These patches are already submitted
d02e7a7266cf RDMA/rxe: Fix "RDMA/rxe: Cleanup rxe_mcast.c"

569aba28f67c RDMA/rxe: Fix "Replace mr by rkey in responder resources(2)"
 or whatever you called it.
5e74a5ecfb53 RDMA/rxe: Fix "Replace mr by rkey in responder resources"

007493744865 RDMA/rxe: Fix typo: replace paylen by payload


This patch was submitted to scsi by Bart and addressed long timeouts that
were not rxe related (same issue also happens with siw)
cdd844a1ba45 Revert "scsi: scsi_debug: Address races following module load"

If Zhu is not OK with this let know what bugs remain that need fixing.

Bob
Zhu Yanjun April 21, 2022, 2:13 a.m. UTC | #4
On Thu, Apr 21, 2022 at 7:04 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>
> On 4/8/22 13:06, Jason Gunthorpe wrote:
> > On Mon, Apr 04, 2022 at 04:50:50PM -0500, Bob Pearson wrote:
> >> There are several race conditions discovered in the current rdma_rxe
> >> driver.  They mostly relate to races between normal operations and
> >> destroying objects.  This patch series
> >>  - Makes several minor cleanups in rxe_pool.[ch]
> >>  - Adds wait for completions to the paths in verbs APIs which destroy
> >>    objects.
> >>  - Changes read side locking to rcu.
> >>  - Moves object cleanup code to after ref count is zero
> >
> > This all seems fine to me now, except for the question about the
> > tasklets
> >
> > Thanks,
> > Jason
>
> There has been a long delay because of the mr = NULL bug and the locking
> problems. With the following patches applied (last to first) I do not
> see any lockdep warnings, seg faults or anything else in dmesg for
> long runs of
>
>         pyverbs
>         perftests (ib_xxx_bw, ib_xxx_lat)
>         rping (node to node)
>         blktests (srp)
>
> These patches were in v13 of the "Fix race conditions" patch. I will send v14 today.
> 8d342cb8d7ce RDMA/rxe: Cleanup rxe_pool.c
>
> 6e4c52e04bc9 RDMA/rxe: Convert read side locking to rcu
>
> e3e46d864b98 RDMA/rxe: Stop lookup of partially built objects
>
> e1fb6b7225d0 RDMA/rxe: Enforce IBA C11-17
>
> 2607d042376f RDMA/rxe: Move mw cleanup code to rxe_mw_cleanup()
>
> ca082913b915 RDMA/rxe: Move mr cleanup code to rxe_mr_cleanup()
>
> 394f24ebc81b RDMA/rxe: Move qp cleanup code to rxe_qp_do_cleanup()
>
>
> 3fb445b66e5c RDMA/rxe: Add rxe_srq_cleanup()
>
> 4730b0ed751a RDMA/rxe: Remove IB_SRQ_INIT_MASK
>
>
> These patches are already submitted
> d02e7a7266cf RDMA/rxe: Fix "RDMA/rxe: Cleanup rxe_mcast.c"
>
> 569aba28f67c RDMA/rxe: Fix "Replace mr by rkey in responder resources(2)"
>  or whatever you called it.
> 5e74a5ecfb53 RDMA/rxe: Fix "Replace mr by rkey in responder resources"
>
> 007493744865 RDMA/rxe: Fix typo: replace paylen by payload
>
>
> This patch was submitted to scsi by Bart and addressed long timeouts that
> were not rxe related (same issue also happens with siw)
> cdd844a1ba45 Revert "scsi: scsi_debug: Address races following module load"
>
> If Zhu is not OK with this let know what bugs remain that need fixing.

How do you get this conclusion that I am not OK with this?

Zhu Yanjun

>
> Bob
Pearson, Robert B April 21, 2022, 4:09 p.m. UTC | #5
Zhu,

Sorry. I am not trying to imply you are against this. Just that you are more aware of the
current outstanding bugs reported.

Bob

-----Original Message-----
From: Zhu Yanjun <zyjzyj2000@gmail.com> 
Sent: Wednesday, April 20, 2022 9:13 PM
To: Bob Pearson <rpearsonhpe@gmail.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>; RDMA mailing list <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH for-next v13 00/10] Fix race conditions in rxe_pool

On Thu, Apr 21, 2022 at 7:04 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>
> On 4/8/22 13:06, Jason Gunthorpe wrote:
> > On Mon, Apr 04, 2022 at 04:50:50PM -0500, Bob Pearson wrote:
> >> There are several race conditions discovered in the current 
> >> rdma_rxe driver.  They mostly relate to races between normal 
> >> operations and destroying objects.  This patch series
> >>  - Makes several minor cleanups in rxe_pool.[ch]
> >>  - Adds wait for completions to the paths in verbs APIs which destroy
> >>    objects.
> >>  - Changes read side locking to rcu.
> >>  - Moves object cleanup code to after ref count is zero
> >
> > This all seems fine to me now, except for the question about the 
> > tasklets
> >
> > Thanks,
> > Jason
>
> There has been a long delay because of the mr = NULL bug and the 
> locking problems. With the following patches applied (last to first) I 
> do not see any lockdep warnings, seg faults or anything else in dmesg 
> for long runs of
>
>         pyverbs
>         perftests (ib_xxx_bw, ib_xxx_lat)
>         rping (node to node)
>         blktests (srp)
>
> These patches were in v13 of the "Fix race conditions" patch. I will send v14 today.
> 8d342cb8d7ce RDMA/rxe: Cleanup rxe_pool.c
>
> 6e4c52e04bc9 RDMA/rxe: Convert read side locking to rcu
>
> e3e46d864b98 RDMA/rxe: Stop lookup of partially built objects
>
> e1fb6b7225d0 RDMA/rxe: Enforce IBA C11-17
>
> 2607d042376f RDMA/rxe: Move mw cleanup code to rxe_mw_cleanup()
>
> ca082913b915 RDMA/rxe: Move mr cleanup code to rxe_mr_cleanup()
>
> 394f24ebc81b RDMA/rxe: Move qp cleanup code to rxe_qp_do_cleanup()
>
>
> 3fb445b66e5c RDMA/rxe: Add rxe_srq_cleanup()
>
> 4730b0ed751a RDMA/rxe: Remove IB_SRQ_INIT_MASK
>
>
> These patches are already submitted
> d02e7a7266cf RDMA/rxe: Fix "RDMA/rxe: Cleanup rxe_mcast.c"
>
> 569aba28f67c RDMA/rxe: Fix "Replace mr by rkey in responder resources(2)"
>  or whatever you called it.
> 5e74a5ecfb53 RDMA/rxe: Fix "Replace mr by rkey in responder resources"
>
> 007493744865 RDMA/rxe: Fix typo: replace paylen by payload
>
>
> This patch was submitted to scsi by Bart and addressed long timeouts 
> that were not rxe related (same issue also happens with siw)
> cdd844a1ba45 Revert "scsi: scsi_debug: Address races following module load"
>
> If Zhu is not OK with this let know what bugs remain that need fixing.

How do you get this conclusion that I am not OK with this?

Zhu Yanjun

>
> Bob