mbox series

[v2,00/18] Implement work queues for rdma_rxe

Message ID 20221021200118.2163-1-rpearsonhpe@gmail.com (mailing list archive)
Headers show
Series Implement work queues for rdma_rxe | expand

Message

Bob Pearson Oct. 21, 2022, 8:01 p.m. UTC
This patch series implements work queues as an alternative for
the main tasklets in the rdma_rxe driver. The first few patches
perform some cleanups of the current tasklet code followed by a
patch that makes the internal API for task execution pluggable and
implements an inline and a tasklet based set of functions.
The remaining patches cleanup the qp reset and error code in the
three tasklets and modify the locking logic to prevent making
multiple calls to the tasklet scheduling routine. Finally after
this preparation the work queue equivalent set of functions is
added and module parameters are implemented to allow tuning the
task types.

The advantages of the work queue version of deferred task execution
is mainly that the work queue variant has much better scalability
and overall performance than the tasklet variant. The tasklet
performance saturates with one connected queue pair and stays constant.
The work queue performance is slightly better for one queue pair but
scales up with the number of connected queue pairs. The perftest
microbenchmarks in local loopback mode (not a very realistic test
case) can reach approximately 100Gb/sec with work queues compared to
about 16Gb/sec for tasklets.

It is expected that the work queue version will take the place of
the tasklet version once this patch series is accepted and moved
upstream. However, for now it is convenient to keep the option of
easily switching between the versions to help benchmarking and
debugging.

This patch series is derived from an earlier patch set developed by
Ian Ziemba at HPE which is used in some Lustre storage clients attached
to Lustre servers with hard RoCE v2 NICs.

This v2 version of the patch set has some minor changes that address
comments from Leon Romanovsky regarding locking of the valid parameter
and the setup parameters for alloc_workqueue. It also has one
additional cleanup patch.

Bob Pearson (18):
  RDMA/rxe: Remove redundant header files
  RDMA/rxe: Remove init of task locks from rxe_qp.c
  RDMA/rxe: Removed unused name from rxe_task struct
  RDMA/rxe: Split rxe_run_task() into two subroutines
  RDMA/rxe: Make rxe_do_task static
  RDMA/rxe: Rename task->state_lock to task->lock
  RDMA/rxe: Make task interface pluggable
  RDMA/rxe: Split rxe_drain_resp_pkts()
  RDMA/rxe: Simplify reset state handling in rxe_resp.c
  RDMA/rxe: Handle qp error in rxe_resp.c
  RDMA/rxe: Cleanup comp tasks in rxe_qp.c
  RDMA/rxe: Remove __rxe_do_task()
  RDMA/rxe: Make tasks schedule each other
  RDMA/rxe: Implement disable/enable_task()
  RDMA/rxe: Replace TASK_STATE_START by TASK_STATE_IDLE
  RDMA/rxe: Replace task->destroyed by task state INVALID.
  RDMA/rxe: Add workqueue support for tasks
  RDMA/rxe: Add parameters to control task type

 drivers/infiniband/sw/rxe/rxe.c       |   9 +-
 drivers/infiniband/sw/rxe/rxe_comp.c  |  35 ++-
 drivers/infiniband/sw/rxe/rxe_net.c   |   4 +-
 drivers/infiniband/sw/rxe/rxe_qp.c    |  87 +++----
 drivers/infiniband/sw/rxe/rxe_req.c   |  10 +-
 drivers/infiniband/sw/rxe/rxe_resp.c  |  75 ++++--
 drivers/infiniband/sw/rxe/rxe_task.c  | 339 +++++++++++++++++++-------
 drivers/infiniband/sw/rxe/rxe_task.h  |  77 +++---
 drivers/infiniband/sw/rxe/rxe_verbs.c |   8 +-
 9 files changed, 427 insertions(+), 217 deletions(-)


base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780

Comments

Jason Gunthorpe Oct. 28, 2022, 5:04 p.m. UTC | #1
> Bob Pearson (18):
>   RDMA/rxe: Remove redundant header files
>   RDMA/rxe: Remove init of task locks from rxe_qp.c
>   RDMA/rxe: Removed unused name from rxe_task struct
>   RDMA/rxe: Split rxe_run_task() into two subroutines
>   RDMA/rxe: Make rxe_do_task static

I took these patches into for-next, the rest will need reposting to
address the 0-day and decide if we should strip out the work queue
entirely

Jason
Bob Pearson Oct. 28, 2022, 6:16 p.m. UTC | #2
On 10/28/22 12:04, Jason Gunthorpe wrote:
>> Bob Pearson (18):
>>   RDMA/rxe: Remove redundant header files
>>   RDMA/rxe: Remove init of task locks from rxe_qp.c
>>   RDMA/rxe: Removed unused name from rxe_task struct
>>   RDMA/rxe: Split rxe_run_task() into two subroutines
>>   RDMA/rxe: Make rxe_do_task static
> 
> I took these patches into for-next, the rest will need reposting to
> address the 0-day and decide if we should strip out the work queue
> entirely
> 
> Jason

I'm guessing you meant tasklet??
Jason Gunthorpe Oct. 28, 2022, 6:17 p.m. UTC | #3
On Fri, Oct 28, 2022 at 01:16:11PM -0500, Bob Pearson wrote:
> On 10/28/22 12:04, Jason Gunthorpe wrote:
> >> Bob Pearson (18):
> >>   RDMA/rxe: Remove redundant header files
> >>   RDMA/rxe: Remove init of task locks from rxe_qp.c
> >>   RDMA/rxe: Removed unused name from rxe_task struct
> >>   RDMA/rxe: Split rxe_run_task() into two subroutines
> >>   RDMA/rxe: Make rxe_do_task static
> > 
> > I took these patches into for-next, the rest will need reposting to
> > address the 0-day and decide if we should strip out the work queue
> > entirely
> > 
> > Jason
> 
> I'm guessing you meant tasklet??

yes

Jason
Bob Pearson Oct. 28, 2022, 6:58 p.m. UTC | #4
On 10/28/22 13:17, Jason Gunthorpe wrote:
> On Fri, Oct 28, 2022 at 01:16:11PM -0500, Bob Pearson wrote:
>> On 10/28/22 12:04, Jason Gunthorpe wrote:
>>>> Bob Pearson (18):
>>>>   RDMA/rxe: Remove redundant header files
>>>>   RDMA/rxe: Remove init of task locks from rxe_qp.c
>>>>   RDMA/rxe: Removed unused name from rxe_task struct
>>>>   RDMA/rxe: Split rxe_run_task() into two subroutines
>>>>   RDMA/rxe: Make rxe_do_task static
>>>
>>> I took these patches into for-next, the rest will need reposting to
>>> address the 0-day and decide if we should strip out the work queue
>>> entirely
>>>
>>> Jason
>>
>> I'm guessing you meant tasklet??
> 
> yes
> 
> Jason

What do you mean by 0-day?? Sounds like a cpu bug we used to talk about. But not sure what it
has to do with work queue patches.

Bob
Jason Gunthorpe Oct. 28, 2022, 7:16 p.m. UTC | #5
On Fri, Oct 28, 2022 at 01:58:08PM -0500, Bob Pearson wrote:
> On 10/28/22 13:17, Jason Gunthorpe wrote:
> > On Fri, Oct 28, 2022 at 01:16:11PM -0500, Bob Pearson wrote:
> >> On 10/28/22 12:04, Jason Gunthorpe wrote:
> >>>> Bob Pearson (18):
> >>>>   RDMA/rxe: Remove redundant header files
> >>>>   RDMA/rxe: Remove init of task locks from rxe_qp.c
> >>>>   RDMA/rxe: Removed unused name from rxe_task struct
> >>>>   RDMA/rxe: Split rxe_run_task() into two subroutines
> >>>>   RDMA/rxe: Make rxe_do_task static
> >>>
> >>> I took these patches into for-next, the rest will need reposting to
> >>> address the 0-day and decide if we should strip out the work queue
> >>> entirely
> >>>
> >>> Jason
> >>
> >> I'm guessing you meant tasklet??
> > 
> > yes
> > 
> > Jason
> 
> What do you mean by 0-day?? Sounds like a cpu bug we used to talk
> about. But not sure what it has to do with work queue patches.

https://lore.kernel.org/linux-rdma/202210220559.f7taTL8S-lkp@intel.com/

Jason
Pearson, Robert B Oct. 28, 2022, 7:29 p.m. UTC | #6
Yes. I saw that. I missed the 0-day connection. Yes I'll add the 'static' declaration.

Bob

-----Original Message-----
From: Jason Gunthorpe <jgg@nvidia.com> 
Sent: Friday, October 28, 2022 2:16 PM
To: Bob Pearson <rpearsonhpe@gmail.com>
Cc: zyjzyj2000@gmail.com; leon@kernel.org; Hack, Jenny (Ft. Collins) <jhack@hpe.com>; Ziemba, Ian <ian.ziemba@hpe.com>; matsuda-daisuke@fujitsu.com; lizhijian@fujitsu.com; haris.phnx@gmail.com; linux-rdma@vger.kernel.org
Subject: Re: [PATCH v2 00/18] Implement work queues for rdma_rxe

On Fri, Oct 28, 2022 at 01:58:08PM -0500, Bob Pearson wrote:
> On 10/28/22 13:17, Jason Gunthorpe wrote:
> > On Fri, Oct 28, 2022 at 01:16:11PM -0500, Bob Pearson wrote:
> >> On 10/28/22 12:04, Jason Gunthorpe wrote:
> >>>> Bob Pearson (18):
> >>>>   RDMA/rxe: Remove redundant header files
> >>>>   RDMA/rxe: Remove init of task locks from rxe_qp.c
> >>>>   RDMA/rxe: Removed unused name from rxe_task struct
> >>>>   RDMA/rxe: Split rxe_run_task() into two subroutines
> >>>>   RDMA/rxe: Make rxe_do_task static
> >>>
> >>> I took these patches into for-next, the rest will need reposting 
> >>> to address the 0-day and decide if we should strip out the work 
> >>> queue entirely
> >>>
> >>> Jason
> >>
> >> I'm guessing you meant tasklet??
> > 
> > yes
> > 
> > Jason
> 
> What do you mean by 0-day?? Sounds like a cpu bug we used to talk 
> about. But not sure what it has to do with work queue patches.

https://lore.kernel.org/linux-rdma/202210220559.f7taTL8S-lkp@intel.com/

Jason