Message ID | 20221018043345.4033-1-rpearsonhpe@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Implement work queues for rdma_rxe | expand |
On Tue, Oct 18, 2022 at 6:39 AM Bob Pearson <rpearsonhpe@gmail.com> wrote: > > This patch series implements work queues as an alternative for > the main tasklets in the rdma_rxe driver. The first few patches > perform some cleanups of the current tasklet code followed by a > patch that makes the internal API for task execution pluggable and > implements an inline and a tasklet based set of functions. > The remaining patches cleanup the qp reset and error code in the > three tasklets and modify the locking logic to prevent making > multiple calls to the tasklet scheduling routine. Finally after > this preparation the work queue equivalent set of functions is > added and module parameters are implemented to allow tuning the > task types. > > The advantages of the work queue version of deferred task execution > is mainly that the work queue variant has much better scalability > and overall performance than the tasklet variant. The tasklet > performance saturates with one connected queue pair and stays constant. > The work queue performance is slightly better for one queue pair but > scales up with the number of connected queue pairs. The perftest > microbenchmarks in local loopback mode (not a very realistic test > case) can reach approximately 100Gb/sec with work queues compared to > about 16Gb/sec for tasklets. > > This patch series is derived from an earlier patch set developed by > Ian Ziemba at HPE which is used in some Lustre storage clients attached > to Lustre servers with hard RoCE v2 NICs. > > Bob Pearson (16): > RDMA/rxe: Remove init of task locks from rxe_qp.c > RDMA/rxe: Removed unused name from rxe_task struct > RDMA/rxe: Split rxe_run_task() into two subroutines > RDMA/rxe: Make rxe_do_task static > RDMA/rxe: Rename task->state_lock to task->lock > RDMA/rxe: Make task interface pluggable > RDMA/rxe: Simplify reset state handling in rxe_resp.c > RDMA/rxe: Split rxe_drain_resp_pkts() > RDMA/rxe: Handle qp error in rxe_resp.c > RDMA/rxe: Cleanup comp tasks in rxe_qp.c > RDMA/rxe: Remove __rxe_do_task() > RDMA/rxe: Make tasks schedule each other > RDMA/rxe: Implement disable/enable_task() > RDMA/rxe: Replace TASK_STATE_START by TASK_STATE_IDLE > RDMA/rxe: Add workqueue support for tasks > RDMA/rxe: Add parameters to control task type > > drivers/infiniband/sw/rxe/rxe.c | 9 +- > drivers/infiniband/sw/rxe/rxe_comp.c | 35 ++- > drivers/infiniband/sw/rxe/rxe_net.c | 4 +- > drivers/infiniband/sw/rxe/rxe_qp.c | 87 +++---- > drivers/infiniband/sw/rxe/rxe_req.c | 10 +- > drivers/infiniband/sw/rxe/rxe_resp.c | 75 ++++-- > drivers/infiniband/sw/rxe/rxe_task.c | 354 ++++++++++++++++++++------ > drivers/infiniband/sw/rxe/rxe_task.h | 76 +++--- > drivers/infiniband/sw/rxe/rxe_verbs.c | 8 +- > 9 files changed, 451 insertions(+), 207 deletions(-) > > > base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780 The patch series is not applying cleanly over the mentioned commit for me. Patch 'PATCH for-next 05/16] RDMA/rxe: Rename task->state_lock to task->lock.' fails at "drivers/infiniband/sw/rxe/rxe_task.c:103". I corrected that manually, then it fails in the next commit. Didn't check after that. Is it the same for others or is it just me? > -- > 2.34.1 >
On Fri, Oct 21, 2022 12:02 AM haris Iqbal wrote: > > On Tue, Oct 18, 2022 at 6:39 AM Bob Pearson <rpearsonhpe@gmail.com> wrote: > > > > This patch series implements work queues as an alternative for > > the main tasklets in the rdma_rxe driver. The first few patches > > perform some cleanups of the current tasklet code followed by a > > patch that makes the internal API for task execution pluggable and > > implements an inline and a tasklet based set of functions. > > The remaining patches cleanup the qp reset and error code in the > > three tasklets and modify the locking logic to prevent making > > multiple calls to the tasklet scheduling routine. Finally after > > this preparation the work queue equivalent set of functions is > > added and module parameters are implemented to allow tuning the > > task types. > > > > The advantages of the work queue version of deferred task execution > > is mainly that the work queue variant has much better scalability > > and overall performance than the tasklet variant. The tasklet > > performance saturates with one connected queue pair and stays constant. > > The work queue performance is slightly better for one queue pair but > > scales up with the number of connected queue pairs. The perftest > > microbenchmarks in local loopback mode (not a very realistic test > > case) can reach approximately 100Gb/sec with work queues compared to > > about 16Gb/sec for tasklets. > > > > This patch series is derived from an earlier patch set developed by > > Ian Ziemba at HPE which is used in some Lustre storage clients attached > > to Lustre servers with hard RoCE v2 NICs. > > > > Bob Pearson (16): > > RDMA/rxe: Remove init of task locks from rxe_qp.c > > RDMA/rxe: Removed unused name from rxe_task struct > > RDMA/rxe: Split rxe_run_task() into two subroutines > > RDMA/rxe: Make rxe_do_task static > > RDMA/rxe: Rename task->state_lock to task->lock > > RDMA/rxe: Make task interface pluggable > > RDMA/rxe: Simplify reset state handling in rxe_resp.c > > RDMA/rxe: Split rxe_drain_resp_pkts() > > RDMA/rxe: Handle qp error in rxe_resp.c > > RDMA/rxe: Cleanup comp tasks in rxe_qp.c > > RDMA/rxe: Remove __rxe_do_task() > > RDMA/rxe: Make tasks schedule each other > > RDMA/rxe: Implement disable/enable_task() > > RDMA/rxe: Replace TASK_STATE_START by TASK_STATE_IDLE > > RDMA/rxe: Add workqueue support for tasks > > RDMA/rxe: Add parameters to control task type > > > > drivers/infiniband/sw/rxe/rxe.c | 9 +- > > drivers/infiniband/sw/rxe/rxe_comp.c | 35 ++- > > drivers/infiniband/sw/rxe/rxe_net.c | 4 +- > > drivers/infiniband/sw/rxe/rxe_qp.c | 87 +++---- > > drivers/infiniband/sw/rxe/rxe_req.c | 10 +- > > drivers/infiniband/sw/rxe/rxe_resp.c | 75 ++++-- > > drivers/infiniband/sw/rxe/rxe_task.c | 354 ++++++++++++++++++++------ > > drivers/infiniband/sw/rxe/rxe_task.h | 76 +++--- > > drivers/infiniband/sw/rxe/rxe_verbs.c | 8 +- > > 9 files changed, 451 insertions(+), 207 deletions(-) > > > > > > base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780 > > The patch series is not applying cleanly over the mentioned commit for > me. Patch 'PATCH for-next 05/16] RDMA/rxe: Rename task->state_lock to > task->lock.' fails at "drivers/infiniband/sw/rxe/rxe_task.c:103". > I corrected that manually, then it fails in the next commit. Didn't > check after that. Is it the same for others or is it just me? There is a problem with the 4th patch. Its subject is different from other patches, so probably it was not generated at the same time with them. I could apply the rest after adding the following change to the 4th: === diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c index 097ddb16c230..a7203b93e5cc 100644 --- a/drivers/infiniband/sw/rxe/rxe_task.c +++ b/drivers/infiniband/sw/rxe/rxe_task.c @@ -28,7 +28,7 @@ int __rxe_do_task(struct rxe_task *task) * a second caller finds the task already running * but looks just after the last call to func */ -static void rxe_do_task(struct tasklet_struct *t) +static void do_task(struct tasklet_struct *t) { int cont; int ret; @@ -100,7 +100,7 @@ int rxe_init_task(struct rxe_task *task, void *arg, int (*func)(void *)) task->func = func; task->destroyed = false; - tasklet_setup(&task->tasklet, rxe_do_task); + tasklet_setup(&task->tasklet, do_task); task->state = TASK_STATE_START; spin_lock_init(&task->state_lock); @@ -132,7 +132,7 @@ void rxe_run_task(struct rxe_task *task) if (task->destroyed) return; - rxe_do_task(&task->tasklet); + do_task(&task->tasklet); } void rxe_sched_task(struct rxe_task *task) === Daisuke > > > -- > > 2.34.1 > >
On 10/20/22 21:46, matsuda-daisuke@fujitsu.com wrote: > On Fri, Oct 21, 2022 12:02 AM haris Iqbal wrote: >> >> On Tue, Oct 18, 2022 at 6:39 AM Bob Pearson <rpearsonhpe@gmail.com> wrote: >>> >>> This patch series implements work queues as an alternative for >>> the main tasklets in the rdma_rxe driver. The first few patches >>> perform some cleanups of the current tasklet code followed by a >>> patch that makes the internal API for task execution pluggable and >>> implements an inline and a tasklet based set of functions. >>> The remaining patches cleanup the qp reset and error code in the >>> three tasklets and modify the locking logic to prevent making >>> multiple calls to the tasklet scheduling routine. Finally after >>> this preparation the work queue equivalent set of functions is >>> added and module parameters are implemented to allow tuning the >>> task types. >>> >>> The advantages of the work queue version of deferred task execution >>> is mainly that the work queue variant has much better scalability >>> and overall performance than the tasklet variant. The tasklet >>> performance saturates with one connected queue pair and stays constant. >>> The work queue performance is slightly better for one queue pair but >>> scales up with the number of connected queue pairs. The perftest >>> microbenchmarks in local loopback mode (not a very realistic test >>> case) can reach approximately 100Gb/sec with work queues compared to >>> about 16Gb/sec for tasklets. >>> >>> This patch series is derived from an earlier patch set developed by >>> Ian Ziemba at HPE which is used in some Lustre storage clients attached >>> to Lustre servers with hard RoCE v2 NICs. >>> >>> Bob Pearson (16): >>> RDMA/rxe: Remove init of task locks from rxe_qp.c >>> RDMA/rxe: Removed unused name from rxe_task struct >>> RDMA/rxe: Split rxe_run_task() into two subroutines >>> RDMA/rxe: Make rxe_do_task static >>> RDMA/rxe: Rename task->state_lock to task->lock >>> RDMA/rxe: Make task interface pluggable >>> RDMA/rxe: Simplify reset state handling in rxe_resp.c >>> RDMA/rxe: Split rxe_drain_resp_pkts() >>> RDMA/rxe: Handle qp error in rxe_resp.c >>> RDMA/rxe: Cleanup comp tasks in rxe_qp.c >>> RDMA/rxe: Remove __rxe_do_task() >>> RDMA/rxe: Make tasks schedule each other >>> RDMA/rxe: Implement disable/enable_task() >>> RDMA/rxe: Replace TASK_STATE_START by TASK_STATE_IDLE >>> RDMA/rxe: Add workqueue support for tasks >>> RDMA/rxe: Add parameters to control task type >>> >>> drivers/infiniband/sw/rxe/rxe.c | 9 +- >>> drivers/infiniband/sw/rxe/rxe_comp.c | 35 ++- >>> drivers/infiniband/sw/rxe/rxe_net.c | 4 +- >>> drivers/infiniband/sw/rxe/rxe_qp.c | 87 +++---- >>> drivers/infiniband/sw/rxe/rxe_req.c | 10 +- >>> drivers/infiniband/sw/rxe/rxe_resp.c | 75 ++++-- >>> drivers/infiniband/sw/rxe/rxe_task.c | 354 ++++++++++++++++++++------ >>> drivers/infiniband/sw/rxe/rxe_task.h | 76 +++--- >>> drivers/infiniband/sw/rxe/rxe_verbs.c | 8 +- >>> 9 files changed, 451 insertions(+), 207 deletions(-) >>> >>> >>> base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780 >> >> The patch series is not applying cleanly over the mentioned commit for >> me. Patch 'PATCH for-next 05/16] RDMA/rxe: Rename task->state_lock to >> task->lock.' fails at "drivers/infiniband/sw/rxe/rxe_task.c:103". >> I corrected that manually, then it fails in the next commit. Didn't >> check after that. Is it the same for others or is it just me? > > There is a problem with the 4th patch. Its subject is different from other patches, > so probably it was not generated at the same time with them. I could apply the rest > after adding the following change to the 4th: > === I messed up the subject and resent it just after the other ones. > diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c > index 097ddb16c230..a7203b93e5cc 100644 > --- a/drivers/infiniband/sw/rxe/rxe_task.c > +++ b/drivers/infiniband/sw/rxe/rxe_task.c > @@ -28,7 +28,7 @@ int __rxe_do_task(struct rxe_task *task) > * a second caller finds the task already running > * but looks just after the last call to func > */ > -static void rxe_do_task(struct tasklet_struct *t) > +static void do_task(struct tasklet_struct *t) > { > int cont; > int ret; > @@ -100,7 +100,7 @@ int rxe_init_task(struct rxe_task *task, void *arg, int (*func)(void *)) > task->func = func; > task->destroyed = false; > > - tasklet_setup(&task->tasklet, rxe_do_task); > + tasklet_setup(&task->tasklet, do_task); > > task->state = TASK_STATE_START; > spin_lock_init(&task->state_lock); > @@ -132,7 +132,7 @@ void rxe_run_task(struct rxe_task *task) > if (task->destroyed) > return; > > - rxe_do_task(&task->tasklet); > + do_task(&task->tasklet); > } > > void rxe_sched_task(struct rxe_task *task) > === > > Daisuke > >> >>> -- >>> 2.34.1 >>>
On 10/20/22 10:02, haris iqbal wrote: > On Tue, Oct 18, 2022 at 6:39 AM Bob Pearson <rpearsonhpe@gmail.com> wrote: >> >> This patch series implements work queues as an alternative for >> the main tasklets in the rdma_rxe driver. The first few patches >> perform some cleanups of the current tasklet code followed by a >> patch that makes the internal API for task execution pluggable and >> implements an inline and a tasklet based set of functions. >> The remaining patches cleanup the qp reset and error code in the >> three tasklets and modify the locking logic to prevent making >> multiple calls to the tasklet scheduling routine. Finally after >> this preparation the work queue equivalent set of functions is >> added and module parameters are implemented to allow tuning the >> task types. >> >> The advantages of the work queue version of deferred task execution >> is mainly that the work queue variant has much better scalability >> and overall performance than the tasklet variant. The tasklet >> performance saturates with one connected queue pair and stays constant. >> The work queue performance is slightly better for one queue pair but >> scales up with the number of connected queue pairs. The perftest >> microbenchmarks in local loopback mode (not a very realistic test >> case) can reach approximately 100Gb/sec with work queues compared to >> about 16Gb/sec for tasklets. >> >> This patch series is derived from an earlier patch set developed by >> Ian Ziemba at HPE which is used in some Lustre storage clients attached >> to Lustre servers with hard RoCE v2 NICs. >> >> Bob Pearson (16): >> RDMA/rxe: Remove init of task locks from rxe_qp.c >> RDMA/rxe: Removed unused name from rxe_task struct >> RDMA/rxe: Split rxe_run_task() into two subroutines >> RDMA/rxe: Make rxe_do_task static >> RDMA/rxe: Rename task->state_lock to task->lock >> RDMA/rxe: Make task interface pluggable >> RDMA/rxe: Simplify reset state handling in rxe_resp.c >> RDMA/rxe: Split rxe_drain_resp_pkts() >> RDMA/rxe: Handle qp error in rxe_resp.c >> RDMA/rxe: Cleanup comp tasks in rxe_qp.c >> RDMA/rxe: Remove __rxe_do_task() >> RDMA/rxe: Make tasks schedule each other >> RDMA/rxe: Implement disable/enable_task() >> RDMA/rxe: Replace TASK_STATE_START by TASK_STATE_IDLE >> RDMA/rxe: Add workqueue support for tasks >> RDMA/rxe: Add parameters to control task type >> >> drivers/infiniband/sw/rxe/rxe.c | 9 +- >> drivers/infiniband/sw/rxe/rxe_comp.c | 35 ++- >> drivers/infiniband/sw/rxe/rxe_net.c | 4 +- >> drivers/infiniband/sw/rxe/rxe_qp.c | 87 +++---- >> drivers/infiniband/sw/rxe/rxe_req.c | 10 +- >> drivers/infiniband/sw/rxe/rxe_resp.c | 75 ++++-- >> drivers/infiniband/sw/rxe/rxe_task.c | 354 ++++++++++++++++++++------ >> drivers/infiniband/sw/rxe/rxe_task.h | 76 +++--- >> drivers/infiniband/sw/rxe/rxe_verbs.c | 8 +- >> 9 files changed, 451 insertions(+), 207 deletions(-) >> >> >> base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780 > > The patch series is not applying cleanly over the mentioned commit for > me. Patch 'PATCH for-next 05/16] RDMA/rxe: Rename task->state_lock to > task->lock.' fails at "drivers/infiniband/sw/rxe/rxe_task.c:103". > I corrected that manually, then it fails in the next commit. Didn't > check after that. Is it the same for others or is it just me? > >> -- >> 2.34.1 >> This worked for me. There was the botched 4/16 which I resent just after the other ones. You may need to delete the first 4/16 and use the second one. I am going to resend it tomorrow. There are a couple of things folks have pointed out that I want to address. Bob