diff mbox series

[for-next,v2,18/18] RDMA/rxe: Add parameters to control task type

Message ID 20221021200118.2163-19-rpearsonhpe@gmail.com (mailing list archive)
State Superseded
Delegated to: Jason Gunthorpe
Headers show
Series Implement work queues for rdma_rxe | expand

Commit Message

Bob Pearson Oct. 21, 2022, 8:01 p.m. UTC
Add modparams to control the task types for req, comp, and resp
tasks.

It is expected that the work queue version will take the place of
the tasklet version once this patch series is accepted and moved
upstream. However, for now it is convenient to keep the option of
easily switching between the versions to help benchmarking and
debugging.

Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe_qp.c   | 6 +++---
 drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++++
 drivers/infiniband/sw/rxe/rxe_task.h | 4 ++++
 3 files changed, 15 insertions(+), 3 deletions(-)

Comments

Daisuke Matsuda (Fujitsu) Oct. 25, 2022, 9:35 a.m. UTC | #1
On Sat, Oct 22, 2022 5:01 AM Bob Pearson:
> 
> Add modparams to control the task types for req, comp, and resp
> tasks.
> 
> It is expected that the work queue version will take the place of
> the tasklet version once this patch series is accepted and moved
> upstream. However, for now it is convenient to keep the option of
> easily switching between the versions to help benchmarking and
> debugging.
> 
> Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_qp.c   | 6 +++---
>  drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++++
>  drivers/infiniband/sw/rxe/rxe_task.h | 4 ++++
>  3 files changed, 15 insertions(+), 3 deletions(-)

<...>

> diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
> index 9b8c9d28ee46..4568c4a05e85 100644
> --- a/drivers/infiniband/sw/rxe/rxe_task.c
> +++ b/drivers/infiniband/sw/rxe/rxe_task.c
> @@ -6,6 +6,14 @@
> 
>  #include "rxe.h"
> 
> +int rxe_req_task_type = RXE_TASK_TYPE_TASKLET;
> +int rxe_comp_task_type = RXE_TASK_TYPE_TASKLET;
> +int rxe_resp_task_type = RXE_TASK_TYPE_TASKLET;

As the tasklet version is to be eliminated in near future, why
don't we make the workqueue version default now?


> +
> +module_param_named(req_task_type, rxe_req_task_type, int, 0664);
> +module_param_named(comp_task_type, rxe_comp_task_type, int, 0664);
> +module_param_named(resp_task_type, rxe_resp_task_type, int, 0664);

As I have commented to the 7th patch, users would not benefit from
specifying the 'inline' type directly. It is OK to have the mode internally
to keep the source code simple, but RXE_TASK_TYPE_INLINE should
not be exposed to users.

I suggest that these module parameters be bool type and task types
be like this:
=== rxe_task.h===
enum rxe_task_type {
        RXE_TASK_TYPE_WORKQUEUE = 0,
        RXE_TASK_TYPE_TASKLET   = 1,
        RXE_TASK_TYPE_INLINE    = 2,
};
=============
In this case, while we can preserve the 'inline' type internally,
we can prohibit users from specifying any value other than
'workqueue' or 'tasklet'; modprobe fails if non-boolean values
are passed.

If you still want to keep the parameters int type, then you need
to add some code to perform value check. We can specify an 
arbitrary int value with the current implementation.

Thanks,
Daisuke

> +
>  static struct workqueue_struct *rxe_wq;
> 
>  int rxe_alloc_wq(void)
> diff --git a/drivers/infiniband/sw/rxe/rxe_task.h b/drivers/infiniband/sw/rxe/rxe_task.h
> index d1156b935635..5a2ac7ada05b 100644
> --- a/drivers/infiniband/sw/rxe/rxe_task.h
> +++ b/drivers/infiniband/sw/rxe/rxe_task.h
> @@ -7,6 +7,10 @@
>  #ifndef RXE_TASK_H
>  #define RXE_TASK_H
> 
> +extern int rxe_req_task_type;
> +extern int rxe_comp_task_type;
> +extern int rxe_resp_task_type;
> +
>  struct rxe_task;
> 
>  struct rxe_task_ops {
> --
> 2.34.1
Jason Gunthorpe Oct. 25, 2022, 11:18 a.m. UTC | #2
On Tue, Oct 25, 2022 at 09:35:01AM +0000, Daisuke Matsuda (Fujitsu) wrote:
> On Sat, Oct 22, 2022 5:01 AM Bob Pearson:
> > 
> > Add modparams to control the task types for req, comp, and resp
> > tasks.
> > 
> > It is expected that the work queue version will take the place of
> > the tasklet version once this patch series is accepted and moved
> > upstream. However, for now it is convenient to keep the option of
> > easily switching between the versions to help benchmarking and
> > debugging.
> > 
> > Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
> > Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> > ---
> >  drivers/infiniband/sw/rxe/rxe_qp.c   | 6 +++---
> >  drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++++
> >  drivers/infiniband/sw/rxe/rxe_task.h | 4 ++++
> >  3 files changed, 15 insertions(+), 3 deletions(-)
> 
> <...>
> 
> > diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
> > index 9b8c9d28ee46..4568c4a05e85 100644
> > --- a/drivers/infiniband/sw/rxe/rxe_task.c
> > +++ b/drivers/infiniband/sw/rxe/rxe_task.c
> > @@ -6,6 +6,14 @@
> > 
> >  #include "rxe.h"
> > 
> > +int rxe_req_task_type = RXE_TASK_TYPE_TASKLET;
> > +int rxe_comp_task_type = RXE_TASK_TYPE_TASKLET;
> > +int rxe_resp_task_type = RXE_TASK_TYPE_TASKLET;
> 
> As the tasklet version is to be eliminated in near future, why
> don't we make the workqueue version default now?

Why don't we just get rid of the tasklet version right away?

Jason
Bob Pearson Oct. 25, 2022, 2:31 p.m. UTC | #3
On 10/25/22 06:18, Jason Gunthorpe wrote:
> On Tue, Oct 25, 2022 at 09:35:01AM +0000, Daisuke Matsuda (Fujitsu) wrote:
>> On Sat, Oct 22, 2022 5:01 AM Bob Pearson:
>>>
>>> Add modparams to control the task types for req, comp, and resp
>>> tasks.
>>>
>>> It is expected that the work queue version will take the place of
>>> the tasklet version once this patch series is accepted and moved
>>> upstream. However, for now it is convenient to keep the option of
>>> easily switching between the versions to help benchmarking and
>>> debugging.
>>>
>>> Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
>>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
>>> ---
>>>  drivers/infiniband/sw/rxe/rxe_qp.c   | 6 +++---
>>>  drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++++
>>>  drivers/infiniband/sw/rxe/rxe_task.h | 4 ++++
>>>  3 files changed, 15 insertions(+), 3 deletions(-)
>>
>> <...>
>>
>>> diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
>>> index 9b8c9d28ee46..4568c4a05e85 100644
>>> --- a/drivers/infiniband/sw/rxe/rxe_task.c
>>> +++ b/drivers/infiniband/sw/rxe/rxe_task.c
>>> @@ -6,6 +6,14 @@
>>>
>>>  #include "rxe.h"
>>>
>>> +int rxe_req_task_type = RXE_TASK_TYPE_TASKLET;
>>> +int rxe_comp_task_type = RXE_TASK_TYPE_TASKLET;
>>> +int rxe_resp_task_type = RXE_TASK_TYPE_TASKLET;
>>
>> As the tasklet version is to be eliminated in near future, why
>> don't we make the workqueue version default now?
> 
> Why don't we just get rid of the tasklet version right away?
> 
> Jason

What time zone are you in? I thought you were out west.

There are several users out there who use rxe as a dev tool for other subsystems.
I don't want to make a big change that they can't control. By letting them decide
if and when to move that is avoided. I could back out the modparam and just let
people recompile but I'd end up putting it back in for our use because it is easier.

No matter what we decide to do here, there is a question I have about how to
surface tuning parameters to users. Not everyone can just recompile to make changes.
What is the preferred way to do this?

Bob
Bob Pearson Oct. 25, 2022, 2:49 p.m. UTC | #4
On 10/25/22 04:35, Daisuke Matsuda (Fujitsu) wrote:
> On Sat, Oct 22, 2022 5:01 AM Bob Pearson:
>>
>> Add modparams to control the task types for req, comp, and resp
>> tasks.
>>
>> It is expected that the work queue version will take the place of
>> the tasklet version once this patch series is accepted and moved
>> upstream. However, for now it is convenient to keep the option of
>> easily switching between the versions to help benchmarking and
>> debugging.
>>
>> Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
>> ---
>>  drivers/infiniband/sw/rxe/rxe_qp.c   | 6 +++---
>>  drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++++
>>  drivers/infiniband/sw/rxe/rxe_task.h | 4 ++++
>>  3 files changed, 15 insertions(+), 3 deletions(-)
> 
> <...>
> 
>> diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
>> index 9b8c9d28ee46..4568c4a05e85 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_task.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_task.c
>> @@ -6,6 +6,14 @@
>>
>>  #include "rxe.h"
>>
>> +int rxe_req_task_type = RXE_TASK_TYPE_TASKLET;
>> +int rxe_comp_task_type = RXE_TASK_TYPE_TASKLET;
>> +int rxe_resp_task_type = RXE_TASK_TYPE_TASKLET;
> 
> As the tasklet version is to be eliminated in near future, why
> don't we make the workqueue version default now?
> 
> 
>> +
>> +module_param_named(req_task_type, rxe_req_task_type, int, 0664);
>> +module_param_named(comp_task_type, rxe_comp_task_type, int, 0664);
>> +module_param_named(resp_task_type, rxe_resp_task_type, int, 0664);
> 
> As I have commented to the 7th patch, users would not benefit from
> specifying the 'inline' type directly. It is OK to have the mode internally
> to keep the source code simple, but RXE_TASK_TYPE_INLINE should
> not be exposed to users.
> 
> I suggest that these module parameters be bool type and task types
> be like this:
> === rxe_task.h===
> enum rxe_task_type {
>         RXE_TASK_TYPE_WORKQUEUE = 0,
>         RXE_TASK_TYPE_TASKLET   = 1,
>         RXE_TASK_TYPE_INLINE    = 2,
> };
> =============
> In this case, while we can preserve the 'inline' type internally,
> we can prohibit users from specifying any value other than
> 'workqueue' or 'tasklet'; modprobe fails if non-boolean values
> are passed.

I don't know if you have noticed this but the tasks that handle incoming packets
already process them inline if the queues are empty which is most of the time.
The difference between this and inline always is not major. The NAPI thread is
already deferred once so we're not running at hw interrupt level.

What you say makes sense in a multi-user environment but that is not always the case.
HPC jobs typically have the node dedicated to a single user and it seems best to let
them figure out what works best. In any case it takes root to make this change.

Bob
> 
> If you still want to keep the parameters int type, then you need
> to add some code to perform value check. We can specify an 
> arbitrary int value with the current implementation.
> 
> Thanks,
> Daisuke
> 
>> +
>>  static struct workqueue_struct *rxe_wq;
>>
>>  int rxe_alloc_wq(void)
>> diff --git a/drivers/infiniband/sw/rxe/rxe_task.h b/drivers/infiniband/sw/rxe/rxe_task.h
>> index d1156b935635..5a2ac7ada05b 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_task.h
>> +++ b/drivers/infiniband/sw/rxe/rxe_task.h
>> @@ -7,6 +7,10 @@
>>  #ifndef RXE_TASK_H
>>  #define RXE_TASK_H
>>
>> +extern int rxe_req_task_type;
>> +extern int rxe_comp_task_type;
>> +extern int rxe_resp_task_type;
>> +
>>  struct rxe_task;
>>
>>  struct rxe_task_ops {
>> --
>> 2.34.1
>
Jason Gunthorpe Oct. 25, 2022, 4:27 p.m. UTC | #5
On Tue, Oct 25, 2022 at 09:31:25AM -0500, Bob Pearson wrote:
> On 10/25/22 06:18, Jason Gunthorpe wrote:
> > On Tue, Oct 25, 2022 at 09:35:01AM +0000, Daisuke Matsuda (Fujitsu) wrote:
> >> On Sat, Oct 22, 2022 5:01 AM Bob Pearson:
> >>>
> >>> Add modparams to control the task types for req, comp, and resp
> >>> tasks.
> >>>
> >>> It is expected that the work queue version will take the place of
> >>> the tasklet version once this patch series is accepted and moved
> >>> upstream. However, for now it is convenient to keep the option of
> >>> easily switching between the versions to help benchmarking and
> >>> debugging.
> >>>
> >>> Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
> >>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> >>> ---
> >>>  drivers/infiniband/sw/rxe/rxe_qp.c   | 6 +++---
> >>>  drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++++
> >>>  drivers/infiniband/sw/rxe/rxe_task.h | 4 ++++
> >>>  3 files changed, 15 insertions(+), 3 deletions(-)
> >>
> >> <...>
> >>
> >>> diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
> >>> index 9b8c9d28ee46..4568c4a05e85 100644
> >>> --- a/drivers/infiniband/sw/rxe/rxe_task.c
> >>> +++ b/drivers/infiniband/sw/rxe/rxe_task.c
> >>> @@ -6,6 +6,14 @@
> >>>
> >>>  #include "rxe.h"
> >>>
> >>> +int rxe_req_task_type = RXE_TASK_TYPE_TASKLET;
> >>> +int rxe_comp_task_type = RXE_TASK_TYPE_TASKLET;
> >>> +int rxe_resp_task_type = RXE_TASK_TYPE_TASKLET;
> >>
> >> As the tasklet version is to be eliminated in near future, why
> >> don't we make the workqueue version default now?
> > 
> > Why don't we just get rid of the tasklet version right away?
> > 
> > Jason
> 
> What time zone are you in? I thought you were out west.
> 
> There are several users out there who use rxe as a dev tool for other subsystems.
> I don't want to make a big change that they can't control. By letting them decide
> if and when to move that is avoided. I could back out the modparam and just let
> people recompile but I'd end up putting it back in for our use because it is easier.

As long as it is functionally the same I'm not worried about
this. Small performance variations are not going to effect dev work.

> No matter what we decide to do here, there is a question I have about how to
> surface tuning parameters to users. Not everyone can just recompile to make changes.
> What is the preferred way to do this?

I don't know that we have much in this area pre-existing. Most of the
devices do not seem to have tuning?

sysfs and rdma netlink are the usual answers, but I don't know about
exposing rxe specific tunables in netlink..

Jason
Daisuke Matsuda (Fujitsu) Oct. 26, 2022, 7:07 a.m. UTC | #6
On Tue, Oct 25, 2022 11:50 PM Bob Pearson wrote:
> On 10/25/22 04:35, Daisuke Matsuda (Fujitsu) wrote:
> > On Sat, Oct 22, 2022 5:01 AM Bob Pearson:
> >>

<...>

> >> +
> >> +module_param_named(req_task_type, rxe_req_task_type, int, 0664);
> >> +module_param_named(comp_task_type, rxe_comp_task_type, int, 0664);
> >> +module_param_named(resp_task_type, rxe_resp_task_type, int, 0664);
> >
> > As I have commented to the 7th patch, users would not benefit from
> > specifying the 'inline' type directly. It is OK to have the mode internally
> > to keep the source code simple, but RXE_TASK_TYPE_INLINE should
> > not be exposed to users.
> >
> > I suggest that these module parameters be bool type and task types
> > be like this:
> > === rxe_task.h===
> > enum rxe_task_type {
> >         RXE_TASK_TYPE_WORKQUEUE = 0,
> >         RXE_TASK_TYPE_TASKLET   = 1,
> >         RXE_TASK_TYPE_INLINE    = 2,
> > };
> > =============
> > In this case, while we can preserve the 'inline' type internally,
> > we can prohibit users from specifying any value other than
> > 'workqueue' or 'tasklet'; modprobe fails if non-boolean values
> > are passed.
> 
> I don't know if you have noticed this but the tasks that handle incoming packets
> already process them inline if the queues are empty which is most of the time.
> The difference between this and inline always is not major. The NAPI thread is
> already deferred once so we're not running at hw interrupt level.

I know rxe_resp_queue_pkt() and rxe_comp_queue_pkt() usually call the procedures
directly. That made me wonder why we need the additional type for users when
there are virtually no difference after all. I now understand how you use this mode for
HPC, but I am not sure we should add this new type for the very specific purpose.

> 
> What you say makes sense in a multi-user environment but that is not always the case.
> HPC jobs typically have the node dedicated to a single user and it seems best to let
> them figure out what works best. In any case it takes root to make this change.

This sounds correct, but it seems we cannot delve into this issue further until the
discussion of whether to eliminate the tasklets or not settles. If the tasklets will be
removed, then I think we should give up the inline type along with the module
parameters in order to keep the things simple, which is much more friendly to
ordinary end-users.

Thanks,
Daisuke

> 
> Bob
> >
> > If you still want to keep the parameters int type, then you need
> > to add some code to perform value check. We can specify an
> > arbitrary int value with the current implementation.
> >
> > Thanks,
> > Daisuke
> >
> >> +
> >>  static struct workqueue_struct *rxe_wq;
> >>
> >>  int rxe_alloc_wq(void)
> >> diff --git a/drivers/infiniband/sw/rxe/rxe_task.h b/drivers/infiniband/sw/rxe/rxe_task.h
> >> index d1156b935635..5a2ac7ada05b 100644
> >> --- a/drivers/infiniband/sw/rxe/rxe_task.h
> >> +++ b/drivers/infiniband/sw/rxe/rxe_task.h
> >> @@ -7,6 +7,10 @@
> >>  #ifndef RXE_TASK_H
> >>  #define RXE_TASK_H
> >>
> >> +extern int rxe_req_task_type;
> >> +extern int rxe_comp_task_type;
> >> +extern int rxe_resp_task_type;
> >> +
> >>  struct rxe_task;
> >>
> >>  struct rxe_task_ops {
> >> --
> >> 2.34.1
> >
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index 50f6b8b8ad9d..673d52271062 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -238,9 +238,9 @@  static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
 
 	skb_queue_head_init(&qp->req_pkts);
 
-	rxe_init_task(&qp->req.task, qp, rxe_requester, RXE_TASK_TYPE_TASKLET);
+	rxe_init_task(&qp->req.task, qp, rxe_requester, rxe_req_task_type);
 	rxe_init_task(&qp->comp.task, qp, rxe_completer,
-			(qp_type(qp) == IB_QPT_RC) ? RXE_TASK_TYPE_TASKLET :
+			(qp_type(qp) == IB_QPT_RC) ? rxe_comp_task_type :
 						     RXE_TASK_TYPE_INLINE);
 
 	qp->qp_timeout_jiffies = 0; /* Can't be set for UD/UC in modify_qp */
@@ -288,7 +288,7 @@  static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp,
 
 	skb_queue_head_init(&qp->resp_pkts);
 
-	rxe_init_task(&qp->resp.task, qp, rxe_responder, RXE_TASK_TYPE_TASKLET);
+	rxe_init_task(&qp->resp.task, qp, rxe_responder, rxe_resp_task_type);
 
 	qp->resp.opcode		= OPCODE_NONE;
 	qp->resp.msn		= 0;
diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
index 9b8c9d28ee46..4568c4a05e85 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.c
+++ b/drivers/infiniband/sw/rxe/rxe_task.c
@@ -6,6 +6,14 @@ 
 
 #include "rxe.h"
 
+int rxe_req_task_type = RXE_TASK_TYPE_TASKLET;
+int rxe_comp_task_type = RXE_TASK_TYPE_TASKLET;
+int rxe_resp_task_type = RXE_TASK_TYPE_TASKLET;
+
+module_param_named(req_task_type, rxe_req_task_type, int, 0664);
+module_param_named(comp_task_type, rxe_comp_task_type, int, 0664);
+module_param_named(resp_task_type, rxe_resp_task_type, int, 0664);
+
 static struct workqueue_struct *rxe_wq;
 
 int rxe_alloc_wq(void)
diff --git a/drivers/infiniband/sw/rxe/rxe_task.h b/drivers/infiniband/sw/rxe/rxe_task.h
index d1156b935635..5a2ac7ada05b 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.h
+++ b/drivers/infiniband/sw/rxe/rxe_task.h
@@ -7,6 +7,10 @@ 
 #ifndef RXE_TASK_H
 #define RXE_TASK_H
 
+extern int rxe_req_task_type;
+extern int rxe_comp_task_type;
+extern int rxe_resp_task_type;
+
 struct rxe_task;
 
 struct rxe_task_ops {