Message ID | 20221021200118.2163-19-rpearsonhpe@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | Implement work queues for rdma_rxe | expand |
On Sat, Oct 22, 2022 5:01 AM Bob Pearson: > > Add modparams to control the task types for req, comp, and resp > tasks. > > It is expected that the work queue version will take the place of > the tasklet version once this patch series is accepted and moved > upstream. However, for now it is convenient to keep the option of > easily switching between the versions to help benchmarking and > debugging. > > Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com> > Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> > --- > drivers/infiniband/sw/rxe/rxe_qp.c | 6 +++--- > drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++++ > drivers/infiniband/sw/rxe/rxe_task.h | 4 ++++ > 3 files changed, 15 insertions(+), 3 deletions(-) <...> > diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c > index 9b8c9d28ee46..4568c4a05e85 100644 > --- a/drivers/infiniband/sw/rxe/rxe_task.c > +++ b/drivers/infiniband/sw/rxe/rxe_task.c > @@ -6,6 +6,14 @@ > > #include "rxe.h" > > +int rxe_req_task_type = RXE_TASK_TYPE_TASKLET; > +int rxe_comp_task_type = RXE_TASK_TYPE_TASKLET; > +int rxe_resp_task_type = RXE_TASK_TYPE_TASKLET; As the tasklet version is to be eliminated in near future, why don't we make the workqueue version default now? > + > +module_param_named(req_task_type, rxe_req_task_type, int, 0664); > +module_param_named(comp_task_type, rxe_comp_task_type, int, 0664); > +module_param_named(resp_task_type, rxe_resp_task_type, int, 0664); As I have commented to the 7th patch, users would not benefit from specifying the 'inline' type directly. It is OK to have the mode internally to keep the source code simple, but RXE_TASK_TYPE_INLINE should not be exposed to users. I suggest that these module parameters be bool type and task types be like this: === rxe_task.h=== enum rxe_task_type { RXE_TASK_TYPE_WORKQUEUE = 0, RXE_TASK_TYPE_TASKLET = 1, RXE_TASK_TYPE_INLINE = 2, }; ============= In this case, while we can preserve the 'inline' type internally, we can prohibit users from specifying any value other than 'workqueue' or 'tasklet'; modprobe fails if non-boolean values are passed. If you still want to keep the parameters int type, then you need to add some code to perform value check. We can specify an arbitrary int value with the current implementation. Thanks, Daisuke > + > static struct workqueue_struct *rxe_wq; > > int rxe_alloc_wq(void) > diff --git a/drivers/infiniband/sw/rxe/rxe_task.h b/drivers/infiniband/sw/rxe/rxe_task.h > index d1156b935635..5a2ac7ada05b 100644 > --- a/drivers/infiniband/sw/rxe/rxe_task.h > +++ b/drivers/infiniband/sw/rxe/rxe_task.h > @@ -7,6 +7,10 @@ > #ifndef RXE_TASK_H > #define RXE_TASK_H > > +extern int rxe_req_task_type; > +extern int rxe_comp_task_type; > +extern int rxe_resp_task_type; > + > struct rxe_task; > > struct rxe_task_ops { > -- > 2.34.1
On Tue, Oct 25, 2022 at 09:35:01AM +0000, Daisuke Matsuda (Fujitsu) wrote: > On Sat, Oct 22, 2022 5:01 AM Bob Pearson: > > > > Add modparams to control the task types for req, comp, and resp > > tasks. > > > > It is expected that the work queue version will take the place of > > the tasklet version once this patch series is accepted and moved > > upstream. However, for now it is convenient to keep the option of > > easily switching between the versions to help benchmarking and > > debugging. > > > > Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com> > > Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> > > --- > > drivers/infiniband/sw/rxe/rxe_qp.c | 6 +++--- > > drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++++ > > drivers/infiniband/sw/rxe/rxe_task.h | 4 ++++ > > 3 files changed, 15 insertions(+), 3 deletions(-) > > <...> > > > diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c > > index 9b8c9d28ee46..4568c4a05e85 100644 > > --- a/drivers/infiniband/sw/rxe/rxe_task.c > > +++ b/drivers/infiniband/sw/rxe/rxe_task.c > > @@ -6,6 +6,14 @@ > > > > #include "rxe.h" > > > > +int rxe_req_task_type = RXE_TASK_TYPE_TASKLET; > > +int rxe_comp_task_type = RXE_TASK_TYPE_TASKLET; > > +int rxe_resp_task_type = RXE_TASK_TYPE_TASKLET; > > As the tasklet version is to be eliminated in near future, why > don't we make the workqueue version default now? Why don't we just get rid of the tasklet version right away? Jason
On 10/25/22 06:18, Jason Gunthorpe wrote: > On Tue, Oct 25, 2022 at 09:35:01AM +0000, Daisuke Matsuda (Fujitsu) wrote: >> On Sat, Oct 22, 2022 5:01 AM Bob Pearson: >>> >>> Add modparams to control the task types for req, comp, and resp >>> tasks. >>> >>> It is expected that the work queue version will take the place of >>> the tasklet version once this patch series is accepted and moved >>> upstream. However, for now it is convenient to keep the option of >>> easily switching between the versions to help benchmarking and >>> debugging. >>> >>> Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com> >>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> >>> --- >>> drivers/infiniband/sw/rxe/rxe_qp.c | 6 +++--- >>> drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++++ >>> drivers/infiniband/sw/rxe/rxe_task.h | 4 ++++ >>> 3 files changed, 15 insertions(+), 3 deletions(-) >> >> <...> >> >>> diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c >>> index 9b8c9d28ee46..4568c4a05e85 100644 >>> --- a/drivers/infiniband/sw/rxe/rxe_task.c >>> +++ b/drivers/infiniband/sw/rxe/rxe_task.c >>> @@ -6,6 +6,14 @@ >>> >>> #include "rxe.h" >>> >>> +int rxe_req_task_type = RXE_TASK_TYPE_TASKLET; >>> +int rxe_comp_task_type = RXE_TASK_TYPE_TASKLET; >>> +int rxe_resp_task_type = RXE_TASK_TYPE_TASKLET; >> >> As the tasklet version is to be eliminated in near future, why >> don't we make the workqueue version default now? > > Why don't we just get rid of the tasklet version right away? > > Jason What time zone are you in? I thought you were out west. There are several users out there who use rxe as a dev tool for other subsystems. I don't want to make a big change that they can't control. By letting them decide if and when to move that is avoided. I could back out the modparam and just let people recompile but I'd end up putting it back in for our use because it is easier. No matter what we decide to do here, there is a question I have about how to surface tuning parameters to users. Not everyone can just recompile to make changes. What is the preferred way to do this? Bob
On 10/25/22 04:35, Daisuke Matsuda (Fujitsu) wrote: > On Sat, Oct 22, 2022 5:01 AM Bob Pearson: >> >> Add modparams to control the task types for req, comp, and resp >> tasks. >> >> It is expected that the work queue version will take the place of >> the tasklet version once this patch series is accepted and moved >> upstream. However, for now it is convenient to keep the option of >> easily switching between the versions to help benchmarking and >> debugging. >> >> Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com> >> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> >> --- >> drivers/infiniband/sw/rxe/rxe_qp.c | 6 +++--- >> drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++++ >> drivers/infiniband/sw/rxe/rxe_task.h | 4 ++++ >> 3 files changed, 15 insertions(+), 3 deletions(-) > > <...> > >> diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c >> index 9b8c9d28ee46..4568c4a05e85 100644 >> --- a/drivers/infiniband/sw/rxe/rxe_task.c >> +++ b/drivers/infiniband/sw/rxe/rxe_task.c >> @@ -6,6 +6,14 @@ >> >> #include "rxe.h" >> >> +int rxe_req_task_type = RXE_TASK_TYPE_TASKLET; >> +int rxe_comp_task_type = RXE_TASK_TYPE_TASKLET; >> +int rxe_resp_task_type = RXE_TASK_TYPE_TASKLET; > > As the tasklet version is to be eliminated in near future, why > don't we make the workqueue version default now? > > >> + >> +module_param_named(req_task_type, rxe_req_task_type, int, 0664); >> +module_param_named(comp_task_type, rxe_comp_task_type, int, 0664); >> +module_param_named(resp_task_type, rxe_resp_task_type, int, 0664); > > As I have commented to the 7th patch, users would not benefit from > specifying the 'inline' type directly. It is OK to have the mode internally > to keep the source code simple, but RXE_TASK_TYPE_INLINE should > not be exposed to users. > > I suggest that these module parameters be bool type and task types > be like this: > === rxe_task.h=== > enum rxe_task_type { > RXE_TASK_TYPE_WORKQUEUE = 0, > RXE_TASK_TYPE_TASKLET = 1, > RXE_TASK_TYPE_INLINE = 2, > }; > ============= > In this case, while we can preserve the 'inline' type internally, > we can prohibit users from specifying any value other than > 'workqueue' or 'tasklet'; modprobe fails if non-boolean values > are passed. I don't know if you have noticed this but the tasks that handle incoming packets already process them inline if the queues are empty which is most of the time. The difference between this and inline always is not major. The NAPI thread is already deferred once so we're not running at hw interrupt level. What you say makes sense in a multi-user environment but that is not always the case. HPC jobs typically have the node dedicated to a single user and it seems best to let them figure out what works best. In any case it takes root to make this change. Bob > > If you still want to keep the parameters int type, then you need > to add some code to perform value check. We can specify an > arbitrary int value with the current implementation. > > Thanks, > Daisuke > >> + >> static struct workqueue_struct *rxe_wq; >> >> int rxe_alloc_wq(void) >> diff --git a/drivers/infiniband/sw/rxe/rxe_task.h b/drivers/infiniband/sw/rxe/rxe_task.h >> index d1156b935635..5a2ac7ada05b 100644 >> --- a/drivers/infiniband/sw/rxe/rxe_task.h >> +++ b/drivers/infiniband/sw/rxe/rxe_task.h >> @@ -7,6 +7,10 @@ >> #ifndef RXE_TASK_H >> #define RXE_TASK_H >> >> +extern int rxe_req_task_type; >> +extern int rxe_comp_task_type; >> +extern int rxe_resp_task_type; >> + >> struct rxe_task; >> >> struct rxe_task_ops { >> -- >> 2.34.1 >
On Tue, Oct 25, 2022 at 09:31:25AM -0500, Bob Pearson wrote: > On 10/25/22 06:18, Jason Gunthorpe wrote: > > On Tue, Oct 25, 2022 at 09:35:01AM +0000, Daisuke Matsuda (Fujitsu) wrote: > >> On Sat, Oct 22, 2022 5:01 AM Bob Pearson: > >>> > >>> Add modparams to control the task types for req, comp, and resp > >>> tasks. > >>> > >>> It is expected that the work queue version will take the place of > >>> the tasklet version once this patch series is accepted and moved > >>> upstream. However, for now it is convenient to keep the option of > >>> easily switching between the versions to help benchmarking and > >>> debugging. > >>> > >>> Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com> > >>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> > >>> --- > >>> drivers/infiniband/sw/rxe/rxe_qp.c | 6 +++--- > >>> drivers/infiniband/sw/rxe/rxe_task.c | 8 ++++++++ > >>> drivers/infiniband/sw/rxe/rxe_task.h | 4 ++++ > >>> 3 files changed, 15 insertions(+), 3 deletions(-) > >> > >> <...> > >> > >>> diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c > >>> index 9b8c9d28ee46..4568c4a05e85 100644 > >>> --- a/drivers/infiniband/sw/rxe/rxe_task.c > >>> +++ b/drivers/infiniband/sw/rxe/rxe_task.c > >>> @@ -6,6 +6,14 @@ > >>> > >>> #include "rxe.h" > >>> > >>> +int rxe_req_task_type = RXE_TASK_TYPE_TASKLET; > >>> +int rxe_comp_task_type = RXE_TASK_TYPE_TASKLET; > >>> +int rxe_resp_task_type = RXE_TASK_TYPE_TASKLET; > >> > >> As the tasklet version is to be eliminated in near future, why > >> don't we make the workqueue version default now? > > > > Why don't we just get rid of the tasklet version right away? > > > > Jason > > What time zone are you in? I thought you were out west. > > There are several users out there who use rxe as a dev tool for other subsystems. > I don't want to make a big change that they can't control. By letting them decide > if and when to move that is avoided. I could back out the modparam and just let > people recompile but I'd end up putting it back in for our use because it is easier. As long as it is functionally the same I'm not worried about this. Small performance variations are not going to effect dev work. > No matter what we decide to do here, there is a question I have about how to > surface tuning parameters to users. Not everyone can just recompile to make changes. > What is the preferred way to do this? I don't know that we have much in this area pre-existing. Most of the devices do not seem to have tuning? sysfs and rdma netlink are the usual answers, but I don't know about exposing rxe specific tunables in netlink.. Jason
On Tue, Oct 25, 2022 11:50 PM Bob Pearson wrote: > On 10/25/22 04:35, Daisuke Matsuda (Fujitsu) wrote: > > On Sat, Oct 22, 2022 5:01 AM Bob Pearson: > >> <...> > >> + > >> +module_param_named(req_task_type, rxe_req_task_type, int, 0664); > >> +module_param_named(comp_task_type, rxe_comp_task_type, int, 0664); > >> +module_param_named(resp_task_type, rxe_resp_task_type, int, 0664); > > > > As I have commented to the 7th patch, users would not benefit from > > specifying the 'inline' type directly. It is OK to have the mode internally > > to keep the source code simple, but RXE_TASK_TYPE_INLINE should > > not be exposed to users. > > > > I suggest that these module parameters be bool type and task types > > be like this: > > === rxe_task.h=== > > enum rxe_task_type { > > RXE_TASK_TYPE_WORKQUEUE = 0, > > RXE_TASK_TYPE_TASKLET = 1, > > RXE_TASK_TYPE_INLINE = 2, > > }; > > ============= > > In this case, while we can preserve the 'inline' type internally, > > we can prohibit users from specifying any value other than > > 'workqueue' or 'tasklet'; modprobe fails if non-boolean values > > are passed. > > I don't know if you have noticed this but the tasks that handle incoming packets > already process them inline if the queues are empty which is most of the time. > The difference between this and inline always is not major. The NAPI thread is > already deferred once so we're not running at hw interrupt level. I know rxe_resp_queue_pkt() and rxe_comp_queue_pkt() usually call the procedures directly. That made me wonder why we need the additional type for users when there are virtually no difference after all. I now understand how you use this mode for HPC, but I am not sure we should add this new type for the very specific purpose. > > What you say makes sense in a multi-user environment but that is not always the case. > HPC jobs typically have the node dedicated to a single user and it seems best to let > them figure out what works best. In any case it takes root to make this change. This sounds correct, but it seems we cannot delve into this issue further until the discussion of whether to eliminate the tasklets or not settles. If the tasklets will be removed, then I think we should give up the inline type along with the module parameters in order to keep the things simple, which is much more friendly to ordinary end-users. Thanks, Daisuke > > Bob > > > > If you still want to keep the parameters int type, then you need > > to add some code to perform value check. We can specify an > > arbitrary int value with the current implementation. > > > > Thanks, > > Daisuke > > > >> + > >> static struct workqueue_struct *rxe_wq; > >> > >> int rxe_alloc_wq(void) > >> diff --git a/drivers/infiniband/sw/rxe/rxe_task.h b/drivers/infiniband/sw/rxe/rxe_task.h > >> index d1156b935635..5a2ac7ada05b 100644 > >> --- a/drivers/infiniband/sw/rxe/rxe_task.h > >> +++ b/drivers/infiniband/sw/rxe/rxe_task.h > >> @@ -7,6 +7,10 @@ > >> #ifndef RXE_TASK_H > >> #define RXE_TASK_H > >> > >> +extern int rxe_req_task_type; > >> +extern int rxe_comp_task_type; > >> +extern int rxe_resp_task_type; > >> + > >> struct rxe_task; > >> > >> struct rxe_task_ops { > >> -- > >> 2.34.1 > >
diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c index 50f6b8b8ad9d..673d52271062 100644 --- a/drivers/infiniband/sw/rxe/rxe_qp.c +++ b/drivers/infiniband/sw/rxe/rxe_qp.c @@ -238,9 +238,9 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp, skb_queue_head_init(&qp->req_pkts); - rxe_init_task(&qp->req.task, qp, rxe_requester, RXE_TASK_TYPE_TASKLET); + rxe_init_task(&qp->req.task, qp, rxe_requester, rxe_req_task_type); rxe_init_task(&qp->comp.task, qp, rxe_completer, - (qp_type(qp) == IB_QPT_RC) ? RXE_TASK_TYPE_TASKLET : + (qp_type(qp) == IB_QPT_RC) ? rxe_comp_task_type : RXE_TASK_TYPE_INLINE); qp->qp_timeout_jiffies = 0; /* Can't be set for UD/UC in modify_qp */ @@ -288,7 +288,7 @@ static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp, skb_queue_head_init(&qp->resp_pkts); - rxe_init_task(&qp->resp.task, qp, rxe_responder, RXE_TASK_TYPE_TASKLET); + rxe_init_task(&qp->resp.task, qp, rxe_responder, rxe_resp_task_type); qp->resp.opcode = OPCODE_NONE; qp->resp.msn = 0; diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c index 9b8c9d28ee46..4568c4a05e85 100644 --- a/drivers/infiniband/sw/rxe/rxe_task.c +++ b/drivers/infiniband/sw/rxe/rxe_task.c @@ -6,6 +6,14 @@ #include "rxe.h" +int rxe_req_task_type = RXE_TASK_TYPE_TASKLET; +int rxe_comp_task_type = RXE_TASK_TYPE_TASKLET; +int rxe_resp_task_type = RXE_TASK_TYPE_TASKLET; + +module_param_named(req_task_type, rxe_req_task_type, int, 0664); +module_param_named(comp_task_type, rxe_comp_task_type, int, 0664); +module_param_named(resp_task_type, rxe_resp_task_type, int, 0664); + static struct workqueue_struct *rxe_wq; int rxe_alloc_wq(void) diff --git a/drivers/infiniband/sw/rxe/rxe_task.h b/drivers/infiniband/sw/rxe/rxe_task.h index d1156b935635..5a2ac7ada05b 100644 --- a/drivers/infiniband/sw/rxe/rxe_task.h +++ b/drivers/infiniband/sw/rxe/rxe_task.h @@ -7,6 +7,10 @@ #ifndef RXE_TASK_H #define RXE_TASK_H +extern int rxe_req_task_type; +extern int rxe_comp_task_type; +extern int rxe_resp_task_type; + struct rxe_task; struct rxe_task_ops {