Message ID | 20240711095908.1604235-2-dan.aloni@vastdata.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/2] rpcrdma: fix handling for RDMA_CM_EVENT_DISCONNECTED due to address change | expand |
On 11/07/2024 12:59, Dan Aloni wrote: > It would be beneficial to implement handling similar to > RDMA_CM_EVENT_DEVICE_REMOVAL in the case where RDMA_CM_EVENT_ADDR_CHANGE > is issued for an unconnected CM. I think Chuck has a pending series for handling device removals with a dedicated ib_client. So one would have to rebase against the other... See: https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.gittopic-device-removal Other than that, looks good, Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
On 2024-07-11 13:11:11, Sagi Grimberg wrote: > > > On 11/07/2024 12:59, Dan Aloni wrote: > > It would be beneficial to implement handling similar to > > RDMA_CM_EVENT_DEVICE_REMOVAL in the case where RDMA_CM_EVENT_ADDR_CHANGE > > is issued for an unconnected CM. > > I think Chuck has a pending series for handling device removals with a > dedicated > ib_client. So one would have to rebase against the other... > > See: https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.gittopic-device-removal > > Other than that, looks good, > Reviewed-by: Sagi Grimberg <sagi@grimberg.me> I ran some testing with `device-removal` branch (merge fe4cef916d91 v6.10-rc7), and found that the `RDMA_CM_EVENT_ADDR_CHANGE` fix is still required and independent of the branch's content. And for the second patch I sent, which fixes only a theoretical issue, I think we need the `case RDMA_CM_EVENT_ADDR_CHANGE:` branch to implement the same logic like the removed `case RDMA_CM_EVENT_DEVICE_REMOVAL:` branch.
On 11/07/2024 15:26, Dan Aloni wrote: > On 2024-07-11 13:11:11, Sagi Grimberg wrote: >> >> On 11/07/2024 12:59, Dan Aloni wrote: >>> It would be beneficial to implement handling similar to >>> RDMA_CM_EVENT_DEVICE_REMOVAL in the case where RDMA_CM_EVENT_ADDR_CHANGE >>> is issued for an unconnected CM. >> I think Chuck has a pending series for handling device removals with a >> dedicated >> ib_client. So one would have to rebase against the other... >> >> See: https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.gittopic-device-removal >> >> Other than that, looks good, >> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> > I ran some testing with `device-removal` branch (merge fe4cef916d91 > v6.10-rc7), and found that the `RDMA_CM_EVENT_ADDR_CHANGE` fix is still > required and independent of the branch's content. Yes it isn't related, was just mentioning that Chuck is touching this exact area in the code, so perhaps these patches should be against this branch (if indeed it is heading upstream)? Otherwise Chuck can rebase on top of these fixes, which I agree are needed.
> On Jul 11, 2024, at 6:11 AM, Sagi Grimberg <sagi@grimberg.me> wrote: > > On 11/07/2024 12:59, Dan Aloni wrote: >> It would be beneficial to implement handling similar to >> RDMA_CM_EVENT_DEVICE_REMOVAL in the case where RDMA_CM_EVENT_ADDR_CHANGE >> is issued for an unconnected CM. > > I think Chuck has a pending series for handling device removals with a dedicated > ib_client. So one would have to rebase against the other... > > See: https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.gittopic-device-removal > > Other than that, looks good, > Reviewed-by: Sagi Grimberg <sagi@grimberg.me> FYI Anna has taken my "rpcrdma ib_client" series: http://git.linux-nfs.org/?p=anna/linux-nfs.git;a=shortlog;h=refs/heads/linux-next PATCH 1/2 "rpcrdma: fix handling for RDMA_CM_EVENT_DISCONNECTED due to address change" is a fix that you probably want to make available to the LTS kernels, so it would have to be applied before my patches in order to make that possible. I don't know what Anna's policy is regarding rearranging the patches in her linux-next branch. Quite possibly she'll just have to rip out my patches so we can give her a fresh series with your work and mine properly ordered. -- Chuck Lever
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c index e42f5664ecaf..0342a0527733 100644 --- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -244,14 +244,13 @@ rpcrdma_cm_event_handler(struct rdma_cm_id *id, struct rdma_cm_event *event) case RDMA_CM_EVENT_DEVICE_REMOVAL: pr_info("rpcrdma: removing device %s for %pISpc\n", ep->re_id->device->name, sap); + fallthrough; + case RDMA_CM_EVENT_ADDR_CHANGE: switch (xchg(&ep->re_connect_status, -ENODEV)) { case 0: goto wake_connect_worker; case 1: goto disconnected; } return 0; - case RDMA_CM_EVENT_ADDR_CHANGE: - ep->re_connect_status = -ENODEV; - goto disconnected; case RDMA_CM_EVENT_ESTABLISHED: rpcrdma_ep_get(ep); ep->re_connect_status = 1;
It would be beneficial to implement handling similar to RDMA_CM_EVENT_DEVICE_REMOVAL in the case where RDMA_CM_EVENT_ADDR_CHANGE is issued for an unconnected CM. Signed-off-by: Dan Aloni <dan.aloni@vastdata.com> --- net/sunrpc/xprtrdma/verbs.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)