Message ID | 20240711095908.1604235-1-dan.aloni@vastdata.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/2] rpcrdma: fix handling for RDMA_CM_EVENT_DISCONNECTED due to address change | expand |
On 11/07/2024 12:59, Dan Aloni wrote: > We observed a scenario in IB bonding where RDMA_CM_EVENT_ADDR_CHANGE is > followed by RDMA_CM_EVENT_DISCONNECTED on a connected endpoint. This > sequence causes a negative reference splat and subsequent tear-down > issues due to a duplication in the disconnection path. > > This fix aligns with the approach taken in a previous change > 4836da219781 ("rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL"), > addressing a similar issue. I think a code comment will help here. This whole handler is not very intuitive (but that may be a result of the rdma_cm state machine, the picture in other ulps do not look materially different).
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c index 432557a553e7..e42f5664ecaf 100644 --- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -273,7 +273,8 @@ rpcrdma_cm_event_handler(struct rdma_cm_id *id, struct rdma_cm_event *event) wake_up_all(&ep->re_connect_wait); return 0; case RDMA_CM_EVENT_DISCONNECTED: - ep->re_connect_status = -ECONNABORTED; + if (xchg(&ep->re_connect_status, -ECONNABORTED) != 1) + break; disconnected: rpcrdma_force_disconnect(ep); return rpcrdma_ep_put(ep);
We observed a scenario in IB bonding where RDMA_CM_EVENT_ADDR_CHANGE is followed by RDMA_CM_EVENT_DISCONNECTED on a connected endpoint. This sequence causes a negative reference splat and subsequent tear-down issues due to a duplication in the disconnection path. This fix aligns with the approach taken in a previous change 4836da219781 ("rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL"), addressing a similar issue. Signed-off-by: Dan Aloni <dan.aloni@vastdata.com> --- net/sunrpc/xprtrdma/verbs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)