diff mbox series

[1/2] rpcrdma: fix handling for RDMA_CM_EVENT_DISCONNECTED due to address change

Message ID 20240711095908.1604235-1-dan.aloni@vastdata.com (mailing list archive)
State New
Headers show
Series [1/2] rpcrdma: fix handling for RDMA_CM_EVENT_DISCONNECTED due to address change | expand

Commit Message

Dan Aloni July 11, 2024, 9:59 a.m. UTC
We observed a scenario in IB bonding where RDMA_CM_EVENT_ADDR_CHANGE is
followed by RDMA_CM_EVENT_DISCONNECTED on a connected endpoint. This
sequence causes a negative reference splat and subsequent tear-down
issues due to a duplication in the disconnection path.

This fix aligns with the approach taken in a previous change
4836da219781 ("rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL"),
addressing a similar issue.

Signed-off-by: Dan Aloni <dan.aloni@vastdata.com>
---
 net/sunrpc/xprtrdma/verbs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Sagi Grimberg July 11, 2024, 10:08 a.m. UTC | #1
On 11/07/2024 12:59, Dan Aloni wrote:
> We observed a scenario in IB bonding where RDMA_CM_EVENT_ADDR_CHANGE is
> followed by RDMA_CM_EVENT_DISCONNECTED on a connected endpoint. This
> sequence causes a negative reference splat and subsequent tear-down
> issues due to a duplication in the disconnection path.
>
> This fix aligns with the approach taken in a previous change
> 4836da219781 ("rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL"),
> addressing a similar issue.

I think a code comment will help here. This whole handler is not very 
intuitive (but
that may be a result of the rdma_cm state machine, the picture in other 
ulps do not
look materially different).
diff mbox series

Patch

diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 432557a553e7..e42f5664ecaf 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -273,7 +273,8 @@  rpcrdma_cm_event_handler(struct rdma_cm_id *id, struct rdma_cm_event *event)
 		wake_up_all(&ep->re_connect_wait);
 		return 0;
 	case RDMA_CM_EVENT_DISCONNECTED:
-		ep->re_connect_status = -ECONNABORTED;
+		if (xchg(&ep->re_connect_status, -ECONNABORTED) != 1)
+			break;
 disconnected:
 		rpcrdma_force_disconnect(ep);
 		return rpcrdma_ep_put(ep);