IB/ipoib: fix race when handling IPOIB_CM_RX_DRAIN_WRID
From: Ralph Campbell <ralph.campbell@qlogic.com>
ipoib_cm_start_rx_drain() calls ib_post_send() and *then* moves the
struct ipoib_cm_rx onto the rx_drain_list. The ib_post_send() will
trigger a completion callback to ipoib_cm_handle_rx_wc() which
tries to move the rx_drain_list to the rx_reap_list but if the
callback happens before ipoib_cm_start_rx_drain() has moved the
structure, it is left in limbo. The fix is to change
ipoib_cm_start_rx_drain() to put the struct on the rx_drain_list and
then call ib_post_send().
Also, only move one struct from rx_flush_list to rx_drain_list since
concurrent IPOIB_CM_RX_DRAIN_WRID events on different QPs could put
multiple ipoib_cm_rx structs on rx_flush_list.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
---
drivers/infiniband/ulp/ipoib/ipoib_cm.c | 12 +++++++++---
1 files changed, 9 insertions(+), 3 deletions(-)
@@ -216,15 +216,21 @@ static void ipoib_cm_start_rx_drain(struct ipoib_dev_priv *priv)
!list_empty(&priv->cm.rx_drain_list))
return;
+ p = list_entry(priv->cm.rx_flush_list.next, typeof(*p), list);
+
+ /*
+ * Put p on rx_drain_list before calling ib_post_send() or there
+ * is a race with the ipoib_cm_handle_rx_wc() completion handler
+ * trying to remove it from rx_drain_list.
+ */
+ list_move(&p->list, &priv->cm.rx_drain_list);
+
/*
* QPs on flush list are error state. This way, a "flush
* error" WC will be immediately generated for each WR we post.
*/
- p = list_entry(priv->cm.rx_flush_list.next, typeof(*p), list);
if (ib_post_send(p->qp, &ipoib_cm_rx_drain_wr, &bad_wr))
ipoib_warn(priv, "failed to post drain wr\n");
-
- list_splice_init(&priv->cm.rx_flush_list, &priv->cm.rx_drain_list);
}
static void ipoib_cm_rx_event_handler(struct ib_event *event, void *ctx)