From patchwork Fri Oct 1 00:35:48 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ralph Campbell X-Patchwork-Id: 222042 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id o910ZpDr005016 for ; Fri, 1 Oct 2010 00:35:53 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753264Ab0JAAfu (ORCPT ); Thu, 30 Sep 2010 20:35:50 -0400 Received: from avexcashub1.qlogic.com ([198.70.193.61]:8839 "EHLO avexcashub1.qlogic.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751784Ab0JAAfu (ORCPT ); Thu, 30 Sep 2010 20:35:50 -0400 Received: from avexcashub2.qlogic.org (10.1.4.116) by avexcashub1.qlogic.org (10.1.4.161) with Microsoft SMTP Server (TLS) id 8.1.436.0; Thu, 30 Sep 2010 17:35:49 -0700 Received: from [10.29.2.82] (10.29.2.82) by avexcashub2.qlogic.org (10.1.4.162) with Microsoft SMTP Server id 8.1.436.0; Thu, 30 Sep 2010 17:35:49 -0700 Subject: Re: [PATCH v5] IB/ipoib: fix dangling pointer references to ipoib_neigh and ipoib_path From: Ralph Campbell To: Pradeep Satyanarayana CC: Roland Dreier , "linux-rdma@vger.kernel.org" In-Reply-To: <4C806E72.6030507@linux.vnet.ibm.com> References: <20100817203619.22174.62871.stgit@chromite.mv.qlogic.com> <20100817203624.22174.69480.stgit@chromite.mv.qlogic.com> <4C806E72.6030507@linux.vnet.ibm.com> Organization: QLogic Date: Thu, 30 Sep 2010 17:35:48 -0700 Message-ID: <1285893348.22791.120.camel@chromite.mv.qlogic.com> MIME-Version: 1.0 X-Mailer: Evolution 2.28.3 (2.28.3-1.fc12) Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.3 (demeter1.kernel.org [140.211.167.41]); Fri, 01 Oct 2010 00:35:53 +0000 (UTC) IB/ipoib: fix race when handling IPOIB_CM_RX_DRAIN_WRID From: Ralph Campbell ipoib_cm_start_rx_drain() calls ib_post_send() and *then* moves the struct ipoib_cm_rx onto the rx_drain_list. The ib_post_send() will trigger a completion callback to ipoib_cm_handle_rx_wc() which tries to move the rx_drain_list to the rx_reap_list but if the callback happens before ipoib_cm_start_rx_drain() has moved the structure, it is left in limbo. The fix is to change ipoib_cm_start_rx_drain() to put the struct on the rx_drain_list and then call ib_post_send(). Also, only move one struct from rx_flush_list to rx_drain_list since concurrent IPOIB_CM_RX_DRAIN_WRID events on different QPs could put multiple ipoib_cm_rx structs on rx_flush_list. Signed-off-by: Ralph Campbell --- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 12 +++++++++--- 1 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index bb10041..dfff159 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -216,15 +216,21 @@ static void ipoib_cm_start_rx_drain(struct ipoib_dev_priv *priv) !list_empty(&priv->cm.rx_drain_list)) return; + p = list_entry(priv->cm.rx_flush_list.next, typeof(*p), list); + + /* + * Put p on rx_drain_list before calling ib_post_send() or there + * is a race with the ipoib_cm_handle_rx_wc() completion handler + * trying to remove it from rx_drain_list. + */ + list_move(&p->list, &priv->cm.rx_drain_list); + /* * QPs on flush list are error state. This way, a "flush * error" WC will be immediately generated for each WR we post. */ - p = list_entry(priv->cm.rx_flush_list.next, typeof(*p), list); if (ib_post_send(p->qp, &ipoib_cm_rx_drain_wr, &bad_wr)) ipoib_warn(priv, "failed to post drain wr\n"); - - list_splice_init(&priv->cm.rx_flush_list, &priv->cm.rx_drain_list); } static void ipoib_cm_rx_event_handler(struct ib_event *event, void *ctx)