From patchwork Mon Feb 24 15:24:52 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 3709801 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id DDFE3BF13A for ; Mon, 24 Feb 2014 15:25:06 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id E002E20160 for ; Mon, 24 Feb 2014 15:25:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C001D2015A for ; Mon, 24 Feb 2014 15:25:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752421AbaBXPZC (ORCPT ); Mon, 24 Feb 2014 10:25:02 -0500 Received: from smtp03.stone-is.org ([87.238.162.65]:58717 "EHLO smtpgw.stone-is.be" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1751568AbaBXPZB (ORCPT ); Mon, 24 Feb 2014 10:25:01 -0500 Received: from localhost (unknown [127.0.0.1]) by smtpgw.stone-is.be (Postfix) with ESMTP id C97AE335950; Mon, 24 Feb 2014 15:24:59 +0000 (UTC) Received: from smtpgw.stone-is.be ([127.0.0.1]) by localhost (smtpgw.stone-is.be [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iJj0k2A-ZFSb; Mon, 24 Feb 2014 16:24:53 +0100 (CET) Received: from vz19.stone-is.net (vz19.stone-is.net [87.238.162.57]) by smtpgw.stone-is.be (Postfix) with ESMTP id 7BF80335949; Mon, 24 Feb 2014 16:24:53 +0100 (CET) X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.117] (178-119-65-67.access.telenet.be [178.119.65.67]) by vz19.stone-is.net (Postfix) with ESMTPSA id 063E989206; Mon, 24 Feb 2014 16:24:25 +0100 (CET) Message-ID: <530B6444.1000805@acm.org> Date: Mon, 24 Feb 2014 16:24:52 +0100 From: Bart Van Assche User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 MIME-Version: 1.0 To: Sagi Grimberg CC: roland@kernel.org, oren@mellanox.com, linux-rdma@vger.kernel.org Subject: Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop References: <1393252218-30638-1-git-send-email-sagig@mellanox.com> <1393252218-30638-2-git-send-email-sagig@mellanox.com> In-Reply-To: <1393252218-30638-2-git-send-email-sagig@mellanox.com> X-Enigmail-Version: 1.6 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 02/24/14 15:30, Sagi Grimberg wrote: > When unmapping request data, it is unsafe automatically > decrement req->nfmr regardless of it's value. This may > happen since IO and reconnect flow may run concurrently > resulting in req->nfmr = -1 and falsely call ib_fmr_pool_unmap. > > Fix the loop condition to be greater than zero (which > explicitly means that FMRs were used on this request) > and only increment when needed. > > This crash is easily reproduceable with ConnectX VFs OR > Connect-IB (where FMRs are not supported) > > Signed-off-by: Sagi Grimberg > --- > drivers/infiniband/ulp/srp/ib_srp.c | 5 ++++- > 1 files changed, 4 insertions(+), 1 deletions(-) > > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c > index 529b6bc..0e20bfb 100644 > --- a/drivers/infiniband/ulp/srp/ib_srp.c > +++ b/drivers/infiniband/ulp/srp/ib_srp.c > @@ -766,8 +766,11 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd, > return; > > pfmr = req->fmr_list; > - while (req->nfmr--) > + > + while (req->nfmr > 0) { > ib_fmr_pool_unmap(*pfmr++); > + req->nfmr--; > + } > > ib_dma_unmap_sg(ibdev, scsi_sglist(scmnd), scsi_sg_count(scmnd), > scmnd->sc_data_direction); > Hello Sagi, If srp_unmap_data() got invoked twice for the same request that means that a race condition has been hit. I would like to see the root cause of that race condition fixed instead of making it safe to invoke srp_unmap_data() multiple times. It would be appreciated if you could verify whether the following patch is a valid alternative for the above patch: Thanks, Bart. --- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index b753a7a..2c75f95 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2141,8 +2141,7 @@ static int SRP_QUEUECOMMAND(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) cmd->tag = req->index; memcpy(cmd->cdb, scmnd->cmnd, scmnd->cmd_len); - req->scmnd = scmnd; - req->cmd = iu; + req->cmd = iu; len = srp_map_data(scmnd, ch, req); if (len < 0) { @@ -2160,7 +2159,12 @@ static int SRP_QUEUECOMMAND(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) ib_dma_sync_single_for_device(dev, iu->dma, target->max_iu_len, DMA_TO_DEVICE); - if (srp_post_send(ch, iu, len)) { + spin_lock_irqsave(&ch->lock, flags); + req->scmnd = scmnd; + ret = srp_post_send(ch, iu, len) ? SCSI_MLQUEUE_HOST_BUSY : 0; + spin_unlock_irqrestore(&ch->lock, flags); + + if (ret) { shost_printk(KERN_ERR, target->scsi_host, PFX "Send failed\n"); goto err_unmap; }