From patchwork Tue Jun 20 12:02:17 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sagi Grimberg X-Patchwork-Id: 9799317 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 2DB88600F6 for ; Tue, 20 Jun 2017 12:02:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1E5EC2847B for ; Tue, 20 Jun 2017 12:02:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 12EA728481; Tue, 20 Jun 2017 12:02:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 87D412847B for ; Tue, 20 Jun 2017 12:02:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751133AbdFTMCY (ORCPT ); Tue, 20 Jun 2017 08:02:24 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:33042 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751124AbdFTMCX (ORCPT ); Tue, 20 Jun 2017 08:02:23 -0400 Received: by mail-wm0-f68.google.com with SMTP id f90so22639822wmh.0 for ; Tue, 20 Jun 2017 05:02:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=3ujrLaDyaxx6djVwkii0qGKzDOgzvmSFYsEo5JskOnA=; b=hllWxStofcWkd9UKb/uKkOkgxM1vcrdcZuQkfCCpE/xS0AtHXDG7qon1kyPSCjQbIa C8ztOxGlgGdoHyMCixPV0NJFdQySYBKqDLAsU3snfmwPW/tn+SntSV3Xc8UBCO9JCle+ ALdb0CE5Y+EjFctdh69zruuu5mAlRsqhMjHn+iJQx8GZZ2efePt97lLXIYoJYyoVJOYK po58wHBX34XrSgOC/NHZmYQvQFIJYvzDZGntlVXEuXviM70oh/vqWlQzJuVntQPzWZL5 XksvtF1ln0+s2PA4c4t+NbN3iytyqKlbnYLt/gfpCKI/IX26OtcEUQRSA6Rthqy7XIGm PhCw== X-Gm-Message-State: AKS2vOzFJNh9fQxWVGAy/QF2mQEPCAFk6Ll6amw2P8Xhr/RJCE1Tk1gF iHDYStWBXV3B9A== X-Received: by 10.28.101.87 with SMTP id z84mr2600601wmb.72.1497960141594; Tue, 20 Jun 2017 05:02:21 -0700 (PDT) Received: from [192.168.64.116] (bzq-82-81-101-184.red.bezeqint.net. [82.81.101.184]) by smtp.gmail.com with ESMTPSA id k12sm13151498wrc.10.2017.06.20.05.02.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 Jun 2017 05:02:20 -0700 (PDT) Subject: Re: Unexpected issues with 2 NVME initiators using the same target From: Sagi Grimberg To: Leon Romanovsky Cc: Robert LeBlanc , Marta Rybczynska , Max Gurtovoy , Christoph Hellwig , "Gruher, Joseph R" , "shahar.salzman" , Laurence Oberman , "Riches Jr, Robert M" , linux-rdma , linux-nvme@lists.infradead.org, Jason Gunthorpe , Liran Liss , Bart Van Assche , Chuck Lever References: <9465cd0c-83db-b058-7615-5626ef60dbb0@grimberg.me> <20170515143632.GH3616@mtr-leonro.local> <20170515145952.GA7871@infradead.org> <20170515170506.GK3616@mtr-leonro.local> <779753075.36035391.1495025796237.JavaMail.zimbra@kalray.eu> <20170518133439.GD3616@mtr-leonro.local> <6073e553-e8c2-6d14-ba5d-c2bd5aff15eb@grimberg.me> <20170620074639.GP17846@mtr-leonro.local> <1c706958-992e-b104-6bae-4a6616c0a9f9@grimberg.me> <20170620083309.GQ17846@mtr-leonro.local> Message-ID: <614481c7-22dd-d93b-e97e-52f868727ec3@grimberg.me> Date: Tue, 20 Jun 2017 15:02:17 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP >>> Can you share the check that correlates to the vendor+hw syndrome? >> >> mkey.free == 1 > > Hmm, the way I understand it is that the HW is trying to access > (locally via send) a MR which was already invalidated. > > Thinking of this further, this can happen in a case where the target > already completed the transaction, sent SEND_WITH_INVALIDATE but the > original send ack was lost somewhere causing the device to retransmit > from the MR (which was already invalidated). This is highly unlikely > though. > > Shouldn't this be protected somehow by the device? > Can someone explain why the above cannot happen? Jason? Liran? Anyone? > > Say host register MR (a) and send (1) from that MR to a target, > send (1) ack got lost, and the target issues SEND_WITH_INVALIDATE > on MR (a) and the host HCA process it, then host HCA timeout on send (1) > so it retries, but ehh, its already invalidated. Well, this entire flow is broken, why should the host send the MR rkey to the target if it is not using it for remote access, the target should never have a chance to remote invalidate something it did not access. I think we have a bug in iSER code, as we should not send the key for remote invalidation if we do inline data send... Robert, can you try the following: --- "VA:%#llX + unsol:%d\n", -- Although, I still don't think its enough. We need to delay the local invalidate till we received a send completion (guarantees that ack was received)... If this indeed the case, _all_ ULP initiator drivers share it because we never condition on a send completion in order to complete an I/O, and in the case of lost ack on send, looks like we need to... It *will* hurt performance. What do other folks think? CC'ing Bart, Chuck, Christoph. Guys, for summary, I think we might have a broken behavior in the initiator mode drivers. We never condition send completions (for requests) before we complete an I/O. The issue is that the ack for those sends might get lost, which means that the HCA will retry them (dropped by the peer HCA) but if we happen to complete the I/O before, either we can unmap the request area, or for inline data, we invalidate it (so the HCA will try to access a MR which was invalidated). Signalling all send completions and also finishing I/Os only after we got them will add latency, and that sucks... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c b/drivers/infiniband/ulp/iser/iser_initiator.c index 12ed62ce9ff7..2a07692007bd 100644 --- a/drivers/infiniband/ulp/iser/iser_initiator.c +++ b/drivers/infiniband/ulp/iser/iser_initiator.c @@ -137,8 +137,10 @@ iser_prepare_write_cmd(struct iscsi_task *task, if (unsol_sz < edtl) { hdr->flags |= ISER_WSV; - hdr->write_stag = cpu_to_be32(mem_reg->rkey); - hdr->write_va = cpu_to_be64(mem_reg->sge.addr + unsol_sz); + if (buf_out->data_len > imm_sz) { + hdr->write_stag = cpu_to_be32(mem_reg->rkey); + hdr->write_va = cpu_to_be64(mem_reg->sge.addr + unsol_sz); + } iser_dbg("Cmd itt:%d, WRITE tags, RKEY:%#.4X "