From patchwork Tue Aug 29 19:29:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 9928095 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id AB60F602B9 for ; Tue, 29 Aug 2017 19:34:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9E16128634 for ; Tue, 29 Aug 2017 19:34:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9299E28A3C; Tue, 29 Aug 2017 19:34:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 463DA28634 for ; Tue, 29 Aug 2017 19:34:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751770AbdH2TeE (ORCPT ); Tue, 29 Aug 2017 15:34:04 -0400 Received: from a2nlsmtp01-02.prod.iad2.secureserver.net ([198.71.225.36]:33218 "EHLO a2nlsmtp01-02.prod.iad2.secureserver.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751701AbdH2Tar (ORCPT ); Tue, 29 Aug 2017 15:30:47 -0400 Received: from linuxonhyperv.com ([107.180.71.197]) by : HOSTING RELAY : with SMTP id mmCndvmRSuGJpmmCndvcdp; Tue, 29 Aug 2017 12:29:46 -0700 x-originating-ip: 107.180.71.197 Received: from longli by linuxonhyperv.com with local (Exim 4.89) (envelope-from ) id 1dmmCn-0006sG-Qm; Tue, 29 Aug 2017 12:29:45 -0700 From: Long Li To: Steve French , linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, Christoph Hellwig , Tom Talpey , Matthew Wilcox Cc: Long Li Subject: [Patch v3 13/19] CIFS: SMBD: Implement SMB write via RDMA read through memory registration Date: Tue, 29 Aug 2017 12:29:09 -0700 Message-Id: <20170829192915.26251-14-longli@exchange.microsoft.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170829192915.26251-1-longli@exchange.microsoft.com> References: <20170829192915.26251-1-longli@exchange.microsoft.com> X-CMAE-Envelope: MS4wfHTBgB9bttUforpKguhYviGq+LOTKqkgl2XMX6bfmJk46mDIsxkJmr8wj6uluQKIHNBOW8vX0im1rybKMBN/USHYKygjYsY2tXgc9iWYGfIGVtGN7VPz xce71liM6y1rmhlpKo4t+2PQ1tkVIXJfAuwal+SnhJmvk4Kiltmt3ur+ClmTn+ZGt0xE5gNrjepHmGloCqr5ADZ+8gtBIDwVL8fwjtZYg1MTAkCJN/4g21OB RxV/JRYY3cC7mlUU0YbXlZNigX8T/A7LFd+NoeXq+TaZWJ2UufOyugG6Tk9akkgjggGAt1xN8flic6X3nelq6UtMcayC348DXWyOHxEfMEDKhUE/4sr8NWkB Rm8PeN2IX71UgKBt55akFhUqpDeb/0ZYYzHw6jp9bwPH8jcMGLyke5M8ZGxXEspJqdTBdWUbuQ/15DmNQpM1f+5tmHh9QFf4GxCWKVY87E9bQ6g7C8AIU/0U k14/1wZfDEkQkg9C Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Long Li When sending I/O, if size is larger than rdma_readwrite_threshold we prepare to send SMB write packet for a RDMA read via memory registration. The actual I/O is done by remote peer through local RDMA hardware. Modify the relevant fields in the packet accordingly, and append a smbd_buffer_descriptor_v1 to the end of the SMB write packet. Signed-off-by: Long Li --- fs/cifs/smb2pdu.c | 45 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index 5cc5f6c..5581afd 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -48,6 +48,7 @@ #include "smb2glob.h" #include "cifspdu.h" #include "cifs_spnego.h" +#include "smbdirect.h" /* * The following table defines the expected "StructureSize" of SMB2 requests @@ -2716,6 +2717,41 @@ smb2_async_writev(struct cifs_writedata *wdata, offsetof(struct smb2_write_req, Buffer) - 4); req->RemainingBytes = 0; + /* + * If we want to do a server RDMA read, fill in and append + * smbd_buffer_descriptor_v1 to the end of write request + */ + if (server->rdma && wdata->bytes > + server->smbd_conn->rdma_readwrite_threshold) { + + struct smbd_buffer_descriptor_v1 *v1; + bool need_invalidate = server->dialect == SMB30_PROT_ID; + + wdata->mr = smbd_register_mr( + server->smbd_conn, wdata->pages, + wdata->nr_pages, wdata->tailsz, + false, need_invalidate); + if (!wdata->mr) { + rc = -ENOBUFS; + goto async_writev_out; + } + req->Length = 0; + req->DataOffset = 0; + req->RemainingBytes = + (wdata->nr_pages-1)*PAGE_SIZE + wdata->tailsz; + req->Channel = SMB2_CHANNEL_RDMA_V1_INVALIDATE; + if (need_invalidate) + req->Channel = SMB2_CHANNEL_RDMA_V1; + req->WriteChannelInfoOffset = + offsetof(struct smb2_write_req, Buffer) - 4; + req->WriteChannelInfoLength = + sizeof(struct smbd_buffer_descriptor_v1); + v1 = (struct smbd_buffer_descriptor_v1 *) &req->Buffer[0]; + v1->offset = wdata->mr->mr->iova; + v1->token = wdata->mr->mr->rkey; + v1->length = wdata->mr->mr->length; + } + /* 4 for rfc1002 length field and 1 for Buffer */ iov[0].iov_len = 4; iov[0].iov_base = req; @@ -2729,10 +2765,17 @@ smb2_async_writev(struct cifs_writedata *wdata, rqst.rq_pagesz = wdata->pagesz; rqst.rq_tailsz = wdata->tailsz; + if (wdata->mr) { + iov[1].iov_len += sizeof(struct smbd_buffer_descriptor_v1); + rqst.rq_npages = 0; + } + cifs_dbg(FYI, "async write at %llu %u bytes\n", wdata->offset, wdata->bytes); - req->Length = cpu_to_le32(wdata->bytes); + /* For RDMA read, I/O size is in RemainingBytes not in Length */ + if (!wdata->mr) + req->Length = cpu_to_le32(wdata->bytes); inc_rfc1001_len(&req->hdr, wdata->bytes - 1 /* Buffer */);