From patchwork Fri Jan 26 20:13:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Olga Kornievskaia X-Patchwork-Id: 10186977 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 2B14E601D5 for ; Fri, 26 Jan 2018 20:13:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1BF2928A28 for ; Fri, 26 Jan 2018 20:13:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1054F28BA0; Fri, 26 Jan 2018 20:13:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4110028A28 for ; Fri, 26 Jan 2018 20:13:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752280AbeAZUNw (ORCPT ); Fri, 26 Jan 2018 15:13:52 -0500 Received: from mail-ua0-f195.google.com ([209.85.217.195]:46956 "EHLO mail-ua0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752260AbeAZUNv (ORCPT ); Fri, 26 Jan 2018 15:13:51 -0500 Received: by mail-ua0-f195.google.com with SMTP id j23so1054176uak.13 for ; Fri, 26 Jan 2018 12:13:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=+mg9dhb5ET0rh6gkfPSLvNgYhnbxlxd1YcZZ35MduSw=; b=sqv94u9sGLjl06ph23b0/f6e2HvRIWXKo7HS/ZFyYyNI1+3AXtvVXebm/dyHIIbzuu tD43ZKr/9bnDcIjv0F03i46NbBWo11W68sKfNSevsDTPAeBgAD43PYM8ih3Y6a75bVv2 VfIynuVv33UV7b49aiO/9bG0y5WYtltZhZKPqWos2ouYTMr7CMiGwiRpezZovFv4PkRp DqMrdc1G4zcMgzZnStpF1zUDjzOvbjAMyQqJuelffoLWAE8x91dHWl89cv/kAcvczueP 2mkoqHG0S8TKuRLKmIR1TlQYM6oZQEC1GCr+ZNCeSnJLn4iToFewl+ezdsFqVguYvI5q Isgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=+mg9dhb5ET0rh6gkfPSLvNgYhnbxlxd1YcZZ35MduSw=; b=CyFGXiV4gv5gVhk7Fx9RYyDD7b3FpNilO5R8Q67chavNjvcIAw6BtBFFADMyIDOcZU dMiHK+MuPvRtEgu/yTedbwzq+2DL+UdJplBDrmZ63D4lWfGULDUZOkykamL9OCTzPxqF NS9lesEzFzqbpIs5V9jZUI4KfZrGAKmmmAdOfc7H5UE3sazmTAsNNZItxdSwfIXTqlMo 7z0yDN69ygBRvRcKOPXwLTKv2l1ISEvwVQjZydB+cClzyHRowNOZrHMTzgNYmrbQ0eOZ kecY+QzZEAiePm0b2nfoPsijmrG/1e9D22VFCFABMY3SrG3dnWpdVMgNELcXGWtf5tIE Gzcw== X-Gm-Message-State: AKwxyteg/AmFvNMig04wjJQzmW/4Zt93NBWcnyPoYWxbmkNw98bu6cOd 24cbmV5KTEKUSOxSrVCat0LbDc6x4CciDLI36Sc= X-Google-Smtp-Source: AH8x225iysCaUbOKOYEPTVgXClWmYtbVHYT5BuwuwG0ba/5el4ObwVzN3bu8SxYsc+8ZZiLF+xZC86ZIbkh6+DFnDHI= X-Received: by 10.159.49.239 with SMTP id w44mr8136233uad.111.1516997629190; Fri, 26 Jan 2018 12:13:49 -0800 (PST) MIME-Version: 1.0 Received: by 10.103.91.71 with HTTP; Fri, 26 Jan 2018 12:13:48 -0800 (PST) In-Reply-To: References: <0B3C835B-3E30-43F3-B53C-BFB4462543D3@oracle.com> From: Olga Kornievskaia Date: Fri, 26 Jan 2018 15:13:48 -0500 X-Google-Sender-Auth: qDfROANSB7lWGUwgT4FVy5rHorQ Message-ID: Subject: Re: problem on nfsd doing RDMA write To: Chuck Lever Cc: Bruce Fields , Linux NFS Mailing List Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, Jan 26, 2018 at 2:29 PM, Chuck Lever wrote: > > >> On Jan 26, 2018, at 11:05 AM, Olga Kornievskaia wrote: >> >> On Fri, Jan 26, 2018 at 1:23 PM, Chuck Lever wrote: >>> >>> >>>> On Jan 26, 2018, at 9:24 AM, Olga Kornievskaia wrote: >>>> >>>> Hi Bruce/Chuck, >>>> >>>> There is a problem with nfsd doing a WRITE of size 4093,4094,4095. To >>>> reproduce, mount with RDMA and do a simple dd "dd if=/dev/zero >>>> of=/mnt/testfile bs=4093 count=1". What happens is that nfsd fails to >>>> parse GETATTR after the WRITE in the compound. It fails the operation >>>> with ERR_OP_ILLEGAL. >>>> >>>> The problem happens for the values where XDR round up ends up rounding >>>> up to the page size. I don't know if my fix is the appropriate way to >>>> fix this but with it I don't get the error: >>>> >>>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c >>>> index 2c61c6b..a8489c3 100644 >>>> --- a/fs/nfsd/nfs4xdr.c >>>> +++ b/fs/nfsd/nfs4xdr.c >>>> @@ -1289,11 +1289,12 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compounda >>>> >>>> len = XDR_QUADLEN(write->wr_buflen) << 2; >>>> if (len >= avail) { >>>> - int pages; >>>> + int pages = 0; >>>> >>>> len -= avail; >>>> >>>> - pages = len >> PAGE_SHIFT; >>>> + if (write->wr_buflen >= PAGE_SIZE) >>>> + pages = len >> PAGE_SHIFT; >>>> argp->pagelist += pages; >>>> argp->pagelen -= pages * PAGE_SIZE; >>>> len -= pages * PAGE_SIZE; >>>> >>>> So the problem is the using "len" instead of "write->wr_buflen" leads >>>> for the values 4093,4094,4095 that are rounded up to 4096, it makes >>>> pages=1 and the argp->pagelen ends up being a negative value leading >>>> to problems of parsing the GETATTR. >>> >>> Would this also be a problem near any page boundary, say, a >>> write length of 8191 bytes? >>> >>> Instead of using the rounded up "len", why not try this: >>> >>> - pages = len >> PAGE_SHIFT; >>> + pages = write->wr_buflen >> PAGE_SHIFT; >> >> You are right. It needs to be that. Otherwise 8191 fails the same way. >> >>> And be sure to test with TCP as well. >> >> Sigh. It breaks normal (non-RDMA) mounts. I'll figure out why. > > OK. > > Remember that a Read chunk's length does not have to be > rounded up. Maybe the transport needs to round up the > length of the unmarshaled data content on behalf of the > NFSv4 write decoder. > The problem of simply taking write->wr_buflen was that len before that could have been adjusted by avail value in then non-RDAM mounts. Again, I'm not sure if this is the right fix. But this one works for both non-RDMA and RDMA mounts. len -= pages * PAGE_SIZE; > >> >>>> If this looks OK, I can send a patch. >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> -- >>> Chuck Lever >>> >>> >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > Chuck Lever > > > Signed-off-by: Chuck Lever Reviewed-by: Sagi Grimberg Signed-off-by: J. Bruce Fields --- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index 2c61c6b..3178997 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -1293,7 +1293,10 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compoundargs *argp, struct xdr_ne len -= avail; - pages = len >> PAGE_SHIFT; + if (!avail) + pages = write->wr_buflen >> PAGE_SHIFT; + else + pages = len >> PAGE_SHIFT; argp->pagelist += pages; argp->pagelen -= pages * PAGE_SIZE;