From patchwork Tue Nov 5 19:58:10 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "J. Bruce Fields" X-Patchwork-Id: 3143591 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 0AEC19F3C4 for ; Tue, 5 Nov 2013 19:58:16 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 81ADB20445 for ; Tue, 5 Nov 2013 19:58:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1903520431 for ; Tue, 5 Nov 2013 19:58:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754799Ab3KET6M (ORCPT ); Tue, 5 Nov 2013 14:58:12 -0500 Received: from fieldses.org ([174.143.236.118]:34971 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752970Ab3KET6L (ORCPT ); Tue, 5 Nov 2013 14:58:11 -0500 Received: from bfields by fieldses.org with local (Exim 4.76) (envelope-from ) id 1VdmlO-0006Qc-Uq; Tue, 05 Nov 2013 14:58:10 -0500 Date: Tue, 5 Nov 2013 14:58:10 -0500 From: "J. Bruce Fields" To: Shyam Kaushik Cc: linux-nfs@vger.kernel.org Subject: Re: Need help with NFS Server SUNRPC performance issue Message-ID: <20131105195810.GC23329@fieldses.org> References: <20131031141538.GA621@fieldses.org> <20131104230244.GD8828@fieldses.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Tue, Nov 05, 2013 at 07:14:50PM +0530, Shyam Kaushik wrote: > Hi Bruce, > > You are spot on this issue. I did a quicker option of just fixing > > fs/nfsd/nfs4proc.c > > nfsd_procedures4[] > > NFSPROC4_COMPOUND > instead of > .pc_xdrressize = NFSD_BUFSIZE/4 > > I made it by /8 & I got double the IOPs. I moved it /16 & now I see > that 30 NFSD threads out of 32 that I have configured are doing the > nfsd_write() job. So yes this is the exact problematic area. Yes, that looks like good evidence we're on the right track, thanks very much for the testing. > Now for a permanent fixture for this issue, what do you suggest? Is it > that before processing the compound we adjust svc_reserve()? I think decode_compound() needs to do some estimate of the maximum total reply size and call svc_reserve() with that new estimate. And for the current code I think it really could be as simple as checking whether the compound includes a READ op. That's because that's all the current xdr encoding handles. We need to fix that: people need to be able to fetch ACLs larger than 4k, and READDIR would be faster if it could return more than 4k of data at a go. After we do that, we'll need to know more than just the list of ops, we'll need to e.g. know which attributes exactly a GETATTR requested. And we don't have any automatic way to figure that out so it'll all be a lot of manual arithmetic. On the other hand the good news is we only need a rough upper bound, so this will may be doable. Beyond that it would also be good to think about whether using worst-case reply sizes to decide when to accept requests is really right. Anyway here's the slightly improved hack--totally untested except to fix some compile errors. --b. --- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index d9454fe..947f268 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -1617,6 +1617,7 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp) struct nfsd4_op *op; struct nfsd4_minorversion_ops *ops; bool cachethis = false; + bool foundread = false; int i; READ_BUF(4); @@ -1667,10 +1668,15 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp) * op in the compound wants to be cached: */ cachethis |= nfsd4_cache_this_op(op); + + foundread |= op->opnum == OP_READ; } /* Sessions make the DRC unnecessary: */ if (argp->minorversion) cachethis = false; + if (!foundread) + /* XXX: use tighter estimates, and svc_reserve_auth: */ + svc_reserve(argp->rqstp, PAGE_SIZE); argp->rqstp->rq_cachetype = cachethis ? RC_REPLBUFF : RC_NOCACHE; DECODE_TAIL;