From patchwork Fri Feb 18 21:24:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 12751938 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10BBEC433EF for ; Fri, 18 Feb 2022 21:30:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239780AbiBRVbH (ORCPT ); Fri, 18 Feb 2022 16:31:07 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:47612 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233540AbiBRVbH (ORCPT ); Fri, 18 Feb 2022 16:31:07 -0500 Received: from mail-qv1-xf30.google.com (mail-qv1-xf30.google.com [IPv6:2607:f8b0:4864:20::f30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D8C1F178958 for ; Fri, 18 Feb 2022 13:30:49 -0800 (PST) Received: by mail-qv1-xf30.google.com with SMTP id p7so17244765qvk.11 for ; Fri, 18 Feb 2022 13:30:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=ug+CvjIUCep1DGuaKq+iPrJGVw6RMMz4iggvbhHn/M0=; b=keL2ho5WOeY3PdjVQlQRqrZDf5Z3DpLzO+N/44hZozmIkciY7bXsPrWG1VcF//pp6F 8EN9X0nopHcviEz62i4/dXM5WC2BrtXr0Hy9ka57ySKZisXXcw4O6BlcHaVf3+uUS5bf a/uG7Do5TeQK7UiQSLE03ao2C5Ft8JKdNvwZc3SFZGQVkoEZhS3G/Ovr8pnszRzpW2Du qhl6F202xhU5Nx7X47hjUAV1Ym/+PvpcoZT8qe1Y0Hs8uyk0irkmp1ZhfPQf4vqPGaTQ hhpKYKl5Nn0fWNs5bkpJG1/mP7diM+r+nJj82fbiD/7R6nVm5bPFCYWqFJ53MG8JLDY0 ptTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ug+CvjIUCep1DGuaKq+iPrJGVw6RMMz4iggvbhHn/M0=; b=QSlInsKy2aFKJliWzkhYNS+EFUUOe4cQ9p+pFz+naCQ4pYckXAOxUvh8jf6BKwlfUw 4UmYANMvn1gW9SMlyK9ItcRcABYWuuj7LFBIBgOLFWvxXf81BBdkUEqtMcrslq0F+KOC v/5CyjtKssQ1AQSB42KGEMg0jPh6wU+SDFCBt0JVWIT4w1Aa4BUTw/3EssfZLRraHy7+ wqw547F2hFFecH7h6jJmdb2P25rhOCQpY6/suyP38bHXI0w/aCgWR49WlcAJbgZ06zqs Kaw31ADqV4AVgZ6sQ99DaNlTT5JxyF9maofwkp9uoHPmadJxWG3OfoRL1R7SP5m+Rlqh wOjA== X-Gm-Message-State: AOAM531vESla+TgrmqExN+czbXKdg3V2UOWUwPaQa9OZrvDJ6BYT7Eli LiaInFApx3FUWLnjjavGNqfH9gNIfg== X-Google-Smtp-Source: ABdhPJwz/+FGNPLOusDDO3gEylmjEbjhgHYE6DFEtK6V8qs2xqLbzvAlV2+auux3lAQ9kSei4KVcTg== X-Received: by 2002:a05:622a:1786:b0:2ca:9f6c:221e with SMTP id s6-20020a05622a178600b002ca9f6c221emr8607009qtk.478.1645219848313; Fri, 18 Feb 2022 13:30:48 -0800 (PST) Received: from localhost.localdomain (c-68-56-145-227.hsd1.mi.comcast.net. [68.56.145.227]) by smtp.gmail.com with ESMTPSA id w22sm26928656qtk.7.2022.02.18.13.30.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Feb 2022 13:30:47 -0800 (PST) From: trondmy@gmail.com X-Google-Original-From: trond.myklebust@hammerspace.com To: linux-nfs@vger.kernel.org Subject: [PATCH v5 1/6] NFS: Adjust the amount of readahead performed by NFS readdir Date: Fri, 18 Feb 2022 16:24:19 -0500 Message-Id: <20220218212424.1840077-2-trond.myklebust@hammerspace.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220218212424.1840077-1-trond.myklebust@hammerspace.com> References: <20220218212424.1840077-1-trond.myklebust@hammerspace.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Trond Myklebust The current NFS readdir code will always try to maximise the amount of readahead it performs on the assumption that we can cache anything that isn't immediately read by the process. There are several cases where this assumption breaks down, including when the 'ls -l' heuristic kicks in to try to force use of readdirplus as a batch replacement for lookup/getattr. This patch therefore tries to tone down the amount of readahead we perform, and adjust it to try to match the amount of data being requested by user space. Signed-off-by: Trond Myklebust --- fs/nfs/dir.c | 55 +++++++++++++++++++++++++++++++++++++++++- include/linux/nfs_fs.h | 1 + 2 files changed, 55 insertions(+), 1 deletion(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 8b190c8e4a45..b0ee3a0e0f81 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -69,6 +69,8 @@ const struct address_space_operations nfs_dir_aops = { .freepage = nfs_readdir_clear_array, }; +#define NFS_INIT_DTSIZE PAGE_SIZE + static struct nfs_open_dir_context *alloc_nfs_open_dir_context(struct inode *dir) { struct nfs_inode *nfsi = NFS_I(dir); @@ -80,6 +82,7 @@ static struct nfs_open_dir_context *alloc_nfs_open_dir_context(struct inode *dir ctx->dir_cookie = 0; ctx->dup_cookie = 0; ctx->page_index = 0; + ctx->dtsize = NFS_INIT_DTSIZE; ctx->eof = false; spin_lock(&dir->i_lock); if (list_empty(&nfsi->open_files) && @@ -155,6 +158,7 @@ struct nfs_readdir_descriptor { struct page *page; struct dir_context *ctx; pgoff_t page_index; + pgoff_t page_index_max; u64 dir_cookie; u64 last_cookie; u64 dup_cookie; @@ -167,12 +171,36 @@ struct nfs_readdir_descriptor { unsigned long gencount; unsigned long attr_gencount; unsigned int cache_entry_index; + unsigned int buffer_fills; + unsigned int dtsize; signed char duped; bool plus; bool eob; bool eof; }; +static void nfs_set_dtsize(struct nfs_readdir_descriptor *desc, unsigned int sz) +{ + struct nfs_server *server = NFS_SERVER(file_inode(desc->file)); + unsigned int maxsize = server->dtsize; + + if (sz > maxsize) + sz = maxsize; + if (sz < NFS_MIN_FILE_IO_SIZE) + sz = NFS_MIN_FILE_IO_SIZE; + desc->dtsize = sz; +} + +static void nfs_shrink_dtsize(struct nfs_readdir_descriptor *desc) +{ + nfs_set_dtsize(desc, desc->dtsize >> 1); +} + +static void nfs_grow_dtsize(struct nfs_readdir_descriptor *desc) +{ + nfs_set_dtsize(desc, desc->dtsize << 1); +} + static void nfs_readdir_array_init(struct nfs_cache_array *array) { memset(array, 0, sizeof(struct nfs_cache_array)); @@ -759,6 +787,7 @@ static int nfs_readdir_page_filler(struct nfs_readdir_descriptor *desc, break; arrays++; *arrays = page = new; + desc->page_index_max++; } else { new = nfs_readdir_page_get_next(mapping, page->index + 1, @@ -768,6 +797,7 @@ static int nfs_readdir_page_filler(struct nfs_readdir_descriptor *desc, if (page != *arrays) nfs_readdir_page_unlock_and_put(page); page = new; + desc->page_index_max = new->index; } status = nfs_readdir_add_to_array(entry, page); } while (!status && !entry->eof); @@ -833,7 +863,7 @@ static int nfs_readdir_xdr_to_array(struct nfs_readdir_descriptor *desc, struct nfs_entry *entry; size_t array_size; struct inode *inode = file_inode(desc->file); - size_t dtsize = NFS_SERVER(inode)->dtsize; + unsigned int dtsize = desc->dtsize; int status = -ENOMEM; entry = kzalloc(sizeof(*entry), GFP_KERNEL); @@ -869,6 +899,7 @@ static int nfs_readdir_xdr_to_array(struct nfs_readdir_descriptor *desc, status = nfs_readdir_page_filler(desc, entry, pages, pglen, arrays, narrays); + desc->buffer_fills++; } while (!status && nfs_readdir_page_needs_filling(page) && page_mapping(page)); @@ -916,6 +947,7 @@ static int find_and_lock_cache_page(struct nfs_readdir_descriptor *desc) if (!desc->page) return -ENOMEM; if (nfs_readdir_page_needs_filling(desc->page)) { + desc->page_index_max = desc->page_index; res = nfs_readdir_xdr_to_array(desc, nfsi->cookieverf, verf, &desc->page, 1); if (res < 0) { @@ -1047,6 +1079,7 @@ static int uncached_readdir(struct nfs_readdir_descriptor *desc) desc->cache_entry_index = 0; desc->last_cookie = desc->dir_cookie; desc->duped = 0; + desc->page_index_max = 0; status = nfs_readdir_xdr_to_array(desc, desc->verf, verf, arrays, sz); @@ -1056,10 +1089,22 @@ static int uncached_readdir(struct nfs_readdir_descriptor *desc) } desc->page = NULL; + /* + * Grow the dtsize if we have to go back for more pages, + * or shrink it if we're reading too many. + */ + if (!desc->eof) { + if (!desc->eob) + nfs_grow_dtsize(desc); + else if (desc->buffer_fills == 1 && + i < (desc->page_index_max >> 1)) + nfs_shrink_dtsize(desc); + } for (i = 0; i < sz && arrays[i]; i++) nfs_readdir_page_array_free(arrays[i]); out: + desc->page_index_max = -1; kfree(arrays); dfprintk(DIRCACHE, "NFS: %s: returns %d\n", __func__, status); return status; @@ -1102,6 +1147,7 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) desc->file = file; desc->ctx = ctx; desc->plus = nfs_use_readdirplus(inode, ctx); + desc->page_index_max = -1; spin_lock(&file->f_lock); desc->dir_cookie = dir_ctx->dir_cookie; @@ -1110,6 +1156,7 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) page_index = dir_ctx->page_index; desc->attr_gencount = dir_ctx->attr_gencount; desc->eof = dir_ctx->eof; + nfs_set_dtsize(desc, dir_ctx->dtsize); memcpy(desc->verf, dir_ctx->verf, sizeof(desc->verf)); spin_unlock(&file->f_lock); @@ -1151,6 +1198,11 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) nfs_do_filldir(desc, nfsi->cookieverf); nfs_readdir_page_unlock_and_put_cached(desc); + if (desc->eob || desc->eof) + break; + /* Grow the dtsize if we have to go back for more pages */ + if (desc->page_index == desc->page_index_max) + nfs_grow_dtsize(desc); } while (!desc->eob && !desc->eof); spin_lock(&file->f_lock); @@ -1160,6 +1212,7 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) dir_ctx->attr_gencount = desc->attr_gencount; dir_ctx->page_index = desc->page_index; dir_ctx->eof = desc->eof; + dir_ctx->dtsize = desc->dtsize; memcpy(dir_ctx->verf, desc->verf, sizeof(dir_ctx->verf)); spin_unlock(&file->f_lock); out_free: diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 6e10725887d1..d27f7e788624 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -106,6 +106,7 @@ struct nfs_open_dir_context { __u64 dir_cookie; __u64 dup_cookie; pgoff_t page_index; + unsigned int dtsize; signed char duped; bool eof; }; From patchwork Fri Feb 18 21:24:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 12751939 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C41A6C433F5 for ; Fri, 18 Feb 2022 21:30:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239882AbiBRVbJ (ORCPT ); Fri, 18 Feb 2022 16:31:09 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:47748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239869AbiBRVbI (ORCPT ); Fri, 18 Feb 2022 16:31:08 -0500 Received: from mail-qv1-xf36.google.com (mail-qv1-xf36.google.com [IPv6:2607:f8b0:4864:20::f36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 25ECA178958 for ; Fri, 18 Feb 2022 13:30:51 -0800 (PST) Received: by mail-qv1-xf36.google.com with SMTP id v10so17283153qvk.7 for ; Fri, 18 Feb 2022 13:30:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=rJbknmMMQc7P/Rh46YEuD8kUrCfShuy/PfuRnvuL8Ic=; b=OqY2s8p7lgRhY4i/gouPpBzq9DOUwny2Y+Y1euCihwHmC+KVAonVNaHygqzCT4RzP7 o7SP7mXwrCJ3Xnp2x/ksvxWdyF7GM+SAO1ZIOUIfxJGCP6ZKTA3NJnvdsjpUSdfFPtBZ AIdKcbP8U12L+HSjkj+EpQuwH+bvyoeT9yWyqASkGK319IZ+6MFmqHO59k0MgCPgsTwh NEWLStjFHw3ZvdwNrrM2w3YgC2n90M2CqA/dStyRmikvsusuV9bVUHeR5eQJl9cczvxo jbv4Zqm7D6Sz85qpwQ5pGAw6VATCfXofKhsWvrGlMD3+PrYcnk0vupQXRuesE+4CvM4j ClVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rJbknmMMQc7P/Rh46YEuD8kUrCfShuy/PfuRnvuL8Ic=; b=X6cbJXUvFbhPBqSbeKNsJ4R5p3LQjeHQlcXIJXNpm/WTqCBvMK8GJ7i3vvHzTQFcNb iBUCMTlet60owEPRL3TzO5sgz1SbYsBzisvDx2bWcI2FyXQZC67du+Qsn/nEGJX8bPb+ cIm44H/3XeOLuomRqF7eDefkLDFdb5kiQFWLpTZLFHyIbRBkrPgTEQNoXx0RMjJ0vp+W 4Y+RFa2f3Uqli4EDq3L1GSV7Vke390vm5zlcdg6HOdgDDFUABaZoY1mc/qKeT6D2RjGo SVmDtb4j3Ehp+wdADVTg0NExn7vx8gB4I86uE9RuMReQgvrPjy6KINfEZ1tzGuMviPv9 uRNA== X-Gm-Message-State: AOAM530t0kVwFK9GRzUX8RRg+cW/WOjUauI0nUkWKZU1E2xkekH2+JlV XadWwWjwqDy6bGkXuUFHwMdXdEZsQA== X-Google-Smtp-Source: ABdhPJynE1kZJXwarIXDmB6ogt15QCmL9tWO+RgOOCj+qrI9SHr4VXEE94jG42OYPmqD/ywKEIbsYw== X-Received: by 2002:a05:622a:1c2:b0:2d4:b318:6949 with SMTP id t2-20020a05622a01c200b002d4b3186949mr8452881qtw.405.1645219849675; Fri, 18 Feb 2022 13:30:49 -0800 (PST) Received: from localhost.localdomain (c-68-56-145-227.hsd1.mi.comcast.net. [68.56.145.227]) by smtp.gmail.com with ESMTPSA id w22sm26928656qtk.7.2022.02.18.13.30.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Feb 2022 13:30:48 -0800 (PST) From: trondmy@gmail.com X-Google-Original-From: trond.myklebust@hammerspace.com To: linux-nfs@vger.kernel.org Subject: [PATCH v5 2/6] NFS: Simplify nfs_readdir_xdr_to_array() Date: Fri, 18 Feb 2022 16:24:20 -0500 Message-Id: <20220218212424.1840077-3-trond.myklebust@hammerspace.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220218212424.1840077-2-trond.myklebust@hammerspace.com> References: <20220218212424.1840077-1-trond.myklebust@hammerspace.com> <20220218212424.1840077-2-trond.myklebust@hammerspace.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Trond Myklebust Recent changes to readdir mean that we can cope with partially filled page cache entries, so we no longer need to rely on looping in nfs_readdir_xdr_to_array(). Signed-off-by: Trond Myklebust --- fs/nfs/dir.c | 29 +++++++++++------------------ 1 file changed, 11 insertions(+), 18 deletions(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index b0ee3a0e0f81..10421b5331ca 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -864,6 +864,7 @@ static int nfs_readdir_xdr_to_array(struct nfs_readdir_descriptor *desc, size_t array_size; struct inode *inode = file_inode(desc->file); unsigned int dtsize = desc->dtsize; + unsigned int pglen; int status = -ENOMEM; entry = kzalloc(sizeof(*entry), GFP_KERNEL); @@ -881,28 +882,20 @@ static int nfs_readdir_xdr_to_array(struct nfs_readdir_descriptor *desc, if (!pages) goto out; - do { - unsigned int pglen; - status = nfs_readdir_xdr_filler(desc, verf_arg, entry->cookie, - pages, dtsize, - verf_res); - if (status < 0) - break; - - pglen = status; - if (pglen == 0) { - nfs_readdir_page_set_eof(page); - break; - } - - verf_arg = verf_res; + status = nfs_readdir_xdr_filler(desc, verf_arg, entry->cookie, pages, + dtsize, verf_res); + if (status < 0) + goto free_pages; + pglen = status; + if (pglen != 0) status = nfs_readdir_page_filler(desc, entry, pages, pglen, arrays, narrays); - desc->buffer_fills++; - } while (!status && nfs_readdir_page_needs_filling(page) && - page_mapping(page)); + else + nfs_readdir_page_set_eof(page); + desc->buffer_fills++; +free_pages: nfs_readdir_free_pages(pages, array_size); out: nfs_free_fattr(entry->fattr); From patchwork Fri Feb 18 21:24:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 12751940 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B1BAC433FE for ; Fri, 18 Feb 2022 21:30:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233540AbiBRVbK (ORCPT ); Fri, 18 Feb 2022 16:31:10 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:47880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239869AbiBRVbJ (ORCPT ); Fri, 18 Feb 2022 16:31:09 -0500 Received: from mail-qk1-x72e.google.com (mail-qk1-x72e.google.com [IPv6:2607:f8b0:4864:20::72e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 88DB4178958 for ; Fri, 18 Feb 2022 13:30:52 -0800 (PST) Received: by mail-qk1-x72e.google.com with SMTP id v5so6896105qkj.4 for ; Fri, 18 Feb 2022 13:30:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=8CqIC3rEvBzQgJVGzgW7Ii250S7Vlb4dINZzcr3ePAc=; b=ZzqWDQ3V4I4HtuKaC9tOqgxGAJ/WuY84bn4wUkUrmyGJZn605a5DShESKW4tplWKuk hbSq4zDNvQRsMWPINmXDv7j8OlhsFc21ELbFX0Zz1pnGm2lfNIccUvJcpJQ9wLgop9eE iocaBOSvmC7oSqJMGwbWBra7xFoOejNyN7sefaLgfEZPOFKCzHqxze8Dvg00UuEGxA0X p1C2Sq4M/7ggDb22aIa7Ed2unzq4SM7vYTWbXY3BkTKmVBGy0ZqzldwBKlGVNb7Z/biS /61rlk1Q2+xcgqbUlJs3YTmzAfYPs2DWwNQ7VBgrc6yRZH6srGYzlENDoTkpZeBNTQft VEIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8CqIC3rEvBzQgJVGzgW7Ii250S7Vlb4dINZzcr3ePAc=; b=0vKgS23B02/ExEsivyZnK5rfPIQRzxgvNevqjW5wJxVIsFg8surcpsQ3WI5bIZ8YlU TaW53B5j94RhJuyTZ7faRZBVu40ZLvIxqY2HnByGYrf0bbcXp/QZW+ZiCyTc/5hmAOac Ht+HpG7xRDB8FMjGY6BahGCg/T6aP4JmWy7F9H2rwqw+VZZuRgi7F623vlld9nP3rniT aL0ZNhlWL6asK4ZQmmbwbgASZnxN44f0LgTcC+9JctJZYUkPd3DMM9LrPCysqHLf7Qlu +UyxoMtL+OQw1Ye8a+4vj3V/tDg2g1sRA/NG93bu42qiibZylkw3V9k0RSpTNBdrDQwq 1H9w== X-Gm-Message-State: AOAM531T4yEAsrHdO+XffErR0lu2mx0x4CaShkwuy7FdLRRyEm279ig8 cV0n3mBO1Xpr4Xuv26seYg5JpK/91w== X-Google-Smtp-Source: ABdhPJzUki3plQCAxOHwVzTbRJO1O0GLzgqJtO6yvGCic+x8itx2ww80YefdglEUxQxZblM1z/8p/w== X-Received: by 2002:a37:747:0:b0:60d:d709:2e20 with SMTP id 68-20020a370747000000b0060dd7092e20mr5709500qkh.579.1645219851012; Fri, 18 Feb 2022 13:30:51 -0800 (PST) Received: from localhost.localdomain (c-68-56-145-227.hsd1.mi.comcast.net. [68.56.145.227]) by smtp.gmail.com with ESMTPSA id w22sm26928656qtk.7.2022.02.18.13.30.49 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Feb 2022 13:30:50 -0800 (PST) From: trondmy@gmail.com X-Google-Original-From: trond.myklebust@hammerspace.com To: linux-nfs@vger.kernel.org Subject: [PATCH v5 3/6] NFS: Improve algorithm for falling back to uncached readdir Date: Fri, 18 Feb 2022 16:24:21 -0500 Message-Id: <20220218212424.1840077-4-trond.myklebust@hammerspace.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220218212424.1840077-3-trond.myklebust@hammerspace.com> References: <20220218212424.1840077-1-trond.myklebust@hammerspace.com> <20220218212424.1840077-2-trond.myklebust@hammerspace.com> <20220218212424.1840077-3-trond.myklebust@hammerspace.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Trond Myklebust When reading a very large directory, we want to try to keep the page cache up to date if doing so is inexpensive. Right now, we will try to refill the page cache if it is non-empty, irrespective of whether or not doing so is going to take a long time. Replace that algorithm with something that looks at how many times we've refilled the page cache without seeing a cache hit. Signed-off-by: Trond Myklebust --- fs/nfs/dir.c | 51 +++++++++++++++++++++--------------------- include/linux/nfs_fs.h | 1 + 2 files changed, 27 insertions(+), 25 deletions(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 10421b5331ca..43a559b34f4a 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -71,19 +71,16 @@ const struct address_space_operations nfs_dir_aops = { #define NFS_INIT_DTSIZE PAGE_SIZE -static struct nfs_open_dir_context *alloc_nfs_open_dir_context(struct inode *dir) +static struct nfs_open_dir_context * +alloc_nfs_open_dir_context(struct inode *dir) { struct nfs_inode *nfsi = NFS_I(dir); struct nfs_open_dir_context *ctx; - ctx = kmalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT); + + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT); if (ctx != NULL) { - ctx->duped = 0; ctx->attr_gencount = nfsi->attr_gencount; - ctx->dir_cookie = 0; - ctx->dup_cookie = 0; - ctx->page_index = 0; ctx->dtsize = NFS_INIT_DTSIZE; - ctx->eof = false; spin_lock(&dir->i_lock); if (list_empty(&nfsi->open_files) && (nfsi->cache_validity & NFS_INO_DATA_INVAL_DEFER)) @@ -170,6 +167,7 @@ struct nfs_readdir_descriptor { unsigned long timestamp; unsigned long gencount; unsigned long attr_gencount; + unsigned int page_fill_misses; unsigned int cache_entry_index; unsigned int buffer_fills; unsigned int dtsize; @@ -925,6 +923,18 @@ nfs_readdir_page_get_cached(struct nfs_readdir_descriptor *desc) desc->last_cookie); } +#define NFS_READDIR_PAGE_FILL_MISS_MAX 5 +/* + * If we've tried to refill the page cache more than 5 times, and + * still not found our cookie, then we should stop and fall back + * to uncached readdir + */ +static bool nfs_readdir_may_fill_pagecache(struct nfs_readdir_descriptor *desc) +{ + return desc->dir_cookie == 0 || + desc->page_fill_misses < NFS_READDIR_PAGE_FILL_MISS_MAX; +} + /* * Returns 0 if desc->dir_cookie was found on page desc->page_index * and locks the page to prevent removal from the page cache. @@ -940,6 +950,8 @@ static int find_and_lock_cache_page(struct nfs_readdir_descriptor *desc) if (!desc->page) return -ENOMEM; if (nfs_readdir_page_needs_filling(desc->page)) { + if (!nfs_readdir_may_fill_pagecache(desc)) + return -EBADCOOKIE; desc->page_index_max = desc->page_index; res = nfs_readdir_xdr_to_array(desc, nfsi->cookieverf, verf, &desc->page, 1); @@ -958,36 +970,22 @@ static int find_and_lock_cache_page(struct nfs_readdir_descriptor *desc) if (desc->page_index == 0) memcpy(nfsi->cookieverf, verf, sizeof(nfsi->cookieverf)); + desc->page_fill_misses++; } res = nfs_readdir_search_array(desc); - if (res == 0) + if (res == 0) { + desc->page_fill_misses = 0; return 0; + } nfs_readdir_page_unlock_and_put_cached(desc); return res; } -static bool nfs_readdir_dont_search_cache(struct nfs_readdir_descriptor *desc) -{ - struct address_space *mapping = desc->file->f_mapping; - struct inode *dir = file_inode(desc->file); - unsigned int dtsize = NFS_SERVER(dir)->dtsize; - loff_t size = i_size_read(dir); - - /* - * Default to uncached readdir if the page cache is empty, and - * we're looking for a non-zero cookie in a large directory. - */ - return desc->dir_cookie != 0 && mapping->nrpages == 0 && size > dtsize; -} - /* Search for desc->dir_cookie from the beginning of the page cache */ static int readdir_search_pagecache(struct nfs_readdir_descriptor *desc) { int res; - if (nfs_readdir_dont_search_cache(desc)) - return -EBADCOOKIE; - do { if (desc->page_index == 0) { desc->current_index = 0; @@ -1149,6 +1147,7 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) page_index = dir_ctx->page_index; desc->attr_gencount = dir_ctx->attr_gencount; desc->eof = dir_ctx->eof; + desc->page_fill_misses = dir_ctx->page_fill_misses; nfs_set_dtsize(desc, dir_ctx->dtsize); memcpy(desc->verf, dir_ctx->verf, sizeof(desc->verf)); spin_unlock(&file->f_lock); @@ -1204,6 +1203,7 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) dir_ctx->duped = desc->duped; dir_ctx->attr_gencount = desc->attr_gencount; dir_ctx->page_index = desc->page_index; + dir_ctx->page_fill_misses = desc->page_fill_misses; dir_ctx->eof = desc->eof; dir_ctx->dtsize = desc->dtsize; memcpy(dir_ctx->verf, desc->verf, sizeof(dir_ctx->verf)); @@ -1247,6 +1247,7 @@ static loff_t nfs_llseek_dir(struct file *filp, loff_t offset, int whence) dir_ctx->dir_cookie = offset; else dir_ctx->dir_cookie = 0; + dir_ctx->page_fill_misses = 0; if (offset == 0) memset(dir_ctx->verf, 0, sizeof(dir_ctx->verf)); dir_ctx->duped = 0; diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index d27f7e788624..3165927048e4 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -106,6 +106,7 @@ struct nfs_open_dir_context { __u64 dir_cookie; __u64 dup_cookie; pgoff_t page_index; + unsigned int page_fill_misses; unsigned int dtsize; signed char duped; bool eof; From patchwork Fri Feb 18 21:24:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 12751941 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E5F3C433EF for ; Fri, 18 Feb 2022 21:30:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239891AbiBRVbM (ORCPT ); Fri, 18 Feb 2022 16:31:12 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:48038 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239869AbiBRVbL (ORCPT ); Fri, 18 Feb 2022 16:31:11 -0500 Received: from mail-qv1-xf36.google.com (mail-qv1-xf36.google.com [IPv6:2607:f8b0:4864:20::f36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22D74377C0 for ; Fri, 18 Feb 2022 13:30:54 -0800 (PST) Received: by mail-qv1-xf36.google.com with SMTP id p7so17245092qvk.11 for ; Fri, 18 Feb 2022 13:30:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=hexoUNGBl+D9siZmjR6DxlCYmvSY1q6HeckNUc7Ha8g=; b=apCMrApeTpAxNPI6wo9bwaQna0TsdwIb2lDRVqAuXKIuz925BffMy260yTTVircE3M DE267tAIODLNXCF/qhwCm8Vj+J1pRMmyRHThnKWJeDdWNVnr8QbKMBZwagz6LImQnL6h Txzv/jUOKKc7EoKmJMcPj974D6mTZRcHDYZaS1wZRulSlCQZElTzlt9KgPoqHFzzn3D4 9sKT7fZV/C0H6W18tAUnjxQrnolxakIyM3qQ6zw3NI5vI8gxMaI97R5bDEKSQLhrBJtF 3XSsFHhHmVTO8Gpr865byNcKTh2oXu4r/jCpptX0v61XsdJ3yQyHDxcFDkY4rK5U6gMh Mjlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hexoUNGBl+D9siZmjR6DxlCYmvSY1q6HeckNUc7Ha8g=; b=cgYm092kumNXZYc+goEwOMrAkZ05dQvQAG5yCI2kuXlF7v4O+aDDOrpgpGZKQaOeNd I+cFJDM5AuLQgtQL/R4QyLQXMEGYsg07QFBCypdoZdZmOeLK4X6m2K0NbC3+s6oAAOMy POHB4KC+2haWjRbsAsAlrAn1fStq+q1k3AhMFIptCvEvHnvGlvP3GeO5HCCk8lKJy2/f ZxGKXNNwf5Hy/hAcThZ4++gkdQp6gSL6kfxj367f+yYlGp60iQX/J4k+Nx3kOiLvKVIM mGS23TK/MN/3xK8VSzl5+TJpeQeVvTwAfzTOXZcCf0hvzeXp/f9GRwXHFBu0l+aiCSeb VJ3A== X-Gm-Message-State: AOAM531kgJDtLJVhMwjQN8h9CYywlxxGrxLDd53SrAWyva7+d4Be335H AKEhOemGWmTGiD/XvU4k1wyFw79NwA== X-Google-Smtp-Source: ABdhPJzqkmfS7MNuyVc3NZFC0IAUp1DTdoY/uylBxXRzdYihuh03PEMHuj7GggqtbpnhtTw/t2KDeA== X-Received: by 2002:ac8:7c4b:0:b0:2dc:a139:52c7 with SMTP id o11-20020ac87c4b000000b002dca13952c7mr8596811qtv.188.1645219852403; Fri, 18 Feb 2022 13:30:52 -0800 (PST) Received: from localhost.localdomain (c-68-56-145-227.hsd1.mi.comcast.net. [68.56.145.227]) by smtp.gmail.com with ESMTPSA id w22sm26928656qtk.7.2022.02.18.13.30.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Feb 2022 13:30:51 -0800 (PST) From: trondmy@gmail.com X-Google-Original-From: trond.myklebust@hammerspace.com To: linux-nfs@vger.kernel.org Subject: [PATCH v5 4/6] NFS: Improve heuristic for readdirplus Date: Fri, 18 Feb 2022 16:24:22 -0500 Message-Id: <20220218212424.1840077-5-trond.myklebust@hammerspace.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220218212424.1840077-4-trond.myklebust@hammerspace.com> References: <20220218212424.1840077-1-trond.myklebust@hammerspace.com> <20220218212424.1840077-2-trond.myklebust@hammerspace.com> <20220218212424.1840077-3-trond.myklebust@hammerspace.com> <20220218212424.1840077-4-trond.myklebust@hammerspace.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Trond Myklebust The heuristic for readdirplus is designed to try to detect 'ls -l' and similar patterns. It does so by looking for cache hit/miss patterns in both the attribute cache and in the dcache of the files in a given directory, and then sets a flag for the readdirplus code to interpret. The problem with this approach is that a single attribute or dcache miss can cause the NFS code to force a refresh of the attributes for the entire set of files contained in the directory. To be able to make a more nuanced decision, let's sample the number of hits and misses in the set of open directory descriptors. That allows us to set thresholds at which we start preferring READDIRPLUS over regular READDIR, or at which we start to force a re-read of the remaining readdir cache using READDIRPLUS. Signed-off-by: Trond Myklebust --- fs/nfs/dir.c | 81 ++++++++++++++++++++++++++---------------- fs/nfs/inode.c | 4 +-- fs/nfs/internal.h | 4 +-- fs/nfs/nfstrace.h | 1 - include/linux/nfs_fs.h | 5 +-- 5 files changed, 57 insertions(+), 38 deletions(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 43a559b34f4a..ba4e94b2a007 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -87,8 +87,7 @@ alloc_nfs_open_dir_context(struct inode *dir) nfs_set_cache_invalid(dir, NFS_INO_INVALID_DATA | NFS_INO_REVAL_FORCED); - list_add(&ctx->list, &nfsi->open_files); - clear_bit(NFS_INO_FORCE_READDIR, &nfsi->flags); + list_add_tail_rcu(&ctx->list, &nfsi->open_files); spin_unlock(&dir->i_lock); return ctx; } @@ -98,9 +97,9 @@ alloc_nfs_open_dir_context(struct inode *dir) static void put_nfs_open_dir_context(struct inode *dir, struct nfs_open_dir_context *ctx) { spin_lock(&dir->i_lock); - list_del(&ctx->list); + list_del_rcu(&ctx->list); spin_unlock(&dir->i_lock); - kfree(ctx); + kfree_rcu(ctx, rcu_head); } /* @@ -567,7 +566,6 @@ static int nfs_readdir_xdr_filler(struct nfs_readdir_descriptor *desc, /* We requested READDIRPLUS, but the server doesn't grok it */ if (error == -ENOTSUPP && desc->plus) { NFS_SERVER(inode)->caps &= ~NFS_CAP_READDIRPLUS; - clear_bit(NFS_INO_ADVISE_RDPLUS, &NFS_I(inode)->flags); desc->plus = arg.plus = false; goto again; } @@ -617,51 +615,61 @@ int nfs_same_file(struct dentry *dentry, struct nfs_entry *entry) return 1; } -static -bool nfs_use_readdirplus(struct inode *dir, struct dir_context *ctx) +#define NFS_READDIR_CACHE_USAGE_THRESHOLD (8UL) + +static bool nfs_use_readdirplus(struct inode *dir, struct dir_context *ctx, + unsigned int cache_hits, + unsigned int cache_misses) { if (!nfs_server_capable(dir, NFS_CAP_READDIRPLUS)) return false; - if (test_and_clear_bit(NFS_INO_ADVISE_RDPLUS, &NFS_I(dir)->flags)) - return true; - if (ctx->pos == 0) + if (ctx->pos == 0 || + cache_hits + cache_misses > NFS_READDIR_CACHE_USAGE_THRESHOLD) return true; return false; } /* - * This function is called by the lookup and getattr code to request the + * This function is called by the getattr code to request the * use of readdirplus to accelerate any future lookups in the same * directory. */ -void nfs_advise_use_readdirplus(struct inode *dir) +void nfs_readdir_record_entry_cache_hit(struct inode *dir) { struct nfs_inode *nfsi = NFS_I(dir); + struct nfs_open_dir_context *ctx; - if (nfs_server_capable(dir, NFS_CAP_READDIRPLUS) && - !list_empty(&nfsi->open_files)) - set_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags); + if (nfs_server_capable(dir, NFS_CAP_READDIRPLUS)) { + rcu_read_lock(); + list_for_each_entry_rcu (ctx, &nfsi->open_files, list) + atomic_inc(&ctx->cache_hits); + rcu_read_unlock(); + } } /* * This function is mainly for use by nfs_getattr(). * * If this is an 'ls -l', we want to force use of readdirplus. - * Do this by checking if there is an active file descriptor - * and calling nfs_advise_use_readdirplus, then forcing a - * cache flush. */ -void nfs_force_use_readdirplus(struct inode *dir) +void nfs_readdir_record_entry_cache_miss(struct inode *dir) { struct nfs_inode *nfsi = NFS_I(dir); + struct nfs_open_dir_context *ctx; - if (nfs_server_capable(dir, NFS_CAP_READDIRPLUS) && - !list_empty(&nfsi->open_files)) { - set_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags); - set_bit(NFS_INO_FORCE_READDIR, &nfsi->flags); + if (nfs_server_capable(dir, NFS_CAP_READDIRPLUS)) { + rcu_read_lock(); + list_for_each_entry_rcu (ctx, &nfsi->open_files, list) + atomic_inc(&ctx->cache_misses); + rcu_read_unlock(); } } +static void nfs_readdir_record_dcache_miss(struct inode *dir) +{ + nfs_readdir_record_entry_cache_miss(dir); +} + static void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry, unsigned long dir_verifier) @@ -1101,6 +1109,18 @@ static int uncached_readdir(struct nfs_readdir_descriptor *desc) return status; } +#define NFS_READDIR_CACHE_MISS_THRESHOLD (16UL) + +static void nfs_readdir_handle_cache_misses(struct inode *inode, + struct nfs_readdir_descriptor *desc, + pgoff_t page_index, + unsigned int cache_misses) +{ + if (desc->ctx->pos != 0 && + cache_misses > NFS_READDIR_CACHE_MISS_THRESHOLD) + invalidate_mapping_pages(inode->i_mapping, page_index + 1, -1); +} + /* The file offset position represents the dirent entry number. A last cookie cache takes care of the common case of reading the whole directory. @@ -1112,6 +1132,7 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) struct nfs_inode *nfsi = NFS_I(inode); struct nfs_open_dir_context *dir_ctx = file->private_data; struct nfs_readdir_descriptor *desc; + unsigned int cache_hits, cache_misses; pgoff_t page_index; int res; @@ -1137,7 +1158,6 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) goto out; desc->file = file; desc->ctx = ctx; - desc->plus = nfs_use_readdirplus(inode, ctx); desc->page_index_max = -1; spin_lock(&file->f_lock); @@ -1150,6 +1170,8 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) desc->page_fill_misses = dir_ctx->page_fill_misses; nfs_set_dtsize(desc, dir_ctx->dtsize); memcpy(desc->verf, dir_ctx->verf, sizeof(desc->verf)); + cache_hits = atomic_xchg(&dir_ctx->cache_hits, 0); + cache_misses = atomic_xchg(&dir_ctx->cache_misses, 0); spin_unlock(&file->f_lock); if (desc->eof) { @@ -1157,9 +1179,8 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) goto out_free; } - if (test_and_clear_bit(NFS_INO_FORCE_READDIR, &nfsi->flags) && - list_is_singular(&nfsi->open_files)) - invalidate_mapping_pages(inode->i_mapping, page_index + 1, -1); + desc->plus = nfs_use_readdirplus(inode, ctx, cache_hits, cache_misses); + nfs_readdir_handle_cache_misses(inode, desc, page_index, cache_misses); do { res = readdir_search_pagecache(desc); @@ -1178,7 +1199,6 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) break; } if (res == -ETOOSMALL && desc->plus) { - clear_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags); nfs_zap_caches(inode); desc->page_index = 0; desc->plus = false; @@ -1602,7 +1622,7 @@ nfs_lookup_revalidate_dentry(struct inode *dir, struct dentry *dentry, nfs_set_verifier(dentry, dir_verifier); /* set a readdirplus hint that we had a cache miss */ - nfs_force_use_readdirplus(dir); + nfs_readdir_record_dcache_miss(dir); ret = 1; out: nfs_free_fattr(fattr); @@ -1659,7 +1679,6 @@ nfs_do_lookup_revalidate(struct inode *dir, struct dentry *dentry, nfs_mark_dir_for_revalidate(dir); goto out_bad; } - nfs_advise_use_readdirplus(dir); goto out_valid; } @@ -1866,7 +1885,7 @@ struct dentry *nfs_lookup(struct inode *dir, struct dentry * dentry, unsigned in goto out; /* Notify readdir to use READDIRPLUS */ - nfs_force_use_readdirplus(dir); + nfs_readdir_record_dcache_miss(dir); no_entry: res = d_splice_alias(inode, dentry); diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index f9fc506ebb29..1bef81f5373a 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -789,7 +789,7 @@ static void nfs_readdirplus_parent_cache_miss(struct dentry *dentry) if (!nfs_server_capable(d_inode(dentry), NFS_CAP_READDIRPLUS)) return; parent = dget_parent(dentry); - nfs_force_use_readdirplus(d_inode(parent)); + nfs_readdir_record_entry_cache_miss(d_inode(parent)); dput(parent); } @@ -800,7 +800,7 @@ static void nfs_readdirplus_parent_cache_hit(struct dentry *dentry) if (!nfs_server_capable(d_inode(dentry), NFS_CAP_READDIRPLUS)) return; parent = dget_parent(dentry); - nfs_advise_use_readdirplus(d_inode(parent)); + nfs_readdir_record_entry_cache_hit(d_inode(parent)); dput(parent); } diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index 2de7c56a1fbe..46dc97b65661 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -366,8 +366,8 @@ extern struct nfs_client *nfs_init_client(struct nfs_client *clp, const struct nfs_client_initdata *); /* dir.c */ -extern void nfs_advise_use_readdirplus(struct inode *dir); -extern void nfs_force_use_readdirplus(struct inode *dir); +extern void nfs_readdir_record_entry_cache_hit(struct inode *dir); +extern void nfs_readdir_record_entry_cache_miss(struct inode *dir); extern unsigned long nfs_access_cache_count(struct shrinker *shrink, struct shrink_control *sc); extern unsigned long nfs_access_cache_scan(struct shrinker *shrink, diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h index 45a310b586ce..3672f6703ee7 100644 --- a/fs/nfs/nfstrace.h +++ b/fs/nfs/nfstrace.h @@ -36,7 +36,6 @@ #define nfs_show_nfsi_flags(v) \ __print_flags(v, "|", \ - { BIT(NFS_INO_ADVISE_RDPLUS), "ADVISE_RDPLUS" }, \ { BIT(NFS_INO_STALE), "STALE" }, \ { BIT(NFS_INO_ACL_LRU_SET), "ACL_LRU_SET" }, \ { BIT(NFS_INO_INVALIDATING), "INVALIDATING" }, \ diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 3165927048e4..0a5425a58bbd 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -101,6 +101,8 @@ struct nfs_open_context { struct nfs_open_dir_context { struct list_head list; + atomic_t cache_hits; + atomic_t cache_misses; unsigned long attr_gencount; __be32 verf[NFS_DIR_VERIFIER_SIZE]; __u64 dir_cookie; @@ -110,6 +112,7 @@ struct nfs_open_dir_context { unsigned int dtsize; signed char duped; bool eof; + struct rcu_head rcu_head; }; /* @@ -274,13 +277,11 @@ struct nfs4_copy_state { /* * Bit offsets in flags field */ -#define NFS_INO_ADVISE_RDPLUS (0) /* advise readdirplus */ #define NFS_INO_STALE (1) /* possible stale inode */ #define NFS_INO_ACL_LRU_SET (2) /* Inode is on the LRU list */ #define NFS_INO_INVALIDATING (3) /* inode is being invalidated */ #define NFS_INO_PRESERVE_UNLINKED (4) /* preserve file if removed while open */ #define NFS_INO_FSCACHE (5) /* inode can be cached by FS-Cache */ -#define NFS_INO_FORCE_READDIR (7) /* force readdirplus */ #define NFS_INO_LAYOUTCOMMIT (9) /* layoutcommit required */ #define NFS_INO_LAYOUTCOMMITTING (10) /* layoutcommit inflight */ #define NFS_INO_LAYOUTSTATS (11) /* layoutstats inflight */ From patchwork Fri Feb 18 21:24:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 12751942 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D860C433FE for ; Fri, 18 Feb 2022 21:30:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239869AbiBRVbN (ORCPT ); Fri, 18 Feb 2022 16:31:13 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:48100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239885AbiBRVbM (ORCPT ); Fri, 18 Feb 2022 16:31:12 -0500 Received: from mail-qv1-xf2f.google.com (mail-qv1-xf2f.google.com [IPv6:2607:f8b0:4864:20::f2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74E7D178958 for ; Fri, 18 Feb 2022 13:30:55 -0800 (PST) Received: by mail-qv1-xf2f.google.com with SMTP id f19so17278068qvb.6 for ; Fri, 18 Feb 2022 13:30:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=bWajgR1t6VC8HzDua3e7SEt0eKJ8om5RB415PTgRaaQ=; b=GG03ikNj+uRIKmdlBfUbJ2CCwJ2vMwjakXj/aqu9cvguC7BThzMR1pdf+7KUWQIpDq HMZEt0VUIg2YQKFXJJABfvWan9FJ7sPkXAhWQKhn8pI1n0AxiOL3s+9u9SUgi9L1oRSO UGPEKDux4eTc7u0sonJa8pON0PSsYujNyy7qx5xSW69KDyz+5zmvCL/m5Tak/CJdqJ5V WfrsKFUp2vK9JLsa++LH6YcUMPAPxKQb5NalREnhZDEBozoTxnT4QGwNC3rAb7vafSx9 KW9ZL3knavDQDV6vWb78lnLh7E4+u/sdu5e4zEBGvAiqeOgCjEzzamaWlHMxpwk7R1Y+ EuTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bWajgR1t6VC8HzDua3e7SEt0eKJ8om5RB415PTgRaaQ=; b=HA9RdkZnDYZtg97HMGP9JQtRnhfV1Nt/kaDp8LlgonYqCPuvIqvvOwxZOh766XjHDp uKL5sTBKldhdOKDUu6v08hpqIsqGoMebPzU/bDZ6cGr/Fnl44SmLlO/znXojtm2zFV8v QRh8UdQqN7oQk2la1WnBKD7BHORbKmfh5ofm4JZRxgm14Jj9sD3EvRmjLTfJUeNQlkWp 5EiGTByZGy1E2pY6aW6vgHuvSyhIDOFgtN0i23EpcE0J1rILmU1IEyh6hcjcU0OrYh1H CHJ0An3GQjGK6tgdys9hfM4t+v0a/sZ4AGijtHq8eTITWKrROps7EhdbMnboIvboCBx8 aHvQ== X-Gm-Message-State: AOAM530E10xKbGf83syaUNVUjjUnGvynnKzXhzlzXe98g8990tafvvYU ooMEphydDBpbpMJoWs4eIbNXwVhqVw== X-Google-Smtp-Source: ABdhPJwH+aMomGHgKEIHtTRwD2pxn1sojMlNBOATuTrttlrBLGP7PmWK718ekA+GONT+eP6ggy6Y8A== X-Received: by 2002:a05:6214:5091:b0:42c:aa27:a206 with SMTP id kk17-20020a056214509100b0042caa27a206mr7396255qvb.18.1645219853937; Fri, 18 Feb 2022 13:30:53 -0800 (PST) Received: from localhost.localdomain (c-68-56-145-227.hsd1.mi.comcast.net. [68.56.145.227]) by smtp.gmail.com with ESMTPSA id w22sm26928656qtk.7.2022.02.18.13.30.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Feb 2022 13:30:52 -0800 (PST) From: trondmy@gmail.com X-Google-Original-From: trond.myklebust@hammerspace.com To: linux-nfs@vger.kernel.org Subject: [PATCH v5 5/6] NFS: Don't ask for readdirplus unless it can help nfs_getattr() Date: Fri, 18 Feb 2022 16:24:23 -0500 Message-Id: <20220218212424.1840077-6-trond.myklebust@hammerspace.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220218212424.1840077-5-trond.myklebust@hammerspace.com> References: <20220218212424.1840077-1-trond.myklebust@hammerspace.com> <20220218212424.1840077-2-trond.myklebust@hammerspace.com> <20220218212424.1840077-3-trond.myklebust@hammerspace.com> <20220218212424.1840077-4-trond.myklebust@hammerspace.com> <20220218212424.1840077-5-trond.myklebust@hammerspace.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Trond Myklebust If attribute caching is turned off, then use of readdirplus is not going to help stat() performance. Readdirplus also doesn't help if a file is being written to, since we will have to flush those writes in order to sync the mtime/ctime. Signed-off-by: Trond Myklebust --- fs/nfs/inode.c | 33 +++++++++++++++++---------------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index 1bef81f5373a..9d2af9887715 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -782,24 +782,26 @@ void nfs_setattr_update_inode(struct inode *inode, struct iattr *attr, } EXPORT_SYMBOL_GPL(nfs_setattr_update_inode); -static void nfs_readdirplus_parent_cache_miss(struct dentry *dentry) +/* + * Don't request help from readdirplus if the file is being written to, + * or if attribute caching is turned off + */ +static bool nfs_getattr_readdirplus_enable(const struct inode *inode) { - struct dentry *parent; + return nfs_server_capable(inode, NFS_CAP_READDIRPLUS) && + !nfs_have_writebacks(inode) && NFS_MAXATTRTIMEO(inode) > 5 * HZ; +} - if (!nfs_server_capable(d_inode(dentry), NFS_CAP_READDIRPLUS)) - return; - parent = dget_parent(dentry); +static void nfs_readdirplus_parent_cache_miss(struct dentry *dentry) +{ + struct dentry *parent = dget_parent(dentry); nfs_readdir_record_entry_cache_miss(d_inode(parent)); dput(parent); } static void nfs_readdirplus_parent_cache_hit(struct dentry *dentry) { - struct dentry *parent; - - if (!nfs_server_capable(d_inode(dentry), NFS_CAP_READDIRPLUS)) - return; - parent = dget_parent(dentry); + struct dentry *parent = dget_parent(dentry); nfs_readdir_record_entry_cache_hit(d_inode(parent)); dput(parent); } @@ -837,6 +839,7 @@ int nfs_getattr(struct user_namespace *mnt_userns, const struct path *path, int err = 0; bool force_sync = query_flags & AT_STATX_FORCE_SYNC; bool do_update = false; + bool readdirplus_enabled = nfs_getattr_readdirplus_enable(inode); trace_nfs_getattr_enter(inode); @@ -845,7 +848,8 @@ int nfs_getattr(struct user_namespace *mnt_userns, const struct path *path, STATX_INO | STATX_SIZE | STATX_BLOCKS; if ((query_flags & AT_STATX_DONT_SYNC) && !force_sync) { - nfs_readdirplus_parent_cache_hit(path->dentry); + if (readdirplus_enabled) + nfs_readdirplus_parent_cache_hit(path->dentry); goto out_no_revalidate; } @@ -895,15 +899,12 @@ int nfs_getattr(struct user_namespace *mnt_userns, const struct path *path, do_update |= cache_validity & NFS_INO_INVALID_BLOCKS; if (do_update) { - /* Update the attribute cache */ - if (!(server->flags & NFS_MOUNT_NOAC)) + if (readdirplus_enabled) nfs_readdirplus_parent_cache_miss(path->dentry); - else - nfs_readdirplus_parent_cache_hit(path->dentry); err = __nfs_revalidate_inode(server, inode); if (err) goto out; - } else + } else if (readdirplus_enabled) nfs_readdirplus_parent_cache_hit(path->dentry); out_no_revalidate: /* Only return attributes that were revalidated. */ From patchwork Fri Feb 18 21:24:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 12751943 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D80DC433EF for ; Fri, 18 Feb 2022 21:30:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239904AbiBRVbP (ORCPT ); Fri, 18 Feb 2022 16:31:15 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:48248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239885AbiBRVbO (ORCPT ); Fri, 18 Feb 2022 16:31:14 -0500 Received: from mail-qv1-xf36.google.com (mail-qv1-xf36.google.com [IPv6:2607:f8b0:4864:20::f36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BCBB9377C0 for ; Fri, 18 Feb 2022 13:30:56 -0800 (PST) Received: by mail-qv1-xf36.google.com with SMTP id c14so17244645qvl.12 for ; Fri, 18 Feb 2022 13:30:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=+EQAu6/JZZI4LlgaIWtI3TqryfLPflDV7aWi8ajn/D4=; b=LdXNJsWUDbL1H1Hz+FSNTmyMPU859xgGbBGks2/9WW5UVI/IVwF7/Q5LYvuziUmJ3p xUrDHK3J/i0pz1z/6kuGKUpy17IqiiZ0coUCZo0gWmu4T7uWuGIVZDJIiFVtswXMcDet oh11Te0m985E0F9wO1EX1+67E+wyloLbVDaZCw7ZxdsE6tcaVNBfs5r5I/dDjrVa6hfF YTa298ExqH42WhMjmeKMSQurl+YwC7morjv6EAGMlIT/wxngGxvj/jqGntc/+GPK2eU4 7E1rew1O+NwtADw6pHxBfFWs94Os0EBkTp2dlAFdRAC4SwpScNYwlOKMbHxrLrP+o5uA LOdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+EQAu6/JZZI4LlgaIWtI3TqryfLPflDV7aWi8ajn/D4=; b=Mskwt2GYWNLVXpQJdEatC0QTAIOhtWTvG2siShQrWDTBjzvWGkxnne/fnWCahdMbqI EOPz+1TzQomqDOL0LKRHs4v5xh77HqziEjdblTSAntQz7wZrCvOE01/suP+0X3LnD+vp IfAtqufj/PlPIkbADOsP6rNT8qc6T/xL72Q9pEQF4Vcnb1yoWrfnRgrBEJBbYs0JDeyv RWjg+49MFhWgOFtxKO3MXa4hKIueD8RYirEvzz/rgats0j8jUcI6bvhZFDvLn09Qji3p I7m/e9VaS3bQ7iV9Pvfke+xcfMCIeN9K+6GeJzjDABlAEJ8K+z7KSiFrhnzFrt1rHAa1 q2WA== X-Gm-Message-State: AOAM530+lYNqnIf6CIECuzqnJ6dqSMrTq2EmRZHpuBtcp32Rtz9KAco6 Ql5dJrbRtG1dFputDNaW+UcXbZCbyw== X-Google-Smtp-Source: ABdhPJxaGZT5obRGOYj6xU3dJcPbdEJdCVfikzzEPINPso6CFyBq+SMUNY/xQOmaSfMged8vgUUHmA== X-Received: by 2002:ad4:5943:0:b0:425:76d8:90cc with SMTP id eo3-20020ad45943000000b0042576d890ccmr7436605qvb.105.1645219855331; Fri, 18 Feb 2022 13:30:55 -0800 (PST) Received: from localhost.localdomain (c-68-56-145-227.hsd1.mi.comcast.net. [68.56.145.227]) by smtp.gmail.com with ESMTPSA id w22sm26928656qtk.7.2022.02.18.13.30.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Feb 2022 13:30:54 -0800 (PST) From: trondmy@gmail.com X-Google-Original-From: trond.myklebust@hammerspace.com To: linux-nfs@vger.kernel.org Subject: [PATCH v5 6/6] NFSv4: Ask for a full XDR buffer of readdir goodness Date: Fri, 18 Feb 2022 16:24:24 -0500 Message-Id: <20220218212424.1840077-7-trond.myklebust@hammerspace.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220218212424.1840077-6-trond.myklebust@hammerspace.com> References: <20220218212424.1840077-1-trond.myklebust@hammerspace.com> <20220218212424.1840077-2-trond.myklebust@hammerspace.com> <20220218212424.1840077-3-trond.myklebust@hammerspace.com> <20220218212424.1840077-4-trond.myklebust@hammerspace.com> <20220218212424.1840077-5-trond.myklebust@hammerspace.com> <20220218212424.1840077-6-trond.myklebust@hammerspace.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Trond Myklebust Instead of pretending that we know the ratio of directory info vs readdirplus attribute info, just set the 'dircount' field to the same value as the 'maxcount' field. Signed-off-by: Trond Myklebust --- fs/nfs/nfs3xdr.c | 7 ++++--- fs/nfs/nfs4xdr.c | 6 +++--- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c index 9274c9c5efea..feb6e2e36138 100644 --- a/fs/nfs/nfs3xdr.c +++ b/fs/nfs/nfs3xdr.c @@ -1261,6 +1261,8 @@ static void nfs3_xdr_enc_readdir3args(struct rpc_rqst *req, static void encode_readdirplus3args(struct xdr_stream *xdr, const struct nfs3_readdirargs *args) { + uint32_t dircount = args->count; + uint32_t maxcount = args->count; __be32 *p; encode_nfs_fh3(xdr, args->fh); @@ -1273,9 +1275,8 @@ static void encode_readdirplus3args(struct xdr_stream *xdr, * readdirplus: need dircount + buffer size. * We just make sure we make dircount big enough */ - *p++ = cpu_to_be32(args->count >> 3); - - *p = cpu_to_be32(args->count); + *p++ = cpu_to_be32(dircount); + *p = cpu_to_be32(maxcount); } static void nfs3_xdr_enc_readdirplus3args(struct rpc_rqst *req, diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c index 8e70b92df4cc..b7780b97dc4d 100644 --- a/fs/nfs/nfs4xdr.c +++ b/fs/nfs/nfs4xdr.c @@ -1605,7 +1605,8 @@ static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg FATTR4_WORD0_RDATTR_ERROR, FATTR4_WORD1_MOUNTED_ON_FILEID, }; - uint32_t dircount = readdir->count >> 1; + uint32_t dircount = readdir->count; + uint32_t maxcount = readdir->count; __be32 *p, verf[2]; uint32_t attrlen = 0; unsigned int i; @@ -1618,7 +1619,6 @@ static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg FATTR4_WORD1_SPACE_USED|FATTR4_WORD1_TIME_ACCESS| FATTR4_WORD1_TIME_METADATA|FATTR4_WORD1_TIME_MODIFY; attrs[2] |= FATTR4_WORD2_SECURITY_LABEL; - dircount >>= 1; } /* Use mounted_on_fileid only if the server supports it */ if (!(readdir->bitmask[1] & FATTR4_WORD1_MOUNTED_ON_FILEID)) @@ -1634,7 +1634,7 @@ static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg encode_nfs4_verifier(xdr, &readdir->verifier); p = reserve_space(xdr, 12 + (attrlen << 2)); *p++ = cpu_to_be32(dircount); - *p++ = cpu_to_be32(readdir->count); + *p++ = cpu_to_be32(maxcount); *p++ = cpu_to_be32(attrlen); for (i = 0; i < attrlen; i++) *p++ = cpu_to_be32(attrs[i]);