From patchwork Mon Sep 2 17:02:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 11126981 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 578DE15E9 for ; Mon, 2 Sep 2019 17:06:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2F6CC217F4 for ; Mon, 2 Sep 2019 17:06:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Nydhj0Tr" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726464AbfIBRG1 (ORCPT ); Mon, 2 Sep 2019 13:06:27 -0400 Received: from mail-io1-f67.google.com ([209.85.166.67]:33599 "EHLO mail-io1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726450AbfIBRG1 (ORCPT ); Mon, 2 Sep 2019 13:06:27 -0400 Received: by mail-io1-f67.google.com with SMTP id z3so30391007iog.0 for ; Mon, 02 Sep 2019 10:06:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GA75ufOJonbpqC5MlA/RQIRk67dy/HSW1IeiLuiWlEU=; b=Nydhj0TrOlJwcX3rxlcNq3daKdSdzfoBLzD5Ggy0apwcRIbWv0q9oLMIVuGeMAsSMG Rh+unLBc4cf9hlZ/q2wuEWJzhce9BN849EFxFFXPi8nP8+DMlWhHGwRzoMSava5fuUtX GJ8pd6ohZScX04GLk96hhSTcPhK4jf83sUn/xfRcsAa3s5ZjtvTQb/chZ4/OlHPMN7Yy Ws+YlOB+X8JEp/QxfbqK4oEDVdVpljIc02bNMXiHvIJmJe7qEAT7eQbWzPsin6CA1KjQ yzbOGUH8rpX2kKUJR30nar87XABZ+bj3nBeKkFzXYCp9XBA0Lf8zVrWQbU1j0ci23JFA bPiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GA75ufOJonbpqC5MlA/RQIRk67dy/HSW1IeiLuiWlEU=; b=UwmyYZdEF08KYItUbg998KwIHAejc89ERD5+5w3/Qf0s6PDjLt6+CjzWh3beBysYAa hm5JVXmu5XN5YOUWjH+IRm/ZTEaq35/1CGgfk85DIQcj956hfWsYv/8zQzHjVAkZdDiC itYDyK0YALdBa9xlFnYLe6wQ2P3jKw6vqQ83NilMacfmxC543kn5EIbUbxrFZ6BdUFsI 6iVI7a5sdjqCVJTHdeENUwr8o5IxcFkv24G8kj+v7qzFXkvsSEaSoXXCUqNwkRoIoMXy S3ZxxniQtMcg+lcyLFgpplvQ6tQpZRpU26vfLVBypmkbrb2S0S8ISxxoxC8z0+M/Wy+l 5/hg== X-Gm-Message-State: APjAAAVa2+85pdy3fPfnaLjMzW6dlhiitw/wZpIBUhAzTzj2/IUMzeRC yEiauFzDGKwCGClJm7tBvk7WUlgv+Q== X-Google-Smtp-Source: APXvYqwhgOrXEKYr41FsXqz35/0cwABX4ZwOcJTzIdK5Z7siQBdvlrZ8p7txHWaCgxBRsL/Er5MMPg== X-Received: by 2002:a02:ad0e:: with SMTP id s14mr12482179jan.97.1567443986264; Mon, 02 Sep 2019 10:06:26 -0700 (PDT) Received: from localhost.localdomain (c-68-40-189-247.hsd1.mi.comcast.net. [68.40.189.247]) by smtp.gmail.com with ESMTPSA id o3sm7655322iob.64.2019.09.02.10.06.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Sep 2019 10:06:25 -0700 (PDT) From: Trond Myklebust X-Google-Original-From: Trond Myklebust To: "J.Bruce Fields" Cc: linux-nfs@vger.kernel.org Subject: [PATCH v2 1/4] nfsd: nfsd_file cache entries should be per net namespace Date: Mon, 2 Sep 2019 13:02:55 -0400 Message-Id: <20190902170258.92522-2-trond.myklebust@hammerspace.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190902170258.92522-1-trond.myklebust@hammerspace.com> References: <20190902170258.92522-1-trond.myklebust@hammerspace.com> MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Ensure that we can safely clear out the file cache entries when the nfs server is shut down on a container. Otherwise, the file cache may end up pinning the mounts. Signed-off-by: Trond Myklebust --- fs/nfsd/export.c | 2 +- fs/nfsd/filecache.c | 33 +++++++++++++++++++++------------ fs/nfsd/filecache.h | 3 ++- fs/nfsd/nfssvc.c | 1 + 4 files changed, 25 insertions(+), 14 deletions(-) diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c index 052fac64b578..15422c951fd1 100644 --- a/fs/nfsd/export.c +++ b/fs/nfsd/export.c @@ -240,7 +240,7 @@ static void expkey_flush(void) * destroyed while we're in the middle of flushing. */ mutex_lock(&nfsd_mutex); - nfsd_file_cache_purge(); + nfsd_file_cache_purge(current->nsproxy->net_ns); mutex_unlock(&nfsd_mutex); } diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index a2fcb251d2f6..d229fd3c825d 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -17,6 +17,7 @@ #include "vfs.h" #include "nfsd.h" #include "nfsfh.h" +#include "netns.h" #include "filecache.h" #include "trace.h" @@ -168,7 +169,8 @@ nfsd_file_mark_find_or_create(struct nfsd_file *nf) } static struct nfsd_file * -nfsd_file_alloc(struct inode *inode, unsigned int may, unsigned int hashval) +nfsd_file_alloc(struct inode *inode, unsigned int may, unsigned int hashval, + struct net *net) { struct nfsd_file *nf; @@ -178,6 +180,7 @@ nfsd_file_alloc(struct inode *inode, unsigned int may, unsigned int hashval) INIT_LIST_HEAD(&nf->nf_lru); nf->nf_file = NULL; nf->nf_cred = get_current_cred(); + nf->nf_net = net; nf->nf_flags = 0; nf->nf_inode = inode; nf->nf_hashval = hashval; @@ -608,10 +611,11 @@ nfsd_file_cache_init(void) * Note this can deadlock with nfsd_file_lru_cb. */ void -nfsd_file_cache_purge(void) +nfsd_file_cache_purge(struct net *net) { unsigned int i; struct nfsd_file *nf; + struct hlist_node *next; LIST_HEAD(dispose); bool del; @@ -619,10 +623,12 @@ nfsd_file_cache_purge(void) return; for (i = 0; i < NFSD_FILE_HASH_SIZE; i++) { - spin_lock(&nfsd_file_hashtbl[i].nfb_lock); - while(!hlist_empty(&nfsd_file_hashtbl[i].nfb_head)) { - nf = hlist_entry(nfsd_file_hashtbl[i].nfb_head.first, - struct nfsd_file, nf_node); + struct nfsd_fcache_bucket *nfb = &nfsd_file_hashtbl[i]; + + spin_lock(&nfb->nfb_lock); + hlist_for_each_entry_safe(nf, next, &nfb->nfb_head, nf_node) { + if (net && nf->nf_net != net) + continue; del = nfsd_file_unhash_and_release_locked(nf, &dispose); /* @@ -631,7 +637,7 @@ nfsd_file_cache_purge(void) */ WARN_ON_ONCE(!del); } - spin_unlock(&nfsd_file_hashtbl[i].nfb_lock); + spin_unlock(&nfb->nfb_lock); nfsd_file_dispose_list(&dispose); } } @@ -650,7 +656,7 @@ nfsd_file_cache_shutdown(void) * calling nfsd_file_cache_purge */ cancel_delayed_work_sync(&nfsd_filecache_laundrette); - nfsd_file_cache_purge(); + nfsd_file_cache_purge(NULL); list_lru_destroy(&nfsd_file_lru); rcu_barrier(); fsnotify_put_group(nfsd_file_fsnotify_group); @@ -686,7 +692,7 @@ nfsd_match_cred(const struct cred *c1, const struct cred *c2) static struct nfsd_file * nfsd_file_find_locked(struct inode *inode, unsigned int may_flags, - unsigned int hashval) + unsigned int hashval, struct net *net) { struct nfsd_file *nf; unsigned char need = may_flags & NFSD_FILE_MAY_MASK; @@ -697,6 +703,8 @@ nfsd_file_find_locked(struct inode *inode, unsigned int may_flags, continue; if (nf->nf_inode != inode) continue; + if (nf->nf_net != net) + continue; if (!nfsd_match_cred(nf->nf_cred, current_cred())) continue; if (nfsd_file_get(nf) != NULL) @@ -739,6 +747,7 @@ nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, unsigned int may_flags, struct nfsd_file **pnf) { __be32 status; + struct net *net = SVC_NET(rqstp); struct nfsd_file *nf, *new; struct inode *inode; unsigned int hashval; @@ -753,12 +762,12 @@ nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, hashval = (unsigned int)hash_long(inode->i_ino, NFSD_FILE_HASH_BITS); retry: rcu_read_lock(); - nf = nfsd_file_find_locked(inode, may_flags, hashval); + nf = nfsd_file_find_locked(inode, may_flags, hashval, net); rcu_read_unlock(); if (nf) goto wait_for_construction; - new = nfsd_file_alloc(inode, may_flags, hashval); + new = nfsd_file_alloc(inode, may_flags, hashval, net); if (!new) { trace_nfsd_file_acquire(rqstp, hashval, inode, may_flags, NULL, nfserr_jukebox); @@ -766,7 +775,7 @@ nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, } spin_lock(&nfsd_file_hashtbl[hashval].nfb_lock); - nf = nfsd_file_find_locked(inode, may_flags, hashval); + nf = nfsd_file_find_locked(inode, may_flags, hashval, net); if (nf == NULL) goto open_file; spin_unlock(&nfsd_file_hashtbl[hashval].nfb_lock); diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h index 0c0c67166b87..851d9abf54c2 100644 --- a/fs/nfsd/filecache.h +++ b/fs/nfsd/filecache.h @@ -34,6 +34,7 @@ struct nfsd_file { struct rcu_head nf_rcu; struct file *nf_file; const struct cred *nf_cred; + struct net *nf_net; #define NFSD_FILE_HASHED (0) #define NFSD_FILE_PENDING (1) #define NFSD_FILE_BREAK_READ (2) @@ -48,7 +49,7 @@ struct nfsd_file { }; int nfsd_file_cache_init(void); -void nfsd_file_cache_purge(void); +void nfsd_file_cache_purge(struct net *); void nfsd_file_cache_shutdown(void); void nfsd_file_put(struct nfsd_file *nf); struct nfsd_file *nfsd_file_get(struct nfsd_file *nf); diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index d02712ca2685..b944553c6927 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -387,6 +387,7 @@ static void nfsd_shutdown_net(struct net *net) { struct nfsd_net *nn = net_generic(net, nfsd_net_id); + nfsd_file_cache_purge(net); nfs4_state_shutdown_net(net); if (nn->lockd_up) { lockd_down(net); From patchwork Mon Sep 2 17:02:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 11126983 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 16778112C for ; Mon, 2 Sep 2019 17:06:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E1BF7217F4 for ; Mon, 2 Sep 2019 17:06:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Q68KP2C/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726450AbfIBRG2 (ORCPT ); Mon, 2 Sep 2019 13:06:28 -0400 Received: from mail-io1-f68.google.com ([209.85.166.68]:37164 "EHLO mail-io1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726439AbfIBRG2 (ORCPT ); Mon, 2 Sep 2019 13:06:28 -0400 Received: by mail-io1-f68.google.com with SMTP id r4so15074104iop.4 for ; Mon, 02 Sep 2019 10:06:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=bvuAQX80ApjXEKs6gpQsSKgCH/ycoY0ktnhdV2EAR0Y=; b=Q68KP2C/1HHj3Q2M1b7JC8hwlJCywjk7JjUhcssgkmkSttgw0CDarePHYkgfMeNi11 4Nbiduu++AXn4lFFThsWE/wFeaYPKMzXNQFyU3gI73kpa/LviaDB6MR07s+ckJUtfbsa VUsG3wY2dqAHkWjt96xcal1QFKujqA6829fODuy/nJwaaKtUCsmVbTNXM/Js0iKWyBqQ iSBmtiJU+DGD5gBby+xFdCrVN8r17Xc+R22HkTHGEQMLhZJfoRxQ2HGMrviAsQE992Wi tAxpTnosjcXVFTy5nLKjLTI+GnJnIFKIq6ZPLNMQ94t08POS3IsSDK1796TbE6DEysDM ZOvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bvuAQX80ApjXEKs6gpQsSKgCH/ycoY0ktnhdV2EAR0Y=; b=PuBUOBwD9jDB52TLebSXLE8C0UCwHjU0qUdJOWu1vXXJontlrqShIF9PF4I1YoLEAS 7VdfxGtV4O/860gOxBxbIrhZe8xAJjPiABzb3M+oilmaal85tMvTdz9ji3H3woj5drbP EWtH+4btA4nYB0y0NfUku/3HBOWxGuP7gBLlCRjUYWc2zkkIsMomcXSsBkgbFoeHMMWO qylcfR/820taR8q8+CLrnDy3qgYk9DqeOcoxfNskdo7NPv7e6NfwQG2WIOrtHFmh8rKV 9NIRCuNDxBxeytIv6kJ+TllgE3791KQ0Rdw/L5chjySC3qfNUyzD3Q+pwuoocwoTENsF JWmA== X-Gm-Message-State: APjAAAUaAvKS1btmAM1EkJ5+/bMWr5l2riyckgst+IhrGHRYVZuAUWkf jMND9b7laODpyS+PcnOYFXXYtEWb5Q== X-Google-Smtp-Source: APXvYqzRdMQE13nI0kDE0VY1ZCn+C6pn+oUDTG/X8hE687n6Ii7+OVGuN6wGVwvX78XPudY25k6Tew== X-Received: by 2002:a5d:9c0b:: with SMTP id 11mr557932ioe.192.1567443986954; Mon, 02 Sep 2019 10:06:26 -0700 (PDT) Received: from localhost.localdomain (c-68-40-189-247.hsd1.mi.comcast.net. [68.40.189.247]) by smtp.gmail.com with ESMTPSA id o3sm7655322iob.64.2019.09.02.10.06.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Sep 2019 10:06:26 -0700 (PDT) From: Trond Myklebust X-Google-Original-From: Trond Myklebust To: "J.Bruce Fields" Cc: linux-nfs@vger.kernel.org Subject: [PATCH v2 2/4] nfsd: Support the server resetting the boot verifier Date: Mon, 2 Sep 2019 13:02:56 -0400 Message-Id: <20190902170258.92522-3-trond.myklebust@hammerspace.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190902170258.92522-2-trond.myklebust@hammerspace.com> References: <20190902170258.92522-1-trond.myklebust@hammerspace.com> <20190902170258.92522-2-trond.myklebust@hammerspace.com> MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Add support to allow the server to reset the boot verifier in order to force clients to resend I/O after a timeout failure. Signed-off-by: Trond Myklebust Signed-off-by: Lance Shelton --- fs/nfsd/netns.h | 4 ++++ fs/nfsd/nfs3xdr.c | 13 +++++++++---- fs/nfsd/nfs4proc.c | 14 ++++---------- fs/nfsd/nfsctl.c | 1 + fs/nfsd/nfssvc.c | 31 ++++++++++++++++++++++++++++++- 5 files changed, 48 insertions(+), 15 deletions(-) diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h index bdfe5bcb3dcd..9a4ef815fb8c 100644 --- a/fs/nfsd/netns.h +++ b/fs/nfsd/netns.h @@ -104,6 +104,7 @@ struct nfsd_net { /* Time of server startup */ struct timespec64 nfssvc_boot; + seqlock_t boot_lock; /* * Max number of connections this nfsd container will allow. Defaults @@ -179,4 +180,7 @@ struct nfsd_net { extern void nfsd_netns_free_versions(struct nfsd_net *nn); extern unsigned int nfsd_net_id; + +void nfsd_copy_boot_verifier(__be32 verf[2], struct nfsd_net *nn); +void nfsd_reset_boot_verifier(struct nfsd_net *nn); #endif /* __NFSD_NETNS_H__ */ diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c index fcf31822c74c..86e5658651f1 100644 --- a/fs/nfsd/nfs3xdr.c +++ b/fs/nfsd/nfs3xdr.c @@ -27,6 +27,7 @@ static u32 nfs3_ftypes[] = { NF3SOCK, NF3BAD, NF3LNK, NF3BAD, }; + /* * XDR functions for basic NFS types */ @@ -751,14 +752,16 @@ nfs3svc_encode_writeres(struct svc_rqst *rqstp, __be32 *p) { struct nfsd3_writeres *resp = rqstp->rq_resp; struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id); + __be32 verf[2]; p = encode_wcc_data(rqstp, p, &resp->fh); if (resp->status == 0) { *p++ = htonl(resp->count); *p++ = htonl(resp->committed); /* unique identifier, y2038 overflow can be ignored */ - *p++ = htonl((u32)nn->nfssvc_boot.tv_sec); - *p++ = htonl(nn->nfssvc_boot.tv_nsec); + nfsd_copy_boot_verifier(verf, nn); + *p++ = verf[0]; + *p++ = verf[1]; } return xdr_ressize_check(rqstp, p); } @@ -1125,13 +1128,15 @@ nfs3svc_encode_commitres(struct svc_rqst *rqstp, __be32 *p) { struct nfsd3_commitres *resp = rqstp->rq_resp; struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id); + __be32 verf[2]; p = encode_wcc_data(rqstp, p, &resp->fh); /* Write verifier */ if (resp->status == 0) { /* unique identifier, y2038 overflow can be ignored */ - *p++ = htonl((u32)nn->nfssvc_boot.tv_sec); - *p++ = htonl(nn->nfssvc_boot.tv_nsec); + nfsd_copy_boot_verifier(verf, nn); + *p++ = verf[0]; + *p++ = verf[1]; } return xdr_ressize_check(rqstp, p); } diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index cb51893ec1cd..4e3e77b76411 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -568,17 +568,11 @@ nfsd4_access(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, static void gen_boot_verifier(nfs4_verifier *verifier, struct net *net) { - __be32 verf[2]; - struct nfsd_net *nn = net_generic(net, nfsd_net_id); + __be32 *verf = (__be32 *)verifier->data; - /* - * This is opaque to client, so no need to byte-swap. Use - * __force to keep sparse happy. y2038 time_t overflow is - * irrelevant in this usage. - */ - verf[0] = (__force __be32)nn->nfssvc_boot.tv_sec; - verf[1] = (__force __be32)nn->nfssvc_boot.tv_nsec; - memcpy(verifier->data, verf, sizeof(verifier->data)); + BUILD_BUG_ON(2*sizeof(*verf) != sizeof(verifier->data)); + + nfsd_copy_boot_verifier(verf, net_generic(net, nfsd_net_id)); } static __be32 diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index 3cf4f6aa48d6..33cbe2e5d937 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -1477,6 +1477,7 @@ static __net_init int nfsd_init_net(struct net *net) atomic_set(&nn->ntf_refcnt, 0); init_waitqueue_head(&nn->ntf_wq); + seqlock_init(&nn->boot_lock); mnt = vfs_kern_mount(&nfsd_fs_type, SB_KERNMOUNT, "nfsd", NULL); if (IS_ERR(mnt)) { diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index b944553c6927..3caaf5675259 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -344,6 +344,35 @@ static bool nfsd_needs_lockd(struct nfsd_net *nn) return nfsd_vers(nn, 2, NFSD_TEST) || nfsd_vers(nn, 3, NFSD_TEST); } +void nfsd_copy_boot_verifier(__be32 verf[2], struct nfsd_net *nn) +{ + int seq = 0; + + do { + read_seqbegin_or_lock(&nn->boot_lock, &seq); + /* + * This is opaque to client, so no need to byte-swap. Use + * __force to keep sparse happy. y2038 time_t overflow is + * irrelevant in this usage + */ + verf[0] = (__force __be32)nn->nfssvc_boot.tv_sec; + verf[1] = (__force __be32)nn->nfssvc_boot.tv_nsec; + } while (need_seqretry(&nn->boot_lock, seq)); + done_seqretry(&nn->boot_lock, seq); +} + +void nfsd_reset_boot_verifier_locked(struct nfsd_net *nn) +{ + ktime_get_real_ts64(&nn->nfssvc_boot); +} + +void nfsd_reset_boot_verifier(struct nfsd_net *nn) +{ + write_seqlock(&nn->boot_lock); + nfsd_reset_boot_verifier_locked(nn); + write_sequnlock(&nn->boot_lock); +} + static int nfsd_startup_net(int nrservs, struct net *net, const struct cred *cred) { struct nfsd_net *nn = net_generic(net, nfsd_net_id); @@ -596,7 +625,7 @@ int nfsd_create_serv(struct net *net) #endif } atomic_inc(&nn->ntf_refcnt); - ktime_get_real_ts64(&nn->nfssvc_boot); /* record boot time */ + nfsd_reset_boot_verifier(nn); return 0; } From patchwork Mon Sep 2 17:02:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 11126985 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B4AF91813 for ; Mon, 2 Sep 2019 17:06:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 95978217F4 for ; Mon, 2 Sep 2019 17:06:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MBaUDGNL" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726472AbfIBRG3 (ORCPT ); Mon, 2 Sep 2019 13:06:29 -0400 Received: from mail-io1-f65.google.com ([209.85.166.65]:33904 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726395AbfIBRG2 (ORCPT ); Mon, 2 Sep 2019 13:06:28 -0400 Received: by mail-io1-f65.google.com with SMTP id s21so30378260ioa.1 for ; Mon, 02 Sep 2019 10:06:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3XDL4KoWkmx9PlF4CXcLYUPo7i1ojatgLjc2RjXIZH4=; b=MBaUDGNLNJXN2opE3Mlb2yUf+wQl7uAwFF/53Ib4KhjCNse87OvHnjUq6ZxvCSNep8 UEltFVB8QUSD6Nltj14nHTZJkvZrrh3D7GwzpzKgKK8HeyucZMh+jjV/XN0tj/Ip1B+u iYNam8S86U5ANGmQpXxfz7+sEq1NrA5Bi4+Z1gbPy47MnqsyuhGJm8MGQE/IXWyAEySX RU2W5yQrKdwDnS2Ufei0uHa0HPOAQQR0mo/uLVNIHbifNg35LKr5XqCt6Kr78ltKhDxN 4hFvKESxQD6PMV8iy1ihKF1n9jdPEByOnsMhL+ZSo8JdPk3v2MB0Umw23DwUujYc7t/L 6blg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3XDL4KoWkmx9PlF4CXcLYUPo7i1ojatgLjc2RjXIZH4=; b=I6A1HuQb1DHUTZSs/Dxvt3eemTPMmcSFUXlo4udVbJMRUvDReuXlcBFugcpL4vuTlJ EeyhCUZXtWEWILRrVlevxWFor79BESgbjG8LH3SQPuY5I3xRs8wqsAHefXQNWY4ZIo2L U07I+JFnGqb7ANvvn/l5WAzngJ3J0Os8plAMtGLRtZgcs7Tz5WbxAzK4Ov2FLXkadNNs OZipIOwAqFj2lcgmJOi4wNJkJZL+DzAOk+P7apvGA1i2+mskzRHiI5dyvQL2F0mwQMFT /2upgqjRoLNTOGdz5z/g2AYQluzuCY5YZzJE+0NlEc20F2cbEoJmg7JLAvWGzA3HcomN teAA== X-Gm-Message-State: APjAAAXi8uEMpzhBKgs9ERuopPFDCvvcyGmBKKDpqMCCfXsMJSAQobjb LICy7AELLDIyQP/jTdILdplOV8vI+A== X-Google-Smtp-Source: APXvYqzKUFdyU9py8huN8OHe9DejDec/x3FbbQNqv76ioQW/gjFtAY/wbxJ1EaeVKxwKVcSx76elDw== X-Received: by 2002:a02:3e86:: with SMTP id s128mr31322569jas.14.1567443987658; Mon, 02 Sep 2019 10:06:27 -0700 (PDT) Received: from localhost.localdomain (c-68-40-189-247.hsd1.mi.comcast.net. [68.40.189.247]) by smtp.gmail.com with ESMTPSA id o3sm7655322iob.64.2019.09.02.10.06.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Sep 2019 10:06:27 -0700 (PDT) From: Trond Myklebust X-Google-Original-From: Trond Myklebust To: "J.Bruce Fields" Cc: linux-nfs@vger.kernel.org Subject: [PATCH v2 3/4] nfsd: Don't garbage collect files that might contain write errors Date: Mon, 2 Sep 2019 13:02:57 -0400 Message-Id: <20190902170258.92522-4-trond.myklebust@hammerspace.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190902170258.92522-3-trond.myklebust@hammerspace.com> References: <20190902170258.92522-1-trond.myklebust@hammerspace.com> <20190902170258.92522-2-trond.myklebust@hammerspace.com> <20190902170258.92522-3-trond.myklebust@hammerspace.com> MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org If a file may contain unstable writes that can error out, then we want to avoid garbage collecting the struct nfsd_file that may be tracking those errors. So in the garbage collector, we try to avoid collecting files that aren't clean. Furthermore, we avoid immediately kicking off the garbage collector in the case where the reference drops to zero for the case where there is a write error that is being tracked. If the file is unhashed while an error is pending, then declare a reboot, to ensure the client resends any unstable writes. Signed-off-by: Trond Myklebust --- fs/nfsd/filecache.c | 43 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index d229fd3c825d..92d0998824a0 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -216,6 +216,36 @@ nfsd_file_free(struct nfsd_file *nf) return flush; } +static bool +nfsd_file_check_writeback(struct nfsd_file *nf) +{ + struct file *file = nf->nf_file; + struct address_space *mapping; + + if (!file || !(file->f_mode & FMODE_WRITE)) + return false; + mapping = file->f_mapping; + return mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) || + mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK); +} + +static int +nfsd_file_check_write_error(struct nfsd_file *nf) +{ + struct file *file = nf->nf_file; + + if (!file || !(file->f_mode & FMODE_WRITE)) + return 0; + return filemap_check_wb_err(file->f_mapping, READ_ONCE(file->f_wb_err)); +} + +static bool +nfsd_file_in_use(struct nfsd_file *nf) +{ + return nfsd_file_check_writeback(nf) || + nfsd_file_check_write_error(nf); +} + static void nfsd_file_do_unhash(struct nfsd_file *nf) { @@ -223,6 +253,8 @@ nfsd_file_do_unhash(struct nfsd_file *nf) trace_nfsd_file_unhash(nf); + if (nfsd_file_check_write_error(nf)) + nfsd_reset_boot_verifier(net_generic(nf->nf_net, nfsd_net_id)); --nfsd_file_hashtbl[nf->nf_hashval].nfb_count; hlist_del_rcu(&nf->nf_node); if (!list_empty(&nf->nf_lru)) @@ -277,9 +309,10 @@ void nfsd_file_put(struct nfsd_file *nf) { bool is_hashed = test_bit(NFSD_FILE_HASHED, &nf->nf_flags) != 0; + bool unused = !nfsd_file_in_use(nf); set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags); - if (nfsd_file_put_noref(nf) == 1 && is_hashed) + if (nfsd_file_put_noref(nf) == 1 && is_hashed && unused) nfsd_file_schedule_laundrette(NFSD_FILE_LAUNDRETTE_MAY_FLUSH); } @@ -345,6 +378,14 @@ nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru, */ if (atomic_read(&nf->nf_ref) > 1) goto out_skip; + + /* + * Don't throw out files that are still undergoing I/O or + * that have uncleared errors pending. + */ + if (nfsd_file_check_writeback(nf)) + goto out_skip; + if (test_and_clear_bit(NFSD_FILE_REFERENCED, &nf->nf_flags)) goto out_rescan; From patchwork Mon Sep 2 17:02:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 11126987 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AC7F515E9 for ; Mon, 2 Sep 2019 17:06:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8CC59217F4 for ; Mon, 2 Sep 2019 17:06:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WI0oEPHb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726395AbfIBRGa (ORCPT ); Mon, 2 Sep 2019 13:06:30 -0400 Received: from mail-io1-f67.google.com ([209.85.166.67]:39155 "EHLO mail-io1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726439AbfIBRG3 (ORCPT ); Mon, 2 Sep 2019 13:06:29 -0400 Received: by mail-io1-f67.google.com with SMTP id d25so27777003iob.6 for ; Mon, 02 Sep 2019 10:06:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6qGSzyND4rX/g7mlQIjH5fUw5mfZyJz2GUWpcwS3xfQ=; b=WI0oEPHbWiFh7rMo21/DKwuBP3xQl9UeZJguYnjgVrMLsIVPoZRpLIn6V77o6i98RP FLhv/qOSCMks/QHMczZO0Kcc+j+pm/qoamLL7Pl5zAfowI25epLsG2cx55FjebSmyC7G faFz09VPHAzoLlLAa4aV3dC5s/NAZ0Rx1Zkk0tLXeCjEF6s5ZI2zhZ6xXbgu6MiGsJCs VpN13I63izZAi82mqULenfK8ZQ/vRbLMAoMNia/WdVRbZO/TbHcDwKk0FJRVwCL42YFX SO745ymn6JpwX+sJJ3hHUlfDURLiLeR0+UsgmZHIQbPrRjm2/gW7RtPoZNdBHO0mE7v+ +CZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6qGSzyND4rX/g7mlQIjH5fUw5mfZyJz2GUWpcwS3xfQ=; b=kNfpwx40SXtGqtTqIfGX+90gYk1SCHZbpWyZs4vFv4Ci8MMb8VCjrSXUiJ9g7cH/Uf kFFEhSYjzlZHfgrPazyxN8iQ+FiXD0wcFx0Q1cbC2GNuRjdBo1jMOX2WHfhvtz6bWQ90 ljSFgkds4nA+2IbGzxtx7J+/qL6SK70CQ1mpbQJJBLdHIxjCtXyIQkqjBTOj9fcJyHiH nA4mOsLJ3rchJTd80qoG0sDFifYb0E2I7tqnzvB10HakhDg3PYF9Wyr6erqdWSaz3Rf5 jWht9EudWDH3jr405mE4vOEYa/VDKcN95p6EnlV6jp4fVXgCojGwkgL0AI96fzSXXpiT 28nw== X-Gm-Message-State: APjAAAU8/sWDZk2sZ2uT5JVvmW7zEkQuUg2nh2YB6XtlIpEf7WCGFcFz qwLd25xFgEjGj8Qww4D2WvwHuaiKbQ== X-Google-Smtp-Source: APXvYqz51BaGZrIsPbTjSk5tUxmJJaZru4HrgZL6jli2DuGSFvbORyzYNhw56HxhHAK3+9jRSkbdFQ== X-Received: by 2002:a02:920b:: with SMTP id x11mr11584591jag.17.1567443988318; Mon, 02 Sep 2019 10:06:28 -0700 (PDT) Received: from localhost.localdomain (c-68-40-189-247.hsd1.mi.comcast.net. [68.40.189.247]) by smtp.gmail.com with ESMTPSA id o3sm7655322iob.64.2019.09.02.10.06.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Sep 2019 10:06:27 -0700 (PDT) From: Trond Myklebust X-Google-Original-From: Trond Myklebust To: "J.Bruce Fields" Cc: linux-nfs@vger.kernel.org Subject: [PATCH v2 4/4] nfsd: Reset the boot verifier on all write I/O errors Date: Mon, 2 Sep 2019 13:02:58 -0400 Message-Id: <20190902170258.92522-5-trond.myklebust@hammerspace.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190902170258.92522-4-trond.myklebust@hammerspace.com> References: <20190902170258.92522-1-trond.myklebust@hammerspace.com> <20190902170258.92522-2-trond.myklebust@hammerspace.com> <20190902170258.92522-3-trond.myklebust@hammerspace.com> <20190902170258.92522-4-trond.myklebust@hammerspace.com> MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org If multiple clients are writing to the same file, then due to the fact we share a single file descriptor between all NFSv3 clients writing to the file, we have a situation where clients can miss the fact that their file data was not persisted. While this should be rare, it could cause silent data loss in situations where multiple clients are using NLM locking or O_DIRECT to write to the same file. Unfortunately, the stateless nature of NFSv3 and the fact that we can only identify clients by their IP address means that we cannot trivially cache errors; we would not know when it is safe to release them from the cache. So the solution is to declare a reboot. We understand that this should be a rare occurrence, since disks are usually stable. The most frequent occurrence is likely to be ENOSPC, at which point all writes to the given filesystem are likely to fail anyway. So the expectation is that clients will be forced to retry their writes until they hit the fatal error. Signed-off-by: Trond Myklebust --- fs/nfsd/vfs.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 84e87772c2b8..0867d5319fdb 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -958,8 +958,12 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file, nfsdstats.io_write += *cnt; fsnotify_modify(file); - if (stable && use_wgather) + if (stable && use_wgather) { host_err = wait_for_concurrent_writes(file); + if (host_err < 0) + nfsd_reset_boot_verifier(net_generic(SVC_NET(rqstp), + nfsd_net_id)); + } out_nfserr: if (host_err >= 0) { @@ -1063,10 +1067,17 @@ nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, if (EX_ISSYNC(fhp->fh_export)) { int err2 = vfs_fsync_range(nf->nf_file, offset, end, 0); - if (err2 != -EINVAL) - err = nfserrno(err2); - else + switch (err2) { + case 0: + break; + case -EINVAL: err = nfserr_notsupp; + break; + default: + err = nfserrno(err2); + nfsd_reset_boot_verifier(net_generic(nf->nf_net, + nfsd_net_id)); + } } nfsd_file_put(nf);