diff mbox series

nfs: Fix oops in nfs_netfs_init_request() when copying to cache

Message ID 2128544.1733755560@warthog.procyon.org.uk (mailing list archive)
State New
Headers show
Series nfs: Fix oops in nfs_netfs_init_request() when copying to cache | expand

Commit Message

David Howells Dec. 9, 2024, 2:46 p.m. UTC
Hi Max,

Does this fix the issue?

David
---
nfs: Fix oops in nfs_netfs_init_request() when copying to cache

When netfslib wants to copy some data that has just been read on behalf of
nfs, it creates a new write request and calls nfs_netfs_init_request() to
initialise it, but with a NULL file pointer.  This causes
nfs_file_open_context() to oops - however, we don't actually need the nfs
context as we're only going to write to the cache.

Fix this by just returning if we aren't given a file pointer and emit a
warning if the request was for something other than copy-to-cache.

Further, fix nfs_netfs_free_request() so that it doesn't try to free the
context if the pointer is NULL.

Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Reported-by: Max Kellermann <max.kellermann@ionos.com>
Closes: https://lore.kernel.org/r/CAKPOu+986mTt1i9xGBXiQPVOmu4ZJTskrCt6f-99EL_s0rhz_A@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Trond Myklebust <trondmy@kernel.org>
cc: Anna Schumaker <anna@kernel.org>
cc: Dave Wysochanski <dwysocha@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-nfs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
 fs/nfs/fscache.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Max Kellermann Dec. 9, 2024, 5:12 p.m. UTC | #1
On Mon, Dec 9, 2024 at 3:46 PM David Howells <dhowells@redhat.com> wrote:
> Does this fix the issue?

The issue is with 6.11, but this patch fails to build with 6.11 and
I'm not sure how to backport that part:

 fs/nfs/fscache.c: In function ‘nfs_netfs_init_request’:
 fs/nfs/fscache.c:267:50: error: ‘NETFS_PGPRIV2_COPY_TO_CACHE’
undeclared (first use in this function); did you mean
‘NETFS_RREQ_COPY_TO_CACHE’?
   267 |                 if (WARN_ON_ONCE(rreq->origin !=
NETFS_PGPRIV2_COPY_TO_CACHE))
       |
^~~~~~~~~~~~~~~~~~~~~~~~~~~

Our production machines are all 6.11, because 6.12 has that other
netfs regression that freezes all transfers immediately
(https://lore.kernel.org/netfs/CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY4syqw@mail.gmail.com/).
I guess this other bug only affects Ceph and not NFS, but after
experiencing so many kernel regressions recently, I had to become more
cautious with kernel updates (the past 2 months had more
netfs/NFS/Ceph regression than the last 20 years combined).


>
> David
> ---
> nfs: Fix oops in nfs_netfs_init_request() when copying to cache
>
> When netfslib wants to copy some data that has just been read on behalf of
> nfs, it creates a new write request and calls nfs_netfs_init_request() to
> initialise it, but with a NULL file pointer.  This causes
> nfs_file_open_context() to oops - however, we don't actually need the nfs
> context as we're only going to write to the cache.
>
> Fix this by just returning if we aren't given a file pointer and emit a
> warning if the request was for something other than copy-to-cache.
>
> Further, fix nfs_netfs_free_request() so that it doesn't try to free the
> context if the pointer is NULL.
>
> Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
> Reported-by: Max Kellermann <max.kellermann@ionos.com>
> Closes: https://lore.kernel.org/r/CAKPOu+986mTt1i9xGBXiQPVOmu4ZJTskrCt6f-99EL_s0rhz_A@mail.gmail.com/
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Trond Myklebust <trondmy@kernel.org>
> cc: Anna Schumaker <anna@kernel.org>
> cc: Dave Wysochanski <dwysocha@redhat.com>
> cc: Jeff Layton <jlayton@kernel.org>
> cc: linux-nfs@vger.kernel.org
> cc: netfs@lists.linux.dev
> cc: linux-fsdevel@vger.kernel.org
> ---
>  fs/nfs/fscache.c |    9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> index 810269ee0a50..d49e4ce27999 100644
> --- a/fs/nfs/fscache.c
> +++ b/fs/nfs/fscache.c
> @@ -263,6 +263,12 @@ int nfs_netfs_readahead(struct readahead_control *ractl)
>  static atomic_t nfs_netfs_debug_id;
>  static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file)
>  {
> +       if (!file) {
> +               if (WARN_ON_ONCE(rreq->origin != NETFS_PGPRIV2_COPY_TO_CACHE))
> +                       return -EIO;
> +               return 0;
> +       }
> +
>         rreq->netfs_priv = get_nfs_open_context(nfs_file_open_context(file));
>         rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id);
>         /* [DEPRECATED] Use PG_private_2 to mark folio being written to the cache. */
> @@ -274,7 +280,8 @@ static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *fi
>
>  static void nfs_netfs_free_request(struct netfs_io_request *rreq)
>  {
> -       put_nfs_open_context(rreq->netfs_priv);
> +       if (rreq->netfs_priv)
> +               put_nfs_open_context(rreq->netfs_priv);
>  }
>
>  static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq)
>
diff mbox series

Patch

diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index 810269ee0a50..d49e4ce27999 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -263,6 +263,12 @@  int nfs_netfs_readahead(struct readahead_control *ractl)
 static atomic_t nfs_netfs_debug_id;
 static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file)
 {
+	if (!file) {
+		if (WARN_ON_ONCE(rreq->origin != NETFS_PGPRIV2_COPY_TO_CACHE))
+			return -EIO;
+		return 0;
+	}
+
 	rreq->netfs_priv = get_nfs_open_context(nfs_file_open_context(file));
 	rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id);
 	/* [DEPRECATED] Use PG_private_2 to mark folio being written to the cache. */
@@ -274,7 +280,8 @@  static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *fi
 
 static void nfs_netfs_free_request(struct netfs_io_request *rreq)
 {
-	put_nfs_open_context(rreq->netfs_priv);
+	if (rreq->netfs_priv)
+		put_nfs_open_context(rreq->netfs_priv);
 }
 
 static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq)