Message ID | 2128544.1733755560@warthog.procyon.org.uk (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | nfs: Fix oops in nfs_netfs_init_request() when copying to cache | expand |
On Mon, Dec 9, 2024 at 3:46 PM David Howells <dhowells@redhat.com> wrote: > Does this fix the issue? The issue is with 6.11, but this patch fails to build with 6.11 and I'm not sure how to backport that part: fs/nfs/fscache.c: In function ‘nfs_netfs_init_request’: fs/nfs/fscache.c:267:50: error: ‘NETFS_PGPRIV2_COPY_TO_CACHE’ undeclared (first use in this function); did you mean ‘NETFS_RREQ_COPY_TO_CACHE’? 267 | if (WARN_ON_ONCE(rreq->origin != NETFS_PGPRIV2_COPY_TO_CACHE)) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ Our production machines are all 6.11, because 6.12 has that other netfs regression that freezes all transfers immediately (https://lore.kernel.org/netfs/CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY4syqw@mail.gmail.com/). I guess this other bug only affects Ceph and not NFS, but after experiencing so many kernel regressions recently, I had to become more cautious with kernel updates (the past 2 months had more netfs/NFS/Ceph regression than the last 20 years combined). > > David > --- > nfs: Fix oops in nfs_netfs_init_request() when copying to cache > > When netfslib wants to copy some data that has just been read on behalf of > nfs, it creates a new write request and calls nfs_netfs_init_request() to > initialise it, but with a NULL file pointer. This causes > nfs_file_open_context() to oops - however, we don't actually need the nfs > context as we're only going to write to the cache. > > Fix this by just returning if we aren't given a file pointer and emit a > warning if the request was for something other than copy-to-cache. > > Further, fix nfs_netfs_free_request() so that it doesn't try to free the > context if the pointer is NULL. > > Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") > Reported-by: Max Kellermann <max.kellermann@ionos.com> > Closes: https://lore.kernel.org/r/CAKPOu+986mTt1i9xGBXiQPVOmu4ZJTskrCt6f-99EL_s0rhz_A@mail.gmail.com/ > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Trond Myklebust <trondmy@kernel.org> > cc: Anna Schumaker <anna@kernel.org> > cc: Dave Wysochanski <dwysocha@redhat.com> > cc: Jeff Layton <jlayton@kernel.org> > cc: linux-nfs@vger.kernel.org > cc: netfs@lists.linux.dev > cc: linux-fsdevel@vger.kernel.org > --- > fs/nfs/fscache.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c > index 810269ee0a50..d49e4ce27999 100644 > --- a/fs/nfs/fscache.c > +++ b/fs/nfs/fscache.c > @@ -263,6 +263,12 @@ int nfs_netfs_readahead(struct readahead_control *ractl) > static atomic_t nfs_netfs_debug_id; > static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file) > { > + if (!file) { > + if (WARN_ON_ONCE(rreq->origin != NETFS_PGPRIV2_COPY_TO_CACHE)) > + return -EIO; > + return 0; > + } > + > rreq->netfs_priv = get_nfs_open_context(nfs_file_open_context(file)); > rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id); > /* [DEPRECATED] Use PG_private_2 to mark folio being written to the cache. */ > @@ -274,7 +280,8 @@ static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *fi > > static void nfs_netfs_free_request(struct netfs_io_request *rreq) > { > - put_nfs_open_context(rreq->netfs_priv); > + if (rreq->netfs_priv) > + put_nfs_open_context(rreq->netfs_priv); > } > > static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq) >
diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c index 810269ee0a50..d49e4ce27999 100644 --- a/fs/nfs/fscache.c +++ b/fs/nfs/fscache.c @@ -263,6 +263,12 @@ int nfs_netfs_readahead(struct readahead_control *ractl) static atomic_t nfs_netfs_debug_id; static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file) { + if (!file) { + if (WARN_ON_ONCE(rreq->origin != NETFS_PGPRIV2_COPY_TO_CACHE)) + return -EIO; + return 0; + } + rreq->netfs_priv = get_nfs_open_context(nfs_file_open_context(file)); rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id); /* [DEPRECATED] Use PG_private_2 to mark folio being written to the cache. */ @@ -274,7 +280,8 @@ static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *fi static void nfs_netfs_free_request(struct netfs_io_request *rreq) { - put_nfs_open_context(rreq->netfs_priv); + if (rreq->netfs_priv) + put_nfs_open_context(rreq->netfs_priv); } static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq)
Hi Max, Does this fix the issue? David --- nfs: Fix oops in nfs_netfs_init_request() when copying to cache When netfslib wants to copy some data that has just been read on behalf of nfs, it creates a new write request and calls nfs_netfs_init_request() to initialise it, but with a NULL file pointer. This causes nfs_file_open_context() to oops - however, we don't actually need the nfs context as we're only going to write to the cache. Fix this by just returning if we aren't given a file pointer and emit a warning if the request was for something other than copy-to-cache. Further, fix nfs_netfs_free_request() so that it doesn't try to free the context if the pointer is NULL. Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Reported-by: Max Kellermann <max.kellermann@ionos.com> Closes: https://lore.kernel.org/r/CAKPOu+986mTt1i9xGBXiQPVOmu4ZJTskrCt6f-99EL_s0rhz_A@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Trond Myklebust <trondmy@kernel.org> cc: Anna Schumaker <anna@kernel.org> cc: Dave Wysochanski <dwysocha@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-nfs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org --- fs/nfs/fscache.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)