Message ID | 20240711-nfsd-next-v1-1-f9f944500503@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | nfsd: remove unneeded EEXIST error check in nfsd_do_file_acquire | expand |
On Thu, Jul 11, 2024 at 03:11:13PM -0400, Jeff Layton wrote: > Given that we do the search and insertion while holding the i_lock, I > don't think it's possible for us to get EEXIST here. Remove this case. > > Cc: Youzhong Yang <youzhong@gmail.com> > Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") > Signed-off-by: Jeff Layton <jlayton@kernel.org> > --- > This is replacement for PATCH 1/3 in the series I sent yesterday. I > think it makes sense to just eliminate this case. > --- > fs/nfsd/filecache.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > index f84913691b78..b9dc7c22242c 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -1038,8 +1038,6 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, > if (likely(ret == 0)) > goto open_file; > > - if (ret == -EEXIST) > - goto retry; > trace_nfsd_file_insert_err(rqstp, inode, may_flags, ret); > status = nfserr_jukebox; > goto construction_err; > > --- > base-commit: ec1772c39fa8dd85340b1a02040806377ffbff27 > change-id: 20240711-nfsd-next-c9d17f66e2bd > > Best regards, > -- > Jeff Layton <jlayton@kernel.org> Youzhong, can you replace 1/3 in Jeff's file cache series and test again please?
It's in progress, I will report back once it's done, most likely late this afternoon. This time it will have much more nfs load on the server. On Fri, Jul 12, 2024 at 9:32 AM Chuck Lever <chuck.lever@oracle.com> wrote: > > On Thu, Jul 11, 2024 at 03:11:13PM -0400, Jeff Layton wrote: > > Given that we do the search and insertion while holding the i_lock, I > > don't think it's possible for us to get EEXIST here. Remove this case. > > > > Cc: Youzhong Yang <youzhong@gmail.com> > > Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > This is replacement for PATCH 1/3 in the series I sent yesterday. I > > think it makes sense to just eliminate this case. > > --- > > fs/nfsd/filecache.c | 2 -- > > 1 file changed, 2 deletions(-) > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > index f84913691b78..b9dc7c22242c 100644 > > --- a/fs/nfsd/filecache.c > > +++ b/fs/nfsd/filecache.c > > @@ -1038,8 +1038,6 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, > > if (likely(ret == 0)) > > goto open_file; > > > > - if (ret == -EEXIST) > > - goto retry; > > trace_nfsd_file_insert_err(rqstp, inode, may_flags, ret); > > status = nfserr_jukebox; > > goto construction_err; > > > > --- > > base-commit: ec1772c39fa8dd85340b1a02040806377ffbff27 > > change-id: 20240711-nfsd-next-c9d17f66e2bd > > > > Best regards, > > -- > > Jeff Layton <jlayton@kernel.org> > > Youzhong, can you replace 1/3 in Jeff's file cache series and > test again please? > > -- > Chuck Lever
Testing is done with this patch applied. All good, no surprise. On Fri, Jul 12, 2024 at 9:32 AM Chuck Lever <chuck.lever@oracle.com> wrote: > > On Thu, Jul 11, 2024 at 03:11:13PM -0400, Jeff Layton wrote: > > Given that we do the search and insertion while holding the i_lock, I > > don't think it's possible for us to get EEXIST here. Remove this case. > > > > Cc: Youzhong Yang <youzhong@gmail.com> > > Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > This is replacement for PATCH 1/3 in the series I sent yesterday. I > > think it makes sense to just eliminate this case. > > --- > > fs/nfsd/filecache.c | 2 -- > > 1 file changed, 2 deletions(-) > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > index f84913691b78..b9dc7c22242c 100644 > > --- a/fs/nfsd/filecache.c > > +++ b/fs/nfsd/filecache.c > > @@ -1038,8 +1038,6 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, > > if (likely(ret == 0)) > > goto open_file; > > > > - if (ret == -EEXIST) > > - goto retry; > > trace_nfsd_file_insert_err(rqstp, inode, may_flags, ret); > > status = nfserr_jukebox; > > goto construction_err; > > > > --- > > base-commit: ec1772c39fa8dd85340b1a02040806377ffbff27 > > change-id: 20240711-nfsd-next-c9d17f66e2bd > > > > Best regards, > > -- > > Jeff Layton <jlayton@kernel.org> > > Youzhong, can you replace 1/3 in Jeff's file cache series and > test again please? > > -- > Chuck Lever
On Fri, Jul 12, 2024 at 02:30:04PM -0400, Youzhong Yang wrote: > Testing is done with this patch applied. All good, no surprise. Awesome, I've replaced 1/3 with this patch. These are all applied to a private tree at the moment. I will push them to the public nfsd-next branch when the 6.11 merge window closes. > On Fri, Jul 12, 2024 at 9:32 AM Chuck Lever <chuck.lever@oracle.com> wrote: > > > > On Thu, Jul 11, 2024 at 03:11:13PM -0400, Jeff Layton wrote: > > > Given that we do the search and insertion while holding the i_lock, I > > > don't think it's possible for us to get EEXIST here. Remove this case. > > > > > > Cc: Youzhong Yang <youzhong@gmail.com> > > > Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > --- > > > This is replacement for PATCH 1/3 in the series I sent yesterday. I > > > think it makes sense to just eliminate this case. > > > --- > > > fs/nfsd/filecache.c | 2 -- > > > 1 file changed, 2 deletions(-) > > > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > > index f84913691b78..b9dc7c22242c 100644 > > > --- a/fs/nfsd/filecache.c > > > +++ b/fs/nfsd/filecache.c > > > @@ -1038,8 +1038,6 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, > > > if (likely(ret == 0)) > > > goto open_file; > > > > > > - if (ret == -EEXIST) > > > - goto retry; > > > trace_nfsd_file_insert_err(rqstp, inode, may_flags, ret); > > > status = nfserr_jukebox; > > > goto construction_err; > > > > > > --- > > > base-commit: ec1772c39fa8dd85340b1a02040806377ffbff27 > > > change-id: 20240711-nfsd-next-c9d17f66e2bd > > > > > > Best regards, > > > -- > > > Jeff Layton <jlayton@kernel.org> > > > > Youzhong, can you replace 1/3 in Jeff's file cache series and > > test again please? > > > > -- > > Chuck Lever
On Fri, 12 Jul 2024, Jeff Layton wrote: > Given that we do the search and insertion while holding the i_lock, I > don't think it's possible for us to get EEXIST here. Remove this case. I was going to comment that as rhltable_insert() cannot return -EEXIST that is an extra reason to discard the check. But then I looked at the code an I cannot convince myself that it cannot. If __rhashtable_insert_fast() finds that tbl->future_tbl is not NULL it calls rhashtable_insert_slow(), and that seems to fail if the key already exists. But it shouldn't for an rhltable, it should just add the new item to the linked list for that key. It looks like this has always been broken: adding to an rhltable during a resize event can cause EEXIST.... Would anyone like to check my work? I'm surprise that hasn't been noticed if it is really the case. NeilBrown > > Cc: Youzhong Yang <youzhong@gmail.com> > Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") > Signed-off-by: Jeff Layton <jlayton@kernel.org> > --- > This is replacement for PATCH 1/3 in the series I sent yesterday. I > think it makes sense to just eliminate this case. > --- > fs/nfsd/filecache.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > index f84913691b78..b9dc7c22242c 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -1038,8 +1038,6 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, > if (likely(ret == 0)) > goto open_file; > > - if (ret == -EEXIST) > - goto retry; > trace_nfsd_file_insert_err(rqstp, inode, may_flags, ret); > status = nfserr_jukebox; > goto construction_err; > > --- > base-commit: ec1772c39fa8dd85340b1a02040806377ffbff27 > change-id: 20240711-nfsd-next-c9d17f66e2bd > > Best regards, > -- > Jeff Layton <jlayton@kernel.org> > >
On Mon, 2024-07-15 at 10:27 +1000, NeilBrown wrote: > On Fri, 12 Jul 2024, Jeff Layton wrote: > > Given that we do the search and insertion while holding the i_lock, I > > don't think it's possible for us to get EEXIST here. Remove this case. > > I was going to comment that as rhltable_insert() cannot return -EEXIST > that is an extra reason to discard the check. But then I looked at the > code an I cannot convince myself that it cannot. > If __rhashtable_insert_fast() finds that tbl->future_tbl is not NULL it > calls rhashtable_insert_slow(), and that seems to fail if the key > already exists. But it shouldn't for an rhltable, it should just add > the new item to the linked list for that key. > > It looks like this has always been broken: adding to an rhltable during > a resize event can cause EEXIST.... > > Would anyone like to check my work? I'm surprise that hasn't been > noticed if it is really the case. > > I don't know this code well at all, but it looks correct to me: static void *rhashtable_try_insert(struct rhashtable *ht, const void *key, struct rhash_head *obj) { struct bucket_table *new_tbl; struct bucket_table *tbl; struct rhash_lock_head __rcu **bkt; unsigned long flags; unsigned int hash; void *data; new_tbl = rcu_dereference(ht->tbl); do { tbl = new_tbl; hash = rht_head_hashfn(ht, tbl, obj, ht->p); if (rcu_access_pointer(tbl->future_tbl)) /* Failure is OK */ bkt = rht_bucket_var(tbl, hash); else bkt = rht_bucket_insert(ht, tbl, hash); if (bkt == NULL) { new_tbl = rht_dereference_rcu(tbl->future_tbl, ht); data = ERR_PTR(-EAGAIN); } else { flags = rht_lock(tbl, bkt); data = rhashtable_lookup_one(ht, bkt, tbl, hash, key, obj); new_tbl = rhashtable_insert_one(ht, bkt, tbl, hash, obj, data); if (PTR_ERR(new_tbl) != -EEXIST) data = ERR_CAST(new_tbl); rht_unlock(tbl, bkt, flags); } } while (!IS_ERR_OR_NULL(new_tbl)); if (PTR_ERR(data) == -EAGAIN) data = ERR_PTR(rhashtable_insert_rehash(ht, tbl) ?: -EAGAIN); return data; } I'm assuming the part we need to worry about is where rhashtable_insert_one returns -EEXIST. It holds the rht_lock across the lookup and insert though. So if rhashtable_insert_one returns -EEXIST, then "data" must be something valid. In that case, "data" won't be overwritten and it will fall through and return the pointer to the entry already there. That said, this logic is really convoluted, so I may have missed something too. > > > > Cc: Youzhong Yang <youzhong@gmail.com> > > Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > This is replacement for PATCH 1/3 in the series I sent yesterday. I > > think it makes sense to just eliminate this case. > > --- > > fs/nfsd/filecache.c | 2 -- > > 1 file changed, 2 deletions(-) > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > index f84913691b78..b9dc7c22242c 100644 > > --- a/fs/nfsd/filecache.c > > +++ b/fs/nfsd/filecache.c > > @@ -1038,8 +1038,6 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, > > if (likely(ret == 0)) > > goto open_file; > > > > - if (ret == -EEXIST) > > - goto retry; > > trace_nfsd_file_insert_err(rqstp, inode, may_flags, ret); > > status = nfserr_jukebox; > > goto construction_err; > > > > --- > > base-commit: ec1772c39fa8dd85340b1a02040806377ffbff27 > > change-id: 20240711-nfsd-next-c9d17f66e2bd > > > > Best regards, > > -- > > Jeff Layton <jlayton@kernel.org> > > > > >
On Mon, Jul 15, 2024 at 08:25:53AM -0400, Jeff Layton wrote: > On Mon, 2024-07-15 at 10:27 +1000, NeilBrown wrote: > > On Fri, 12 Jul 2024, Jeff Layton wrote: > > > Given that we do the search and insertion while holding the i_lock, I > > > don't think it's possible for us to get EEXIST here. Remove this case. > > > > I was going to comment that as rhltable_insert() cannot return -EEXIST > > that is an extra reason to discard the check. But then I looked at the > > code an I cannot convince myself that it cannot. > > If __rhashtable_insert_fast() finds that tbl->future_tbl is not NULL it > > calls rhashtable_insert_slow(), and that seems to fail if the key > > already exists. But it shouldn't for an rhltable, it should just add > > the new item to the linked list for that key. > > > > It looks like this has always been broken: adding to an rhltable during > > a resize event can cause EEXIST.... > > > > Would anyone like to check my work? I'm surprise that hasn't been > > noticed if it is really the case. > > > > > > I don't know this code well at all, but it looks correct to me: > > static void *rhashtable_try_insert(struct rhashtable *ht, const void *key, > struct rhash_head *obj) > { > struct bucket_table *new_tbl; > struct bucket_table *tbl; > struct rhash_lock_head __rcu **bkt; > unsigned long flags; > unsigned int hash; > void *data; > > new_tbl = rcu_dereference(ht->tbl); > > do { > tbl = new_tbl; > hash = rht_head_hashfn(ht, tbl, obj, ht->p); > if (rcu_access_pointer(tbl->future_tbl)) > /* Failure is OK */ > bkt = rht_bucket_var(tbl, hash); > else > bkt = rht_bucket_insert(ht, tbl, hash); > if (bkt == NULL) { > new_tbl = rht_dereference_rcu(tbl->future_tbl, ht); > data = ERR_PTR(-EAGAIN); > } else { > flags = rht_lock(tbl, bkt); > data = rhashtable_lookup_one(ht, bkt, tbl, > hash, key, obj); > new_tbl = rhashtable_insert_one(ht, bkt, tbl, > hash, obj, data); > if (PTR_ERR(new_tbl) != -EEXIST) > data = ERR_CAST(new_tbl); > > rht_unlock(tbl, bkt, flags); > } > } while (!IS_ERR_OR_NULL(new_tbl)); > > if (PTR_ERR(data) == -EAGAIN) > data = ERR_PTR(rhashtable_insert_rehash(ht, tbl) ?: > -EAGAIN); > > return data; > } > > I'm assuming the part we need to worry about is where > rhashtable_insert_one returns -EEXIST. > > It holds the rht_lock across the lookup and insert though. So if > rhashtable_insert_one returns -EEXIST, then "data" must be something > valid. In that case, "data" won't be overwritten and it will fall > through and return the pointer to the entry already there. > > That said, this logic is really convoluted, so I may have missed > something too. This is the issue I was concerned about after my review: it's obvious that the rhtable API can return -EEXIST, but it's just really hard to tell whether the rh/l/table API will ever return -EEXIST. As Neil says, the rhtable "hash table full" case should not happen with rhltable. But can we prove that? If we are not yet confident, then maybe PATCH 1/3 should replace the "if (ret == -EEXIST)" with "WARN_ON(ret == -EEXIST)"...? It's also possible to ask the human(s) who constructed the rhltable code. :-) > > > Cc: Youzhong Yang <youzhong@gmail.com> > > > Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > --- > > > This is replacement for PATCH 1/3 in the series I sent yesterday. I > > > think it makes sense to just eliminate this case. > > > --- > > > fs/nfsd/filecache.c | 2 -- > > > 1 file changed, 2 deletions(-) > > > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > > index f84913691b78..b9dc7c22242c 100644 > > > --- a/fs/nfsd/filecache.c > > > +++ b/fs/nfsd/filecache.c > > > @@ -1038,8 +1038,6 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, > > > if (likely(ret == 0)) > > > goto open_file; > > > > > > - if (ret == -EEXIST) > > > - goto retry; > > > trace_nfsd_file_insert_err(rqstp, inode, may_flags, ret); > > > status = nfserr_jukebox; > > > goto construction_err; > > > > > > --- > > > base-commit: ec1772c39fa8dd85340b1a02040806377ffbff27 > > > change-id: 20240711-nfsd-next-c9d17f66e2bd > > > > > > Best regards, > > > -- > > > Jeff Layton <jlayton@kernel.org> > > > > > > > > > > -- > Jeff Layton <jlayton@kernel.org>
How is this going? any chance to move forward and deal with the EEXIST case in a future patch? I see no harm in keeping the EEXIST check. On Mon, Jul 15, 2024 at 10:06 AM Chuck Lever <chuck.lever@oracle.com> wrote: > > On Mon, Jul 15, 2024 at 08:25:53AM -0400, Jeff Layton wrote: > > On Mon, 2024-07-15 at 10:27 +1000, NeilBrown wrote: > > > On Fri, 12 Jul 2024, Jeff Layton wrote: > > > > Given that we do the search and insertion while holding the i_lock, I > > > > don't think it's possible for us to get EEXIST here. Remove this case. > > > > > > I was going to comment that as rhltable_insert() cannot return -EEXIST > > > that is an extra reason to discard the check. But then I looked at the > > > code an I cannot convince myself that it cannot. > > > If __rhashtable_insert_fast() finds that tbl->future_tbl is not NULL it > > > calls rhashtable_insert_slow(), and that seems to fail if the key > > > already exists. But it shouldn't for an rhltable, it should just add > > > the new item to the linked list for that key. > > > > > > It looks like this has always been broken: adding to an rhltable during > > > a resize event can cause EEXIST.... > > > > > > Would anyone like to check my work? I'm surprise that hasn't been > > > noticed if it is really the case. > > > > > > > > > > I don't know this code well at all, but it looks correct to me: > > > > static void *rhashtable_try_insert(struct rhashtable *ht, const void *key, > > struct rhash_head *obj) > > { > > struct bucket_table *new_tbl; > > struct bucket_table *tbl; > > struct rhash_lock_head __rcu **bkt; > > unsigned long flags; > > unsigned int hash; > > void *data; > > > > new_tbl = rcu_dereference(ht->tbl); > > > > do { > > tbl = new_tbl; > > hash = rht_head_hashfn(ht, tbl, obj, ht->p); > > if (rcu_access_pointer(tbl->future_tbl)) > > /* Failure is OK */ > > bkt = rht_bucket_var(tbl, hash); > > else > > bkt = rht_bucket_insert(ht, tbl, hash); > > if (bkt == NULL) { > > new_tbl = rht_dereference_rcu(tbl->future_tbl, ht); > > data = ERR_PTR(-EAGAIN); > > } else { > > flags = rht_lock(tbl, bkt); > > data = rhashtable_lookup_one(ht, bkt, tbl, > > hash, key, obj); > > new_tbl = rhashtable_insert_one(ht, bkt, tbl, > > hash, obj, data); > > if (PTR_ERR(new_tbl) != -EEXIST) > > data = ERR_CAST(new_tbl); > > > > rht_unlock(tbl, bkt, flags); > > } > > } while (!IS_ERR_OR_NULL(new_tbl)); > > > > if (PTR_ERR(data) == -EAGAIN) > > data = ERR_PTR(rhashtable_insert_rehash(ht, tbl) ?: > > -EAGAIN); > > > > return data; > > } > > > > I'm assuming the part we need to worry about is where > > rhashtable_insert_one returns -EEXIST. > > > > It holds the rht_lock across the lookup and insert though. So if > > rhashtable_insert_one returns -EEXIST, then "data" must be something > > valid. In that case, "data" won't be overwritten and it will fall > > through and return the pointer to the entry already there. > > > > That said, this logic is really convoluted, so I may have missed > > something too. > > This is the issue I was concerned about after my review: it's > obvious that the rhtable API can return -EEXIST, but it's just > really hard to tell whether the rh/l/table API will ever return > -EEXIST. > > As Neil says, the rhtable "hash table full" case should not happen > with rhltable. But can we prove that? > > If we are not yet confident, then maybe PATCH 1/3 should replace > the "if (ret == -EEXIST)" with "WARN_ON(ret == -EEXIST)"...? It's > also possible to ask the human(s) who constructed the rhltable > code. :-) > > > > > > Cc: Youzhong Yang <youzhong@gmail.com> > > > > Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > > --- > > > > This is replacement for PATCH 1/3 in the series I sent yesterday. I > > > > think it makes sense to just eliminate this case. > > > > --- > > > > fs/nfsd/filecache.c | 2 -- > > > > 1 file changed, 2 deletions(-) > > > > > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > > > index f84913691b78..b9dc7c22242c 100644 > > > > --- a/fs/nfsd/filecache.c > > > > +++ b/fs/nfsd/filecache.c > > > > @@ -1038,8 +1038,6 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, > > > > if (likely(ret == 0)) > > > > goto open_file; > > > > > > > > - if (ret == -EEXIST) > > > > - goto retry; > > > > trace_nfsd_file_insert_err(rqstp, inode, may_flags, ret); > > > > status = nfserr_jukebox; > > > > goto construction_err; > > > > > > > > --- > > > > base-commit: ec1772c39fa8dd85340b1a02040806377ffbff27 > > > > change-id: 20240711-nfsd-next-c9d17f66e2bd > > > > > > > > Best regards, > > > > -- > > > > Jeff Layton <jlayton@kernel.org> > > > > > > > > > > > > > > > -- > > Jeff Layton <jlayton@kernel.org> > > -- > Chuck Lever
> On Jul 29, 2024, at 10:26 AM, Youzhong Yang <youzhong@gmail.com> wrote: > > How is this going? any chance to move forward and deal with the EEXIST > case in a future patch? I see no harm in keeping the EEXIST check. The EEXIST patch has been applied to nfsd-next for a while, along with the other patches in this series. > On Mon, Jul 15, 2024 at 10:06 AM Chuck Lever <chuck.lever@oracle.com> wrote: >> >> On Mon, Jul 15, 2024 at 08:25:53AM -0400, Jeff Layton wrote: >>> On Mon, 2024-07-15 at 10:27 +1000, NeilBrown wrote: >>>> On Fri, 12 Jul 2024, Jeff Layton wrote: >>>>> Given that we do the search and insertion while holding the i_lock, I >>>>> don't think it's possible for us to get EEXIST here. Remove this case. >>>> >>>> I was going to comment that as rhltable_insert() cannot return -EEXIST >>>> that is an extra reason to discard the check. But then I looked at the >>>> code an I cannot convince myself that it cannot. >>>> If __rhashtable_insert_fast() finds that tbl->future_tbl is not NULL it >>>> calls rhashtable_insert_slow(), and that seems to fail if the key >>>> already exists. But it shouldn't for an rhltable, it should just add >>>> the new item to the linked list for that key. >>>> >>>> It looks like this has always been broken: adding to an rhltable during >>>> a resize event can cause EEXIST.... >>>> >>>> Would anyone like to check my work? I'm surprise that hasn't been >>>> noticed if it is really the case. >>>> >>>> >>> >>> I don't know this code well at all, but it looks correct to me: >>> >>> static void *rhashtable_try_insert(struct rhashtable *ht, const void *key, >>> struct rhash_head *obj) >>> { >>> struct bucket_table *new_tbl; >>> struct bucket_table *tbl; >>> struct rhash_lock_head __rcu **bkt; >>> unsigned long flags; >>> unsigned int hash; >>> void *data; >>> >>> new_tbl = rcu_dereference(ht->tbl); >>> >>> do { >>> tbl = new_tbl; >>> hash = rht_head_hashfn(ht, tbl, obj, ht->p); >>> if (rcu_access_pointer(tbl->future_tbl)) >>> /* Failure is OK */ >>> bkt = rht_bucket_var(tbl, hash); >>> else >>> bkt = rht_bucket_insert(ht, tbl, hash); >>> if (bkt == NULL) { >>> new_tbl = rht_dereference_rcu(tbl->future_tbl, ht); >>> data = ERR_PTR(-EAGAIN); >>> } else { >>> flags = rht_lock(tbl, bkt); >>> data = rhashtable_lookup_one(ht, bkt, tbl, >>> hash, key, obj); >>> new_tbl = rhashtable_insert_one(ht, bkt, tbl, >>> hash, obj, data); >>> if (PTR_ERR(new_tbl) != -EEXIST) >>> data = ERR_CAST(new_tbl); >>> >>> rht_unlock(tbl, bkt, flags); >>> } >>> } while (!IS_ERR_OR_NULL(new_tbl)); >>> >>> if (PTR_ERR(data) == -EAGAIN) >>> data = ERR_PTR(rhashtable_insert_rehash(ht, tbl) ?: >>> -EAGAIN); >>> >>> return data; >>> } >>> >>> I'm assuming the part we need to worry about is where >>> rhashtable_insert_one returns -EEXIST. >>> >>> It holds the rht_lock across the lookup and insert though. So if >>> rhashtable_insert_one returns -EEXIST, then "data" must be something >>> valid. In that case, "data" won't be overwritten and it will fall >>> through and return the pointer to the entry already there. >>> >>> That said, this logic is really convoluted, so I may have missed >>> something too. >> >> This is the issue I was concerned about after my review: it's >> obvious that the rhtable API can return -EEXIST, but it's just >> really hard to tell whether the rh/l/table API will ever return >> -EEXIST. >> >> As Neil says, the rhtable "hash table full" case should not happen >> with rhltable. But can we prove that? >> >> If we are not yet confident, then maybe PATCH 1/3 should replace >> the "if (ret == -EEXIST)" with "WARN_ON(ret == -EEXIST)"...? It's >> also possible to ask the human(s) who constructed the rhltable >> code. :-) >> >> >>>>> Cc: Youzhong Yang <youzhong@gmail.com> >>>>> Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") >>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org> >>>>> --- >>>>> This is replacement for PATCH 1/3 in the series I sent yesterday. I >>>>> think it makes sense to just eliminate this case. >>>>> --- >>>>> fs/nfsd/filecache.c | 2 -- >>>>> 1 file changed, 2 deletions(-) >>>>> >>>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c >>>>> index f84913691b78..b9dc7c22242c 100644 >>>>> --- a/fs/nfsd/filecache.c >>>>> +++ b/fs/nfsd/filecache.c >>>>> @@ -1038,8 +1038,6 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, >>>>> if (likely(ret == 0)) >>>>> goto open_file; >>>>> >>>>> - if (ret == -EEXIST) >>>>> - goto retry; >>>>> trace_nfsd_file_insert_err(rqstp, inode, may_flags, ret); >>>>> status = nfserr_jukebox; >>>>> goto construction_err; >>>>> >>>>> --- >>>>> base-commit: ec1772c39fa8dd85340b1a02040806377ffbff27 >>>>> change-id: 20240711-nfsd-next-c9d17f66e2bd >>>>> >>>>> Best regards, >>>>> -- >>>>> Jeff Layton <jlayton@kernel.org> >>>>> >>>>> >>>> >>> >>> -- >>> Jeff Layton <jlayton@kernel.org> >> >> -- >> Chuck Lever -- Chuck Lever
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index f84913691b78..b9dc7c22242c 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -1038,8 +1038,6 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp, if (likely(ret == 0)) goto open_file; - if (ret == -EEXIST) - goto retry; trace_nfsd_file_insert_err(rqstp, inode, may_flags, ret); status = nfserr_jukebox; goto construction_err;
Given that we do the search and insertion while holding the i_lock, I don't think it's possible for us to get EEXIST here. Remove this case. Cc: Youzhong Yang <youzhong@gmail.com> Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") Signed-off-by: Jeff Layton <jlayton@kernel.org> --- This is replacement for PATCH 1/3 in the series I sent yesterday. I think it makes sense to just eliminate this case. --- fs/nfsd/filecache.c | 2 -- 1 file changed, 2 deletions(-) --- base-commit: ec1772c39fa8dd85340b1a02040806377ffbff27 change-id: 20240711-nfsd-next-c9d17f66e2bd Best regards,