Message ID | 20250218153937.6125-3-cel@kernel.org (mailing list archive) |
---|---|
State | Under Review |
Delegated to: | Chuck Lever |
Headers | show |
Series | nfsd: filecache: various fixes | expand |
On Tue, 2025-02-18 at 10:39 -0500, cel@kernel.org wrote: > From: Chuck Lever <chuck.lever@oracle.com> > > Dave opines: > > IMO, there is no need to do this unnecessary work on every object > that is added to the LRU. Changing the gc worker to always run > every 2s and check if it has work to do like so: > > static void > nfsd_file_gc_worker(struct work_struct *work) > { > - nfsd_file_gc(); > - if (list_lru_count(&nfsd_file_lru)) > - nfsd_file_schedule_laundrette(); > + if (list_lru_count(&nfsd_file_lru)) > + nfsd_file_gc(); > + nfsd_file_schedule_laundrette(); > } > > means that nfsd_file_gc() will be run the same way and have the same > behaviour as the current code. When the system it idle, it does a > list_lru_count() check every 2 seconds and goes back to sleep. > That's going to be pretty much unnoticable on most machines that run > NFS servers. > > Suggested-by: Dave Chinner <david@fromorbit.com> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > --- > fs/nfsd/filecache.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > index 909b5bc72bd3..2933cba1e5f4 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -549,9 +549,9 @@ nfsd_file_gc(void) > static void > nfsd_file_gc_worker(struct work_struct *work) > { > - nfsd_file_gc(); > + nfsd_file_schedule_laundrette(); > if (list_lru_count(&nfsd_file_lru)) > - nfsd_file_schedule_laundrette(); > + nfsd_file_gc(); > } > > static unsigned long Given that it's a delayed workqueue job, it probably doesn't matter, but why schedule the laundrette before doing the nfsd_file_gc() call?
On Tue, Feb 18, 2025 at 10:39:32AM -0500, cel@kernel.org wrote: > From: Chuck Lever <chuck.lever@oracle.com> > > Dave opines: > > IMO, there is no need to do this unnecessary work on every object > that is added to the LRU. Changing the gc worker to always run > every 2s and check if it has work to do like so: > > static void > nfsd_file_gc_worker(struct work_struct *work) > { > - nfsd_file_gc(); > - if (list_lru_count(&nfsd_file_lru)) > - nfsd_file_schedule_laundrette(); > + if (list_lru_count(&nfsd_file_lru)) > + nfsd_file_gc(); > + nfsd_file_schedule_laundrette(); > } > > means that nfsd_file_gc() will be run the same way and have the same > behaviour as the current code. When the system it idle, it does a > list_lru_count() check every 2 seconds and goes back to sleep. > That's going to be pretty much unnoticable on most machines that run > NFS servers. > > Suggested-by: Dave Chinner <david@fromorbit.com> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > --- > fs/nfsd/filecache.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > index 909b5bc72bd3..2933cba1e5f4 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -549,9 +549,9 @@ nfsd_file_gc(void) > static void > nfsd_file_gc_worker(struct work_struct *work) > { > - nfsd_file_gc(); > + nfsd_file_schedule_laundrette(); > if (list_lru_count(&nfsd_file_lru)) > - nfsd_file_schedule_laundrette(); > + nfsd_file_gc(); > } IMO, the scheduling of new work is the wrong way around. It should be done on completion of gc work, not before gc work is started. i.e. If nfsd_file_gc() is overly delayed (because load, rt preempt, etc), then a new gc worker will be started in 2s regardless of whether the currently running gc worker has completed or not. Worse case, there's a spinlock hang bug in nfsd_file_gc(). This code will end up with N worker threads all spinning up in nfsd_file_gc() chewing up all the CPU in the system, not making any progress.... If we schedule new work after completion of this work, then gc might hang but it won't slowly drag the entire system down with it. -Dave.
On Wed, 19 Feb 2025, Dave Chinner wrote: > On Tue, Feb 18, 2025 at 10:39:32AM -0500, cel@kernel.org wrote: > > From: Chuck Lever <chuck.lever@oracle.com> > > > > Dave opines: > > > > IMO, there is no need to do this unnecessary work on every object > > that is added to the LRU. Changing the gc worker to always run > > every 2s and check if it has work to do like so: > > > > static void > > nfsd_file_gc_worker(struct work_struct *work) > > { > > - nfsd_file_gc(); > > - if (list_lru_count(&nfsd_file_lru)) > > - nfsd_file_schedule_laundrette(); > > + if (list_lru_count(&nfsd_file_lru)) > > + nfsd_file_gc(); > > + nfsd_file_schedule_laundrette(); > > } > > > > means that nfsd_file_gc() will be run the same way and have the same > > behaviour as the current code. When the system it idle, it does a > > list_lru_count() check every 2 seconds and goes back to sleep. > > That's going to be pretty much unnoticable on most machines that run > > NFS servers. > > > > Suggested-by: Dave Chinner <david@fromorbit.com> > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > > --- > > fs/nfsd/filecache.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > index 909b5bc72bd3..2933cba1e5f4 100644 > > --- a/fs/nfsd/filecache.c > > +++ b/fs/nfsd/filecache.c > > @@ -549,9 +549,9 @@ nfsd_file_gc(void) > > static void > > nfsd_file_gc_worker(struct work_struct *work) > > { > > - nfsd_file_gc(); > > + nfsd_file_schedule_laundrette(); > > if (list_lru_count(&nfsd_file_lru)) > > - nfsd_file_schedule_laundrette(); > > + nfsd_file_gc(); > > } > > IMO, the scheduling of new work is the wrong way around. It should > be done on completion of gc work, not before gc work is started. > > i.e. If nfsd_file_gc() is overly delayed (because load, rt preempt, > etc), then a new gc worker will be started in 2s regardless of > whether the currently running gc worker has completed or not. > > Worse case, there's a spinlock hang bug in nfsd_file_gc(). This code > will end up with N worker threads all spinning up in nfsd_file_gc() > chewing up all the CPU in the system, not making any progress.... > If we schedule new work after completion of this work, then gc might > hang but it won't slowly drag the entire system down with it. While I agree that the enqueue is best done later rather than earlier, I think your worst-case is over-stated. queue_delayed_work() is a no-op if WORK_STRUCT_PENDING_BIT is still set. A given work_struct can only be running once. If the timer fires while nfsd_file_gc() is still running, nfsd_filecache_laundrette will be queued to start immediately that the currently running instance completes. So the worst cases is that there will always be one instance running. Thanks, NeilBrown
On 2/18/25 7:33 PM, Dave Chinner wrote: > On Tue, Feb 18, 2025 at 10:39:32AM -0500, cel@kernel.org wrote: >> From: Chuck Lever <chuck.lever@oracle.com> >> >> Dave opines: >> >> IMO, there is no need to do this unnecessary work on every object >> that is added to the LRU. Changing the gc worker to always run >> every 2s and check if it has work to do like so: >> >> static void >> nfsd_file_gc_worker(struct work_struct *work) >> { >> - nfsd_file_gc(); >> - if (list_lru_count(&nfsd_file_lru)) >> - nfsd_file_schedule_laundrette(); >> + if (list_lru_count(&nfsd_file_lru)) >> + nfsd_file_gc(); >> + nfsd_file_schedule_laundrette(); >> } >> >> means that nfsd_file_gc() will be run the same way and have the same >> behaviour as the current code. When the system it idle, it does a >> list_lru_count() check every 2 seconds and goes back to sleep. >> That's going to be pretty much unnoticable on most machines that run >> NFS servers. >> >> Suggested-by: Dave Chinner <david@fromorbit.com> >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >> --- >> fs/nfsd/filecache.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c >> index 909b5bc72bd3..2933cba1e5f4 100644 >> --- a/fs/nfsd/filecache.c >> +++ b/fs/nfsd/filecache.c >> @@ -549,9 +549,9 @@ nfsd_file_gc(void) >> static void >> nfsd_file_gc_worker(struct work_struct *work) >> { >> - nfsd_file_gc(); >> + nfsd_file_schedule_laundrette(); >> if (list_lru_count(&nfsd_file_lru)) >> - nfsd_file_schedule_laundrette(); >> + nfsd_file_gc(); >> } > > IMO, the scheduling of new work is the wrong way around. It should > be done on completion of gc work, not before gc work is started. > > i.e. If nfsd_file_gc() is overly delayed (because load, rt preempt, > etc), then a new gc worker will be started in 2s regardless of > whether the currently running gc worker has completed or not. > > Worse case, there's a spinlock hang bug in nfsd_file_gc(). This code > will end up with N worker threads all spinning up in nfsd_file_gc() > chewing up all the CPU in the system, not making any progress.... > If we schedule new work after completion of this work, then gc might > hang but it won't slowly drag the entire system down with it. My bad. I miscopied your suggestion. Will fix in my tree.
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index 909b5bc72bd3..2933cba1e5f4 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -549,9 +549,9 @@ nfsd_file_gc(void) static void nfsd_file_gc_worker(struct work_struct *work) { - nfsd_file_gc(); + nfsd_file_schedule_laundrette(); if (list_lru_count(&nfsd_file_lru)) - nfsd_file_schedule_laundrette(); + nfsd_file_gc(); } static unsigned long