Message ID | 20220422141226.1831426-1-bfoster@redhat.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | xfs: fix soft lockup via spinning in filestream ag selection loop | expand |
On Fri, Apr 22, 2022 at 10:12:26AM -0400, Brian Foster wrote: > The filestream AG selection loop uses pagf data to aid in AG > selection, which depends on pagf initialization. If the in-core > structure is not initialized, the caller invokes the AGF read path > to do so and carries on. If another task enters the loop and finds > a pagf init already in progress, the AGF read returns -EAGAIN and > the task continues the loop. This does not increment the current ag > index, however, which means the task spins on the current AGF buffer > until unlocked. > > If the AGF read I/O submitted by the initial task happens to be > delayed for whatever reason, this results in soft lockup warnings Is there a specific 'whatever reason' going on here? > via the spinning task. This is reproduced by xfs/170. To avoid this > problem, fix the AGF trylock failure path to properly iterate to the > next AG. If a task iterates all AGs without making progress, the > trylock behavior is dropped in favor of blocking locks and thus a > soft lockup is no longer possible. > > Fixes: f48e2df8a877ca1c ("xfs: make xfs_*read_agf return EAGAIN to ALLOC_FLAG_TRYLOCK callers") Ooops, this was a major braino on my part. Reviewed-by: Darrick J. Wong <djwong@kernel.org> --D > Signed-off-by: Brian Foster <bfoster@redhat.com> > --- > > I included the Fixes: tag because this looks like a regression in said > commit, but I've not explicitly verified. > > Brian > > fs/xfs/xfs_filestream.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/fs/xfs/xfs_filestream.c b/fs/xfs/xfs_filestream.c > index 6a3ce0f6dc9e..be9bcf8a1f99 100644 > --- a/fs/xfs/xfs_filestream.c > +++ b/fs/xfs/xfs_filestream.c > @@ -128,11 +128,12 @@ xfs_filestream_pick_ag( > if (!pag->pagf_init) { > err = xfs_alloc_pagf_init(mp, NULL, ag, trylock); > if (err) { > - xfs_perag_put(pag); > - if (err != -EAGAIN) > + if (err != -EAGAIN) { > + xfs_perag_put(pag); > return err; > + } > /* Couldn't lock the AGF, skip this AG. */ > - continue; > + goto next_ag; > } > } > > -- > 2.34.1 >
On Fri, Apr 22, 2022 at 09:00:21AM -0700, Darrick J. Wong wrote: > On Fri, Apr 22, 2022 at 10:12:26AM -0400, Brian Foster wrote: > > The filestream AG selection loop uses pagf data to aid in AG > > selection, which depends on pagf initialization. If the in-core > > structure is not initialized, the caller invokes the AGF read path > > to do so and carries on. If another task enters the loop and finds > > a pagf init already in progress, the AGF read returns -EAGAIN and > > the task continues the loop. This does not increment the current ag > > index, however, which means the task spins on the current AGF buffer > > until unlocked. > > > > If the AGF read I/O submitted by the initial task happens to be > > delayed for whatever reason, this results in soft lockup warnings > > Is there a specific 'whatever reason' going on here? > Presumably.. given this seems to reproduce reliably or not at all in certain environments/configs, my suspicion was that either the timing of the test changes enough such that some other task involved with the test is able to load the bdev, or otherwise timing changes just enough to trigger the pagf_init race and the subsequent spinning is what exacerbates the delay (i.e. burning cpu and subsequent soft lockup BUG starve out some part(s) of the I/O submission/completion processing). I've no tangible evidence for either aside from the latter seems fairly logical when you consider that the test consistently completes in 3-4 seconds with the fix in place, but without it we consistently hit multiple instances of the soft lockup detector (on ~20s intervals IIRC) and the system seems to melt down indefinitely. *shrug* Brian > > via the spinning task. This is reproduced by xfs/170. To avoid this > > problem, fix the AGF trylock failure path to properly iterate to the > > next AG. If a task iterates all AGs without making progress, the > > trylock behavior is dropped in favor of blocking locks and thus a > > soft lockup is no longer possible. > > > > Fixes: f48e2df8a877ca1c ("xfs: make xfs_*read_agf return EAGAIN to ALLOC_FLAG_TRYLOCK callers") > > Ooops, this was a major braino on my part. > Reviewed-by: Darrick J. Wong <djwong@kernel.org> > > --D > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > --- > > > > I included the Fixes: tag because this looks like a regression in said > > commit, but I've not explicitly verified. > > > > Brian > > > > fs/xfs/xfs_filestream.c | 7 ++++--- > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/fs/xfs/xfs_filestream.c b/fs/xfs/xfs_filestream.c > > index 6a3ce0f6dc9e..be9bcf8a1f99 100644 > > --- a/fs/xfs/xfs_filestream.c > > +++ b/fs/xfs/xfs_filestream.c > > @@ -128,11 +128,12 @@ xfs_filestream_pick_ag( > > if (!pag->pagf_init) { > > err = xfs_alloc_pagf_init(mp, NULL, ag, trylock); > > if (err) { > > - xfs_perag_put(pag); > > - if (err != -EAGAIN) > > + if (err != -EAGAIN) { > > + xfs_perag_put(pag); > > return err; > > + } > > /* Couldn't lock the AGF, skip this AG. */ > > - continue; > > + goto next_ag; > > } > > } > > > > -- > > 2.34.1 > > >
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Fri, Apr 22, 2022 at 12:22:28PM -0400, Brian Foster wrote: > On Fri, Apr 22, 2022 at 09:00:21AM -0700, Darrick J. Wong wrote: > > On Fri, Apr 22, 2022 at 10:12:26AM -0400, Brian Foster wrote: > > > The filestream AG selection loop uses pagf data to aid in AG > > > selection, which depends on pagf initialization. If the in-core > > > structure is not initialized, the caller invokes the AGF read path > > > to do so and carries on. If another task enters the loop and finds > > > a pagf init already in progress, the AGF read returns -EAGAIN and > > > the task continues the loop. This does not increment the current ag > > > index, however, which means the task spins on the current AGF buffer > > > until unlocked. > > > > > > If the AGF read I/O submitted by the initial task happens to be > > > delayed for whatever reason, this results in soft lockup warnings > > > > Is there a specific 'whatever reason' going on here? > > > > Presumably.. given this seems to reproduce reliably or not at all in > certain environments/configs, my suspicion was that either the timing of > the test changes enough such that some other task involved with the test > is able to load the bdev, or otherwise timing changes just enough to > trigger the pagf_init race and the subsequent spinning is what > exacerbates the delay (i.e. burning cpu and subsequent soft lockup BUG > starve out some part(s) of the I/O submission/completion processing). > I've no tangible evidence for either aside from the latter seems fairly > logical when you consider that the test consistently completes in 3-4 > seconds with the fix in place, but without it we consistently hit > multiple instances of the soft lockup detector (on ~20s intervals IIRC) > and the system seems to melt down indefinitely. *shrug* Ah, ok, so there wasn't any specific event that was causing AGF IO to take a long time, it's just that a thread running the filestream allocator could fail the trylock loop for any reason for long enough to trip the hangcheck warning. --D > Brian > > > > via the spinning task. This is reproduced by xfs/170. To avoid this > > > problem, fix the AGF trylock failure path to properly iterate to the > > > next AG. If a task iterates all AGs without making progress, the > > > trylock behavior is dropped in favor of blocking locks and thus a > > > soft lockup is no longer possible. > > > > > > Fixes: f48e2df8a877ca1c ("xfs: make xfs_*read_agf return EAGAIN to ALLOC_FLAG_TRYLOCK callers") > > > > Ooops, this was a major braino on my part. > > Reviewed-by: Darrick J. Wong <djwong@kernel.org> > > > > --D > > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > > --- > > > > > > I included the Fixes: tag because this looks like a regression in said > > > commit, but I've not explicitly verified. > > > > > > Brian > > > > > > fs/xfs/xfs_filestream.c | 7 ++++--- > > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > > > diff --git a/fs/xfs/xfs_filestream.c b/fs/xfs/xfs_filestream.c > > > index 6a3ce0f6dc9e..be9bcf8a1f99 100644 > > > --- a/fs/xfs/xfs_filestream.c > > > +++ b/fs/xfs/xfs_filestream.c > > > @@ -128,11 +128,12 @@ xfs_filestream_pick_ag( > > > if (!pag->pagf_init) { > > > err = xfs_alloc_pagf_init(mp, NULL, ag, trylock); > > > if (err) { > > > - xfs_perag_put(pag); > > > - if (err != -EAGAIN) > > > + if (err != -EAGAIN) { > > > + xfs_perag_put(pag); > > > return err; > > > + } > > > /* Couldn't lock the AGF, skip this AG. */ > > > - continue; > > > + goto next_ag; > > > } > > > } > > > > > > -- > > > 2.34.1 > > > > > >
diff --git a/fs/xfs/xfs_filestream.c b/fs/xfs/xfs_filestream.c index 6a3ce0f6dc9e..be9bcf8a1f99 100644 --- a/fs/xfs/xfs_filestream.c +++ b/fs/xfs/xfs_filestream.c @@ -128,11 +128,12 @@ xfs_filestream_pick_ag( if (!pag->pagf_init) { err = xfs_alloc_pagf_init(mp, NULL, ag, trylock); if (err) { - xfs_perag_put(pag); - if (err != -EAGAIN) + if (err != -EAGAIN) { + xfs_perag_put(pag); return err; + } /* Couldn't lock the AGF, skip this AG. */ - continue; + goto next_ag; } }
The filestream AG selection loop uses pagf data to aid in AG selection, which depends on pagf initialization. If the in-core structure is not initialized, the caller invokes the AGF read path to do so and carries on. If another task enters the loop and finds a pagf init already in progress, the AGF read returns -EAGAIN and the task continues the loop. This does not increment the current ag index, however, which means the task spins on the current AGF buffer until unlocked. If the AGF read I/O submitted by the initial task happens to be delayed for whatever reason, this results in soft lockup warnings via the spinning task. This is reproduced by xfs/170. To avoid this problem, fix the AGF trylock failure path to properly iterate to the next AG. If a task iterates all AGs without making progress, the trylock behavior is dropped in favor of blocking locks and thus a soft lockup is no longer possible. Fixes: f48e2df8a877ca1c ("xfs: make xfs_*read_agf return EAGAIN to ALLOC_FLAG_TRYLOCK callers") Signed-off-by: Brian Foster <bfoster@redhat.com> --- I included the Fixes: tag because this looks like a regression in said commit, but I've not explicitly verified. Brian fs/xfs/xfs_filestream.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)