[RFC] block: Revert 615939a2ae73

Message ID	169158653156.2034.8363987273532378512.stgit@bazille.1015granger.net (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@vger.kernel.org> Subject: [PATCH RFC] block: Revert 615939a2ae73 From: Chuck Lever <cel@kernel.org> To: linux-block@vger.kernel.org Cc: Chuck Lever <chuck.lever@oracle.com> Date: Wed, 09 Aug 2023 09:08:51 -0400 Message-ID: <169158653156.2034.8363987273532378512.stgit@bazille.1015granger.net> User-Agent: StGit/1.5 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Precedence: bulk
Series	[RFC] block: Revert 615939a2ae73 \| expand [RFC] block: Revert 615939a2ae73

Chuck Lever Aug. 9, 2023, 1:08 p.m. UTC

From: Chuck Lever <chuck.lever@oracle.com>

Even after commit 9f87fc4d72f5 ("block: queue data commands from the
flush state machine at the head"), I'm still seeing hangs on NFS
exports that reside on SATA devices.

Bisected to commit 615939a2ae73 ("blk-mq: defer to the normal
submission path for post-flush requests"). I've tested reverting
that commit on several kernels from 0aa69d53ac7c ("Merge tag
'for-6.5/io_uring-2023-06-23' of git://git.kernel.dk/linux") to
v6.5-rc5, and it seems to address the problem.

I also confirmed that commit 9f87fc4d72f5 ("block: queue data
commands from the flush state machine at the head") is still
necessary.

There is still no root cause.

Fixes: 615939a2ae73 ("blk-mq: defer to the normal submission path for post-flush requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 block/blk-flush.c |   11 -----------
 1 file changed, 11 deletions(-)

Jens Axboe Aug. 9, 2023, 3:51 p.m. UTC | #1

On 8/9/23 7:08 AM, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> Even after commit 9f87fc4d72f5 ("block: queue data commands from the
> flush state machine at the head"), I'm still seeing hangs on NFS
> exports that reside on SATA devices.
> 
> Bisected to commit 615939a2ae73 ("blk-mq: defer to the normal
> submission path for post-flush requests"). I've tested reverting
> that commit on several kernels from 0aa69d53ac7c ("Merge tag
> 'for-6.5/io_uring-2023-06-23' of git://git.kernel.dk/linux") to
> v6.5-rc5, and it seems to address the problem.
> 
> I also confirmed that commit 9f87fc4d72f5 ("block: queue data
> commands from the flush state machine at the head") is still
> necessary.

Adding the author - Christoph?

Christoph Hellwig Aug. 9, 2023, 4:11 p.m. UTC | #2

On Wed, Aug 09, 2023 at 09:51:18AM -0600, Jens Axboe wrote:
> > Bisected to commit 615939a2ae73 ("blk-mq: defer to the normal
> > submission path for post-flush requests"). I've tested reverting
> > that commit on several kernels from 0aa69d53ac7c ("Merge tag
> > 'for-6.5/io_uring-2023-06-23' of git://git.kernel.dk/linux") to
> > v6.5-rc5, and it seems to address the problem.
> > 
> > I also confirmed that commit 9f87fc4d72f5 ("block: queue data
> > commands from the flush state machine at the head") is still
> > necessary.
> 
> Adding the author - Christoph?

Hmm, weird.  I though we had all this sorted out a few weeks
ago when that 9f87fc4d72f5 fixed it for you?  What is different
in this round of testing?

Chuck Lever III Aug. 9, 2023, 4:26 p.m. UTC | #3

> On Aug 9, 2023, at 12:11 PM, Christoph Hellwig <hch@lst.de> wrote:
> 
> On Wed, Aug 09, 2023 at 09:51:18AM -0600, Jens Axboe wrote:
>>> Bisected to commit 615939a2ae73 ("blk-mq: defer to the normal
>>> submission path for post-flush requests"). I've tested reverting
>>> that commit on several kernels from 0aa69d53ac7c ("Merge tag
>>> 'for-6.5/io_uring-2023-06-23' of git://git.kernel.dk/linux") to
>>> v6.5-rc5, and it seems to address the problem.
>>> 
>>> I also confirmed that commit 9f87fc4d72f5 ("block: queue data
>>> commands from the flush state machine at the head") is still
>>> necessary.
>> 
>> Adding the author - Christoph?
> 
> Hmm, weird.  I though we had all this sorted out a few weeks
> ago when that 9f87fc4d72f5 fixed it for you?  What is different
> in this round of testing?

I don't know the answer to that... I suspect that I originally
tested the first revert on a kernel that also did not have 615939
applied. At the time we thought only one of those commits was
broken.

I've bisected this every way I can think of, and both fixes are
needed to fully stop the hangs.


--
Chuck Lever

Christoph Hellwig Aug. 9, 2023, 9:49 p.m. UTC | #4

Oh well.  I don't feel like we're going to find the root cause
given that its late in the merge window and I'm running out of
time I have to work due to the annual summer vacation.  Unless
someone like Chengming who knows the flush code pretty well
feels like working with chuck on a few more iterations we'll
have to revert it.  Which is going to be a very painful merge
with Chengming's work in the for-6.6 branch.

Chengming Zhou Aug. 10, 2023, 3:42 a.m. UTC | #5

On 2023/8/10 05:49, Christoph Hellwig wrote:
> Oh well.  I don't feel like we're going to find the root cause
> given that its late in the merge window and I'm running out of
> time I have to work due to the annual summer vacation.  Unless
> someone like Chengming who knows the flush code pretty well
> feels like working with chuck on a few more iterations we'll

Hi,

No problem, I can work with Chuck to find the root cause if Chuck
has time too, since I still can't reproduce it by myself.

Maybe we should first find what's the status of the hang request?
I can write a Drgn script to find if any request hang in the queue.

Christoph, it would be very helpful if you share some thoughts
and directions.

> have to revert it.  Which is going to be a very painful merge
> with Chengming's work in the for-6.6 branch.
> 

Maybe we can revert it manually if we still fail since that commit
just let postflushes go to the normal I/O path, instead of going to
the flush state machine.

So I think it should be fine if we just delete that case?

Chuck, could you please help to check this change on block/for-next?

Thanks!

diff --git a/block/blk-flush.c b/block/blk-flush.c
index e73dc22d05c1..7ea3c00f40ce 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -442,17 +442,6 @@ bool blk_insert_flush(struct request *rq)
                 * Queue for normal execution.
                 */
                return false;
-       case REQ_FSEQ_DATA | REQ_FSEQ_POSTFLUSH:
-               /*
-                * Initialize the flush fields and completion handler to trigger
-                * the post flush, and then just pass the command on.
-                */
-               blk_rq_init_flush(rq);
-               rq->flush.seq |= REQ_FSEQ_PREFLUSH;
-               spin_lock_irq(&fq->mq_flush_lock);
-               fq->flush_data_in_flight++;
-               spin_unlock_irq(&fq->mq_flush_lock);
-               return false;
        default:
                /*
                 * Mark the request as part of a flush sequence and submit it

Chuck Lever III Aug. 10, 2023, 1:47 p.m. UTC | #6

> On Aug 9, 2023, at 11:42 PM, Chengming Zhou <zhouchengming@bytedance.com> wrote:
> 
> On 2023/8/10 05:49, Christoph Hellwig wrote:
>> Oh well.  I don't feel like we're going to find the root cause
>> given that its late in the merge window and I'm running out of
>> time I have to work due to the annual summer vacation.  Unless
>> someone like Chengming who knows the flush code pretty well
>> feels like working with chuck on a few more iterations we'll
> 
> Hi,
> 
> No problem, I can work with Chuck to find the root cause if Chuck
> has time too, since I still can't reproduce it by myself.

Yes, I have some time to help.


> Maybe we should first find what's the status of the hang request?
> I can write a Drgn script to find if any request hang in the queue.
> 
> Christoph, it would be very helpful if you share some thoughts
> and directions.
> 
> 
>> have to revert it.  Which is going to be a very painful merge
>> with Chengming's work in the for-6.6 branch.
>> 
> 
> Maybe we can revert it manually if we still fail since that commit
> just let postflushes go to the normal I/O path, instead of going to
> the flush state machine.
> 
> So I think it should be fine if we just delete that case?

That's basically the fix I have been testing on v6.5-rc5 and earlier.


> Chuck, could you please help to check this change on block/for-next?

I'll set up a block/for-next kernel and try this out.


> Thanks!
> 
> diff --git a/block/blk-flush.c b/block/blk-flush.c
> index e73dc22d05c1..7ea3c00f40ce 100644
> --- a/block/blk-flush.c
> +++ b/block/blk-flush.c
> @@ -442,17 +442,6 @@ bool blk_insert_flush(struct request *rq)
>                 * Queue for normal execution.
>                 */
>                return false;
> -       case REQ_FSEQ_DATA | REQ_FSEQ_POSTFLUSH:
> -               /*
> -                * Initialize the flush fields and completion handler to trigger
> -                * the post flush, and then just pass the command on.
> -                */
> -               blk_rq_init_flush(rq);
> -               rq->flush.seq |= REQ_FSEQ_PREFLUSH;
> -               spin_lock_irq(&fq->mq_flush_lock);
> -               fq->flush_data_in_flight++;
> -               spin_unlock_irq(&fq->mq_flush_lock);
> -               return false;
>        default:
>                /*
>                 * Mark the request as part of a flush sequence and submit it

--
Chuck Lever

Christoph Hellwig Aug. 10, 2023, 3:58 p.m. UTC | #7

On Thu, Aug 10, 2023 at 11:42:50AM +0800, Chengming Zhou wrote:
> Christoph, it would be very helpful if you share some thoughts
> and directions.

I am not quite sure where the problems could even be.  I think the
interesting aspects are:

 - we are not directly processing through the normal submission path
   instead of adding the command to the flush_data_in_flight list and
   then going through blk_kick_flush.  So a command now gets executed
   earlier than it did before.

 - given that it only happens with SATA, it must be in some way related
   to the fact that ATA flush commands are not queud, that is when a
   flush is outstanding no other command can't.

Chuck Lever III Aug. 10, 2023, 4:02 p.m. UTC | #8

On Thu, Aug 10, 2023 at 11:42:50AM +0800, Chengming Zhou wrote:
> On 2023/8/10 05:49, Christoph Hellwig wrote:
> > Oh well.  I don't feel like we're going to find the root cause
> > given that its late in the merge window and I'm running out of
> > time I have to work due to the annual summer vacation.  Unless
> > someone like Chengming who knows the flush code pretty well
> > feels like working with chuck on a few more iterations we'll
> 
> Hi,
> 
> No problem, I can work with Chuck to find the root cause if Chuck
> has time too, since I still can't reproduce it by myself.
> 
> Maybe we should first find what's the status of the hang request?
> I can write a Drgn script to find if any request hang in the queue.
> 
> Christoph, it would be very helpful if you share some thoughts
> and directions.
> 
> 
> > have to revert it.  Which is going to be a very painful merge
> > with Chengming's work in the for-6.6 branch.
> > 
> 
> Maybe we can revert it manually if we still fail since that commit
> just let postflushes go to the normal I/O path, instead of going to
> the flush state machine.
> 
> So I think it should be fine if we just delete that case?
> 
> Chuck, could you please help to check this change on block/for-next?
> 
> Thanks!
> 
> diff --git a/block/blk-flush.c b/block/blk-flush.c
> index e73dc22d05c1..7ea3c00f40ce 100644
> --- a/block/blk-flush.c
> +++ b/block/blk-flush.c
> @@ -442,17 +442,6 @@ bool blk_insert_flush(struct request *rq)
>                  * Queue for normal execution.
>                  */
>                 return false;
> -       case REQ_FSEQ_DATA | REQ_FSEQ_POSTFLUSH:
> -               /*
> -                * Initialize the flush fields and completion handler to trigger
> -                * the post flush, and then just pass the command on.
> -                */
> -               blk_rq_init_flush(rq);
> -               rq->flush.seq |= REQ_FSEQ_PREFLUSH;
> -               spin_lock_irq(&fq->mq_flush_lock);
> -               fq->flush_data_in_flight++;
> -               spin_unlock_irq(&fq->mq_flush_lock);
> -               return false;
>         default:
>                 /*
>                  * Mark the request as part of a flush sequence and submit it

linux-block/for-next with the above hunk applied is not able to
reproduce the NFSD hangs on SATA devices.

[RFC] block: Revert 615939a2ae73

Commit Message

Comments

Patch