Message ID | 1634550083-202815-1-git-send-email-john.garry@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] blk-mq: Fix blk_mq_tagset_busy_iter() for shared tags | expand |
> -----Original Message----- > From: John Garry [mailto:john.garry@huawei.com] > Sent: Monday, October 18, 2021 3:11 PM > To: axboe@kernel.dk > Cc: ming.lei@redhat.com; linux-block@vger.kernel.org; linux- > kernel@vger.kernel.org; kashyap.desai@broadcom.com; hare@suse.de; John > Garry <john.garry@huawei.com> > Subject: [PATCH v2] blk-mq: Fix blk_mq_tagset_busy_iter() for shared tags > > Since it is now possible for a tagset to share a single set of tags, the iter > function should not re-iter the tags for the count of #hw queues in that case. > Rather it should just iter once. > > Fixes: e0fdf846c7bb ("blk-mq: Use shared tags for shared sbitmap support") > Reported-by: Kashyap Desai <kashyap.desai@broadcom.com> > Signed-off-by: John Garry <john.garry@huawei.com> > Reviewed-by: Ming Lei <ming.lei@redhat.com> > --- > Diff to v1: > - Add Ming's RB tag Now I noticed proper host_busy in my test. Still CPU hogging is not resolved, but issue addressed by this patch is resolved. Tested-by: Kashyap Desai <kashyap.desai@broadcom.com>
On 18/10/2021 19:49, Kashyap Desai wrote: >> -----Original Message----- >> From: John Garry [mailto:john.garry@huawei.com] >> Sent: Monday, October 18, 2021 3:11 PM >> To:axboe@kernel.dk >> Cc:ming.lei@redhat.com;linux-block@vger.kernel.org; linux- >> kernel@vger.kernel.org;kashyap.desai@broadcom.com;hare@suse.de; John >> Garry<john.garry@huawei.com> >> Subject: [PATCH v2] blk-mq: Fix blk_mq_tagset_busy_iter() for shared > tags >> Since it is now possible for a tagset to share a single set of tags, the > iter >> function should not re-iter the tags for the count of #hw queues in that > case. >> Rather it should just iter once. >> >> Fixes: e0fdf846c7bb ("blk-mq: Use shared tags for shared sbitmap > support") >> Reported-by: Kashyap Desai<kashyap.desai@broadcom.com> >> Signed-off-by: John Garry<john.garry@huawei.com> >> Reviewed-by: Ming Lei<ming.lei@redhat.com> >> --- >> Diff to v1: >> - Add Ming's RB tag > Now I noticed proper host_busy in my test. Still CPU hogging is not > resolved, but issue addressed by this patch is resolved. > > Tested-by: Kashyap Desai<kashyap.desai@broadcom.com> Hi Jens, Can you kindly consider picking up this patch? I'm still waiting for feedback from Kashyap on whether we should optimize the other iter functions for shared tags, but this one is a fix. Thanks!
On Mon, 18 Oct 2021 17:41:23 +0800, John Garry wrote: > Since it is now possible for a tagset to share a single set of tags, the > iter function should not re-iter the tags for the count of #hw queues in > that case. Rather it should just iter once. > > Applied, thanks! [1/1] blk-mq: Fix blk_mq_tagset_busy_iter() for shared tags commit: 0994c64eb4159ba019e7fedc7ba0dd6a69235b40 Best regards,
+ scsi mailing list > On Mon, 18 Oct 2021 17:41:23 +0800, John Garry wrote: > > Since it is now possible for a tagset to share a single set of tags, > > the iter function should not re-iter the tags for the count of #hw > > queues in that case. Rather it should just iter once. John - Recently we found issue of error hander thread never kicked off and this patch fix the issue. Without this patch, scsi error hander will not find correct host_busy counter. Take one simple case. There is one IO outstanding and that is getting timedout. Now SML wants to wake up EH thread only if, below condition met "scsi_host_busy(shost) == shost->host_failed" Without this patch, shared host tag enabled meagaraid_sas driver will find host_busy = actual outstanding * nr_hw_queues. Error handler thread will never be kicked-off. This patch is mandatory for fixing shared host tag feature and require to be part of stable kernel. Do you need more data for posting to stable kernel ? Kashyap > > > > > > Applied, thanks! > > [1/1] blk-mq: Fix blk_mq_tagset_busy_iter() for shared tags > commit: 0994c64eb4159ba019e7fedc7ba0dd6a69235b40 > > Best regards, > -- > Jens Axboe >
On 09/12/2021 13:52, Kashyap Desai wrote: > + scsi mailing list > >> On Mon, 18 Oct 2021 17:41:23 +0800, John Garry wrote: >>> Since it is now possible for a tagset to share a single set of tags, >>> the iter function should not re-iter the tags for the count of #hw >>> queues in that case. Rather it should just iter once. > John - Recently we found issue of error hander thread never kicked off and > this patch fix the issue. > Without this patch, scsi error hander will not find correct host_busy > counter. > > Take one simple case. There is one IO outstanding and that is getting > timedout. > Now SML wants to wake up EH thread only if, below condition met > "scsi_host_busy(shost) == shost->host_failed" > > Without this patch, shared host tag enabled meagaraid_sas driver will find > host_busy = actual outstanding * nr_hw_queues. > Error handler thread will never be kicked-off. > > This patch is mandatory for fixing shared host tag feature and require to be > part of stable kernel. > > Do you need more data for posting to stable kernel ? To be clear, are you saying that you see the issue which patch "blk-mq: Fix blk_mq_tagset_busy_iter() for shared tags" fixes before v5.16-rc? This patch (now commit 0994c64eb415) and the commit which it is supposed to fix, e155b0c238b2, will only be in v5.16, so I don't see anything which is needed in stable. Thanks, John
> On 09/12/2021 13:52, Kashyap Desai wrote: > > + scsi mailing list > > > >> On Mon, 18 Oct 2021 17:41:23 +0800, John Garry wrote: > >>> Since it is now possible for a tagset to share a single set of tags, > >>> the iter function should not re-iter the tags for the count of #hw > >>> queues in that case. Rather it should just iter once. > > John - Recently we found issue of error hander thread never kicked off > > and this patch fix the issue. > > Without this patch, scsi error hander will not find correct host_busy > > counter. > > > > Take one simple case. There is one IO outstanding and that is getting > > timedout. > > Now SML wants to wake up EH thread only if, below condition met > > "scsi_host_busy(shost) == shost->host_failed" > > > > Without this patch, shared host tag enabled meagaraid_sas driver will > > find host_busy = actual outstanding * nr_hw_queues. > > Error handler thread will never be kicked-off. > > > > This patch is mandatory for fixing shared host tag feature and require > > to be part of stable kernel. > > > > Do you need more data for posting to stable kernel ? > > To be clear, are you saying that you see the issue which patch "blk-mq: > Fix blk_mq_tagset_busy_iter() for shared tags" fixes before v5.16-rc? > > This patch (now commit 0994c64eb415) and the commit which it is supposed > to fix, e155b0c238b2, will only be in v5.16, so I don't see anything which > is > needed in stable. Hi John Yes. No need of posting this to stable. There is still an issue which we are tracking. It is not always reproducible. I am injecting artificial Task abort on my setup to reproduce it. It happens on rhel8.5 most of the time. It is a timing issue so thinking of reproducing on other kernel as well. I am suspecting issue might be due to missing commit - 67f3b2f822b7e71cfc9b42dbd9f3144fa2933e0b of [PATCH] blk-mq: avoid to iterate over stale request Whenever I notice the issue, there was a symptoms that host_busy is getting counted for each hctx individually. Let me collect more data and I will start another thread. Kashyap > > Thanks, > John
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 72a2724a4eee..c943b6529619 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -378,9 +378,12 @@ void blk_mq_all_tag_iter(struct blk_mq_tags *tags, busy_tag_iter_fn *fn, void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset, busy_tag_iter_fn *fn, void *priv) { - int i; + unsigned int flags = tagset->flags; + int i, nr_tags; + + nr_tags = blk_mq_is_shared_tags(flags) ? 1 : tagset->nr_hw_queues; - for (i = 0; i < tagset->nr_hw_queues; i++) { + for (i = 0; i < nr_tags; i++) { if (tagset->tags && tagset->tags[i]) __blk_mq_all_tag_iter(tagset->tags[i], fn, priv, BT_TAG_ITER_STARTED);