Message ID | 20230824144403.2135739-1-chengming.zhou@linux.dev (mailing list archive) |
---|---|
Headers | show |
Series | blk-mq: optimize the queue_rqs() support | expand |
On 8/24/23 07:43, chengming.zhou@linux.dev wrote: > From: Chengming Zhou <zhouchengming@bytedance.com> > > The current queue_rqs() support has limitation that it can't work on > shared tags queue, which is resolved by patch 1-3. We move the account > of active requests to where we really allocate the driver tag. > > This is clearer and matched with the unaccount side which now happen > when we put the driver tag. And we can remove RQF_MQ_INFLIGHT, which > was used to avoid double account problem of flush request. > > Another problem is that the driver that support queue_rqs() has to > set inflight request table by itself, which is resolved in patch 4. > > The patch 5 fixes a potential race problem which may cause false > timeout because of the reorder of rq->state and rq->deadline. > > The patch 6 add support queue_rqs() for null_blk, which showed a > 3.6% IOPS improvement in fio/t/io_uring benchmark on my test VM. > And we also use it for testing queue_rqs() on shared tags queue. Hi Jens and Christoph, This patch series would be simplified significantly if the code for fair tag allocation would be removed first (https://lore.kernel.org/linux-block/20230103195337.158625-1-bvanassche@acm.org/, January 2023). It has been proposed to improve fair tag sharing but the complexity of the proposed alternative is scary (https://lore.kernel.org/linux-block/20230618160738.54385-1-yukuai1@huaweicloud.com/, June 2023). Does everyone agree with removing the code for fair tag sharing - code that significantly hurts performance of UFS devices and code that did not exist in the legacy block layer? Thanks, Bart.
On 2023/8/25 01:02, Bart Van Assche wrote: > On 8/24/23 07:43, chengming.zhou@linux.dev wrote: >> From: Chengming Zhou <zhouchengming@bytedance.com> >> >> The current queue_rqs() support has limitation that it can't work on >> shared tags queue, which is resolved by patch 1-3. We move the account >> of active requests to where we really allocate the driver tag. >> >> This is clearer and matched with the unaccount side which now happen >> when we put the driver tag. And we can remove RQF_MQ_INFLIGHT, which >> was used to avoid double account problem of flush request. >> >> Another problem is that the driver that support queue_rqs() has to >> set inflight request table by itself, which is resolved in patch 4. >> >> The patch 5 fixes a potential race problem which may cause false >> timeout because of the reorder of rq->state and rq->deadline. >> >> The patch 6 add support queue_rqs() for null_blk, which showed a >> 3.6% IOPS improvement in fio/t/io_uring benchmark on my test VM. >> And we also use it for testing queue_rqs() on shared tags queue. > > Hi Jens and Christoph, > > This patch series would be simplified significantly if the code for > fair tag allocation would be removed first > (https://lore.kernel.org/linux-block/20230103195337.158625-1-bvanassche@acm.org/, January 2023). > It has been proposed to improve fair tag sharing but the complexity of > the proposed alternative is scary > (https://lore.kernel.org/linux-block/20230618160738.54385-1-yukuai1@huaweicloud.com/, June 2023). > Does everyone agree with removing the code for fair tag sharing - code > that significantly hurts performance of UFS devices and code that did > not exist in the legacy block layer? > Hi Bart, thanks for the references! I don't know the details of the UFS devices bad performance problem. But I feel it maybe caused by the too lazy queue idle handling, which is now only handled in queue timeout work. Another problem maybe the wakeup batch algorithm, which is too subtle. And there were some IO hang problems caused by it in the past. So yes, we should improve it, although I don't have good idea for now, need to do some tests and analysis. As for removing all this code, I don't know from my limited knowledge. It was introduced to improve relative fair tags sharing between queues, to avoid starvation. And the proposed alternative looks too complex to me. Thanks.
On 8/25/23 01:24, Chengming Zhou wrote: > I don't know the details of the UFS devices bad performance problem. > But I feel it maybe caused by the too lazy queue idle handling, which > is now only handled in queue timeout work. Hi Chengming, The root cause of the UFS performance problem is the fair sharing algorithm itself: reducing the active queue count only happens after the request queue timeout has expired. This is way too slow. Last time it was proposed to remove that algorithm Yu Kuai promised to replace it by a better algorithm. Since progress on the replacement algorithm has stalled I'm asking again whether everyone agrees to remove the fairness algorithm. Thanks, Bart.
On 2023/8/24 22:43, chengming.zhou@linux.dev wrote: > From: Chengming Zhou <zhouchengming@bytedance.com> > > The current queue_rqs() support has limitation that it can't work on > shared tags queue, which is resolved by patch 1-3. We move the account > of active requests to where we really allocate the driver tag. > > This is clearer and matched with the unaccount side which now happen > when we put the driver tag. And we can remove RQF_MQ_INFLIGHT, which > was used to avoid double account problem of flush request. > > Another problem is that the driver that support queue_rqs() has to > set inflight request table by itself, which is resolved in patch 4. > > The patch 5 fixes a potential race problem which may cause false > timeout because of the reorder of rq->state and rq->deadline. > > The patch 6 add support queue_rqs() for null_blk, which showed a > 3.6% IOPS improvement in fio/t/io_uring benchmark on my test VM. > And we also use it for testing queue_rqs() on shared tags queue. Hello, gentle ping. Thanks. > > Thanks for review! > > Chengming Zhou (6): > blk-mq: account active requests when get driver tag > blk-mq: remove RQF_MQ_INFLIGHT > blk-mq: support batched queue_rqs() on shared tags queue > blk-mq: update driver tags request table when start request > blk-mq: fix potential reorder of request state and deadline > block/null_blk: add queue_rqs() support > > block/blk-flush.c | 11 ++----- > block/blk-mq-debugfs.c | 1 - > block/blk-mq.c | 53 ++++++++++++++------------------ > block/blk-mq.h | 57 ++++++++++++++++++++++++----------- > drivers/block/null_blk/main.c | 20 ++++++++++++ > drivers/block/virtio_blk.c | 2 -- > drivers/nvme/host/pci.c | 1 - > include/linux/blk-mq.h | 2 -- > 8 files changed, 84 insertions(+), 63 deletions(-) >
From: Chengming Zhou <zhouchengming@bytedance.com> The current queue_rqs() support has limitation that it can't work on shared tags queue, which is resolved by patch 1-3. We move the account of active requests to where we really allocate the driver tag. This is clearer and matched with the unaccount side which now happen when we put the driver tag. And we can remove RQF_MQ_INFLIGHT, which was used to avoid double account problem of flush request. Another problem is that the driver that support queue_rqs() has to set inflight request table by itself, which is resolved in patch 4. The patch 5 fixes a potential race problem which may cause false timeout because of the reorder of rq->state and rq->deadline. The patch 6 add support queue_rqs() for null_blk, which showed a 3.6% IOPS improvement in fio/t/io_uring benchmark on my test VM. And we also use it for testing queue_rqs() on shared tags queue. Thanks for review! Chengming Zhou (6): blk-mq: account active requests when get driver tag blk-mq: remove RQF_MQ_INFLIGHT blk-mq: support batched queue_rqs() on shared tags queue blk-mq: update driver tags request table when start request blk-mq: fix potential reorder of request state and deadline block/null_blk: add queue_rqs() support block/blk-flush.c | 11 ++----- block/blk-mq-debugfs.c | 1 - block/blk-mq.c | 53 ++++++++++++++------------------ block/blk-mq.h | 57 ++++++++++++++++++++++++----------- drivers/block/null_blk/main.c | 20 ++++++++++++ drivers/block/virtio_blk.c | 2 -- drivers/nvme/host/pci.c | 1 - include/linux/blk-mq.h | 2 -- 8 files changed, 84 insertions(+), 63 deletions(-)