Message ID | 20241217024047.1091893-5-yukuai1@huaweicloud.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | lib/sbitmap: fix shallow_depth tag allocation | expand |
On 12/16/24 6:40 PM, Yu Kuai wrote: > +static unsigned int min_async_depth = 64; > +module_param(min_async_depth, int, 0444); > +MODULE_PARM_DESC(min_async_depth, "The minimal number of tags available for asynchronous requests"); Users may not like it that this parameter is read-only. > @@ -513,9 +523,12 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx) > struct deadline_data *dd = q->elevator->elevator_data; > struct blk_mq_tags *tags = hctx->sched_tags; > > - dd->async_depth = max(1UL, 3 * q->nr_requests / 4); Shouldn't this assignment be retained instead of removing it? Additionally, some time ago a user requested to initialize dd->async_depth to q->nr_requests instead of 3/4 of that value because the lower value introduced a performance regression. Thanks, Bart.
Hi, 在 2024/12/18 6:13, Bart Van Assche 写道: > On 12/16/24 6:40 PM, Yu Kuai wrote: >> +static unsigned int min_async_depth = 64; >> +module_param(min_async_depth, int, 0444); >> +MODULE_PARM_DESC(min_async_depth, "The minimal number of tags >> available for asynchronous requests"); > > Users may not like it that this parameter is read-only. > >> @@ -513,9 +523,12 @@ static void dd_depth_updated(struct blk_mq_hw_ctx >> *hctx) >> struct deadline_data *dd = q->elevator->elevator_data; >> struct blk_mq_tags *tags = hctx->sched_tags; >> - dd->async_depth = max(1UL, 3 * q->nr_requests / 4); > > Shouldn't this assignment be retained instead of removing it? > Additionally, some time ago a user requested to initialize > dd->async_depth to q->nr_requests instead of 3/4 of that value because > the lower value introduced a performance regression. dd->async_depth is initialized to 0 now, functionally I think it's the same as q->nr_requests. And I do explain this in commit message, maybe it's not clear? BTW, if user sets new nr_requests and async_depth < new nr_requests, async_depth won't be reset after this patch. Thanks, Kuai > > Thanks, > > Bart. > > . >
Hi, 在 2024/12/18 9:12, Yu Kuai 写道: > > Users may not like it that this parameter is read-only. I can't make this read-write, because set lower value will cause problems for existing elevator, because wake_batch has to be updated as well. Thanks, Kuai
On 12/17/24 5:14 PM, Yu Kuai wrote: > I can't make this read-write, because set lower value will cause > problems for existing elevator, because wake_batch has to be > updated as well. Should the request queue perhaps be frozen before wake_batch is updated? Thanks, Bart.
On 12/17/24 5:12 PM, Yu Kuai wrote: > dd->async_depth is initialized to 0 now, functionally I think > it's the same as q->nr_requests. And I do explain this in commit > message, maybe it's not clear? It would be good to add a comment in the source code that explains that __blk_mq_get_tag() does not restrict tag allocation if dd->async_depth is zero because that causes data->shallow_depth to be zero. Thanks, Bart.
Hi, 在 2024/12/19 2:00, Bart Van Assche 写道: > On 12/17/24 5:14 PM, Yu Kuai wrote: >> I can't make this read-write, because set lower value will cause >> problems for existing elevator, because wake_batch has to be >> updated as well. > > Should the request queue perhaps be frozen before wake_batch is updated? Yes, we should. The good thing is for now it's frozen already: - update nr_requests context; - switch elevator; However, if you mean do this while writing async_depth, freeze queue is not enough, we have to ping all the hctx as well by q->sysfs_lock, which is not possible. Or if you mean do this while write the new min_async_depth, then we have to update wat_batch for all the queues in the system, too crazy for me... Thanks, Kuai > > Thanks, > > Bart. > > . >
Hi, 在 2024/12/19 2:06, Bart Van Assche 写道: > On 12/17/24 5:12 PM, Yu Kuai wrote: >> dd->async_depth is initialized to 0 now, functionally I think >> it's the same as q->nr_requests. And I do explain this in commit >> message, maybe it's not clear? > > It would be good to add a comment in the source code that explains that > __blk_mq_get_tag() does not restrict tag allocation if dd->async_depth > is zero because that causes data->shallow_depth to be zero. > Ok. Thanks, Kuai > Thanks, > > Bart. > . >
On 12/18/24 5:21 PM, Yu Kuai wrote: > Hi, > > 在 2024/12/19 2:00, Bart Van Assche 写道: >> On 12/17/24 5:14 PM, Yu Kuai wrote: >>> I can't make this read-write, because set lower value will cause >>> problems for existing elevator, because wake_batch has to be >>> updated as well. >> >> Should the request queue perhaps be frozen before wake_batch is updated? > > Yes, we should. The good thing is for now it's frozen already: > - update nr_requests context; > - switch elevator; > > However, if you mean do this while writing async_depth, freeze queue > is not enough, we have to ping all the hctx as well by q->sysfs_lock, > which is not possible. > > Or if you mean do this while write the new min_async_depth, then we have > to update wat_batch for all the queues in the system, too crazy for > me... Should min_async_depth perhaps be a request queue attribute instead of an mq-deadline I/O scheduler attribute? Thanks, Bart.
Hi, 在 2024/12/20 3:25, Bart Van Assche 写道: > On 12/18/24 5:21 PM, Yu Kuai wrote: >> Hi, >> >> 在 2024/12/19 2:00, Bart Van Assche 写道: >>> On 12/17/24 5:14 PM, Yu Kuai wrote: >>>> I can't make this read-write, because set lower value will cause >>>> problems for existing elevator, because wake_batch has to be >>>> updated as well. >>> >>> Should the request queue perhaps be frozen before wake_batch is updated? >> >> Yes, we should. The good thing is for now it's frozen already: >> - update nr_requests context; >> - switch elevator; >> >> However, if you mean do this while writing async_depth, freeze queue >> is not enough, we have to ping all the hctx as well by q->sysfs_lock, >> which is not possible. >> >> Or if you mean do this while write the new min_async_depth, then we have >> to update wat_batch for all the queues in the system, too crazy for >> me... > > Should min_async_depth perhaps be a request queue attribute instead of > an mq-deadline I/O scheduler attribute? Yes, I think this make sense, at least kyber and deadline can both benefit from this. And I might must add a new async_depth_updated() api to the elevator ops. Thanks, Kuai > > Thanks, > > Bart. > > > . >
diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 1f0d175a941e..9be0a33985ce 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -24,6 +24,16 @@ #include "blk-mq-debugfs.h" #include "blk-mq-sched.h" +/* + * async_depth is used to reserve scheduler tags for synchronous requests, + * and the value will affect sbitmap wake_batch. The default minimal value is 64 + * because the corresponding wake_batch is 8, and lower wake_batch may affect + * IO performance. + */ +static unsigned int min_async_depth = 64; +module_param(min_async_depth, int, 0444); +MODULE_PARM_DESC(min_async_depth, "The minimal number of tags available for asynchronous requests"); + /* * See Documentation/block/deadline-iosched.rst */ @@ -513,9 +523,12 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx) struct deadline_data *dd = q->elevator->elevator_data; struct blk_mq_tags *tags = hctx->sched_tags; - dd->async_depth = max(1UL, 3 * q->nr_requests / 4); + if (q->nr_requests > min_async_depth) + sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, + min_async_depth); - sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth); + if (q->nr_requests <= dd->async_depth) + dd->async_depth = 0; } /* Called by blk_mq_init_hctx() and blk_mq_init_sched(). */ @@ -814,7 +827,7 @@ STORE_JIFFIES(deadline_write_expire_store, &dd->fifo_expire[DD_WRITE], 0, INT_MA STORE_JIFFIES(deadline_prio_aging_expire_store, &dd->prio_aging_expire, 0, INT_MAX); STORE_INT(deadline_writes_starved_store, &dd->writes_starved, INT_MIN, INT_MAX); STORE_INT(deadline_front_merges_store, &dd->front_merges, 0, 1); -STORE_INT(deadline_async_depth_store, &dd->async_depth, 1, INT_MAX); +STORE_INT(deadline_async_depth_store, &dd->async_depth, min_async_depth, INT_MAX); STORE_INT(deadline_fifo_batch_store, &dd->fifo_batch, 0, INT_MAX); #undef STORE_FUNCTION #undef STORE_INT