Message ID | 20240627142606.3709394-1-lilingfeng@huaweicloud.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] block: flush all throttled bios when deleting the cgroup | expand |
Hello, Li. On Thu, Jun 27, 2024 at 10:26:06PM +0800, Li Lingfeng wrote: > From: Li Lingfeng <lilingfeng3@huawei.com> > > When a process migrates to another cgroup and the original cgroup is deleted, > the restrictions of throttled bios cannot be removed. If the restrictions > are set too low, it will take a long time to complete these bios. > > Refer to the process of deleting a disk to remove the restrictions and > issue bios when deleting the cgroup. > > This makes difference on the behavior of throttled bios: > Before: the limit of the throttled bios can't be changed and the bios will > complete under this limit; > Now: the limit will be canceled and the throttled bios will be flushed > immediately. I'm not necessarily against this but the description doesn't explain why this is better either. Can you please detail why this behavior is better? Thanks.
在 2024/6/28 4:43, Tejun Heo 写道: > Hello, Li. > > On Thu, Jun 27, 2024 at 10:26:06PM +0800, Li Lingfeng wrote: >> From: Li Lingfeng <lilingfeng3@huawei.com> >> >> When a process migrates to another cgroup and the original cgroup is deleted, >> the restrictions of throttled bios cannot be removed. If the restrictions >> are set too low, it will take a long time to complete these bios. >> >> Refer to the process of deleting a disk to remove the restrictions and >> issue bios when deleting the cgroup. >> >> This makes difference on the behavior of throttled bios: >> Before: the limit of the throttled bios can't be changed and the bios will >> complete under this limit; >> Now: the limit will be canceled and the throttled bios will be flushed >> immediately. > I'm not necessarily against this but the description doesn't explain why > this is better either. Can you please detail why this behavior is better? I think it may be more appropriate to remove the limit of bios after the cgroup is deleted, rather than let the bios continue to be throttled by a non-existent cgroup. If the limit is set too low, and the original cgourp has been deleted, we now have no way to make the bios complete immediately, but to wait for the bios to slowly complete under the limit. Thanks. > > Thanks. >
Hi, 在 2024/06/28 10:04, Li Lingfeng 写道: > > 在 2024/6/28 4:43, Tejun Heo 写道: >> Hello, Li. >> >> On Thu, Jun 27, 2024 at 10:26:06PM +0800, Li Lingfeng wrote: >>> From: Li Lingfeng <lilingfeng3@huawei.com> >>> >>> When a process migrates to another cgroup and the original cgroup is >>> deleted, >>> the restrictions of throttled bios cannot be removed. If the >>> restrictions >>> are set too low, it will take a long time to complete these bios. >>> >>> Refer to the process of deleting a disk to remove the restrictions and >>> issue bios when deleting the cgroup. >>> >>> This makes difference on the behavior of throttled bios: >>> Before: the limit of the throttled bios can't be changed and the bios >>> will >>> complete under this limit; >>> Now: the limit will be canceled and the throttled bios will be flushed >>> immediately. >> I'm not necessarily against this but the description doesn't explain why >> this is better either. Can you please detail why this behavior is better? > I think it may be more appropriate to remove the limit of bios after the > cgroup is deleted, rather than let the bios continue to be throttled by a > non-existent cgroup. The backgroud is that our test found this, by: 1) setting a low limit in one cgroup; 2) bind a task in the cgroup and issue lots of IO; 3) migrate the task to root cgroup; 4) delete the cgroup; And oops, unless the disk is deleted, IO will hang for a long time and there is no way to recover. The good thing is that after flushing throttled bio while deleting the cgroup, this "IO hang" can be avoided. However, I'm not sure for this change, because user may still want the BIO to be throttled. Anyway, I don't think this will be a problem in reallife. Thanks, Kuai > > If the limit is set too low, and the original cgourp has been deleted, we > now have no way to make the bios complete immediately, but to wait for the > bios to slowly complete under the limit. > > Thanks. > >> >> Thanks. >> > > > . >
On Fri, Jun 28, 2024 at 10:04:20AM GMT, Li Lingfeng <lilingfeng@huaweicloud.com> wrote: > I think it may be more appropriate to remove the limit of bios after the > cgroup is deleted, rather than let the bios continue to be throttled by a > non-existent cgroup. I'm not that familiar with this part -- can this also happen for IOs submitted by an exited task? (In contrast to a running task migrated elsewhere.) > If the limit is set too low, and the original cgourp has been deleted, we > now have no way to make the bios complete immediately, but to wait for the > bios to slowly complete under the limit. It makes some sense, it's not unlike reparenting of memcg objects, IIRC flushed bios would actually be passed to a parent throtl_grp, right? Thanks, Michal
在 2024/7/2 22:25, Michal Koutný 写道: > On Fri, Jun 28, 2024 at 10:04:20AM GMT, Li Lingfeng <lilingfeng@huaweicloud.com> wrote: >> I think it may be more appropriate to remove the limit of bios after the >> cgroup is deleted, rather than let the bios continue to be throttled by a >> non-existent cgroup. > I'm not that familiar with this part -- can this also happen for IOs > submitted by an exited task? (In contrast to a running task migrated > elsewhere.) Yes, IOs will be throttled no matter whether the task that delivers them exits. >> If the limit is set too low, and the original cgourp has been deleted, we >> now have no way to make the bios complete immediately, but to wait for the >> bios to slowly complete under the limit. > It makes some sense, it's not unlike reparenting of memcg objects, IIRC > flushed bios would actually be passed to a parent throtl_grp, right? Yes, flushed bios would be throttled by the parent throtl_grp. > Thanks, > Michal
diff --git a/block/blk-throttle.c b/block/blk-throttle.c index c1bf73f8c75d..a0e5b28951ca 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1534,6 +1534,42 @@ static void throtl_shutdown_wq(struct request_queue *q) cancel_work_sync(&td->dispatch_work); } +static void tg_cancel_bios(struct throtl_grp *tg) +{ + struct throtl_service_queue *sq = &tg->service_queue; + + if (tg->flags & THROTL_TG_CANCELING) + return; + /* + * Set the flag to make sure throtl_pending_timer_fn() won't + * stop until all throttled bios are dispatched. + */ + tg->flags |= THROTL_TG_CANCELING; + + /* + * Do not dispatch cgroup without THROTL_TG_PENDING or cgroup + * will be inserted to service queue without THROTL_TG_PENDING + * set in tg_update_disptime below. Then IO dispatched from + * child in tg_dispatch_one_bio will trigger double insertion + * and corrupt the tree. + */ + if (!(tg->flags & THROTL_TG_PENDING)) + return; + + /* + * Update disptime after setting the above flag to make sure + * throtl_select_dispatch() won't exit without dispatching. + */ + tg_update_disptime(tg); + + throtl_schedule_pending_timer(sq, jiffies + 1); +} + +static void throtl_pd_offline(struct blkg_policy_data *pd) +{ + tg_cancel_bios(pd_to_tg(pd)); +} + struct blkcg_policy blkcg_policy_throtl = { .dfl_cftypes = throtl_files, .legacy_cftypes = throtl_legacy_files, @@ -1541,6 +1577,7 @@ struct blkcg_policy blkcg_policy_throtl = { .pd_alloc_fn = throtl_pd_alloc, .pd_init_fn = throtl_pd_init, .pd_online_fn = throtl_pd_online, + .pd_offline_fn = throtl_pd_offline, .pd_free_fn = throtl_pd_free, }; @@ -1561,32 +1598,15 @@ void blk_throtl_cancel_bios(struct gendisk *disk) */ rcu_read_lock(); blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg) { - struct throtl_grp *tg = blkg_to_tg(blkg); - struct throtl_service_queue *sq = &tg->service_queue; - - /* - * Set the flag to make sure throtl_pending_timer_fn() won't - * stop until all throttled bios are dispatched. - */ - tg->flags |= THROTL_TG_CANCELING; - /* - * Do not dispatch cgroup without THROTL_TG_PENDING or cgroup - * will be inserted to service queue without THROTL_TG_PENDING - * set in tg_update_disptime below. Then IO dispatched from - * child in tg_dispatch_one_bio will trigger double insertion - * and corrupt the tree. + * disk_release will call pd_offline_fn to cancel bios. + * However, disk_release can't be called if someone get + * the refcount of device and issued bios which are + * inflight after del_gendisk. + * Cancel bios here to ensure no bios are inflight after + * del_gendisk. */ - if (!(tg->flags & THROTL_TG_PENDING)) - continue; - - /* - * Update disptime after setting the above flag to make sure - * throtl_select_dispatch() won't exit without dispatching. - */ - tg_update_disptime(tg); - - throtl_schedule_pending_timer(sq, jiffies + 1); + tg_cancel_bios(blkg_to_tg(blkg)); } rcu_read_unlock(); spin_unlock_irq(&q->queue_lock);