Message ID | 20240627090856.2345018-1-lilingfeng@huaweicloud.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blk-cgroup: don't clear stat in blkcg_reset_stats() | expand |
Hello, On Thu, Jun 27, 2024 at 05:08:56PM +0800, Li Lingfeng wrote: > The list corruption described in commit 6da668063279 ("blk-cgroup: fix > list corruption from resetting io stat") has no effect. It's unnecessary > to fix it. I find this paragraph a bit confusing. At the time, it was broken, right? And if we were to memset() now, it'd break again. > As for cgroup v1, it does not use iostat any more after commit > ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1"), so using > memset to clear iostat has no real effect. Ah, okay, this is because we made the stats blk-throtl specific but didn't implement ->pd_reset_stat_fn(), right? > As for cgroup v2, it will not call blkcg_reset_stats() to corrupt the > list. > > The list of root cgroup can be used by both cgroup v1 and v2 while > non-root cgroup can't since it must be removed before switch between > cgroup v1 and v2. > So it may has effect if the list of root used by cgroup v2 was corrupted > after switching to cgroup v1, and switch back to cgroup v2 to use the > corrupted list again. > However, the root cgroup will not use the list any more after commit > ef45fe470e1e("blk-cgroup: show global disk stats in root cgroup io.stat"). Hmm... I'm still having a bit of trouble following this line of argument given that all the patch does is dropping stat clearing. > @@ -668,7 +645,6 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css, > * anyway. If you get hit by a race, retry. > */ > hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) { > - blkg_clear_stat(blkg); > for (i = 0; i < BLKCG_MAX_POLS; i++) { > struct blkcg_policy *pol = blkcg_policy[i]; The patch looks fine to me although it'd be nice to follow up with a patch to implement ->pd_reset_stat_fn() for blk-throtl. I'm not quite following the list corruption part of argument. Thanks.
On 6/27/24 05:08, Li Lingfeng wrote: > From: Li Lingfeng <lilingfeng3@huawei.com> > > The list corruption described in commit 6da668063279 ("blk-cgroup: fix > list corruption from resetting io stat") has no effect. It's unnecessary > to fix it. > > As for cgroup v1, it does not use iostat any more after commit > ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1"), so using > memset to clear iostat has no real effect. > As for cgroup v2, it will not call blkcg_reset_stats() to corrupt the > list. > > The list of root cgroup can be used by both cgroup v1 and v2 while > non-root cgroup can't since it must be removed before switch between > cgroup v1 and v2. > So it may has effect if the list of root used by cgroup v2 was corrupted > after switching to cgroup v1, and switch back to cgroup v2 to use the > corrupted list again. > However, the root cgroup will not use the list any more after commit > ef45fe470e1e("blk-cgroup: show global disk stats in root cgroup io.stat"). > > Although this has no negative effect, it is not necessary. Remove the > related code. You may be right that it may not be necessary in the mainline kernel, it does fix the issue on distros with older kernels or some stable releases where commit ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1") may not be present. > > Fixes: 6da668063279 ("blk-cgroup: fix list corruption from resetting io stat") I don't think there should be a fixes tag or it will be backported to stable releases. > Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> > --- > block/blk-cgroup.c | 24 ------------------------ > 1 file changed, 24 deletions(-) > > diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c > index 37e6cc91d576..1113c398a742 100644 > --- a/block/blk-cgroup.c > +++ b/block/blk-cgroup.c > @@ -629,29 +629,6 @@ static void blkg_iostat_set(struct blkg_iostat *dst, struct blkg_iostat *src) > } > } > > -static void __blkg_clear_stat(struct blkg_iostat_set *bis) > -{ > - struct blkg_iostat cur = {0}; > - unsigned long flags; > - > - flags = u64_stats_update_begin_irqsave(&bis->sync); > - blkg_iostat_set(&bis->cur, &cur); > - blkg_iostat_set(&bis->last, &cur); > - u64_stats_update_end_irqrestore(&bis->sync, flags); > -} > - > -static void blkg_clear_stat(struct blkcg_gq *blkg) > -{ > - int cpu; > - > - for_each_possible_cpu(cpu) { > - struct blkg_iostat_set *s = per_cpu_ptr(blkg->iostat_cpu, cpu); > - > - __blkg_clear_stat(s); > - } > - __blkg_clear_stat(&blkg->iostat); > -} > - > static int blkcg_reset_stats(struct cgroup_subsys_state *css, > struct cftype *cftype, u64 val) > { > @@ -668,7 +645,6 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css, > * anyway. If you get hit by a race, retry. > */ > hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) { > - blkg_clear_stat(blkg); > for (i = 0; i < BLKCG_MAX_POLS; i++) { > struct blkcg_policy *pol = blkcg_policy[i]; > If you are saying that iostat is no longer used in cgroup v1, why not remove the blkcg_reset_stats() and its supporting functions and deprecate the v1 reset_stats control file. The file should still be there to avoid userspace regression, but it will be a nop. Cheers, Longman
Hi, 在 2024/06/28 4:10, Tejun Heo 写道: > Hello, > > On Thu, Jun 27, 2024 at 05:08:56PM +0800, Li Lingfeng wrote: >> The list corruption described in commit 6da668063279 ("blk-cgroup: fix >> list corruption from resetting io stat") has no effect. It's unnecessary >> to fix it. > > I find this paragraph a bit confusing. At the time, it was broken, right? > And if we were to memset() now, it'd break again. > >> As for cgroup v1, it does not use iostat any more after commit >> ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1"), so using >> memset to clear iostat has no real effect. > > Ah, okay, this is because we made the stats blk-throtl specific but didn't > implement ->pd_reset_stat_fn(), right? I'm afraid not... Implement pd_reset_stat_fn() or not is another problem, this patch should be just code cleanup, not fixing any real problems. > >> As for cgroup v2, it will not call blkcg_reset_stats() to corrupt the >> list. >> >> The list of root cgroup can be used by both cgroup v1 and v2 while >> non-root cgroup can't since it must be removed before switch between >> cgroup v1 and v2. >> So it may has effect if the list of root used by cgroup v2 was corrupted >> after switching to cgroup v1, and switch back to cgroup v2 to use the >> corrupted list again. >> However, the root cgroup will not use the list any more after commit >> ef45fe470e1e("blk-cgroup: show global disk stats in root cgroup io.stat"). > > Hmm... I'm still having a bit of trouble following this line of argument > given that all the patch does is dropping stat clearing. > >> @@ -668,7 +645,6 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css, >> * anyway. If you get hit by a race, retry. >> */ >> hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) { >> - blkg_clear_stat(blkg); >> for (i = 0; i < BLKCG_MAX_POLS; i++) { >> struct blkcg_policy *pol = blkcg_policy[i]; > > The patch looks fine to me although it'd be nice to follow up with a patch > to implement ->pd_reset_stat_fn() for blk-throtl. I'm not quite following > the list corruption part of argument. The code deleted by this patch was claimed to fix a lsit corruption, however, the list corruption does not exist now hence related code can be removed: 1) Take a look at blk_cgroup_bio_start, now there are two conditions before this blkg can be added to the per_cpu list: blk_cgroup_bio_start if (!cgroup_subsys_on_dfl(io_cgrp_subsys)) -> the list will always be empty for cgroup v1 return; if (!cgroup_parent(blkcg->css.cgroup)) -> the list will always be empty for root blkg return; ... llist_add() 2) blkcg_reset_stats can only be called from cgroup v1 api, hence there is nothing to be cleared for blkg_clear_stat(); 3) Noted that user can switch from cgroup v2 to v1, however, we found that user must delete all the child cg to do that, hence only root blkg can be kept after switching to v1. And root blkg is bypassed from blk_cgroup_bio_start(), hence no problem. Thanks, Kuai > > Thanks. >
在 2024/6/28 5:03, Waiman Long 写道: > > On 6/27/24 05:08, Li Lingfeng wrote: >> From: Li Lingfeng <lilingfeng3@huawei.com> >> >> The list corruption described in commit 6da668063279 ("blk-cgroup: fix >> list corruption from resetting io stat") has no effect. It's unnecessary >> to fix it. >> >> As for cgroup v1, it does not use iostat any more after commit >> ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1"), so using >> memset to clear iostat has no real effect. >> As for cgroup v2, it will not call blkcg_reset_stats() to corrupt the >> list. >> >> The list of root cgroup can be used by both cgroup v1 and v2 while >> non-root cgroup can't since it must be removed before switch between >> cgroup v1 and v2. >> So it may has effect if the list of root used by cgroup v2 was corrupted >> after switching to cgroup v1, and switch back to cgroup v2 to use the >> corrupted list again. >> However, the root cgroup will not use the list any more after commit >> ef45fe470e1e("blk-cgroup: show global disk stats in root cgroup >> io.stat"). >> >> Although this has no negative effect, it is not necessary. Remove the >> related code. > You may be right that it may not be necessary in the mainline kernel, > it does fix the issue on distros with older kernels or some stable > releases where commit ad7c3b41e86b("blk-throttle: Fix io statistics > for cgroup v1") may not be present. > >> >> Fixes: 6da668063279 ("blk-cgroup: fix list corruption from resetting >> io stat") > I don't think there should be a fixes tag or it will be backported to > stable releases. OK, I will remove it. >> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> >> --- >> block/blk-cgroup.c | 24 ------------------------ >> 1 file changed, 24 deletions(-) >> >> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c >> index 37e6cc91d576..1113c398a742 100644 >> --- a/block/blk-cgroup.c >> +++ b/block/blk-cgroup.c >> @@ -629,29 +629,6 @@ static void blkg_iostat_set(struct blkg_iostat >> *dst, struct blkg_iostat *src) >> } >> } >> -static void __blkg_clear_stat(struct blkg_iostat_set *bis) >> -{ >> - struct blkg_iostat cur = {0}; >> - unsigned long flags; >> - >> - flags = u64_stats_update_begin_irqsave(&bis->sync); >> - blkg_iostat_set(&bis->cur, &cur); >> - blkg_iostat_set(&bis->last, &cur); >> - u64_stats_update_end_irqrestore(&bis->sync, flags); >> -} >> - >> -static void blkg_clear_stat(struct blkcg_gq *blkg) >> -{ >> - int cpu; >> - >> - for_each_possible_cpu(cpu) { >> - struct blkg_iostat_set *s = per_cpu_ptr(blkg->iostat_cpu, cpu); >> - >> - __blkg_clear_stat(s); >> - } >> - __blkg_clear_stat(&blkg->iostat); >> -} >> - >> static int blkcg_reset_stats(struct cgroup_subsys_state *css, >> struct cftype *cftype, u64 val) >> { >> @@ -668,7 +645,6 @@ static int blkcg_reset_stats(struct >> cgroup_subsys_state *css, >> * anyway. If you get hit by a race, retry. >> */ >> hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) { >> - blkg_clear_stat(blkg); >> for (i = 0; i < BLKCG_MAX_POLS; i++) { >> struct blkcg_policy *pol = blkcg_policy[i]; > > If you are saying that iostat is no longer used in cgroup v1, why not > remove the blkcg_reset_stats() and its supporting functions and > deprecate the v1 reset_stats control file. The file should still be > there to avoid userspace regression, but it will be a nop. I'm not sure if we can just remove blkcg_reset_stats() since bfq_pd_reset_stats() may be called in it. Thanks. > > Cheers, > Longman
Hi, +CC Jan and Paolo for bfq part. 在 2024/06/28 5:03, Waiman Long 写道: > > On 6/27/24 05:08, Li Lingfeng wrote: >> From: Li Lingfeng <lilingfeng3@huawei.com> >> >> The list corruption described in commit 6da668063279 ("blk-cgroup: fix >> list corruption from resetting io stat") has no effect. It's unnecessary >> to fix it. >> >> As for cgroup v1, it does not use iostat any more after commit >> ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1"), so using >> memset to clear iostat has no real effect. >> As for cgroup v2, it will not call blkcg_reset_stats() to corrupt the >> list. >> >> The list of root cgroup can be used by both cgroup v1 and v2 while >> non-root cgroup can't since it must be removed before switch between >> cgroup v1 and v2. >> So it may has effect if the list of root used by cgroup v2 was corrupted >> after switching to cgroup v1, and switch back to cgroup v2 to use the >> corrupted list again. >> However, the root cgroup will not use the list any more after commit >> ef45fe470e1e("blk-cgroup: show global disk stats in root cgroup >> io.stat"). >> >> Although this has no negative effect, it is not necessary. Remove the >> related code. > You may be right that it may not be necessary in the mainline kernel, it > does fix the issue on distros with older kernels or some stable releases > where commit ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup > v1") may not be present. > >> >> Fixes: 6da668063279 ("blk-cgroup: fix list corruption from resetting >> io stat") > I don't think there should be a fixes tag or it will be backported to > stable releases. Yes, and probably it's better to add: Depends-on: ad7c3b41e86b ("blk-throttle: Fix io statistics for cgroup v1") Depends-on: 0416f3be58c6 ("blk-cgroup: don't update io stat for root cgroup") >> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> >> --- >> block/blk-cgroup.c | 24 ------------------------ >> 1 file changed, 24 deletions(-) >> >> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c >> index 37e6cc91d576..1113c398a742 100644 >> --- a/block/blk-cgroup.c >> +++ b/block/blk-cgroup.c >> @@ -629,29 +629,6 @@ static void blkg_iostat_set(struct blkg_iostat >> *dst, struct blkg_iostat *src) >> } >> } >> -static void __blkg_clear_stat(struct blkg_iostat_set *bis) >> -{ >> - struct blkg_iostat cur = {0}; >> - unsigned long flags; >> - >> - flags = u64_stats_update_begin_irqsave(&bis->sync); >> - blkg_iostat_set(&bis->cur, &cur); >> - blkg_iostat_set(&bis->last, &cur); >> - u64_stats_update_end_irqrestore(&bis->sync, flags); >> -} >> - >> -static void blkg_clear_stat(struct blkcg_gq *blkg) >> -{ >> - int cpu; >> - >> - for_each_possible_cpu(cpu) { >> - struct blkg_iostat_set *s = per_cpu_ptr(blkg->iostat_cpu, cpu); >> - >> - __blkg_clear_stat(s); >> - } >> - __blkg_clear_stat(&blkg->iostat); >> -} >> - >> static int blkcg_reset_stats(struct cgroup_subsys_state *css, >> struct cftype *cftype, u64 val) >> { >> @@ -668,7 +645,6 @@ static int blkcg_reset_stats(struct >> cgroup_subsys_state *css, >> * anyway. If you get hit by a race, retry. >> */ >> hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) { >> - blkg_clear_stat(blkg); >> for (i = 0; i < BLKCG_MAX_POLS; i++) { >> struct blkcg_policy *pol = blkcg_policy[i]; > > If you are saying that iostat is no longer used in cgroup v1, why not > remove the blkcg_reset_stats() and its supporting functions and > deprecate the v1 reset_stats control file. The file should still be > there to avoid userspace regression, but it will be a nop. No, this is not correct, blkg->iostat is not used in cgroup v1, however, bfq(and probably blk-throtl) has it's only stats, and they rely on pd_reset_stats_fn() from blkcg_reset_stats() to clear the stats. I don't think remove it is a good idea for now, bfq maintainer must agree with this. Thanks, Kuai > > Cheers, > Longman > > > . >
Friendly ping ... Thank. 在 2024/6/28 11:14, Yu Kuai 写道: > Hi, > > 在 2024/06/28 4:10, Tejun Heo 写道: >> Hello, >> >> On Thu, Jun 27, 2024 at 05:08:56PM +0800, Li Lingfeng wrote: >>> The list corruption described in commit 6da668063279 ("blk-cgroup: fix >>> list corruption from resetting io stat") has no effect. It's >>> unnecessary >>> to fix it. >> >> I find this paragraph a bit confusing. At the time, it was broken, >> right? >> And if we were to memset() now, it'd break again. >> >>> As for cgroup v1, it does not use iostat any more after commit >>> ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1"), so using >>> memset to clear iostat has no real effect. >> >> Ah, okay, this is because we made the stats blk-throtl specific but >> didn't >> implement ->pd_reset_stat_fn(), right? > > I'm afraid not... Implement pd_reset_stat_fn() or not is another > problem, this patch should be just code cleanup, not fixing any real > problems. > >> >>> As for cgroup v2, it will not call blkcg_reset_stats() to corrupt the >>> list. >>> >>> The list of root cgroup can be used by both cgroup v1 and v2 while >>> non-root cgroup can't since it must be removed before switch between >>> cgroup v1 and v2. >>> So it may has effect if the list of root used by cgroup v2 was >>> corrupted >>> after switching to cgroup v1, and switch back to cgroup v2 to use the >>> corrupted list again. >>> However, the root cgroup will not use the list any more after commit >>> ef45fe470e1e("blk-cgroup: show global disk stats in root cgroup >>> io.stat"). >> >> Hmm... I'm still having a bit of trouble following this line of argument >> given that all the patch does is dropping stat clearing. >> >>> @@ -668,7 +645,6 @@ static int blkcg_reset_stats(struct >>> cgroup_subsys_state *css, >>> * anyway. If you get hit by a race, retry. >>> */ >>> hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) { >>> - blkg_clear_stat(blkg); >>> for (i = 0; i < BLKCG_MAX_POLS; i++) { >>> struct blkcg_policy *pol = blkcg_policy[i]; >> >> The patch looks fine to me although it'd be nice to follow up with a >> patch >> to implement ->pd_reset_stat_fn() for blk-throtl. I'm not quite >> following >> the list corruption part of argument. > > The code deleted by this patch was claimed to fix a lsit corruption, > however, the list corruption does not exist now hence related code can > be removed: > > 1) Take a look at blk_cgroup_bio_start, now there are two conditions > before this blkg can be added to the per_cpu list: > > blk_cgroup_bio_start > if (!cgroup_subsys_on_dfl(io_cgrp_subsys)) > -> the list will always be empty for cgroup v1 > return; > if (!cgroup_parent(blkcg->css.cgroup)) > -> the list will always be empty for root blkg > return; > ... > llist_add() > > 2) blkcg_reset_stats can only be called from cgroup v1 api, hence > there is nothing to be cleared for blkg_clear_stat(); > > 3) Noted that user can switch from cgroup v2 to v1, however, we found > that user must delete all the child cg to do that, hence only root blkg > can be kept after switching to v1. And root blkg is bypassed from > blk_cgroup_bio_start(), hence no problem. > > Thanks, > Kuai >> >> Thanks. >> >
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 37e6cc91d576..1113c398a742 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -629,29 +629,6 @@ static void blkg_iostat_set(struct blkg_iostat *dst, struct blkg_iostat *src) } } -static void __blkg_clear_stat(struct blkg_iostat_set *bis) -{ - struct blkg_iostat cur = {0}; - unsigned long flags; - - flags = u64_stats_update_begin_irqsave(&bis->sync); - blkg_iostat_set(&bis->cur, &cur); - blkg_iostat_set(&bis->last, &cur); - u64_stats_update_end_irqrestore(&bis->sync, flags); -} - -static void blkg_clear_stat(struct blkcg_gq *blkg) -{ - int cpu; - - for_each_possible_cpu(cpu) { - struct blkg_iostat_set *s = per_cpu_ptr(blkg->iostat_cpu, cpu); - - __blkg_clear_stat(s); - } - __blkg_clear_stat(&blkg->iostat); -} - static int blkcg_reset_stats(struct cgroup_subsys_state *css, struct cftype *cftype, u64 val) { @@ -668,7 +645,6 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css, * anyway. If you get hit by a race, retry. */ hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) { - blkg_clear_stat(blkg); for (i = 0; i < BLKCG_MAX_POLS; i++) { struct blkcg_policy *pol = blkcg_policy[i];