Message ID | 20220516084045.96004-1-zhouchengming@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blk-iocost: fix very large vtime when iocg activate | expand |
On Mon, May 16, 2022 at 04:40:45PM +0800, Chengming Zhou wrote: > When the first iocg activate after blk_iocost_init(), now->vnow > maybe smaller than ioc->margins.target, cause very large vtarget > since it's u64. > > vtarget = now->vnow - ioc->margins.target; > atomic64_add(vtarget - vtime, &iocg->vtime); > > Then the iocg's vtime will be very large too, larger than now->vnow. It's a wrapping counter. Please take a look at how time_before64() and friends work. Nacked-by: Tejun Heo <tj@kernel.org> Again, please spend more effort understanding the code before sending these subtle patches. Thanks.
On 2022/5/17 02:46, Tejun Heo wrote: > On Mon, May 16, 2022 at 04:40:45PM +0800, Chengming Zhou wrote: >> When the first iocg activate after blk_iocost_init(), now->vnow >> maybe smaller than ioc->margins.target, cause very large vtarget >> since it's u64. >> >> vtarget = now->vnow - ioc->margins.target; >> atomic64_add(vtarget - vtime, &iocg->vtime); >> >> Then the iocg's vtime will be very large too, larger than now->vnow. > > It's a wrapping counter. Please take a look at how time_before64() and > friends work. Hi Tejun, below is from the trace of test on original code: iocost_iocg_activate: [vda:/user.slice] now=38343468:2171657838 vrate=137438 \ period=0->0 vtime=18446744007162209454 weight=6553600/6553600 hweight=65536/65536 The vtime value is very large, much larger than vnow. Maybe the commit message is a little misleading? And I take a look at how time_before64() work: #define time_after64(a,b) \ (typecheck(__u64, a) && \ typecheck(__u64, b) && \ ((__s64)((b) - (a)) < 0)) #define time_before64(a,b) time_after64(b,a) I still don't get why my changes are wrong. :-) > > Nacked-by: Tejun Heo <tj@kernel.org> > > Again, please spend more effort understanding the code before sending these > subtle patches. Ok, will do. This problem is found from the trace of test, then verified fixed using the trace of the same test with this patch. Thanks. > > Thanks. >
On Tue, May 17, 2022 at 08:57:55AM +0800, Chengming Zhou wrote: > #define time_after64(a,b) \ > (typecheck(__u64, a) && \ > typecheck(__u64, b) && \ > ((__s64)((b) - (a)) < 0)) > #define time_before64(a,b) time_after64(b,a) > > I still don't get why my changes are wrong. :-) It's a wrapping timestamp where a lower value doesn't necessarily mean earlier. The before/after relationship is defined only in relation to each other. Imagine a cirle representing the whole value range and picking two spots in the circle, if one is in the clockwise half from the other, the former is said to be earlier than the latter and vice-versa. vtime runs way faster than nanosecs and wraps regularly, so we can't use absolute values to compare before/after. Thanks.
On 2022/5/17 09:03, Tejun Heo wrote: > On Tue, May 17, 2022 at 08:57:55AM +0800, Chengming Zhou wrote: >> #define time_after64(a,b) \ >> (typecheck(__u64, a) && \ >> typecheck(__u64, b) && \ >> ((__s64)((b) - (a)) < 0)) >> #define time_before64(a,b) time_after64(b,a) >> >> I still don't get why my changes are wrong. :-) > > It's a wrapping timestamp where a lower value doesn't necessarily mean > earlier. The before/after relationship is defined only in relation to each > other. Imagine a cirle representing the whole value range and picking two > spots in the circle, if one is in the clockwise half from the other, the > former is said to be earlier than the latter and vice-versa. vtime runs way > faster than nanosecs and wraps regularly, so we can't use absolute values to > compare before/after. Yes, thanks for the explanation. But the problem is not comparing two timestamp, since ioc->margins.target is not a timestamp. This patch just fix a corner case when now->vnow < ioc->margins.target: u64 vtarget; vtarget = now->vnow - ioc->margins.target; --> vtarget should be a timestamp earlier than vnow. But when now->vnow < ioc->margins.target, vtarget would be a timestamp after vnow. Thanks. > > Thanks. >
On 2022/5/17 09:03, Tejun Heo wrote: > On Tue, May 17, 2022 at 08:57:55AM +0800, Chengming Zhou wrote: >> #define time_after64(a,b) \ >> (typecheck(__u64, a) && \ >> typecheck(__u64, b) && \ >> ((__s64)((b) - (a)) < 0)) >> #define time_before64(a,b) time_after64(b,a) >> >> I still don't get why my changes are wrong. :-) > > It's a wrapping timestamp where a lower value doesn't necessarily mean > earlier. The before/after relationship is defined only in relation to each > other. Imagine a cirle representing the whole value range and picking two > spots in the circle, if one is in the clockwise half from the other, the > former is said to be earlier than the latter and vice-versa. vtime runs way > faster than nanosecs and wraps regularly, so we can't use absolute values to > compare before/after. Please ignore my previous reply, you are right. I should fix the tracing analysis tools to test again. Thanks. > > Thanks. >
diff --git a/block/blk-iocost.c b/block/blk-iocost.c index 5019dff524a4..a2ee12175c49 100644 --- a/block/blk-iocost.c +++ b/block/blk-iocost.c @@ -1248,7 +1248,7 @@ static bool iocg_activate(struct ioc_gq *iocg, struct ioc_now *now) { struct ioc *ioc = iocg->ioc; u64 last_period, cur_period; - u64 vtime, vtarget; + u64 vtime, vtarget = 0; int i; /* @@ -1290,7 +1290,8 @@ static bool iocg_activate(struct ioc_gq *iocg, struct ioc_now *now) * Always start with the target budget. On deactivation, we throw away * anything above it. */ - vtarget = now->vnow - ioc->margins.target; + if (now->vnow > ioc->margins.target) + vtarget = now->vnow - ioc->margins.target; vtime = atomic64_read(&iocg->vtime); atomic64_add(vtarget - vtime, &iocg->vtime);
When the first iocg activate after blk_iocost_init(), now->vnow maybe smaller than ioc->margins.target, cause very large vtarget since it's u64. vtarget = now->vnow - ioc->margins.target; atomic64_add(vtarget - vtime, &iocg->vtime); Then the iocg's vtime will be very large too, larger than now->vnow. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> --- block/blk-iocost.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)