Message ID | 20220117081615.45449-1-zhangshaokun@hisilicon.com (mailing list archive) |
---|---|
State | Under Review, archived |
Headers | show |
Series | cpuidle: menu: Fix long delay issue when tick stopped | expand |
On Mon, Jan 17, 2022 at 9:16 AM Shaokun Zhang <zhangshaokun@hisilicon.com> wrote: > > From: Guo Yang <guoyang2@huawei.com> > > The network delay was always big on arm server tested by qperf, > the reason was that the cpu entered deep power down idle state(like intel > C6) and can't goto a shallow one. > > The intervals in @get_typical_interval() was much smaller than predicted_ns > in @menu_select(), so the predict state is always deepest and cause long > time network delay. > > Every time when the cpu got an interrupt from the network, the cpu was > waken up and did the IRQ, after that the cpu enter @menu_select() > but the @tick_nohz_tick_stopped() was true and get a big data->next_timer_ns, > the cpu can never goto a shallow state util the data->next_timer_ns timeout. > Below was the print when the issue occurrence. > > [ 37.082861] intervals = 36us > [ 37.082875] intervals = 15us > [ 37.082888] intervals = 22us > [ 37.082902] intervals = 35us > [ 37.082915] intervals = 34us > [ 37.082929] intervals = 39us > [ 37.082942] intervals = 39us > [ 37.082956] intervals = 35us > [ 37.082970] target_residency_ns = 10000, predicted_ns = 35832710 > [ 37.082998] target_residency_ns = 600000, predicted_ns = 35832710 > [ 37.083037] intervals = 36us > [ 37.083050] intervals = 15us > [ 37.083064] intervals = 22us > [ 37.083077] intervals = 35us > [ 37.083091] intervals = 34us > [ 37.083104] intervals = 39us > [ 37.083118] intervals = 39us > [ 37.083131] intervals = 35us > [ 37.083145] target_residency_ns = 10000, predicted_ns = 35657420 > [ 37.083174] target_residency_ns = 600000, predicted_ns = 35657420 > [ 37.083212] intervals = 36us > [ 37.083225] intervals = 15us > [ 37.083239] intervals = 22us > [ 37.083253] intervals = 35us > [ 37.083266] intervals = 34us > [ 37.083279] intervals = 39us > [ 37.083293] intervals = 39us > [ 37.083307] intervals = 35us > [ 37.083320] target_residency_ns = 10000, predicted_ns = 35482140 > [ 37.083349] target_residency_ns = 600000, predicted_ns = 35482140 > > Add idle tick wakeup judge before change predicted_ns. > > Cc: "Rafael J. Wysocki" <rafael@kernel.org> > Cc: Daniel Lezcano <daniel.lezcano@linaro.org> > Signed-off-by: Guo Yang <guoyang2@huawei.com> > Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> > --- > drivers/cpuidle/governors/menu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c > index c492268..3f03843 100644 > --- a/drivers/cpuidle/governors/menu.c > +++ b/drivers/cpuidle/governors/menu.c > @@ -313,7 +313,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, > get_typical_interval(data, predicted_us)) * > NSEC_PER_USEC; > > - if (tick_nohz_tick_stopped()) { > + if (tick_nohz_tick_stopped() && data->tick_wakeup) { data->tick_wakeup is only true if tick_nohz_idle_got_tick() has returned true, but I'm not sure how this can happen after stopping the tick. IOW, it looks like the change simply makes the condition be always false. > /* > * If the tick is already stopped, the cost of possible short > * idle duration misprediction is much higher, because the CPU > -- > 1.8.3.1 >
Hi Rafael, Apologies that reply later. On 2022/1/21 2:55, Rafael J. Wysocki wrote: > On Mon, Jan 17, 2022 at 9:16 AM Shaokun Zhang > <zhangshaokun@hisilicon.com> wrote: >> ...<snip>... >> [ 37.083307] intervals = 35us >> [ 37.083320] target_residency_ns = 10000, predicted_ns = 35482140 >> [ 37.083349] target_residency_ns = 600000, predicted_ns = 35482140 >> >> Add idle tick wakeup judge before change predicted_ns. >> >> Cc: "Rafael J. Wysocki" <rafael@kernel.org> >> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> >> Signed-off-by: Guo Yang <guoyang2@huawei.com> >> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> >> --- >> drivers/cpuidle/governors/menu.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c >> index c492268..3f03843 100644 >> --- a/drivers/cpuidle/governors/menu.c >> +++ b/drivers/cpuidle/governors/menu.c >> @@ -313,7 +313,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, >> get_typical_interval(data, predicted_us)) * >> NSEC_PER_USEC; >> >> - if (tick_nohz_tick_stopped()) { >> + if (tick_nohz_tick_stopped() && data->tick_wakeup) { > > data->tick_wakeup is only true if tick_nohz_idle_got_tick() has > returned true, but I'm not sure how this can happen after stopping the > tick. In order to debug this, call trace is added and as follow: if (predicted_us < TICK_USEC) predicted_us = ktime_to_us(delta_next); printk("predicted_us = %uus\n", predicted_us); dump_stack(); //add call trace print } When the issue came, the CPU was waken up by network interrupts [ 1048.130033] intervals = 1us [ 1048.130034] intervals = 1us [ 1048.130035] intervals = 1us [ 1048.130036] intervals = 1us [ 1048.130037] intervals = 1us [ 1048.130038] intervals = 1us [ 1048.130039] intervals = 1us [ 1048.130040] intervals = 1us [ 1048.130041] predicted_us = 484143us [ 1048.130043] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G OE 5.3.0-rc6 #23 [ 1048.130044] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 0.39 12/01/2017 [ 1048.130045] Call Trace: [ 1048.130048] dump_stack+0x5a/0x73 [ 1048.130052] menu_select+0x3b0/0x6c0 [ 1048.130058] do_idle+0x1b4/0x290 [ 1048.130063] cpu_startup_entry+0x19/0x20 [ 1048.130067] start_secondary+0x155/0x1b0 [ 1048.130070] secondary_startup_64+0xa4/0xb0 [ 1048.130078] intervals = 1us [ 1048.130079] intervals = 1us [ 1048.130080] intervals = 1us [ 1048.130081] intervals = 1us [ 1048.130081] intervals = 1us [ 1048.130082] intervals = 1us [ 1048.130083] intervals = 1us [ 1048.130084] intervals = 1us [ 1048.130085] predicted_us = 484097us [ 1048.130087] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G OE 5.3.0-rc6 #23 [ 1048.130088] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 0.39 12/01/2017 [ 1048.130089] Call Trace: [ 1048.130093] dump_stack+0x5a/0x73 [ 1048.130097] menu_select+0x3b0/0x6c0 [ 1048.130102] do_idle+0x1b4/0x290 [ 1048.130107] cpu_startup_entry+0x19/0x20 [ 1048.130112] start_secondary+0x155/0x1b0 [ 1048.130115] secondary_startup_64+0xa4/0xb0 [ 1048.130123] intervals = 1us [ 1048.130123] intervals = 1us [ 1048.130124] intervals = 1us [ 1048.130125] intervals = 1us [ 1048.130126] intervals = 1us [ 1048.130127] intervals = 1us [ 1048.130128] intervals = 1us [ 1048.130129] intervals = 1us [ 1048.130130] predicted_us = 484053us [ 1048.130132] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G OE 5.3.0-rc6 #23 [ 1048.130133] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 0.39 12/01/2017 [ 1048.130134] Call Trace: [ 1048.130137] dump_stack+0x5a/0x73 [ 1048.130141] menu_select+0x3b0/0x6c0 [ 1048.130147] do_idle+0x1b4/0x290 [ 1048.130152] cpu_startup_entry+0x19/0x20 [ 1048.130156] start_secondary+0x155/0x1b0 [ 1048.130159] secondary_startup_64+0xa4/0xb0 > > IOW, it looks like the change simply makes the condition be always false. > Agree, any good feedback is welcome and we can try it. Thanks, Shaokun >> /* >> * If the tick is already stopped, the cost of possible short >> * idle duration misprediction is much higher, because the CPU >> -- >> 1.8.3.1 >> > . >
diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index c492268..3f03843 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -313,7 +313,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, get_typical_interval(data, predicted_us)) * NSEC_PER_USEC; - if (tick_nohz_tick_stopped()) { + if (tick_nohz_tick_stopped() && data->tick_wakeup) { /* * If the tick is already stopped, the cost of possible short * idle duration misprediction is much higher, because the CPU