Message ID | CAJZ5v0iThFDEjnwTbpAhwHY_vF_KDdAUyhDL1CdB4GJsG5eNRQ@mail.gmail.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On 2018-03-21 23:15, Rafael J. Wysocki wrote: > On Wed, Mar 21, 2018 at 6:59 PM, Thomas Ilsche > <thomas.ilsche@tu-dresden.de> wrote: >> On 2018-03-21 15:36, Rafael J. Wysocki wrote: >>> >>> >>> So please disregard this one entirely and take the v7.2 replacement >>> instead of it:https://patchwork.kernel.org/patch/10299429/ >>> >>> The current versions (including the above) is in the git branch at >>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ >>> idle-loop-v7.2 >> >> >> With v7.2 (tested on SKL-SP from git) I see similar behavior in idle >> as with v5: several cores which just keep the sched tick enabled. >> Worse yet, some go only in C1 (not even C1E!?) despite sleeping the >> full sched tick. >> The resulting power consumption is ~105 W instead of ~ 70 W. >> >> https://wwwpub.zih.tu-dresden.de/~tilsche/powernightmares/v7_2_skl_sp_idle.png >> >> I have briefly ran v7 and I believe it was also affected. > > Then it looks like menu_select() stubbornly thinks that the idle > duration will be within the tick boundary on those cores. > > That may be because the bumping up of the correction factor in > menu_reflect() is too conservative or it may be necessary to do > something radical to measured_us in menu_update() in case of a tick > wakeup combined with a large next_timer_us value. > > For starters, please see if the attached patch (on top of the > idle-loop-v7.2 git branch) changes this behavior in any way. > The patch on top of idle-loop-v7.2 doesn't improve idle behavior on SKL-SP. Overall it is pretty erratic, I have not seen any regular patterns. Sometimes only few cpus are affected, here's a screenshot of almost all cpus being affected after a short burst workload. https://wwwpub.zih.tu-dresden.de/~tilsche/powernightmares/v7_2_reflect_skl_sp_idle.png
On Thursday, March 22, 2018 2:18:59 PM CET Thomas Ilsche wrote: > On 2018-03-21 23:15, Rafael J. Wysocki wrote: > > On Wed, Mar 21, 2018 at 6:59 PM, Thomas Ilsche > > <thomas.ilsche@tu-dresden.de> wrote: > >> On 2018-03-21 15:36, Rafael J. Wysocki wrote: > >>> > >>> > >>> So please disregard this one entirely and take the v7.2 replacement > >>> instead of it:https://patchwork.kernel.org/patch/10299429/ > >>> > >>> The current versions (including the above) is in the git branch at > >>> > >>> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ > >>> idle-loop-v7.2 > >> > >> > >> With v7.2 (tested on SKL-SP from git) I see similar behavior in idle > >> as with v5: several cores which just keep the sched tick enabled. > >> Worse yet, some go only in C1 (not even C1E!?) despite sleeping the > >> full sched tick. > >> The resulting power consumption is ~105 W instead of ~ 70 W. > >> > >> https://wwwpub.zih.tu-dresden.de/~tilsche/powernightmares/v7_2_skl_sp_idle.png > >> > >> I have briefly ran v7 and I believe it was also affected. > > > > Then it looks like menu_select() stubbornly thinks that the idle > > duration will be within the tick boundary on those cores. > > > > That may be because the bumping up of the correction factor in > > menu_reflect() is too conservative or it may be necessary to do > > something radical to measured_us in menu_update() in case of a tick > > wakeup combined with a large next_timer_us value. > > > > For starters, please see if the attached patch (on top of the > > idle-loop-v7.2 git branch) changes this behavior in any way. > > > > The patch on top of idle-loop-v7.2 doesn't improve idle behavior on > SKL-SP. Overall it is pretty erratic, I have not seen any regular > patterns. Sometimes only few cpus are affected, here's a screenshot of > almost all cpus being affected after a short burst workload. > > https://wwwpub.zih.tu-dresden.de/~tilsche/powernightmares/v7_2_reflect_skl_sp_idle.png Thanks for the information! I will post a v7.3 of patch [5/8] shortly that appears to give good results for me. It may be selectig deep states quite aggressively, but let's see.
--- drivers/cpuidle/governors/menu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-pm/drivers/cpuidle/governors/menu.c =================================================================== --- linux-pm.orig/drivers/cpuidle/governors/menu.c +++ linux-pm/drivers/cpuidle/governors/menu.c @@ -498,7 +498,7 @@ static void menu_reflect(struct cpuidle_ * correction factor. Use 0.75 * RESOLUTION (which is easy * enough to get) that should work fine on the average. */ - new_factor += RESOLUTION / 2 + RESOLUTION / 4; + new_factor += RESOLUTION; data->correction_factor[data->bucket] = new_factor; } else { data->needs_update = 1;