[RFC/RFT,v2] cpuidle: New timer events oriented governor for tickless systems

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The venerable menu governor does some thigns that are quite
questionable in my view.  First, it includes timer wakeups in
the pattern detection data and mixes them up with wakeups from
other sources which in some cases causes it to expect what
essentially would be a timer wakeup in a time frame in which
no timer wakeups are possible (becuase it knows the time until
the next timer event and that is later than the expected wakeup
time).  Second, it uses the extra exit latency limit based on
the predicted idle duration and depending on the number of tasks
waiting on I/O, even though those tasks may run on a different
CPU when they are woken up.  Moreover, the time ranges used by it
for the sleep length correction factors depend on whether or not
there are tasks waiting on I/O, which again doesn't imply anything
in particular, and they are not correlated to the list of available
idle states in any way whatever.  Also,  the pattern detection code
in menu may end up considering values that are too large to matter
at all, in which cases running it is a waste of time.

A major rework of the menu governor would be required to address
these issues and the performance of at least some workloads (tuned
specifically to the current behavior of the menu governor) is likely
to suffer from that.  It is thus better to introduce an entirely new
governor without them and let everybody use the governor that works
better with their actual workloads.

The new governor introduced here, the timer events oriented (TEO)
governor, uses the same basic strategy as menu: it always tries to
find the deepest idle state that can be used in the given conditions.
However, it applies a different approach to that problem.  First, it
doesn't use "correction factors" for the time till the closest timer,
but instead it tries to correlate the measured idle duration values
with the available idle states and use that information to pick up
the idle state that is most likely to "match" the upcoming CPU idle
interval.  Second, it doesn't take the number of "I/O waiters" into
account at all and the pattern detection code in it tries to avoid
taking timer wakeups into account.  It also only uses idle duration
values less than the current time till the closest timer (with the
tick excluded) for that purpose.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

The v2 is a re-write of major parts of the original patch.

The approach the same in general, but the details have changed significantly
with respect to the previous version.  In particular:
* The decay of the idle state metrics is implemented differently.
* There is a more "clever" pattern detection (sort of along the lines
  of what the menu does, but simplified quite a bit and trying to avoid
  including timer wakeups).
* The "promotion" from the "polling" state is gone.
* The "safety net" wakeups are treated as the CPU might have been idle
  until the closest timer.

I'm running this governor on all of my systems now without any
visible adverse effects.

Overall, it selects deeper idle states more often than menu on average, but
that doesn't seem to make a significant difference in the majority of cases.

In this preliminary revision it overtakes menu as the default governor
for tickless systems (due to the higher rating), but that is likely
to change going forward.  At this point I'm mostly asking for feedback
and possibly testing with whatever workloads you can throw at it.

The patch should apply on top of 4.19, although I'm running it on
top of my linux-next branch.  This version hasn't been run through
benchmarks yet and that likely will take some time as I will be
traveling quite a bit during the next few weeks.

---
 drivers/cpuidle/Kconfig            |   11 
 drivers/cpuidle/governors/Makefile |    1 
 drivers/cpuidle/governors/teo.c    |  491 +++++++++++++++++++++++++++++++++++++
 3 files changed, 503 insertions(+)

Message ID	1899281.leNo2RexrE@aspire.rjw.lan (mailing list archive)
State	RFC
Headers	show Return-Path: <linux-pm-owner@kernel.org> Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E694413A4 for <patchwork-linux-pm@patchwork.kernel.org>; Fri, 26 Oct 2018 09:15:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D171A2C605 for <patchwork-linux-pm@patchwork.kernel.org>; Fri, 26 Oct 2018 09:15:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C55F02C628; Fri, 26 Oct 2018 09:15:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A20132C605 for <patchwork-linux-pm@patchwork.kernel.org>; Fri, 26 Oct 2018 09:15:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726111AbeJZRwF (ORCPT <rfc822;patchwork-linux-pm@patchwork.kernel.org>); Fri, 26 Oct 2018 13:52:05 -0400 Received: from cloudserver094114.home.pl ([79.96.170.134]:43409 "EHLO cloudserver094114.home.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726101AbeJZRwF (ORCPT <rfc822;linux-pm@vger.kernel.org>); Fri, 26 Oct 2018 13:52:05 -0400 Received: from 79.184.255.63.ipv4.supernova.orange.pl (79.184.255.63) (HELO aspire.rjw.lan) by serwer1319399.home.pl (79.96.170.134) with SMTP (IdeaSmtpServer 0.83.148) id 932a155d2dc07b7a; Fri, 26 Oct 2018 11:15:44 +0200 From: "Rafael J. Wysocki" <rjw@rjwysocki.net> To: Linux PM <linux-pm@vger.kernel.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>, Peter Zijlstra <peterz@infradead.org>, LKML <linux-kernel@vger.kernel.org>, Frederic Weisbecker <frederic@kernel.org>, Mel Gorman <mgorman@suse.de>, Giovanni Gherdovich <ggherdovich@suse.cz>, Doug Smythies <dsmythies@telus.net>, Daniel Lezcano <daniel.lezcano@linaro.org> Subject: [RFC/RFT][PATCH v2] cpuidle: New timer events oriented governor for tickless systems Date: Fri, 26 Oct 2018 11:12:21 +0200 Message-ID: <1899281.leNo2RexrE@aspire.rjw.lan> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: <linux-pm.vger.kernel.org> X-Mailing-List: linux-pm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP
Series	[RFC/RFT,v2] cpuidle: New timer events oriented governor for tickless systems \| expand [RFC/RFT,v2] cpuidle: New timer events oriented governor for tickless systems

[RFC/RFT,v2] cpuidle: New timer events oriented governor for tickless systems

Commit Message

Comments

Patch