From patchwork Fri Jun 8 01:07:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chen Yu X-Patchwork-Id: 10453697 X-Patchwork-Delegate: rjw@sisk.pl Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 10ABE60318 for ; Fri, 8 Jun 2018 01:02:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DE38229866 for ; Fri, 8 Jun 2018 01:02:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CF09F29877; Fri, 8 Jun 2018 01:02:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1E51329866 for ; Fri, 8 Jun 2018 01:02:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752202AbeFHBCe (ORCPT ); Thu, 7 Jun 2018 21:02:34 -0400 Received: from mga01.intel.com ([192.55.52.88]:47612 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752174AbeFHBCd (ORCPT ); Thu, 7 Jun 2018 21:02:33 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jun 2018 18:02:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,488,1520924400"; d="scan'208";a="61374574" Received: from sandybridge-desktop.sh.intel.com ([10.239.160.116]) by fmsmga004.fm.intel.com with ESMTP; 07 Jun 2018 18:02:32 -0700 From: Chen Yu To: "Rafael J. Wysocki" , Viresh Kumar Cc: linux-kernel@vger.kernel.org, Chen Yu , Artem S Tashkinov , linux-pm@vger.kernel.org Subject: [PATCH][v2] sched: cpufreq: Fix long idle judgement logic in load calculation Date: Fri, 8 Jun 2018 09:07:33 +0800 Message-Id: <1528420053-22884-1-git-send-email-yu.c.chen@intel.com> X-Mailer: git-send-email 2.7.4 Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP According to current code implementation, detecting the long idle period is done by checking if the interval between two adjacent utilization update handers is long enough. Although this mechanism can detect if the idle period is long enough (no utilization hooks invoked during idle period), it might not contain a corner case: if the task has occupied the cpu for too long which causes no context switch during that period, then no utilization handler will be launched until this high prio task is switched out. As a result, the idle_periods field might be calculated incorrectly because it regards the 100% load as 0% and makes the conservative governor who uses this field confusing. Change the judgement to compare the idle_time with sampling_rate directly. Reported-by: Artem S. Tashkinov Cc: Artem S Tashkinov Cc: "Rafael J. Wysocki" Cc: Viresh Kumar Cc: linux-pm@vger.kernel.org Signed-off-by: Chen Yu Acked-by: Viresh Kumar --- v2: Per Viresh's suggestion, ignore idle_time longer than 30mins and simplify the code. --- drivers/cpufreq/cpufreq_governor.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c index 871bf9c..1d50e97 100644 --- a/drivers/cpufreq/cpufreq_governor.c +++ b/drivers/cpufreq/cpufreq_governor.c @@ -165,7 +165,7 @@ unsigned int dbs_update(struct cpufreq_policy *policy) * calls, so the previous load value can be used then. */ load = j_cdbs->prev_load; - } else if (unlikely(time_elapsed > 2 * sampling_rate && + } else if (unlikely((int)idle_time > 2 * sampling_rate && j_cdbs->prev_load)) { /* * If the CPU had gone completely idle and a task has @@ -185,10 +185,8 @@ unsigned int dbs_update(struct cpufreq_policy *policy) * clear prev_load to guarantee that the load will be * computed again next time. * - * Detecting this situation is easy: the governor's - * utilization update handler would not have run during - * CPU-idle periods. Hence, an unusually large - * 'time_elapsed' (as compared to the sampling rate) + * Detecting this situation is easy: an unusually large + * 'idle_time' (as compared to the sampling rate) * indicates this scenario. */ load = j_cdbs->prev_load; @@ -217,8 +215,8 @@ unsigned int dbs_update(struct cpufreq_policy *policy) j_cdbs->prev_load = load; } - if (time_elapsed > 2 * sampling_rate) { - unsigned int periods = time_elapsed / sampling_rate; + if (unlikely((int)idle_time > 2 * sampling_rate)) { + unsigned int periods = idle_time / sampling_rate; if (periods < idle_periods) idle_periods = periods;