[3/3] cpufreq: CPPC: Eliminate the impact of cpc_read() latency error

We have found significant differences in the latency of cpc_read() between
regular scenarios and scenarios with high memory access pressure. Ignoring
this error can result in getting rate interface occasionally returning
absurd values.

Here provides a high memory access sample test by stress-ng. My local
testing platform includes 160 CPUs, the CPC registers is accessed by mmio
method, and the cpuidle feature is disabled (the AMU always works online):

~~~
./stress-ng --memrate 160 --timeout 180
~~~

The following data is sourced from ftrace statistics towards
cppc_get_perf_ctrs():

              Regular scenarios               ||      High memory access pressure scenarios
104)               |  cppc_get_perf_ctrs() {  ||  133)               |  cppc_get_perf_ctrs() {
104)   0.800 us    |    cpc_read.isra.0();    ||  133)   4.580 us    |    cpc_read.isra.0();
104)   0.640 us    |    cpc_read.isra.0();    ||  133)   7.780 us    |    cpc_read.isra.0();
104)   0.450 us    |    cpc_read.isra.0();    ||  133)   2.550 us    |    cpc_read.isra.0();
104)   0.430 us    |    cpc_read.isra.0();    ||  133)   0.570 us    |    cpc_read.isra.0();
104)   4.610 us    |  }                       ||  133) ! 157.610 us  |  }
104)               |  cppc_get_perf_ctrs() {  ||  133)               |  cppc_get_perf_ctrs() {
104)   0.720 us    |    cpc_read.isra.0();    ||  133)   0.760 us    |    cpc_read.isra.0();
104)   0.720 us    |    cpc_read.isra.0();    ||  133)   4.480 us    |    cpc_read.isra.0();
104)   0.510 us    |    cpc_read.isra.0();    ||  133)   0.520 us    |    cpc_read.isra.0();
104)   0.500 us    |    cpc_read.isra.0();    ||  133) + 10.100 us   |    cpc_read.isra.0();
104)   3.460 us    |  }                       ||  133) ! 120.850 us  |  }
108)               |  cppc_get_perf_ctrs() {  ||   87)               |  cppc_get_perf_ctrs() {
108)   0.820 us    |    cpc_read.isra.0();    ||   87) ! 255.200 us  |    cpc_read.isra.0();
108)   0.850 us    |    cpc_read.isra.0();    ||   87)   2.910 us    |    cpc_read.isra.0();
108)   0.590 us    |    cpc_read.isra.0();    ||   87)   5.160 us    |    cpc_read.isra.0();
108)   0.610 us    |    cpc_read.isra.0();    ||   87)   4.340 us    |    cpc_read.isra.0();
108)   5.080 us    |  }                       ||   87) ! 315.790 us  |  }
108)               |  cppc_get_perf_ctrs() {  ||   87)               |  cppc_get_perf_ctrs() {
108)   0.630 us    |    cpc_read.isra.0();    ||   87)   0.800 us    |    cpc_read.isra.0();
108)   0.630 us    |    cpc_read.isra.0();    ||   87)   6.310 us    |    cpc_read.isra.0();
108)   0.420 us    |    cpc_read.isra.0();    ||   87)   1.190 us    |    cpc_read.isra.0();
108)   0.430 us    |    cpc_read.isra.0();    ||   87) + 11.620 us   |    cpc_read.isra.0();
108)   3.780 us    |  }                       ||   87) ! 207.010 us  |  }

My local testing platform works under 3000000hz, but the cpuinfo_cur_freq
interface returns values that are not even close to the actual frequency:

[root@localhost ~]# cd /sys/devices/system/cpu
[root@localhost cpu]# for i in {0..159}; do cat cpu$i/cpufreq/cpuinfo_cur_freq; done
5127812
2952127
3069001
3496183
922989768
2419194
3427042
2331869
3594611
8238499
...

The reason is when under heavy memory access pressure, the execution of
cpc_read() delay has increased from sub-microsecond to several hundred
microseconds. Moving the cpc_read function into a critical section by irq
disable/enable has minimal impact on the result.

  cppc_get_perf_ctrs()[0]                    cppc_get_perf_ctrs()[1]
/                    \                      /                      \
cpc_read         cpc_read                  cpc_read            cpc_read
 ref[0]        delivered[0]                 ref[1]            delivered[1]
    |              |                           |                    |
    v              v                           v                    v
-----------------------------------------------------------------------> time
     <--delta[0]--> <------sample_period------> <-----delta[1]----->

Since that,
  freq = ref_freq * (delivered[1] - delivered[0]) / (ref[1] - ref[0])
and
  delivered[1] - delivered[0] = freq * (delta[1] + sample_period),
  ref[1] - ref[0] = ref_freq * (delta[0] + sample_period)

To eliminate the impact of system memory access latency, setting a
sampling period of 2us is far from sufficient. Consequently, we suggest
cppc_cpufreq_get_rate() only can be called in the process context, and
adopt a longer sampling period to neutralize the impact of random latency.

Here we call the cond_resched() function instead of sleep-like functions
to ensure that `taskset -c $i cat cpu$i/cpufreq/cpuinfo_cur_freq` could
work when cpuidle feature is enabled.

Reported-by: Yang Shi <yang@os.amperecomputing.com>
Link: https://lore.kernel.org/all/20230328193846.8757-1-yang@os.amperecomputing.com/
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
 drivers/cpufreq/cppc_cpufreq.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

Message ID	20231025093847.3740104-4-zengheng4@huawei.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 469B7C25B6B for <linux-arm-kernel@archiver.kernel.org>; Wed, 25 Oct 2023 09:34:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=B2rov0zTiqZXW+TUm9kVH4In7Hy27HblWmsrnjq0gU4=; b=PFmEyV0rGHf3gq eHSkMhhLYVhA1fR1KBPLE4CDz3m3eeCR3LEKOA2P75+Ytv1IFnNbTTJ8XvFNvZ8xz2gfFNdFKDTnY UitKHtKO3E+jUoo21UFymfLbo71GiIGQvdFFA9ft9hA9lcxg+gOqM5tqE5LTjofite9ksnN2BuP/S EcTZ1N2q4aLE+qU4xAtRY5oGCMMcOzQ1fSlBMfKs2rum1ujDKTyD4quo763Ig1JG9XGBLCXp1Tri5 fNAo5rWUE0YFSgB0P0aq1qAbZBxRm5sHSwOSk8VXRbUi0Te1gVILohavFP174gPXtoIuFEOw7rpvF tLpwMW9+ldmrNPt5gfow==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qvaHF-00BpiE-2w; Wed, 25 Oct 2023 09:34:13 +0000 Received: from szxga08-in.huawei.com ([45.249.212.255]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qvaGs-00BpWk-28 for linux-arm-kernel@lists.infradead.org; Wed, 25 Oct 2023 09:33:52 +0000 Received: from kwepemi500024.china.huawei.com (unknown [172.30.72.57]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4SFkF60pgzz15NlK; Wed, 25 Oct 2023 17:30:50 +0800 (CST) Received: from huawei.com (10.175.103.91) by kwepemi500024.china.huawei.com (7.221.188.100) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Wed, 25 Oct 2023 17:33:40 +0800 From: Zeng Heng <zengheng4@huawei.com> To: <broonie@kernel.org>, <joey.gouly@arm.com>, <will@kernel.org>, <amit.kachhap@arm.com>, <rafael@kernel.org>, <catalin.marinas@arm.com>, <james.morse@arm.com>, <mark.rutland@arm.com>, <maz@kernel.org>, <viresh.kumar@linaro.org>, <sumitg@nvidia.com>, <yang@os.amperecomputing.com> CC: <linux-kernel@vger.kernel.org>, <linux-pm@vger.kernel.org>, <linux-arm-kernel@lists.infradead.org>, <wangxiongfeng2@huawei.com>, <xiexiuqi@huawei.com> Subject: [PATCH 3/3] cpufreq: CPPC: Eliminate the impact of cpc_read() latency error Date: Wed, 25 Oct 2023 17:38:47 +0800 Message-ID: <20231025093847.3740104-4-zengheng4@huawei.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231025093847.3740104-1-zengheng4@huawei.com> References: <20231025093847.3740104-1-zengheng4@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.103.91] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemi500024.china.huawei.com (7.221.188.100) X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231025_023351_074001_4C02A0F7 X-CRM114-Status: GOOD ( 14.83 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: <linux-arm-kernel.lists.infradead.org> List-Unsubscribe: <http://lists.infradead.org/mailman/options/linux-arm-kernel>, <mailto:linux-arm-kernel-request@lists.infradead.org?subject=unsubscribe> List-Archive: <http://lists.infradead.org/pipermail/linux-arm-kernel/> List-Post: <mailto:linux-arm-kernel@lists.infradead.org> List-Help: <mailto:linux-arm-kernel-request@lists.infradead.org?subject=help> List-Subscribe: <http://lists.infradead.org/mailman/listinfo/linux-arm-kernel>, <mailto:linux-arm-kernel-request@lists.infradead.org?subject=subscribe> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org
Series	Make the cpuinfo_cur_freq interface read correctly \| expand [0/3] Make the cpuinfo_cur_freq interface read correctly [1/3] arm64: cpufeature: Export cpu_has_amu_feat() [2/3] cpufreq: CPPC: Keep the target core awake when reading its cpufreq rate [3/3] cpufreq: CPPC: Eliminate the impact of cpc_read() latency error

[3/3] cpufreq: CPPC: Eliminate the impact of cpc_read() latency error

Commit Message

Comments

Patch