From patchwork Mon Jul 19 06:33:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gautham R Shenoy X-Patchwork-Id: 12384797 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 874C6C636CD for ; Mon, 19 Jul 2021 06:33:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6B653611AE for ; Mon, 19 Jul 2021 06:33:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233493AbhGSGgs (ORCPT ); Mon, 19 Jul 2021 02:36:48 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:17160 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233048AbhGSGgs (ORCPT ); Mon, 19 Jul 2021 02:36:48 -0400 Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 16J64BHm153323; Mon, 19 Jul 2021 02:33:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=pp1; bh=MN9XLLEAnxmc+w1MpuwlLANXnCDdalG4ijujfQRd/o4=; b=W3YYif6RkOQk8cXfgAWsmF5bSRCUDbbUJGCLq2/qlpCiE4p4pdA79XKpDAkuu0rQip62 WOOV4r9fyEDEfbqBgERBQA+v6/QVSX3SNtAc/an125siF0D5InYtqgMSMrThM9PqrG2J qBByGuE8mmv60AyJiZEoAtbp900SNt+GbvGOKBjFUiX3vCe1UDT5MjQ2h8zcn9rSZu0z ACNUCQXkEb5p/zLdKEp1kmF4HJcSDHdms+vPjp9FCoMynwSuOfCooodhz4kRi2b07bH2 cX5/XbOCsuHef6VIEXIgW5OVP/I7/8fcXWiaxt+sZyfo9zYR1dmT0HGI2iw6gu516v8Q PA== Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0a-001b2d01.pphosted.com with ESMTP id 39vxwpf41y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Jul 2021 02:33:35 -0400 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 16J6Hrks006070; Mon, 19 Jul 2021 06:33:34 GMT Received: from b01cxnp22033.gho.pok.ibm.com (b01cxnp22033.gho.pok.ibm.com [9.57.198.23]) by ppma05wdc.us.ibm.com with ESMTP id 39upu9vbwd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Jul 2021 06:33:34 +0000 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp22033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 16J6XXlq35651956 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 19 Jul 2021 06:33:33 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9AD77B205F; Mon, 19 Jul 2021 06:33:33 +0000 (GMT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0CBABB206B; Mon, 19 Jul 2021 06:33:33 +0000 (GMT) Received: from sofia.ibm.com (unknown [9.85.122.97]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP; Mon, 19 Jul 2021 06:33:32 +0000 (GMT) Received: by sofia.ibm.com (Postfix, from userid 1000) id D82332E2D0E; Mon, 19 Jul 2021 12:03:30 +0530 (IST) From: "Gautham R. Shenoy" To: "Rafael J. Wysocki" , Daniel Lezcano , Michael Ellerman , "Aneesh Kumar K.V" , Vaidyanathan Srinivasan , Michal Suchanek Cc: linux-pm@vger.kernel.org, joedecke@de.ibm.com, linuxppc-dev@lists.ozlabs.org, "Gautham R. Shenoy" , Vaidyanathan Srinivasan Subject: [PATCH v5 1/2] cpuidle/pseries: Fixup CEDE0 latency only for POWER10 onwards Date: Mon, 19 Jul 2021 12:03:18 +0530 Message-Id: <1626676399-15975-2-git-send-email-ego@linux.vnet.ibm.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1626676399-15975-1-git-send-email-ego@linux.vnet.ibm.com> References: <1626676399-15975-1-git-send-email-ego@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: ky_QaEFudg8UK-dVwpv6xaf1kTKNdpBT X-Proofpoint-GUID: ky_QaEFudg8UK-dVwpv6xaf1kTKNdpBT X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.790 definitions=2021-07-19_02:2021-07-16,2021-07-19 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 malwarescore=0 impostorscore=0 adultscore=0 mlxlogscore=999 lowpriorityscore=0 bulkscore=0 clxscore=1015 mlxscore=0 suspectscore=0 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2107190033 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: "Gautham R. Shenoy" Commit d947fb4c965c ("cpuidle: pseries: Fixup exit latency for CEDE(0)") sets the exit latency of CEDE(0) based on the latency values of the Extended CEDE states advertised by the platform On POWER9 LPARs, the firmwares advertise a very low value of 2us for CEDE1 exit latency on a Dedicated LPAR. The latency advertized by the PHYP hypervisor corresponds to the latency required to wakeup from the underlying hardware idle state. However the wakeup latency from the LPAR perspective should include 1. The time taken to transition the CPU from the Hypervisor into the LPAR post wakeup from platform idle state 2. Time taken to send the IPI from the source CPU (waker) to the idle target CPU (wakee). 1. can be measured via timer idle test, where we queue a timer, say for 1ms, and enter the CEDE state. When the timer fires, in the timer handler we compute how much extra timer over the expected 1ms have we consumed. On a a POWER9 LPAR the numbers are CEDE latency measured using a timer (numbers in ns) N Min Median Avg 90%ile 99%ile Max Stddev 400 2601 5677 5668.74 5917 6413 9299 455.01 1. and 2. combined can be determined by an IPI latency test where we send an IPI to an idle CPU and in the handler compute the time difference between when the IPI was sent and when the handler ran. We see the following numbers on POWER9 LPAR. CEDE latency measured using an IPI (numbers in ns) N Min Median Avg 90%ile 99%ile Max Stddev 400 711 7564 7369.43 8559 9514 9698 1200.01 Suppose, we consider the 99th percentile latency value measured using the IPI to be the wakeup latency, the value would be 9.5us This is in the ballpark of the default value of 10us. Hence, use the exit latency of CEDE(0) based on the latency values advertized by platform only from POWER10 onwards. The values advertized on POWER10 platforms is more realistic and informed by the latency measurements. For earlier platforms stick to the default value of 10us. The fix was suggested by Michael Ellerman. Reported-by: Enrico Joedecke Fixes: commit d947fb4c965c ("cpuidle: pseries: Fixup exit latency for CEDE(0)") Cc: Michal Suchanek Cc: Vaidyanathan Srinivasan Signed-off-by: Gautham R. Shenoy --- drivers/cpuidle/cpuidle-pseries.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c index a2b5c6f..e592280d 100644 --- a/drivers/cpuidle/cpuidle-pseries.c +++ b/drivers/cpuidle/cpuidle-pseries.c @@ -419,7 +419,21 @@ static int pseries_idle_probe(void) cpuidle_state_table = shared_states; max_idle_state = ARRAY_SIZE(shared_states); } else { - fixup_cede0_latency(); + /* + * Use firmware provided latency values + * starting with POWER10 platforms. In the + * case that we are running on a POWER10 + * platform but in an earlier compat mode, we + * can still use the firmware provided values. + * + * However, on platforms prior to POWER10, we + * cannot rely on the accuracy of the firmware + * provided latency values. On such platforms, + * go with the conservative default estimate + * of 10us. + */ + if (cpu_has_feature(CPU_FTR_ARCH_31) || pvr_version_is(PVR_POWER10)) + fixup_cede0_latency(); cpuidle_state_table = dedicated_states; max_idle_state = NR_DEDICATED_STATES; } From patchwork Mon Jul 19 06:33:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gautham R Shenoy X-Patchwork-Id: 12384799 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91474C636CB for ; Mon, 19 Jul 2021 06:33:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 75BC560BD3 for ; Mon, 19 Jul 2021 06:33:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233632AbhGSGgw (ORCPT ); Mon, 19 Jul 2021 02:36:52 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:28996 "EHLO mx0b-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233048AbhGSGgu (ORCPT ); Mon, 19 Jul 2021 02:36:50 -0400 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 16J6SiRh013093; Mon, 19 Jul 2021 02:33:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=pp1; bh=xhsFBQQhd+v0g2hFgAVn2UjRNynXKnOoX1vUoPJcaTE=; b=s+y7PGu3M2MiMUXrnMNBpX157i65jMU/sfF9/j6ZyDw7wfg2w6UKibpRrMrjHMwOyjOA v7BlQ4OsOHoZ7T02asfyNUi4n2E8Wd6gpTj45YB7wyTo84N34rfKB6disFl9nCjFHyFF OACvM/l/2FLKvmpZXquL3Yzc2VJ/jIe7O7XfK/OcBJWKas//o+GFQiaebXi3ihLyJ7Ir 4pQqX+WQh1+y+BbvN8kxzFHt2euHCxwYc6agZhUpM+gFHf1MJ54NA+1alHoWcoPDqWEy pUoeC9/f64i7AgHuAtoSnfHvKX1kRrVqEQ2d8M5YnsQ3VlIY3XTe6EU15Gja10c7vRGO 9g== Received: from ppma04dal.us.ibm.com (7a.29.35a9.ip4.static.sl-reverse.com [169.53.41.122]) by mx0a-001b2d01.pphosted.com with ESMTP id 39w0mt4ar7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Jul 2021 02:33:35 -0400 Received: from pps.filterd (ppma04dal.us.ibm.com [127.0.0.1]) by ppma04dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 16J6Wvnv009171; Mon, 19 Jul 2021 06:33:34 GMT Received: from b01cxnp22035.gho.pok.ibm.com (b01cxnp22035.gho.pok.ibm.com [9.57.198.25]) by ppma04dal.us.ibm.com with ESMTP id 39upuaddnr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Jul 2021 06:33:34 +0000 Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp22035.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 16J6XXVm19988864 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 19 Jul 2021 06:33:33 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id ABBD26A74D; Mon, 19 Jul 2021 06:33:33 +0000 (GMT) Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 179696A74C; Mon, 19 Jul 2021 06:33:33 +0000 (GMT) Received: from sofia.ibm.com (unknown [9.85.122.97]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP; Mon, 19 Jul 2021 06:33:32 +0000 (GMT) Received: by sofia.ibm.com (Postfix, from userid 1000) id DD1F72E2D40; Mon, 19 Jul 2021 12:03:30 +0530 (IST) From: "Gautham R. Shenoy" To: "Rafael J. Wysocki" , Daniel Lezcano , Michael Ellerman , "Aneesh Kumar K.V" , Vaidyanathan Srinivasan , Michal Suchanek Cc: linux-pm@vger.kernel.org, joedecke@de.ibm.com, linuxppc-dev@lists.ozlabs.org, "Gautham R. Shenoy" Subject: [PATCH v5 2/2] cpuidle/pseries: Do not cap the CEDE0 latency in fixup_cede0_latency() Date: Mon, 19 Jul 2021 12:03:19 +0530 Message-Id: <1626676399-15975-3-git-send-email-ego@linux.vnet.ibm.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1626676399-15975-1-git-send-email-ego@linux.vnet.ibm.com> References: <1626676399-15975-1-git-send-email-ego@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: nAKZk-yhKhhOyvD5KGc_zypugulw-9aK X-Proofpoint-GUID: nAKZk-yhKhhOyvD5KGc_zypugulw-9aK X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.790 definitions=2021-07-19_02:2021-07-16,2021-07-19 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 impostorscore=0 malwarescore=0 adultscore=0 clxscore=1011 mlxlogscore=999 phishscore=0 priorityscore=1501 mlxscore=0 bulkscore=0 spamscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2107190033 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: "Gautham R. Shenoy" Currently in fixup_cede0_latency() code, we perform the fixup the CEDE(0) exit latency value only if minimum advertized extended CEDE latency values are less than 10us. This was done so as to not break the expected behaviour on POWER8 platforms where the advertised latency was higher than the default 10us, which would delay the SMT folding on the core. However, after the earlier patch "cpuidle/pseries: Fixup CEDE0 latency only for POWER10 onwards", we can be sure that the fixup of CEDE0 latency is going to happen only from POWER10 onwards. Hence unconditionally use the minimum exit latency provided by the platform. Signed-off-by: Gautham R. Shenoy --- drivers/cpuidle/cpuidle-pseries.c | 59 ++++++++++++++++++++------------------- 1 file changed, 30 insertions(+), 29 deletions(-) diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c index e592280d..bba449b 100644 --- a/drivers/cpuidle/cpuidle-pseries.c +++ b/drivers/cpuidle/cpuidle-pseries.c @@ -346,11 +346,9 @@ static int pseries_cpuidle_driver_init(void) static void __init fixup_cede0_latency(void) { struct xcede_latency_payload *payload; - u64 min_latency_us; + u64 min_xcede_latency_us = UINT_MAX; int i; - min_latency_us = dedicated_states[1].exit_latency; // CEDE latency - if (parse_cede_parameters()) return; @@ -358,42 +356,45 @@ static void __init fixup_cede0_latency(void) nr_xcede_records); payload = &xcede_latency_parameter.payload; + + /* + * The CEDE idle state maps to CEDE(0). While the hypervisor + * does not advertise CEDE(0) exit latency values, it does + * advertise the latency values of the extended CEDE states. + * We use the lowest advertised exit latency value as a proxy + * for the exit latency of CEDE(0). + */ for (i = 0; i < nr_xcede_records; i++) { struct xcede_latency_record *record = &payload->records[i]; + u8 hint = record->hint; u64 latency_tb = be64_to_cpu(record->latency_ticks); u64 latency_us = DIV_ROUND_UP_ULL(tb_to_ns(latency_tb), NSEC_PER_USEC); - if (latency_us == 0) - pr_warn("cpuidle: xcede record %d has an unrealistic latency of 0us.\n", i); - - if (latency_us < min_latency_us) - min_latency_us = latency_us; - } - - /* - * By default, we assume that CEDE(0) has exit latency 10us, - * since there is no way for us to query from the platform. - * - * However, if the wakeup latency of an Extended CEDE state is - * smaller than 10us, then we can be sure that CEDE(0) - * requires no more than that. - * - * Perform the fix-up. - */ - if (min_latency_us < dedicated_states[1].exit_latency) { /* - * We set a minimum of 1us wakeup latency for cede0 to - * distinguish it from snooze + * We expect the exit latency of an extended CEDE + * state to be non-zero, it to since it takes at least + * a few nanoseconds to wakeup the idle CPU and + * dispatch the virtual processor into the Linux + * Guest. + * + * So we consider only non-zero value for performing + * the fixup of CEDE(0) latency. */ - u64 cede0_latency = 1; + if (latency_us == 0) { + pr_warn("cpuidle: Skipping xcede record %d [hint=%d]. Exit latency = 0us\n", + i, hint); + continue; + } - if (min_latency_us > cede0_latency) - cede0_latency = min_latency_us - 1; + if (latency_us < min_xcede_latency_us) + min_xcede_latency_us = latency_us; + } - dedicated_states[1].exit_latency = cede0_latency; - dedicated_states[1].target_residency = 10 * (cede0_latency); + if (min_xcede_latency_us != UINT_MAX) { + dedicated_states[1].exit_latency = min_xcede_latency_us; + dedicated_states[1].target_residency = 10 * (min_xcede_latency_us); pr_info("cpuidle: Fixed up CEDE exit latency to %llu us\n", - cede0_latency); + min_xcede_latency_us); } }