From patchwork Thu Jun 1 18:27:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arjan van de Ven X-Patchwork-Id: 13264380 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A933C77B7A for ; Thu, 1 Jun 2023 18:30:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229672AbjFASaw (ORCPT ); Thu, 1 Jun 2023 14:30:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232674AbjFASao (ORCPT ); Thu, 1 Jun 2023 14:30:44 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3EF66193 for ; Thu, 1 Jun 2023 11:30:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685644242; x=1717180242; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mBO1UNUmkqWGbAWUvgXaq7kgcyWPB2AxPCN7VsVrLJM=; b=XUNOecTeuPTx+3BOmRtBZeGDmopDMEvEZyEgyCP6znshrWvdHcWwBIR4 vAzZqGnMe6EnVISFZW5VkSl4OuOYLFDqdXrp4PAJlC2bVq/pruxZa/a5m aM0neTIiWTPgZdDeRHRGGQ4izGImiaGTDEXfuGk5y4uekOKmx7N1bZhRz 7SLU/M+Ig7m5/3+tRblotiJx0EMGG74ESJ0XayD2LzS41LI1XZewYIanK X81J0qaZxh6GQJZHKbnk4DG7Psi22bopa0wQJ7CIfem/zszEH4XVz9wiJ V1VKgJerQAyDWuwHt45ZamOQGs1r/+a+x8wWwla6jguK6TB7HN7uJ+6DA w==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="383921612" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="383921612" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="657900905" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="657900905" Received: from arjan-box.jf.intel.com ([10.54.74.119]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:17 -0700 From: arjan@linux.intel.com To: linux-pm@vger.kernel.org Cc: artem.bityutskiy@linux.intel.com, rafael@kernel.org, Arjan van de Ven , Arjan van de Ven Subject: [PATCH 1/7] intel_idle: refactor state->enter manipulation into its own function Date: Thu, 1 Jun 2023 18:27:55 +0000 Message-Id: <20230601182801.2622044-2-arjan@linux.intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230601182801.2622044-1-arjan@linux.intel.com> References: <20230601182801.2622044-1-arjan@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: Arjan van de Ven Since the 6.4 kernel, the logic for updating a state's enter method based on "environmental conditions" (command line options, cpu sidechannel workarounds etc etc) has gotten pretty complex. This patch refactors this into a seperate small, self contained function (no behavior changes) for improved readability and to make future changes to this logic easier to do and understand. Signed-off-by: Arjan van de Ven --- drivers/idle/intel_idle.c | 50 ++++++++++++++++++++++----------------- 1 file changed, 28 insertions(+), 22 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index aa2d19db2b1d..c351b21c0875 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -1839,6 +1839,32 @@ static bool __init intel_idle_verify_cstate(unsigned int mwait_hint) return true; } +static void state_update_enter_method(struct cpuidle_state *state, int cstate) +{ + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) { + /* + * Combining with XSTATE with IBRS or IRQ_ENABLE flags + * is not currently supported but this driver. + */ + WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IBRS); + WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE); + state->enter = intel_idle_xstate; + } else if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) && + state->flags & CPUIDLE_FLAG_IBRS) { + /* + * IBRS mitigation requires that C-states are entered + * with interrupts disabled. + */ + WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE); + state->enter = intel_idle_ibrs; + } else if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) { + state->enter = intel_idle_irq; + } else if (force_irq_on) { + pr_info("forced intel_idle_irq for state %d\n", cstate); + state->enter = intel_idle_irq; + } +} + static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv) { int cstate; @@ -1894,28 +1920,8 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv) drv->states[drv->state_count] = cpuidle_state_table[cstate]; state = &drv->states[drv->state_count]; - if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) { - /* - * Combining with XSTATE with IBRS or IRQ_ENABLE flags - * is not currently supported but this driver. - */ - WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IBRS); - WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE); - state->enter = intel_idle_xstate; - } else if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) && - state->flags & CPUIDLE_FLAG_IBRS) { - /* - * IBRS mitigation requires that C-states are entered - * with interrupts disabled. - */ - WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE); - state->enter = intel_idle_ibrs; - } else if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) { - state->enter = intel_idle_irq; - } else if (force_irq_on) { - pr_info("forced intel_idle_irq for state %d\n", cstate); - state->enter = intel_idle_irq; - } + state_update_enter_method(state, cstate); + if ((disabled_states_mask & BIT(drv->state_count)) || ((icpu->use_acpi || force_use_acpi) && From patchwork Thu Jun 1 18:27:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arjan van de Ven X-Patchwork-Id: 13264379 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E374C7EE32 for ; Thu, 1 Jun 2023 18:30:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231303AbjFASav (ORCPT ); Thu, 1 Jun 2023 14:30:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232350AbjFASao (ORCPT ); Thu, 1 Jun 2023 14:30:44 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F151196 for ; Thu, 1 Jun 2023 11:30:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685644242; x=1717180242; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mWfrGVa+PTb4FgQoPCZsWhwVeY1hyETZCVFPyRVU/GM=; b=KqdnTM6Dio8UvkCMKgVhm9Ncp6zTcv0PYrLDhLvRlNYAOTZNE8vkR/eA MiOn57OD356+ZFyX7YYk39wTLlRLNLuae/UAf6j9l/S40MUy5x1b8YGL8 QxaxnDM8oy8E0K0jKONBcq5GBOrk3KOhDTcldEd1p58ZOikWsH82jsMzV VZqw5G+J2ADKuwIUP3hL5rMP70aTvx8PCxz8/ixqqor6wKwSY/aKMSd/s e4hakcWp/z4pGekOWo6sqDZkOcMWRzMqFZsurjFDj9Gr7GQroF5Vu63cW Hae9l6r7iSwI1vyGc/SkZ45eKGNbFe+o1KJvqc/G3onyQoUu0x4JPGtl9 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="383921626" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="383921626" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="657900910" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="657900910" Received: from arjan-box.jf.intel.com ([10.54.74.119]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:19 -0700 From: arjan@linux.intel.com To: linux-pm@vger.kernel.org Cc: artem.bityutskiy@linux.intel.com, rafael@kernel.org, Arjan van de Ven , Arjan van de Ven Subject: [PATCH 2/7] intel_idle: clean up the (new) state_update_enter_method function Date: Thu, 1 Jun 2023 18:27:56 +0000 Message-Id: <20230601182801.2622044-3-arjan@linux.intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230601182801.2622044-1-arjan@linux.intel.com> References: <20230601182801.2622044-1-arjan@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: Arjan van de Ven Now that the logic for state_update_enter_method() is in its own function, the long if .. else if .. else if .. else if chain can be simplified by just returning from the function at the various places. This does not change functionality, but it makes the logic much simpler to read or modify later. Signed-off-by: Arjan van de Ven --- drivers/idle/intel_idle.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index c351b21c0875..256c2d42e350 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -1849,7 +1849,10 @@ static void state_update_enter_method(struct cpuidle_state *state, int cstate) WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IBRS); WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE); state->enter = intel_idle_xstate; - } else if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) && + return; + } + + if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) && state->flags & CPUIDLE_FLAG_IBRS) { /* * IBRS mitigation requires that C-states are entered @@ -1857,9 +1860,15 @@ static void state_update_enter_method(struct cpuidle_state *state, int cstate) */ WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE); state->enter = intel_idle_ibrs; - } else if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) { + return; + } + + if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) { state->enter = intel_idle_irq; - } else if (force_irq_on) { + return; + } + + if (force_irq_on) { pr_info("forced intel_idle_irq for state %d\n", cstate); state->enter = intel_idle_irq; } From patchwork Thu Jun 1 18:27:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arjan van de Ven X-Patchwork-Id: 13264378 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06616C7EE2F for ; Thu, 1 Jun 2023 18:30:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232656AbjFASat (ORCPT ); Thu, 1 Jun 2023 14:30:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35426 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232258AbjFASan (ORCPT ); Thu, 1 Jun 2023 14:30:43 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42ED1197 for ; Thu, 1 Jun 2023 11:30:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685644242; x=1717180242; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TV9+KotZgTt2wCEdDqxarJhA1EkexUK6f2+xSypUoxQ=; b=VcrMwIQCOhffYBvwC/Qhblc5czevZDF+rHuSVqPo0cIaOgLT4oec42SU t4zxGUFl9APz13lRvqDKIVq7lDp47N9TNrUAKB99sHG8VR4MBhv/wtGW7 fLoleOciMNjC0i3Ra4Ya406VfD2rQ8npP+ZcQLI5+SAXaPiKWveU7CmGL MoIo6ct+hL8X8e+EVpqiaSz0gCScLusZhL7ym/nukt+p8rBSHZHoox8kc fcnjgRbEYVAulILGGLwsK48z98RcFkaBlSk0PcwEofjQ/CJPzCyEN9YdI ZbGGga0GS7z0cfxgNkKuY/86Nrm9IBPu4rnod1lgH9j8MOlMVsaqKZSa1 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="383921632" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="383921632" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="657900914" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="657900914" Received: from arjan-box.jf.intel.com ([10.54.74.119]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:19 -0700 From: arjan@linux.intel.com To: linux-pm@vger.kernel.org Cc: artem.bityutskiy@linux.intel.com, rafael@kernel.org, Arjan van de Ven , Arjan van de Ven Subject: [PATCH 3/7] intel_idle: Add a sanity check in the new state_update_enter_method function Date: Thu, 1 Jun 2023 18:27:57 +0000 Message-Id: <20230601182801.2622044-4-arjan@linux.intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230601182801.2622044-1-arjan@linux.intel.com> References: <20230601182801.2622044-1-arjan@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: Arjan van de Ven The state_update_enter_method function updates a state's enter function pointer, but does so assuming that the current function is "intel_idle" or "intel_idle_irq". In the code currently that's basically the case, but soon this will change. Add a sanity check early in the function to make the assumption explicit, and return early if the precondition is not met. Signed-off-by: Arjan van de Ven --- drivers/idle/intel_idle.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 256c2d42e350..8415965372c7 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -1841,6 +1841,14 @@ static bool __init intel_idle_verify_cstate(unsigned int mwait_hint) static void state_update_enter_method(struct cpuidle_state *state, int cstate) { + /* + * The updates below are only valid if state->enter is actually the + * 'intel_idle' or 'intel_idle_irq' functions; for all other cases + * we just bow out early. + */ + if (state->enter != intel_idle && state->enter != intel_idle_irq ) + return; + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) { /* * Combining with XSTATE with IBRS or IRQ_ENABLE flags From patchwork Thu Jun 1 18:27:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arjan van de Ven X-Patchwork-Id: 13264382 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D937C7EE2A for ; Thu, 1 Jun 2023 18:31:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232777AbjFASbJ (ORCPT ); Thu, 1 Jun 2023 14:31:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232250AbjFASar (ORCPT ); Thu, 1 Jun 2023 14:30:47 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5EFF71B6 for ; Thu, 1 Jun 2023 11:30:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685644246; x=1717180246; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=U1Jor66igQKrXAvCg2YBr0y0//OW9WNCJBVHPvJAG7E=; b=O0dy3hsZ5RijWCS3wFRasbhD+mlvc0XEbA4tao8ZcBszc51juLRik/gv a8hq6mM4YGsTEmCQmtxGz4KUcODedoNVA7OeKusZnXs9gqI4+tYKQwGQ8 B5uF+5hIXEzUNoj3Ro68vx08AzGw06kE9QnRV1h4bGHGw/hGrXTgVxIup T5uoDvZss5TkSCX5+rV323HYMi/w9ZKDJK3O9q5cC29XGlQOApeR8rd3B RKV29sNNkFNQviHsz+n/OUAY0VEoDP1XRod570VfQ3kih7v615My9vaKr XAPs4f5NdFfSVjp6p+hP6GbXmqzIwshu8LybiBh/PUrq9R5vW3WB3Brer g==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="383921638" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="383921638" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="657900919" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="657900919" Received: from arjan-box.jf.intel.com ([10.54.74.119]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:20 -0700 From: arjan@linux.intel.com To: linux-pm@vger.kernel.org Cc: artem.bityutskiy@linux.intel.com, rafael@kernel.org, Arjan van de Ven , Arjan van de Ven Subject: [PATCH 4/7] intel_idle: Add helper functions to support 'hlt' as idle state Date: Thu, 1 Jun 2023 18:27:58 +0000 Message-Id: <20230601182801.2622044-5-arjan@linux.intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230601182801.2622044-1-arjan@linux.intel.com> References: <20230601182801.2622044-1-arjan@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: Arjan van de Ven Currently the intel_idle driver has a family of functions called intel_idle/intel_idle_irq/intel_idle_xsave/... that use the mwait instruction to enter into a low power state. x86 cpus can also use the legacy "hlt" instruction for this, and in some cases (VM guests for example) the mwait instruction might not be available. Because of this, add the basic helpers to allow 'hlt' to be used to enter a low power state (will be used in later patches), both in the regular and the _irq enabled variant. Signed-off-by: Arjan van de Ven --- drivers/idle/intel_idle.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 8415965372c7..66d262fd267e 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -199,6 +199,43 @@ static __cpuidle int intel_idle_xstate(struct cpuidle_device *dev, return __intel_idle(dev, drv, index); } +static __always_inline int __intel_idle_hlt(struct cpuidle_device *dev, + struct cpuidle_driver *drv, int index) +{ + raw_safe_halt(); + raw_local_irq_disable(); + return index; +} + +/** + * intel_idle_hlt - Ask the processor to enter the given idle state using hlt. + * @dev: cpuidle device of the target CPU. + * @drv: cpuidle driver (assumed to point to intel_idle_driver). + * @index: Target idle state index. + * + * Use the HLT instruction to notify the processor that the CPU represented by + * @dev is idle and it can try to enter the idle state corresponding to @index. + * + * Must be called under local_irq_disable(). + */ +static __cpuidle int intel_idle_hlt(struct cpuidle_device *dev, + struct cpuidle_driver *drv, int index) +{ + return __intel_idle_hlt(dev, drv, index); +} + +static __cpuidle int intel_idle_hlt_irq(struct cpuidle_device *dev, + struct cpuidle_driver *drv, int index) +{ + int ret; + + raw_local_irq_enable(); + ret = __intel_idle_hlt(dev, drv, index); + raw_local_irq_disable(); + + return ret; +} + /** * intel_idle_s2idle - Ask the processor to enter the given idle state. * @dev: cpuidle device of the target CPU. From patchwork Thu Jun 1 18:27:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arjan van de Ven X-Patchwork-Id: 13264381 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D41AC7EE23 for ; Thu, 1 Jun 2023 18:31:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232984AbjFASbI (ORCPT ); Thu, 1 Jun 2023 14:31:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35672 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231279AbjFASar (ORCPT ); Thu, 1 Jun 2023 14:30:47 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6035F1B7 for ; Thu, 1 Jun 2023 11:30:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685644246; x=1717180246; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GKdhPPBYvkATBkf66MVCgik12brZChNrB577n2CaSSo=; b=ByTVkmxGUbgUkirfal2rpPCa+kZ31LYUU/30ps40lYoYnNErsuoflbcY j/fHb/q09Tq3lhbwVS91V0j8Duk5PqOIo59KeYjWXOW8Qi3E3hQTv0JLb jmGjYduz32WyThV64VnpZIvBDtjcsQ+PjVE4e28Zc56j99lQ54S/8txaG fNpfPSBXYMWriRq9yy7epqSYlGvRHdXZkD0QC8EerfFRNo6P9xGzr/cLK dKDNNXgmxuk+USPh26cfQ5sq9tdUlbPsGpm6sTgIaWWqv3Lx/5moO8efd MA7hSWi0huaaRDH3rlR354PUdkC5Gq+6gb6KEFmJvlm7ZBQ9bQdhFzh6i Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="383921641" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="383921641" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="657900922" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="657900922" Received: from arjan-box.jf.intel.com ([10.54.74.119]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:21 -0700 From: arjan@linux.intel.com To: linux-pm@vger.kernel.org Cc: artem.bityutskiy@linux.intel.com, rafael@kernel.org, Arjan van de Ven , Arjan van de Ven Subject: [PATCH 5/7] intel_idle: Add a way to skip the mwait check on all states Date: Thu, 1 Jun 2023 18:27:59 +0000 Message-Id: <20230601182801.2622044-6-arjan@linux.intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230601182801.2622044-1-arjan@linux.intel.com> References: <20230601182801.2622044-1-arjan@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: Arjan van de Ven Currently, intel_idle verifies that the cpuid instruction enumerates that the mwait value for a state is actually supported by the CPU. Going forward, when running in a VM guest, that check will not work and we're going to need a way to turn it off. Add a global bool for this, and uses this in the check function to short circuit this cpuid check. Signed-off-by: Arjan van de Ven --- drivers/idle/intel_idle.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 66d262fd267e..55c3e6ece3dd 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -69,6 +69,7 @@ static int max_cstate = CPUIDLE_STATE_MAX - 1; static unsigned int disabled_states_mask __read_mostly; static unsigned int preferred_states_mask __read_mostly; static bool force_irq_on __read_mostly; +static bool skip_mwait_check __read_mostly; static struct cpuidle_device __percpu *intel_idle_cpuidle_devices; @@ -1866,6 +1867,9 @@ static bool __init intel_idle_verify_cstate(unsigned int mwait_hint) unsigned int num_substates = (mwait_substates >> mwait_cstate * 4) & MWAIT_SUBSTATE_MASK; + if (skip_mwait_check) + return true; + /* Ignore the C-state if there are NO sub-states in CPUID for it. */ if (num_substates == 0) return false; From patchwork Thu Jun 1 18:28:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arjan van de Ven X-Patchwork-Id: 13264383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6616CC7EE2E for ; Thu, 1 Jun 2023 18:31:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232758AbjFASbK (ORCPT ); Thu, 1 Jun 2023 14:31:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35428 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232776AbjFASas (ORCPT ); Thu, 1 Jun 2023 14:30:48 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 606C11B8 for ; Thu, 1 Jun 2023 11:30:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685644246; x=1717180246; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=y2t8/LIeUlBkDeRC1K4Q3OOvQDYIW/K1pQfYY1RRulw=; b=k0L0IRQ+FDd1MRXdbFlO4aXPdBggAk9u4ROrPwNzPc+i9SzgblTpY6Sa sdL1x+EHDXbELdje6k+126TA6MPmU9b/BrGZj5/Cs1o01x3IqB6WiYEMg /xAxT3sZJxMmw28cUh2gLnO9xdJ1uxBOxXUwbI5WjETeaTk2X7utg7gpa wZl6vOUuZSJkvLfNBff57YD7DxaGoz36WNUZNkD/2jaIPT7Rqyia3KmPw eWRYLBfqsisbIUV1iuAXEo1Ir85ic7UNf/1sncK+2znErqAXz93TYMZrh 1AMe92GLHPXGKyT+E5dQW981KbsqCH3krmaOk+sz02cLGWnMnz0ceVzNO w==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="383921645" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="383921645" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="657900926" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="657900926" Received: from arjan-box.jf.intel.com ([10.54.74.119]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:21 -0700 From: arjan@linux.intel.com To: linux-pm@vger.kernel.org Cc: artem.bityutskiy@linux.intel.com, rafael@kernel.org, Arjan van de Ven , Arjan van de Ven Subject: [PATCH 6/7] intel_idle: Add support for using intel_idle in a VM guest using just hlt Date: Thu, 1 Jun 2023 18:28:00 +0000 Message-Id: <20230601182801.2622044-7-arjan@linux.intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230601182801.2622044-1-arjan@linux.intel.com> References: <20230601182801.2622044-1-arjan@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: Arjan van de Ven In a typical VM guest, the mwait instruction is not available, leaving only the 'hlt' instruction (which causes a VMEXIT to the host). So currently, intel_idle will detect the lack of mwait, and fail to initialize (after which another idle method would step in which will just use hlt always). By providing capability to do this with the intel_idle driver, we can do better than this fallback. While this current change only gets us parity to the existing behavior, later patches in this series will add new capabilities. In order to do this, a simplified version of the initialization function for VM guests is created, and this will be called if the CPU is recognized, but mwait is not supported, and we're in a VM guest. One thing to note is that the latency (and break even) of this C1 state is higher than the typical bare metal C1 state. Because hlt causes a vmexit, and the cost of vmexit + hypervisor overhead + vmenter is typically in the order of upto 5 microseconds... even if the hypervisor does not actually goes into a hardware power saving state. Signed-off-by: Arjan van de Ven --- drivers/idle/intel_idle.c | 54 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 55c3e6ece3dd..c4929d8a35a4 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -1280,6 +1280,18 @@ static struct cpuidle_state snr_cstates[] __initdata = { .enter = NULL } }; +static struct cpuidle_state vmguest_cstates[] __initdata = { + { + .name = "C1", + .desc = "HLT", + .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE, + .exit_latency = 5, + .target_residency = 10, + .enter = &intel_idle_hlt_irq, }, + { + .enter = NULL } +}; + static const struct idle_cpu idle_cpu_nehalem __initconst = { .state_table = nehalem_cstates, .auto_demotion_disable_flags = NHM_C1_AUTO_DEMOTE | NHM_C3_AUTO_DEMOTE, @@ -2105,6 +2117,46 @@ static void __init intel_idle_cpuidle_devices_uninit(void) cpuidle_unregister_device(per_cpu_ptr(intel_idle_cpuidle_devices, i)); } +static int __init intel_idle_vminit(const struct x86_cpu_id *id) +{ + int retval; + + cpuidle_state_table = vmguest_cstates; + skip_mwait_check = true; /* hypervisor hides mwait from us normally */ + + icpu = (const struct idle_cpu *)id->driver_data; + + pr_debug("v" INTEL_IDLE_VERSION " model 0x%X\n", + boot_cpu_data.x86_model); + + intel_idle_cpuidle_devices = alloc_percpu(struct cpuidle_device); + if (!intel_idle_cpuidle_devices) + return -ENOMEM; + + intel_idle_cpuidle_driver_init(&intel_idle_driver); + + retval = cpuidle_register_driver(&intel_idle_driver); + if (retval) { + struct cpuidle_driver *drv = cpuidle_get_driver(); + printk(KERN_DEBUG pr_fmt("intel_idle yielding to %s\n"), + drv ? drv->name : "none"); + goto init_driver_fail; + } + + retval = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "idle/intel:online", + intel_idle_cpu_online, NULL); + if (retval < 0) + goto hp_setup_fail; + + return 0; +hp_setup_fail: + intel_idle_cpuidle_devices_uninit(); + cpuidle_unregister_driver(&intel_idle_driver); +init_driver_fail: + free_percpu(intel_idle_cpuidle_devices); + return retval; +} + static int __init intel_idle_init(void) { const struct x86_cpu_id *id; @@ -2123,6 +2175,8 @@ static int __init intel_idle_init(void) id = x86_match_cpu(intel_idle_ids); if (id) { if (!boot_cpu_has(X86_FEATURE_MWAIT)) { + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) + return intel_idle_vminit(id); pr_debug("Please enable MWAIT in BIOS SETUP\n"); return -ENODEV; } From patchwork Thu Jun 1 18:28:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arjan van de Ven X-Patchwork-Id: 13264384 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9811AC77B7A for ; Thu, 1 Jun 2023 18:31:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232250AbjFASbJ (ORCPT ); Thu, 1 Jun 2023 14:31:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232758AbjFASar (ORCPT ); Thu, 1 Jun 2023 14:30:47 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 610151B9 for ; Thu, 1 Jun 2023 11:30:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685644246; x=1717180246; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=X7iBLJ8cWDJrTQv0o3e2MbK1UDlcv/vN12Fvbh88Azo=; b=a59LSMRu8Ki2v7YqCgNHDI2ndlRlccRsrxZHYFud76muADSLC6Hv8MWH 86iUW1FNT4Nyb8racBTFf4LCK6POTNPEBRI68zf9oriuTmtwjM808UxoD lJvedjnCFPRzHxE/EFtvaFAhJftMRnRCycsxWtvC3h3u6CUeW6FK70KP6 5XwS7jnuPATyTz3dKLdMHk4cKnOv3IIe5LjTigKNO1eXS/BWEfolmfXeG 6tYTXfnbur7sUUL79atTeTWnnHSskVPBF4Qvhpc62wboqADrL0Nxjwi1i wmQ1h1dMtKGV7/Scdywum/+6Zmciht8lRtNi1fvlPUkFaHRnswXMDF/z4 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="383921647" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="383921647" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="657900932" X-IronPort-AV: E=Sophos;i="6.00,210,1681196400"; d="scan'208";a="657900932" Received: from arjan-box.jf.intel.com ([10.54.74.119]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 11:28:22 -0700 From: arjan@linux.intel.com To: linux-pm@vger.kernel.org Cc: artem.bityutskiy@linux.intel.com, rafael@kernel.org, Arjan van de Ven , Arjan van de Ven Subject: [PATCH 7/7] intel_idle: Add a "Long HLT" C1 state for the VM guest mode Date: Thu, 1 Jun 2023 18:28:01 +0000 Message-Id: <20230601182801.2622044-8-arjan@linux.intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230601182801.2622044-1-arjan@linux.intel.com> References: <20230601182801.2622044-1-arjan@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: Arjan van de Ven intel_idle will, for the bare metal case, usually have one or more deep power states that have the CPUIDLE_FLAG_TLB_FLUSHED flag set. When a state with this flag is selected by the cpuidle framework, it will also flush the TLBs as part of entering this state. The benefit of doing this is that the kernel does not need to wake the cpu out of this deep power state just to flush the TLBs... for which the latency can be very high due to the exit latency of deep power states. In a VM guest currently, this benefit of avoiding the wakeup does not exist, while the problem (long exit latency) is even more severe. Linux will need to wake up a vCPU (causing the host to either come out of a deep C state, or the VMM to have to deschedule something else to schedule the vCPU) which can take a very long time.. adding a lot of latency to tlb flush operations (including munmap and others). To solve this, add a "Long HLT" C state to the state table for the VM guest case that has the CPUIDLE_FLAG_TLB_FLUSHED flag set. The result of that is that for long idle periods (where the VMM is likely to do things that cause large latency) the cpuidle framework will flush the TLBs (and avoid the wakeups), while for short/quick idle durations, the existing behavior is retained. Now, there is still only "hlt" available in the guest, but for long idle, the host can go to a deeper state (say C6). There is a reasonable debate one can have to what to set for the exit_latency and break even point for this "Long HLT" state. The good news is that intel_idle has these values available for the underlying CPU (even when mwait is not exposed). The solution thus is to just use the latency and break even of the deepest state from the bare metal CPU. This is under the assumption that this is a pretty reasonable estimate of what the VMM would do to cause latency. Signed-off-by: Arjan van de Ven --- drivers/idle/intel_idle.c | 54 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index c4929d8a35a4..e056e1ec64a9 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -1288,6 +1288,13 @@ static struct cpuidle_state vmguest_cstates[] __initdata = { .exit_latency = 5, .target_residency = 10, .enter = &intel_idle_hlt_irq, }, + { + .name = "C1L", + .desc = "Long HLT", + .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 5, + .target_residency = 200, + .enter = &intel_idle_hlt, }, { .enter = NULL } }; @@ -2117,6 +2124,44 @@ static void __init intel_idle_cpuidle_devices_uninit(void) cpuidle_unregister_device(per_cpu_ptr(intel_idle_cpuidle_devices, i)); } +/* + * Match up the latency and break even point of the bare metal (cpu based) + * states with the deepest VM available state. + * + * We only want to do this for the deepest state, the ones that has + * the TLB_FLUSHED flag set on the . + * + * All our short idle states are dominated by vmexit/vmenter latencies, + * not the underlying hardware latencies so we keep our values for these. + */ +static void matchup_vm_state_with_baremetal(void) +{ + int cstate; + + for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) { + int matching_cstate; + + if (intel_idle_max_cstate_reached(cstate)) + break; + + if (!cpuidle_state_table[cstate].enter && + !cpuidle_state_table[cstate].enter_s2idle) + break; + + if (!(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_TLB_FLUSHED)) + continue; + + for (matching_cstate = 0; matching_cstate < CPUIDLE_STATE_MAX; ++matching_cstate) { + if (icpu->state_table[matching_cstate].exit_latency > cpuidle_state_table[cstate].exit_latency) { + cpuidle_state_table[cstate].exit_latency = icpu->state_table[matching_cstate].exit_latency; + cpuidle_state_table[cstate].target_residency = icpu->state_table[matching_cstate].target_residency; + } + } + + } +} + + static int __init intel_idle_vminit(const struct x86_cpu_id *id) { int retval; @@ -2133,6 +2178,15 @@ static int __init intel_idle_vminit(const struct x86_cpu_id *id) if (!intel_idle_cpuidle_devices) return -ENOMEM; + /* + * We don't know exactly what the host will do when we go idle, but as a worst estimate + * we can assume that the exit latency of the deepest host state will be hit for our + * deep (long duration) guest idle state. + * The same logic applies to the break even point for the long duration guest idle state. + * So lets copy these two properties from the table we found for the host CPU type. + */ + matchup_vm_state_with_baremetal(); + intel_idle_cpuidle_driver_init(&intel_idle_driver); retval = cpuidle_register_driver(&intel_idle_driver);