diff mbox series

[1/1] KVM: x86/vPMU: Check PMU is enabled for vCPU before searching for PMC

Message ID 20231109180646.2963718-2-khorenko@virtuozzo.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86/vPMU: Speed up vmexit for AMD Zen 4 CPUs | expand

Commit Message

Konstantin Khorenko Nov. 9, 2023, 6:06 p.m. UTC
The following 2 mainstream patches have introduced extra
events accounting:

  018d70ffcfec ("KVM: x86: Update vPMCs when retiring branch instructions")
  9cd803d496e7 ("KVM: x86: Update vPMCs when retiring instructions")

kvm_pmu_trigger_event() iterates over all PMCs looking for enabled and
this appeared to be fast on Intel CPUs and quite expensive for AMD CPUs.

kvm_pmu_trigger_event() can be optimized not to iterate over all PMCs in
the following cases:

  * if PMU is completely disabled for a VM, which is the default
    configuration
  * if PMU v2 is enabled, but no PMCs are configured

For Intel CPUs:
  * By default PMU is disabled for KVM VMs (<pmu state='off'/> or absent
    in the VM config xml which results in "-cpu pmu=off" qemu option).
    In this case pmu->version is reported as 0 for the appropriate vCPU.

  * According to Intel® 64 and IA-32 Architectures Software Developer’s
    Manual PMU version 2 and higher provide IA32_PERF_GLOBAL_CTRL MSR
    which in particular contains bits which can be used for efficient
    detection which fixed-function performance and general-purpose
    performance monitoring counters are enabled at the moment.

  * Searching for enabled PMCs is fast and the optimization does not
    bring noticeable performance increase.

For AMD CPUs:
  * For CPUs older than Zen 4 pmu->version is always reported as "1" for
    the appropriate vCPU, no matter if PMU is disabled for KVM VMs
    (<pmu state='off'/>) or enabled.
    So for "old" CPUs currently it's impossible to detect when PMU is
    disabled for a VM and skip the iteration by PMCs efficiently.

  * Since Zen 4 AMD CPUs support PMU v2 and in this case pmu->version
    should be reported as "2" and IA32_PERF_GLOBAL_CTRL MSR is available
    and can be used for fast and efficient check for enabled PMCs.
    https://www.phoronix.com/news/AMD-PerfMonV2-Linux-Patches
    https://www.phoronix.com/news/AMD-PerfMonV2-Guests-KVM

  * Optimized preliminary check for enabled PMCs on AMD Zen 4 CPUs
    should give quite noticeable performance improvement.

AMD performance results:
CPU: AMD Zen 3 (three!): AMD EPYC 7443P 24-Core Processor

 * The test binary is run inside an AlmaLinux 9 VM with their stock kernel
   5.14.0-284.11.1.el9_2.x86_64.
 * Test binary checks the CPUID instractions rate (instructions per sec).
 * Default VM config (PMU is off, pmu->version is reported as 1).
 * The Host runs the kernel under test.

 # for i in 1 2 3 4 5 ; do ./at_cpu_cpuid.pub ; done | \
   awk -e '{print $4;}' | \
   cut -f1 --delimiter='.' | \
   ./avg.sh

Measurements:
1. Host runs stock latest mainstream kernel commit 305230142ae0.
2. Host runs same mainstream kernel + current patch.
3. Host runs same mainstream kernel + current patch + force
   guest_pmu_is_enabled() to always return "false" using following change:

   -       if (pmu->version >= 2 && !(pmu->global_ctrl & ~pmu->global_ctrl_mask))
   +       if (pmu->version == 1 && !(pmu->global_ctrl & ~pmu->global_ctrl_mask))

   --------------------------------------
   | Kernels	| CPUID rate		|
   --------------------------------------
   | 1.		| 1360250		|
   | 2.		| 1365536 (+ 0.4%)	|
   | 3.		| 1541850 (+13.4%)	|
   --------------------------------------

Measurement (2) gives some fluctuation, the performance is not increased
because the test was done on a Zen 3 CPU, so we are unable to use fast
check for active PMCs.
Measurement (3) shows expected performance boost on a Zen 4 CPU under
the same test.

Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
---
 arch/x86/kvm/pmu.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

Comments

Sean Christopherson Nov. 9, 2023, 8:09 p.m. UTC | #1
On Thu, Nov 09, 2023, Konstantin Khorenko wrote:
> The following 2 mainstream patches have introduced extra
> events accounting:
> 
>   018d70ffcfec ("KVM: x86: Update vPMCs when retiring branch instructions")
>   9cd803d496e7 ("KVM: x86: Update vPMCs when retiring instructions")
> 
> kvm_pmu_trigger_event() iterates over all PMCs looking for enabled and
> this appeared to be fast on Intel CPUs and quite expensive for AMD CPUs.
> 
> kvm_pmu_trigger_event() can be optimized not to iterate over all PMCs in
> the following cases:

Heh, I'm just putting the finishing touches on a series to optimize this mess.

> ---
>  arch/x86/kvm/pmu.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 9ae07db6f0f6..290d407f339b 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -731,12 +731,38 @@ static inline bool cpl_is_matched(struct kvm_pmc *pmc)
>  	return (static_call(kvm_x86_get_cpl)(pmc->vcpu) == 0) ? select_os : select_user;
>  }
>  
> +static inline bool guest_pmu_is_enabled(struct kvm_pmu *pmu)
> +{
> +	/*
> +	 * Currently VMs do not have PMU settings in configs which defaults
> +	 * to "pmu=off".
> +	 *
> +	 * For Intel currently this means pmu->version will be 0.
> +	 * For AMD currently PMU cannot be disabled:
> +	 * pmu->version should be 2 for Zen 4 cpus and 1 otherwise.
> +	 */
> +	if (pmu->version == 0)
> +		return false;
> +
> +	/*
> +	 * Starting with PMU v2 IA32_PERF_GLOBAL_CTRL MSR is available and
> +	 * it can be used to check if none PMCs are enabled.
> +	 */
> +	if (pmu->version >= 2 && !(pmu->global_ctrl & ~pmu->global_ctrl_mask))
> +		return false;
> +
> +	return true;
> +}
> +
>  void kvm_pmu_trigger_event(struct kvm_vcpu *vcpu, u64 perf_hw_id)
>  {
>  	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
>  	struct kvm_pmc *pmc;
>  	int i;
>  
> +	if (!guest_pmu_is_enabled(pmu))
> +		return;

This _should_ be unnecessary, because all_valid_pmc_idx _should_ be zero when the
PMU is disabled.  Architecturally, AMD doesn't provide a way to disable the PMU,
but KVM open that can of worms a long time ago, e.g. see the enable_pmu check in
get_gp_pmc_amd().  Kernels have long since learned to not panic if the PMU isn't
available, precisely because hypervisors have a long history of not virtualizing
a PMU.

The issue is that KVM stupidly doesn't zero out the metadata, i.e. configures
all_valid_pmc_idx as if the vCPU has a PMU even though KVM will deny access to
all assets in the end.

The below should resolve the issue.  Note, it won't apply on kvm-x86/next due to
multiple dependencies on other PMU changes I have in flight.  Give me a few hours
to test; with luck I'll get this posted by end-of-day.

--
From: Sean Christopherson <seanjc@google.com>
Date: Thu, 9 Nov 2023 11:03:48 -0800
Subject: [PATCH 1/4] KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is
 disabled

Move the purging of common PMU metadata from intel_pmu_refresh() to
kvm_pmu_refresh(), and invoke the vendor refresh() hook if and only if
the VM is supposed to have a vPMU.

KVM already denies access to the PMU based on kvm->arch.enable_pmu, as
get_gp_pmc_amd() returns NULL for all PMCs in that case, i.e. KVM already
violates AMD's architecture by not virtualizing a PMU (kernels have long
since learned to not panic when the PMU is unavailable).  But configuring
the PMU as if it were enabled causes unwanted side effects, e.g. calls to
kvm_pmu_trigger_event() waste an absurd number of cycles due to the
all_valid_pmc_idx bitmap being non-zero.

Fixes: b1d66dad65dc ("KVM: x86/svm: Add module param to control PMU virtualization")
Reported-by: Konstantin Khorenko <khorenko@virtuozzo.com>
Closes: https://lore.kernel.org/all/20231109180646.2963718-2-khorenko@virtuozzo.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/pmu.c           | 20 ++++++++++++++++++--
 arch/x86/kvm/vmx/pmu_intel.c | 16 ++--------------
 2 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 1b74a29ed250..b52bab7dc422 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -739,6 +739,8 @@ static void kvm_pmu_reset(struct kvm_vcpu *vcpu)
  */
 void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
 {
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+
 	if (KVM_BUG_ON(kvm_vcpu_has_run(vcpu), vcpu->kvm))
 		return;
 
@@ -748,8 +750,22 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
 	 */
 	kvm_pmu_reset(vcpu);
 
-	bitmap_zero(vcpu_to_pmu(vcpu)->all_valid_pmc_idx, X86_PMC_IDX_MAX);
-	static_call(kvm_x86_pmu_refresh)(vcpu);
+	pmu->version = 0;
+	pmu->nr_arch_gp_counters = 0;
+	pmu->nr_arch_fixed_counters = 0;
+	pmu->counter_bitmask[KVM_PMC_GP] = 0;
+	pmu->counter_bitmask[KVM_PMC_FIXED] = 0;
+	pmu->reserved_bits = 0xffffffff00200000ull;
+	pmu->raw_event_mask = X86_RAW_EVENT_MASK;
+	pmu->global_ctrl_mask = ~0ull;
+	pmu->global_status_mask = ~0ull;
+	pmu->fixed_ctr_ctrl_mask = ~0ull;
+	pmu->pebs_enable_mask = ~0ull;
+	pmu->pebs_data_cfg_mask = ~0ull;
+	bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
+
+	if (vcpu->kvm->arch.enable_pmu)
+		static_call(kvm_x86_pmu_refresh)(vcpu);
 }
 
 void kvm_pmu_init(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index c3a841d8df27..0d2fd9fdcf4b 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -463,19 +463,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	u64 counter_mask;
 	int i;
 
-	pmu->nr_arch_gp_counters = 0;
-	pmu->nr_arch_fixed_counters = 0;
-	pmu->counter_bitmask[KVM_PMC_GP] = 0;
-	pmu->counter_bitmask[KVM_PMC_FIXED] = 0;
-	pmu->version = 0;
-	pmu->reserved_bits = 0xffffffff00200000ull;
-	pmu->raw_event_mask = X86_RAW_EVENT_MASK;
-	pmu->global_ctrl_mask = ~0ull;
-	pmu->global_status_mask = ~0ull;
-	pmu->fixed_ctr_ctrl_mask = ~0ull;
-	pmu->pebs_enable_mask = ~0ull;
-	pmu->pebs_data_cfg_mask = ~0ull;
-
 	memset(&lbr_desc->records, 0, sizeof(lbr_desc->records));
 
 	/*
@@ -487,8 +474,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 		return;
 
 	entry = kvm_find_cpuid_entry(vcpu, 0xa);
-	if (!entry || !vcpu->kvm->arch.enable_pmu)
+	if (!entry)
 		return;
+
 	eax.full = entry->eax;
 	edx.full = entry->edx;
 

base-commit: 45c6565ff59d0b254ca3755cb4e14776a2c0b324
--
Jim Mattson Nov. 9, 2023, 8:13 p.m. UTC | #2
On Thu, Nov 9, 2023 at 10:24 AM Konstantin Khorenko
<khorenko@virtuozzo.com> wrote:
>
> The following 2 mainstream patches have introduced extra
> events accounting:
>
>   018d70ffcfec ("KVM: x86: Update vPMCs when retiring branch instructions")
>   9cd803d496e7 ("KVM: x86: Update vPMCs when retiring instructions")
>
> kvm_pmu_trigger_event() iterates over all PMCs looking for enabled and
> this appeared to be fast on Intel CPUs and quite expensive for AMD CPUs.
>
> kvm_pmu_trigger_event() can be optimized not to iterate over all PMCs in
> the following cases:
>
>   * if PMU is completely disabled for a VM, which is the default
>     configuration
>   * if PMU v2 is enabled, but no PMCs are configured
>
> For Intel CPUs:
>   * By default PMU is disabled for KVM VMs (<pmu state='off'/> or absent
>     in the VM config xml which results in "-cpu pmu=off" qemu option).
>     In this case pmu->version is reported as 0 for the appropriate vCPU.
>
>   * According to Intel® 64 and IA-32 Architectures Software Developer’s
>     Manual PMU version 2 and higher provide IA32_PERF_GLOBAL_CTRL MSR
>     which in particular contains bits which can be used for efficient
>     detection which fixed-function performance and general-purpose
>     performance monitoring counters are enabled at the moment.
>
>   * Searching for enabled PMCs is fast and the optimization does not
>     bring noticeable performance increase.
>
> For AMD CPUs:
>   * For CPUs older than Zen 4 pmu->version is always reported as "1" for
>     the appropriate vCPU, no matter if PMU is disabled for KVM VMs
>     (<pmu state='off'/>) or enabled.
>     So for "old" CPUs currently it's impossible to detect when PMU is
>     disabled for a VM and skip the iteration by PMCs efficiently.
>
>   * Since Zen 4 AMD CPUs support PMU v2 and in this case pmu->version
>     should be reported as "2" and IA32_PERF_GLOBAL_CTRL MSR is available
>     and can be used for fast and efficient check for enabled PMCs.
>     https://www.phoronix.com/news/AMD-PerfMonV2-Linux-Patches
>     https://www.phoronix.com/news/AMD-PerfMonV2-Guests-KVM
>
>   * Optimized preliminary check for enabled PMCs on AMD Zen 4 CPUs
>     should give quite noticeable performance improvement.
>
> AMD performance results:
> CPU: AMD Zen 3 (three!): AMD EPYC 7443P 24-Core Processor
>
>  * The test binary is run inside an AlmaLinux 9 VM with their stock kernel
>    5.14.0-284.11.1.el9_2.x86_64.
>  * Test binary checks the CPUID instractions rate (instructions per sec).
>  * Default VM config (PMU is off, pmu->version is reported as 1).
>  * The Host runs the kernel under test.
>
>  # for i in 1 2 3 4 5 ; do ./at_cpu_cpuid.pub ; done | \
>    awk -e '{print $4;}' | \
>    cut -f1 --delimiter='.' | \
>    ./avg.sh
>
> Measurements:
> 1. Host runs stock latest mainstream kernel commit 305230142ae0.
> 2. Host runs same mainstream kernel + current patch.
> 3. Host runs same mainstream kernel + current patch + force
>    guest_pmu_is_enabled() to always return "false" using following change:
>
>    -       if (pmu->version >= 2 && !(pmu->global_ctrl & ~pmu->global_ctrl_mask))
>    +       if (pmu->version == 1 && !(pmu->global_ctrl & ~pmu->global_ctrl_mask))
>
>    --------------------------------------
>    | Kernels    | CPUID rate            |
>    --------------------------------------
>    | 1.         | 1360250               |
>    | 2.         | 1365536 (+ 0.4%)      |
>    | 3.         | 1541850 (+13.4%)      |
>    --------------------------------------
>
> Measurement (2) gives some fluctuation, the performance is not increased
> because the test was done on a Zen 3 CPU, so we are unable to use fast
> check for active PMCs.
> Measurement (3) shows expected performance boost on a Zen 4 CPU under
> the same test.
>
> Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
> ---
>  arch/x86/kvm/pmu.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
>
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 9ae07db6f0f6..290d407f339b 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -731,12 +731,38 @@ static inline bool cpl_is_matched(struct kvm_pmc *pmc)
>         return (static_call(kvm_x86_get_cpl)(pmc->vcpu) == 0) ? select_os : select_user;
>  }
>
> +static inline bool guest_pmu_is_enabled(struct kvm_pmu *pmu)
> +{
> +       /*
> +        * Currently VMs do not have PMU settings in configs which defaults
> +        * to "pmu=off".
> +        *
> +        * For Intel currently this means pmu->version will be 0.
> +        * For AMD currently PMU cannot be disabled:

Isn't that what KVM_PMU_CAP_DISABLE is for?
Sean Christopherson Nov. 9, 2023, 11:05 p.m. UTC | #3
On Thu, Nov 09, 2023, Jim Mattson wrote:
> On Thu, Nov 9, 2023 at 10:24 AM Konstantin Khorenko <khorenko@virtuozzo.com> wrote:
> > ---
> >  arch/x86/kvm/pmu.c | 26 ++++++++++++++++++++++++++
> >  1 file changed, 26 insertions(+)
> >
> > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> > index 9ae07db6f0f6..290d407f339b 100644
> > --- a/arch/x86/kvm/pmu.c
> > +++ b/arch/x86/kvm/pmu.c
> > @@ -731,12 +731,38 @@ static inline bool cpl_is_matched(struct kvm_pmc *pmc)
> >         return (static_call(kvm_x86_get_cpl)(pmc->vcpu) == 0) ? select_os : select_user;
> >  }
> >
> > +static inline bool guest_pmu_is_enabled(struct kvm_pmu *pmu)
> > +{
> > +       /*
> > +        * Currently VMs do not have PMU settings in configs which defaults
> > +        * to "pmu=off".
> > +        *
> > +        * For Intel currently this means pmu->version will be 0.
> > +        * For AMD currently PMU cannot be disabled:
> 
> Isn't that what KVM_PMU_CAP_DISABLE is for?

Yeah, see my response.  KVM doesn't clear the metadata, so internally it looks
like the PMU is enabled even though it's effectively disabled from the guest's
perspective.
diff mbox series

Patch

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 9ae07db6f0f6..290d407f339b 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -731,12 +731,38 @@  static inline bool cpl_is_matched(struct kvm_pmc *pmc)
 	return (static_call(kvm_x86_get_cpl)(pmc->vcpu) == 0) ? select_os : select_user;
 }
 
+static inline bool guest_pmu_is_enabled(struct kvm_pmu *pmu)
+{
+	/*
+	 * Currently VMs do not have PMU settings in configs which defaults
+	 * to "pmu=off".
+	 *
+	 * For Intel currently this means pmu->version will be 0.
+	 * For AMD currently PMU cannot be disabled:
+	 * pmu->version should be 2 for Zen 4 cpus and 1 otherwise.
+	 */
+	if (pmu->version == 0)
+		return false;
+
+	/*
+	 * Starting with PMU v2 IA32_PERF_GLOBAL_CTRL MSR is available and
+	 * it can be used to check if none PMCs are enabled.
+	 */
+	if (pmu->version >= 2 && !(pmu->global_ctrl & ~pmu->global_ctrl_mask))
+		return false;
+
+	return true;
+}
+
 void kvm_pmu_trigger_event(struct kvm_vcpu *vcpu, u64 perf_hw_id)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 	struct kvm_pmc *pmc;
 	int i;
 
+	if (!guest_pmu_is_enabled(pmu))
+		return;
+
 	for_each_set_bit(i, pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX) {
 		pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, i);