Message ID | 20250302220112.17653-5-dongli.zhang@oracle.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup | expand |
On 3/3/2025 6:00 AM, Dongli Zhang wrote: > Although AMD PERFCORE and PerfMonV2 are removed when "-pmu" is configured, > there is no way to fully disable KVM AMD PMU virtualization. Neither > "-cpu host,-pmu" nor "-cpu EPYC" achieves this. This looks like a KVM bug. Anyway, since QEMU can achieve its goal with KVM_PMU_CAP_DISABLE with current KVM, I'm fine with it. I have one nit below, otherwise Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> > As a result, the following message still appears in the VM dmesg: > > [ 0.263615] Performance Events: AMD PMU driver. > > However, the expected output should be: > > [ 0.596381] Performance Events: PMU not available due to virtualization, using software events only. > [ 0.600972] NMI watchdog: Perf NMI watchdog permanently disabled > > This occurs because AMD does not use any CPUID bit to indicate PMU > availability. > > To address this, KVM_CAP_PMU_CAPABILITY is used to set KVM_PMU_CAP_DISABLE > when "-pmu" is configured. > > Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> > --- > Changed since v1: > - Switch back to the initial implementation with "-pmu". > https://lore.kernel.org/all/20221119122901.2469-3-dongli.zhang@oracle.com > - Mention that "KVM_PMU_CAP_DISABLE doesn't change the PMU behavior on > Intel platform because current "pmu" property works as expected." > > target/i386/kvm/kvm.c | 31 +++++++++++++++++++++++++++++++ > 1 file changed, 31 insertions(+) > > diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c > index f41e190fb8..5c8a852dbd 100644 > --- a/target/i386/kvm/kvm.c > +++ b/target/i386/kvm/kvm.c > @@ -176,6 +176,8 @@ static int has_triple_fault_event; > > static bool has_msr_mcg_ext_ctl; > > +static int has_pmu_cap; > + > static struct kvm_cpuid2 *cpuid_cache; > static struct kvm_cpuid2 *hv_cpuid_cache; > static struct kvm_msr_list *kvm_feature_msrs; > @@ -2053,6 +2055,33 @@ full: > > int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp) > { > + static bool first = true; > + int ret; > + > + if (first) { > + first = false; > + > + /* > + * Since Linux v5.18, KVM provides a VM-level capability to easily > + * disable PMUs; however, QEMU has been providing PMU property per > + * CPU since v1.6. In order to accommodate both, have to configure > + * the VM-level capability here. > + * > + * KVM_PMU_CAP_DISABLE doesn't change the PMU > + * behavior on Intel platform because current "pmu" property works > + * as expected. > + */ > + if (has_pmu_cap && !X86_CPU(cpu)->enable_pmu) { One nit, it's safer to use (has_pmu_cap & KVM_PMU_CAP_DISABLE) && !X86_CPU(cpu)->enable_pmu Maybe we can rename has_pmu_cap to pmu_cap as well. > + ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PMU_CAPABILITY, 0, > + KVM_PMU_CAP_DISABLE); > + if (ret < 0) { > + error_setg_errno(errp, -ret, > + "Failed to set KVM_PMU_CAP_DISABLE"); > + return ret; > + } > + } > + } > + > return 0; > } > > @@ -3351,6 +3380,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s) > } > } > > + has_pmu_cap = kvm_check_extension(s, KVM_CAP_PMU_CAPABILITY); > + > return 0; > } >
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index f41e190fb8..5c8a852dbd 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -176,6 +176,8 @@ static int has_triple_fault_event; static bool has_msr_mcg_ext_ctl; +static int has_pmu_cap; + static struct kvm_cpuid2 *cpuid_cache; static struct kvm_cpuid2 *hv_cpuid_cache; static struct kvm_msr_list *kvm_feature_msrs; @@ -2053,6 +2055,33 @@ full: int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp) { + static bool first = true; + int ret; + + if (first) { + first = false; + + /* + * Since Linux v5.18, KVM provides a VM-level capability to easily + * disable PMUs; however, QEMU has been providing PMU property per + * CPU since v1.6. In order to accommodate both, have to configure + * the VM-level capability here. + * + * KVM_PMU_CAP_DISABLE doesn't change the PMU + * behavior on Intel platform because current "pmu" property works + * as expected. + */ + if (has_pmu_cap && !X86_CPU(cpu)->enable_pmu) { + ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PMU_CAPABILITY, 0, + KVM_PMU_CAP_DISABLE); + if (ret < 0) { + error_setg_errno(errp, -ret, + "Failed to set KVM_PMU_CAP_DISABLE"); + return ret; + } + } + } + return 0; } @@ -3351,6 +3380,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s) } } + has_pmu_cap = kvm_check_extension(s, KVM_CAP_PMU_CAPABILITY); + return 0; }
Although AMD PERFCORE and PerfMonV2 are removed when "-pmu" is configured, there is no way to fully disable KVM AMD PMU virtualization. Neither "-cpu host,-pmu" nor "-cpu EPYC" achieves this. As a result, the following message still appears in the VM dmesg: [ 0.263615] Performance Events: AMD PMU driver. However, the expected output should be: [ 0.596381] Performance Events: PMU not available due to virtualization, using software events only. [ 0.600972] NMI watchdog: Perf NMI watchdog permanently disabled This occurs because AMD does not use any CPUID bit to indicate PMU availability. To address this, KVM_CAP_PMU_CAPABILITY is used to set KVM_PMU_CAP_DISABLE when "-pmu" is configured. Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> --- Changed since v1: - Switch back to the initial implementation with "-pmu". https://lore.kernel.org/all/20221119122901.2469-3-dongli.zhang@oracle.com - Mention that "KVM_PMU_CAP_DISABLE doesn't change the PMU behavior on Intel platform because current "pmu" property works as expected." target/i386/kvm/kvm.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+)