diff mbox series

[v2,17/54] KVM: x86/pmu: Always set global enable bits in passthrough mode

Message ID 20240506053020.3911940-18-mizhang@google.com (mailing list archive)
State New, archived
Headers show
Series Mediated Passthrough vPMU 2.0 for x86 | expand

Commit Message

Mingwei Zhang May 6, 2024, 5:29 a.m. UTC
From: Sandipan Das <sandipan.das@amd.com>

Currently, the global control bits for a vcpu are restored to the reset
state only if the guest PMU version is less than 2. This works for
emulated PMU as the MSRs are intercepted and backing events are created
for and managed by the host PMU [1].

If such a guest in run with passthrough PMU, the counters no longer work
because the global enable bits are cleared. Hence, set the global enable
bits to their reset state if passthrough PMU is used.

A passthrough-capable host may not necessarily support PMU version 2 and
it can choose to restore or save the global control state from struct
kvm_pmu in the PMU context save and restore helpers depending on the
availability of the global control register.

[1] 7b46b733bdb4 ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"");
Reported-by: Mingwei Zhang <mizhang@google.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
[removed the fixes tag]
---
 arch/x86/kvm/pmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Mi, Dapeng May 8, 2024, 4:18 a.m. UTC | #1
On 5/6/2024 1:29 PM, Mingwei Zhang wrote:
> From: Sandipan Das <sandipan.das@amd.com>
>
> Currently, the global control bits for a vcpu are restored to the reset
> state only if the guest PMU version is less than 2. This works for
> emulated PMU as the MSRs are intercepted and backing events are created
> for and managed by the host PMU [1].
>
> If such a guest in run with passthrough PMU, the counters no longer work
> because the global enable bits are cleared. Hence, set the global enable
> bits to their reset state if passthrough PMU is used.
>
> A passthrough-capable host may not necessarily support PMU version 2 and
> it can choose to restore or save the global control state from struct
> kvm_pmu in the PMU context save and restore helpers depending on the
> availability of the global control register.
>
> [1] 7b46b733bdb4 ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"");
> Reported-by: Mingwei Zhang <mizhang@google.com>
> Signed-off-by: Sandipan Das <sandipan.das@amd.com>
> [removed the fixes tag]
> ---
>  arch/x86/kvm/pmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 5768ea2935e9..e656f72fdace 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -787,7 +787,7 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
>  	 * in the global controls).  Emulate that behavior when refreshing the
>  	 * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
>  	 */
> -	if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
> +	if ((pmu->passthrough || kvm_pmu_has_perf_global_ctrl(pmu)) && pmu->nr_arch_gp_counters)

The logic seems not correct. we could support perfmon version 1 for
meidated vPMU (passthrough vPMU) as well in the future.  pmu->passthrough
is ture doesn't guarantee GLOBAL_CTRL MSR always exists.


>  		pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
>  }
>
Mingwei Zhang May 8, 2024, 4:36 a.m. UTC | #2
On Tue, May 7, 2024 at 9:19 PM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>
>
> On 5/6/2024 1:29 PM, Mingwei Zhang wrote:
> > From: Sandipan Das <sandipan.das@amd.com>
> >
> > Currently, the global control bits for a vcpu are restored to the reset
> > state only if the guest PMU version is less than 2. This works for
> > emulated PMU as the MSRs are intercepted and backing events are created
> > for and managed by the host PMU [1].
> >
> > If such a guest in run with passthrough PMU, the counters no longer work
> > because the global enable bits are cleared. Hence, set the global enable
> > bits to their reset state if passthrough PMU is used.
> >
> > A passthrough-capable host may not necessarily support PMU version 2 and
> > it can choose to restore or save the global control state from struct
> > kvm_pmu in the PMU context save and restore helpers depending on the
> > availability of the global control register.
> >
> > [1] 7b46b733bdb4 ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"");
> > Reported-by: Mingwei Zhang <mizhang@google.com>
> > Signed-off-by: Sandipan Das <sandipan.das@amd.com>
> > [removed the fixes tag]
> > ---
> >  arch/x86/kvm/pmu.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> > index 5768ea2935e9..e656f72fdace 100644
> > --- a/arch/x86/kvm/pmu.c
> > +++ b/arch/x86/kvm/pmu.c
> > @@ -787,7 +787,7 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
> >        * in the global controls).  Emulate that behavior when refreshing the
> >        * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
> >        */
> > -     if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
> > +     if ((pmu->passthrough || kvm_pmu_has_perf_global_ctrl(pmu)) && pmu->nr_arch_gp_counters)
>
> The logic seems not correct. we could support perfmon version 1 for
> meidated vPMU (passthrough vPMU) as well in the future.  pmu->passthrough
> is ture doesn't guarantee GLOBAL_CTRL MSR always exists.

heh, the logic is correct here. However, I would say the code change
may not reflect that clearly.

The if condition combines the handling of global ctrl registers for
both the legacy vPMU and the mediated passthrough vPMU.

In legacy pmu, the logic should be this:

if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)

Because, since KVM emulates the MSR, if the global ctrl register does
not exist, then there is no point resetting it to any value. However,
if it does exist, there are non-zero number of GP counters, we should
reset it to some value (all enabling bits are set for GP counters)
according to SDM.

The logic for mediated passthrough PMU is different as follows:

if (pmu->passthrough && pmu->nr_arch_gp_counters)

Since mediated passthrough PMU requires PerfMon v4 in Intel (PerfMon
v2 in AMD), once it is enabled (pmu->passthrough = true), then global
ctrl _must_ exist phyiscally. Regardless of whether we expose it to
the guest VM, at reset time, we need to ensure enabling bits for GP
counters are set (behind the screen). This is critical for AMD, since
most of the guests are usually in (AMD) PerfMon v1 in which global
ctrl MSR is inaccessible, but does exist and is operating in HW.

Yes, if we eliminate that requirement (pmu->passthrough -> Perfmon v4
Intel / Perfmon v2 AMD), then this code will have to change. However,
that is currently not in our RFCv2.

Thanks.
-Mingwei







>
>
> >               pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
> >  }
> >
Mi, Dapeng May 8, 2024, 6:27 a.m. UTC | #3
On 5/8/2024 12:36 PM, Mingwei Zhang wrote:
> On Tue, May 7, 2024 at 9:19 PM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>>
>> On 5/6/2024 1:29 PM, Mingwei Zhang wrote:
>>> From: Sandipan Das <sandipan.das@amd.com>
>>>
>>> Currently, the global control bits for a vcpu are restored to the reset
>>> state only if the guest PMU version is less than 2. This works for
>>> emulated PMU as the MSRs are intercepted and backing events are created
>>> for and managed by the host PMU [1].
>>>
>>> If such a guest in run with passthrough PMU, the counters no longer work
>>> because the global enable bits are cleared. Hence, set the global enable
>>> bits to their reset state if passthrough PMU is used.
>>>
>>> A passthrough-capable host may not necessarily support PMU version 2 and
>>> it can choose to restore or save the global control state from struct
>>> kvm_pmu in the PMU context save and restore helpers depending on the
>>> availability of the global control register.
>>>
>>> [1] 7b46b733bdb4 ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"");
>>> Reported-by: Mingwei Zhang <mizhang@google.com>
>>> Signed-off-by: Sandipan Das <sandipan.das@amd.com>
>>> [removed the fixes tag]
>>> ---
>>>  arch/x86/kvm/pmu.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
>>> index 5768ea2935e9..e656f72fdace 100644
>>> --- a/arch/x86/kvm/pmu.c
>>> +++ b/arch/x86/kvm/pmu.c
>>> @@ -787,7 +787,7 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
>>>        * in the global controls).  Emulate that behavior when refreshing the
>>>        * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
>>>        */
>>> -     if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
>>> +     if ((pmu->passthrough || kvm_pmu_has_perf_global_ctrl(pmu)) && pmu->nr_arch_gp_counters)
>> The logic seems not correct. we could support perfmon version 1 for
>> meidated vPMU (passthrough vPMU) as well in the future.  pmu->passthrough
>> is ture doesn't guarantee GLOBAL_CTRL MSR always exists.
> heh, the logic is correct here. However, I would say the code change
> may not reflect that clearly.
>
> The if condition combines the handling of global ctrl registers for
> both the legacy vPMU and the mediated passthrough vPMU.
>
> In legacy pmu, the logic should be this:
>
> if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
>
> Because, since KVM emulates the MSR, if the global ctrl register does
> not exist, then there is no point resetting it to any value. However,
> if it does exist, there are non-zero number of GP counters, we should
> reset it to some value (all enabling bits are set for GP counters)
> according to SDM.
>
> The logic for mediated passthrough PMU is different as follows:
>
> if (pmu->passthrough && pmu->nr_arch_gp_counters)
>
> Since mediated passthrough PMU requires PerfMon v4 in Intel (PerfMon
> v2 in AMD), once it is enabled (pmu->passthrough = true), then global
> ctrl _must_ exist phyiscally. Regardless of whether we expose it to
> the guest VM, at reset time, we need to ensure enabling bits for GP
> counters are set (behind the screen). This is critical for AMD, since
> most of the guests are usually in (AMD) PerfMon v1 in which global
> ctrl MSR is inaccessible, but does exist and is operating in HW.
>
> Yes, if we eliminate that requirement (pmu->passthrough -> Perfmon v4
> Intel / Perfmon v2 AMD), then this code will have to change. However,
Yeah, that's what I'm worrying about. We ever discussed to support mediated
vPMU on HW below perfmon v4. When someone implements this, he may not
notice this place needs to be changed as well, this introduces a potential
bug and we should avoid this.
> that is currently not in our RFCv2.
>
> Thanks.
> -Mingwei
>
>
>
>
>
>
>
>>
>>>               pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
>>>  }
>>>
Sean Christopherson May 8, 2024, 2:13 p.m. UTC | #4
On Wed, May 08, 2024, Dapeng Mi wrote:
> 
> On 5/8/2024 12:36 PM, Mingwei Zhang wrote:
> > if (pmu->passthrough && pmu->nr_arch_gp_counters)
> >
> > Since mediated passthrough PMU requires PerfMon v4 in Intel (PerfMon
> > v2 in AMD), once it is enabled (pmu->passthrough = true), then global
> > ctrl _must_ exist phyiscally. Regardless of whether we expose it to
> > the guest VM, at reset time, we need to ensure enabling bits for GP
> > counters are set (behind the screen). This is critical for AMD, since
> > most of the guests are usually in (AMD) PerfMon v1 in which global
> > ctrl MSR is inaccessible, but does exist and is operating in HW.
> >
> > Yes, if we eliminate that requirement (pmu->passthrough -> Perfmon v4
> > Intel / Perfmon v2 AMD), then this code will have to change. However,
> Yeah, that's what I'm worrying about. We ever discussed to support mediated
> vPMU on HW below perfmon v4. When someone implements this, he may not
> notice this place needs to be changed as well, this introduces a potential
> bug and we should avoid this.

Just add a WARN on the PMU version.  I haven't thought much about whether or not
KVM should support mediated PMU for earlier hardware, but having a sanity check
on the assumptions of this code is reasonable even if we don't _plan_ on supporting
earlier hardware.
Mingwei Zhang May 9, 2024, 12:13 a.m. UTC | #5
On Wed, May 8, 2024 at 7:13 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, May 08, 2024, Dapeng Mi wrote:
> >
> > On 5/8/2024 12:36 PM, Mingwei Zhang wrote:
> > > if (pmu->passthrough && pmu->nr_arch_gp_counters)
> > >
> > > Since mediated passthrough PMU requires PerfMon v4 in Intel (PerfMon
> > > v2 in AMD), once it is enabled (pmu->passthrough = true), then global
> > > ctrl _must_ exist phyiscally. Regardless of whether we expose it to
> > > the guest VM, at reset time, we need to ensure enabling bits for GP
> > > counters are set (behind the screen). This is critical for AMD, since
> > > most of the guests are usually in (AMD) PerfMon v1 in which global
> > > ctrl MSR is inaccessible, but does exist and is operating in HW.
> > >
> > > Yes, if we eliminate that requirement (pmu->passthrough -> Perfmon v4
> > > Intel / Perfmon v2 AMD), then this code will have to change. However,
> > Yeah, that's what I'm worrying about. We ever discussed to support mediated
> > vPMU on HW below perfmon v4. When someone implements this, he may not
> > notice this place needs to be changed as well, this introduces a potential
> > bug and we should avoid this.

I think you might have worried too much about future problems, but
yes, things are under the radar. For Intel, this version constraint
might be ok as Perfmon v4 is skylake, which is already pretty early.

For AMD, things are slightly different, PerfMon v2 in AMD requires
Genoa, which is pretty new. So, this problem probably could be
something for AMD if they want to extend the new vPMU design to Milan,
but we will see how people think. So one potential (easy) extension
for AMD is host PerfMon v1 + guest PerfMon v1 support for mediated
passthrough vPMU.

>
> Just add a WARN on the PMU version.  I haven't thought much about whether or not
> KVM should support mediated PMU for earlier hardware, but having a sanity check
> on the assumptions of this code is reasonable even if we don't _plan_ on supporting
> earlier hardware.

Sure. That sounds pretty reasonable.

Thanks.
-Mingwei
Mi, Dapeng May 9, 2024, 12:30 a.m. UTC | #6
On 5/9/2024 8:13 AM, Mingwei Zhang wrote:
> On Wed, May 8, 2024 at 7:13 AM Sean Christopherson <seanjc@google.com> wrote:
>> On Wed, May 08, 2024, Dapeng Mi wrote:
>>> On 5/8/2024 12:36 PM, Mingwei Zhang wrote:
>>>> if (pmu->passthrough && pmu->nr_arch_gp_counters)
>>>>
>>>> Since mediated passthrough PMU requires PerfMon v4 in Intel (PerfMon
>>>> v2 in AMD), once it is enabled (pmu->passthrough = true), then global
>>>> ctrl _must_ exist phyiscally. Regardless of whether we expose it to
>>>> the guest VM, at reset time, we need to ensure enabling bits for GP
>>>> counters are set (behind the screen). This is critical for AMD, since
>>>> most of the guests are usually in (AMD) PerfMon v1 in which global
>>>> ctrl MSR is inaccessible, but does exist and is operating in HW.
>>>>
>>>> Yes, if we eliminate that requirement (pmu->passthrough -> Perfmon v4
>>>> Intel / Perfmon v2 AMD), then this code will have to change. However,
>>> Yeah, that's what I'm worrying about. We ever discussed to support mediated
>>> vPMU on HW below perfmon v4. When someone implements this, he may not
>>> notice this place needs to be changed as well, this introduces a potential
>>> bug and we should avoid this.
> I think you might have worried too much about future problems, but
> yes, things are under the radar. For Intel, this version constraint
> might be ok as Perfmon v4 is skylake, which is already pretty early.
No, I don't think this is redundant worry since we did discuss this
requirement before and it could need to be supported in the future. We need
to consider the code's extensibility and not introduce potential issue.
>
> For AMD, things are slightly different, PerfMon v2 in AMD requires
> Genoa, which is pretty new. So, this problem probably could be
> something for AMD if they want to extend the new vPMU design to Milan,
> but we will see how people think. So one potential (easy) extension
> for AMD is host PerfMon v1 + guest PerfMon v1 support for mediated
> passthrough vPMU.
>
>> Just add a WARN on the PMU version.  I haven't thought much about whether or not
>> KVM should support mediated PMU for earlier hardware, but having a sanity check
>> on the assumptions of this code is reasonable even if we don't _plan_ on supporting
>> earlier hardware.
> Sure. That sounds pretty reasonable.
Good for me.
>
> Thanks.
> -Mingwei
>
Mi, Dapeng May 9, 2024, 12:38 a.m. UTC | #7
On 5/8/2024 10:13 PM, Sean Christopherson wrote:
> On Wed, May 08, 2024, Dapeng Mi wrote:
>> On 5/8/2024 12:36 PM, Mingwei Zhang wrote:
>>> if (pmu->passthrough && pmu->nr_arch_gp_counters)
>>>
>>> Since mediated passthrough PMU requires PerfMon v4 in Intel (PerfMon
>>> v2 in AMD), once it is enabled (pmu->passthrough = true), then global
>>> ctrl _must_ exist phyiscally. Regardless of whether we expose it to
>>> the guest VM, at reset time, we need to ensure enabling bits for GP
>>> counters are set (behind the screen). This is critical for AMD, since
>>> most of the guests are usually in (AMD) PerfMon v1 in which global
>>> ctrl MSR is inaccessible, but does exist and is operating in HW.
>>>
>>> Yes, if we eliminate that requirement (pmu->passthrough -> Perfmon v4
>>> Intel / Perfmon v2 AMD), then this code will have to change. However,
>> Yeah, that's what I'm worrying about. We ever discussed to support mediated
>> vPMU on HW below perfmon v4. When someone implements this, he may not
>> notice this place needs to be changed as well, this introduces a potential
>> bug and we should avoid this.
> Just add a WARN on the PMU version.  I haven't thought much about whether or not
> KVM should support mediated PMU for earlier hardware, but having a sanity check
> on the assumptions of this code is reasonable even if we don't _plan_ on supporting
> earlier hardware.
I have no preference on whether supporting the old hardware (especially for
Intel CPUs below v4 which has no GLOBAL_STATUS_SET MSR), but I think the
key question is whether we want to totally drop the emulated vPMU after
mediated passthrough vPMU becomes mature. If so, we may have to support the
old platforms.
diff mbox series

Patch

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 5768ea2935e9..e656f72fdace 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -787,7 +787,7 @@  void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
 	 * in the global controls).  Emulate that behavior when refreshing the
 	 * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
 	 */
-	if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
+	if ((pmu->passthrough || kvm_pmu_has_perf_global_ctrl(pmu)) && pmu->nr_arch_gp_counters)
 		pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
 }