diff mbox series

[1/1] KVM: arm64: PMU: Avoid inappropriate use of host's PMUVer

Message ID 20230610194510.4146549-1-reijiw@google.com (mailing list archive)
State New, archived
Headers show
Series [1/1] KVM: arm64: PMU: Avoid inappropriate use of host's PMUVer | expand

Commit Message

Reiji Watanabe June 10, 2023, 7:45 p.m. UTC
Avoid using the PMUVer of the host's PMU hardware (including
the sanitized value of the PMUVer) for vPMU control purposes,
except in a few cases, as the value of host's PMUVer may differ
from the value of ID_AA64DFR0_EL1.PMUVer for the guest.

The first case is when using the PMUVer as the limit value of
the ID_AA64DFR0_EL1.PMUVer for the guest. The second case is
when using the PMUVer to determine the valid range of events for
KVM_ARM_VCPU_PMU_V3_FILTER, as it has been allowing userspace to
specify events that are valid for the PMU hardware, regardless of
the value of the guest's ID_AA64DFR0_EL1.PMUVer. KVM will change
the valid range of the event that the guest can use based on the
value of the guest's ID_AA64DFR0_EL1.PMUVer though.

Signed-off-by: Reiji Watanabe <reijiw@google.com>
---
 arch/arm64/kvm/pmu-emul.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

Comments

Oliver Upton June 11, 2023, 12:57 a.m. UTC | #1
Hi Reiji,

On Sat, Jun 10, 2023 at 12:45:10PM -0700, Reiji Watanabe wrote:
> @@ -735,7 +736,7 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
>  		 * Don't advertise STALL_SLOT, as PMMIR_EL0 is handled
>  		 * as RAZ
>  		 */
> -		if (vcpu->kvm->arch.arm_pmu->pmuver >= ID_AA64DFR0_EL1_PMUVer_V3P4)
> +		if (vcpu->kvm->arch.dfr0_pmuver.imp >= ID_AA64DFR0_EL1_PMUVer_V3P4)
>  			val &= ~BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT - 32);

I don't think this conditional masking is correct in the first place,
and this change would only make it worse.

We emulate reads of PMCEID1_EL0 using the literal value of the CPU. The
_advertised_ PMU version has no bearing on the core PMU version. So,
assuming we hit this on a v3p5+ part with userspace (stupidly)
advertising an older implementation level, we never clear the bit for
STALL_SLOT.

So let's just fix the issue by unconditionally masking the bit.

>  		base = 32;
>  	}
> @@ -932,11 +933,17 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  		return 0;
>  	}
>  	case KVM_ARM_VCPU_PMU_V3_FILTER: {
> +		u8 pmuver = kvm_arm_pmu_get_pmuver_limit();
>  		struct kvm_pmu_event_filter __user *uaddr;
>  		struct kvm_pmu_event_filter filter;
>  		int nr_events;
>  
> -		nr_events = kvm_pmu_event_mask(kvm) + 1;
> +		/*
> +		 * Allow userspace to specify an event filter for the entire
> +		 * event range supported by PMUVer of the hardware, rather
> +		 * than the guest's PMUVer for KVM backward compatibility.
> +		 */
> +		nr_events = __kvm_pmu_event_mask(pmuver) + 1;

This is a rather signifcant change from the existing behavior though,
no?

The 'raw' PMU version of the selected instance has been used as the
basis of the maximum event list, but this uses the sanitised value. I'd
rather we consistently use the selected PMU instance as the basis for
all guest-facing PMU emulation.

I get that asymmetry in this deparment is exceedingly rare in the wild,
but I'd rather keep a consistent model in the PMU emulation code where
all our logic is based on the selected PMU instance.

--
Thanks,
Oliver
Reiji Watanabe June 11, 2023, 4:54 a.m. UTC | #2
Hi Oliver,

On Sat, Jun 10, 2023 at 05:57:34PM -0700, Oliver Upton wrote:
> Hi Reiji,
> 
> On Sat, Jun 10, 2023 at 12:45:10PM -0700, Reiji Watanabe wrote:
> > @@ -735,7 +736,7 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
> >  		 * Don't advertise STALL_SLOT, as PMMIR_EL0 is handled
> >  		 * as RAZ
> >  		 */
> > -		if (vcpu->kvm->arch.arm_pmu->pmuver >= ID_AA64DFR0_EL1_PMUVer_V3P4)
> > +		if (vcpu->kvm->arch.dfr0_pmuver.imp >= ID_AA64DFR0_EL1_PMUVer_V3P4)
> >  			val &= ~BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT - 32);
> 
> I don't think this conditional masking is correct in the first place,

I'm not sure why this conditional masking is correct.
Could you please elaborate ?


> and this change would only make it worse.
> 
> We emulate reads of PMCEID1_EL0 using the literal value of the CPU. The
> _advertised_ PMU version has no bearing on the core PMU version. So,
> assuming we hit this on a v3p5+ part with userspace (stupidly)
> advertising an older implementation level, we never clear the bit for
> STALL_SLOT.

I'm not sure if I understand this comment correctly.
When the guest's PMUVer is older than v3p4, I don't think we need
to clear the bit for STALL_SLOT, as PMMIR_EL1 is not implemented
for the guest (PMMIR_EL1 is implemented only on v3p4 or newer).
Or am I missing something ?

BTW, as KVM doesn't expose vPMU to the guest on non-uniform PMUVer
systems (as the sanitized value of ID_AA64DFR0_EL1.PMUVer on such
systems is zero), it is unlikely that the guest on such systems will
read this register (KVM should inject UNDEFINED in this case,
although KVM doesn't do that).


> 
> So let's just fix the issue by unconditionally masking the bit.
> 
> >  		base = 32;
> >  	}
> > @@ -932,11 +933,17 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >  		return 0;
> >  	}
> >  	case KVM_ARM_VCPU_PMU_V3_FILTER: {
> > +		u8 pmuver = kvm_arm_pmu_get_pmuver_limit();
> >  		struct kvm_pmu_event_filter __user *uaddr;
> >  		struct kvm_pmu_event_filter filter;
> >  		int nr_events;
> >  
> > -		nr_events = kvm_pmu_event_mask(kvm) + 1;
> > +		/*
> > +		 * Allow userspace to specify an event filter for the entire
> > +		 * event range supported by PMUVer of the hardware, rather
> > +		 * than the guest's PMUVer for KVM backward compatibility.
> > +		 */
> > +		nr_events = __kvm_pmu_event_mask(pmuver) + 1;
> 
> This is a rather signifcant change from the existing behavior though,
> no?
> 
> The 'raw' PMU version of the selected instance has been used as the
> basis of the maximum event list, but this uses the sanitised value. I'd
> rather we consistently use the selected PMU instance as the basis for
> all guest-facing PMU emulation.
> 
> I get that asymmetry in this deparment is exceedingly rare in the wild,
> but I'd rather keep a consistent model in the PMU emulation code where
> all our logic is based on the selected PMU instance.

Oh, sorry, I forget to update this from the previous (slightly different)
series [1], where kvm_arm_pmu_get_pmuver_limit() returned
kvm->arch.arm_pmu->pmuver.  Although the sanitized value will always be
the same as kvm->arch.arm_pmu->pmuver with the series [2], I don't meant
to change this in this patch.

[1] https://lore.kernel.org/all/20230527040236.1875860-1-reijiw@google.com/
[2] https://lore.kernel.org/all/20230610061520.3026530-1-reijiw@google.com/

Thank you,
Reiji
Oliver Upton June 11, 2023, 7:47 a.m. UTC | #3
On Sat, Jun 10, 2023 at 09:54:30PM -0700, Reiji Watanabe wrote:
> On Sat, Jun 10, 2023 at 05:57:34PM -0700, Oliver Upton wrote:
> > Hi Reiji,
> > 
> > On Sat, Jun 10, 2023 at 12:45:10PM -0700, Reiji Watanabe wrote:
> > > @@ -735,7 +736,7 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
> > >  		 * Don't advertise STALL_SLOT, as PMMIR_EL0 is handled
> > >  		 * as RAZ
> > >  		 */
> > > -		if (vcpu->kvm->arch.arm_pmu->pmuver >= ID_AA64DFR0_EL1_PMUVer_V3P4)
> > > +		if (vcpu->kvm->arch.dfr0_pmuver.imp >= ID_AA64DFR0_EL1_PMUVer_V3P4)
> > >  			val &= ~BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT - 32);
> > 
> > I don't think this conditional masking is correct in the first place,
> 
> I'm not sure why this conditional masking is correct.
> Could you please elaborate ?

On second thought, the original code works, but for a rather non-obvious
reason. I was concerned about the case where kvm->arch.arm_pmu->pmuver does
not match the current CPU, but as you say we hide PMU from the guest in this
case.

My concern remains, though, for the proposed fix.

> > and this change would only make it worse.
> > 
> > We emulate reads of PMCEID1_EL0 using the literal value of the CPU. The
> > _advertised_ PMU version has no bearing on the core PMU version. So,
> > assuming we hit this on a v3p5+ part with userspace (stupidly)
> > advertising an older implementation level, we never clear the bit for
> > STALL_SLOT.
> 
> I'm not sure if I understand this comment correctly.
> When the guest's PMUVer is older than v3p4, I don't think we need
> to clear the bit for STALL_SLOT, as PMMIR_EL1 is not implemented
> for the guest (PMMIR_EL1 is implemented only on v3p4 or newer).
> Or am I missing something ?

The guest's PMU version has no influence on the *hardware* value of
PMCEID1_EL0.

Suppose KVM is running on a v3p5+ implementation, but userspace has set
ID_AA64DFR0_EL1.PMUVer to v3p0. In this case the read of PMCEID1_EL0 on
the preceding line would advertise the STALL_SLOT event, and KVM fails
to mask it due to the ID register value. The fact we do not support the
event is an invariant, and in the worst case we wind up clearing a bit
that's already 0.

This is why I'd suggested just unconditionally clearing the bit. While
we're on the topic, doesn't the same reasoning hold for
STALL_SLOT_{FRONTEND,BACKEND}? We probably want to hide those too.

--
Thanks,
Oliver
Reiji Watanabe June 11, 2023, 4:01 p.m. UTC | #4
Hi Oliver,

Thank you for the clarification!
But, I still have some questions on your comments.

> > > We emulate reads of PMCEID1_EL0 using the literal value of the CPU. The
> > > _advertised_ PMU version has no bearing on the core PMU version. So,
> > > assuming we hit this on a v3p5+ part with userspace (stupidly)
> > > advertising an older implementation level, we never clear the bit for
> > > STALL_SLOT.
> > 
> > I'm not sure if I understand this comment correctly.
> > When the guest's PMUVer is older than v3p4, I don't think we need
> > to clear the bit for STALL_SLOT, as PMMIR_EL1 is not implemented
> > for the guest (PMMIR_EL1 is implemented only on v3p4 or newer).
> > Or am I missing something ?
> 
> The guest's PMU version has no influence on the *hardware* value of
> PMCEID1_EL0.
> 
> Suppose KVM is running on a v3p5+ implementation, but userspace has set
> ID_AA64DFR0_EL1.PMUVer to v3p0. In this case the read of PMCEID1_EL0 on
> the preceding line would advertise the STALL_SLOT event, and KVM fails
> to mask it due to the ID register value. The fact we do not support the
> event is an invariant, in the worst case we wind up clearing a bit
> that's already 0.

As far as I checked ArmARM, the STALL_SLOT event can be supported on
any PMUv3 version (including on v3p0).  Assuming that is true, I don't
see any reason to not expose the event to the guest in this particular
example. Or can the STALL_SLOT event only be implemented from certain
versions of PMUv3 ?


> This is why I'd suggested just unconditionally clearing the bit. While

When the hardware supports the STALL_SLOT event (again, I assume any
PMUv3 version hardware can support the event), and the guest's PMUVer
is older than v3p4, what is the reason why we want to clear the bit ?


> we're on the topic, doesn't the same reasoning hold for
> STALL_SLOT_{FRONTEND,BACKEND}? We probably want to hide those too.

Yes, I agree on that.  
I will include the fix for that as a part of this series!

Thank you,
Reiji
Oliver Upton June 12, 2023, 7:36 p.m. UTC | #5
On Sun, Jun 11, 2023 at 09:01:05AM -0700, Reiji Watanabe wrote:

[...]

> > Suppose KVM is running on a v3p5+ implementation, but userspace has set
> > ID_AA64DFR0_EL1.PMUVer to v3p0. In this case the read of PMCEID1_EL0 on
> > the preceding line would advertise the STALL_SLOT event, and KVM fails
> > to mask it due to the ID register value. The fact we do not support the
> > event is an invariant, in the worst case we wind up clearing a bit
> > that's already 0.
> 
> As far as I checked ArmARM, the STALL_SLOT event can be supported on
> any PMUv3 version (including on v3p0).  Assuming that is true, I don't
> see any reason to not expose the event to the guest in this particular
> example. Or can the STALL_SLOT event only be implemented from certain
> versions of PMUv3 ?

Well, users of the event don't get the full picture w/o PMMIR_EL1.SLOTS,
which is only available on v3p4+. We probably should start exposing the
register + event (separate from this change).

> > This is why I'd suggested just unconditionally clearing the bit. While
> 
> When the hardware supports the STALL_SLOT event (again, I assume any
> PMUv3 version hardware can support the event), and the guest's PMUVer
> is older than v3p4, what is the reason why we want to clear the bit ?

What's the value of the event w/o PMMIR_EL1? I agree there's no
fundamental issue with letting it past, but I'd rather we start
exposing the feature when we provide all the necessary detail.

--
Thanks,
Oliver
Reiji Watanabe June 13, 2023, 12:26 a.m. UTC | #6
On Mon, Jun 12, 2023 at 09:36:38PM +0200, Oliver Upton wrote:
> On Sun, Jun 11, 2023 at 09:01:05AM -0700, Reiji Watanabe wrote:
> 
> [...]
> 
> > > Suppose KVM is running on a v3p5+ implementation, but userspace has set
> > > ID_AA64DFR0_EL1.PMUVer to v3p0. In this case the read of PMCEID1_EL0 on
> > > the preceding line would advertise the STALL_SLOT event, and KVM fails
> > > to mask it due to the ID register value. The fact we do not support the
> > > event is an invariant, in the worst case we wind up clearing a bit
> > > that's already 0.
> > 
> > As far as I checked ArmARM, the STALL_SLOT event can be supported on
> > any PMUv3 version (including on v3p0).  Assuming that is true, I don't
> > see any reason to not expose the event to the guest in this particular
> > example. Or can the STALL_SLOT event only be implemented from certain
> > versions of PMUv3 ?
> 
> Well, users of the event don't get the full picture w/o PMMIR_EL1.SLOTS,
> which is only available on v3p4+. We probably should start exposing the
> register + event (separate from this change).
> 
> > > This is why I'd suggested just unconditionally clearing the bit. While
> > 
> > When the hardware supports the STALL_SLOT event (again, I assume any
> > PMUv3 version hardware can support the event), and the guest's PMUVer
> > is older than v3p4, what is the reason why we want to clear the bit ?
> 
> What's the value of the event w/o PMMIR_EL1? I agree there's no

I agree that the value of the event w/o PMMIR_EL1 is pretty limited.


> fundamental issue with letting it past, but I'd rather we start
> exposing the feature when we provide all the necessary detail.

To confirm, are you suggesting to stop exposing the event even on hosts
w/o PMMIR_EL1 until KVM gets ready to support PMMIR_EL1 ?
(guests on those hosts won't get PMMIR_EL1 in any case though?)
Could you please explain why ?

Perhaps I think I would rather keep the code as it is?
(since I'm simply not sure what would be the benefits of that)

Thank you,
Reiji
Oliver Upton June 14, 2023, 12:41 p.m. UTC | #7
On Mon, Jun 12, 2023 at 05:26:33PM -0700, Reiji Watanabe wrote:
> On Mon, Jun 12, 2023 at 09:36:38PM +0200, Oliver Upton wrote:
> > I'd rather we start exposing the feature when we provide all the
> > necessary detail.
> 
> To confirm, are you suggesting to stop exposing the event even on hosts
> w/o PMMIR_EL1 until KVM gets ready to support PMMIR_EL1 ?
> (guests on those hosts won't get PMMIR_EL1 in any case though?)
> Could you please explain why ?
> 
> Perhaps I think I would rather keep the code as it is?
> (since I'm simply not sure what would be the benefits of that)

I'd rather not keep confusing code hanging around. The fact that KVM
does not support the STALL_SLOTS event is invariant of both the hardware
PMU implementation and the userspace value for the ID register field.
Let's make sure the implementation exactly matches this position.
diff mbox series

Patch

diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 491ca7eb2a4c..2d52f44de4a1 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -35,12 +35,8 @@  static struct kvm_pmc *kvm_vcpu_idx_to_pmc(struct kvm_vcpu *vcpu, int cnt_idx)
 	return &vcpu->arch.pmu.pmc[cnt_idx];
 }
 
-static u32 kvm_pmu_event_mask(struct kvm *kvm)
+static u32 __kvm_pmu_event_mask(unsigned int pmuver)
 {
-	unsigned int pmuver;
-
-	pmuver = kvm->arch.arm_pmu->pmuver;
-
 	switch (pmuver) {
 	case ID_AA64DFR0_EL1_PMUVer_IMP:
 		return GENMASK(9, 0);
@@ -55,6 +51,11 @@  static u32 kvm_pmu_event_mask(struct kvm *kvm)
 	}
 }
 
+static u32 kvm_pmu_event_mask(struct kvm *kvm)
+{
+	return __kvm_pmu_event_mask(kvm->arch.dfr0_pmuver.imp);
+}
+
 /**
  * kvm_pmc_is_64bit - determine if counter is 64bit
  * @pmc: counter context
@@ -735,7 +736,7 @@  u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
 		 * Don't advertise STALL_SLOT, as PMMIR_EL0 is handled
 		 * as RAZ
 		 */
-		if (vcpu->kvm->arch.arm_pmu->pmuver >= ID_AA64DFR0_EL1_PMUVer_V3P4)
+		if (vcpu->kvm->arch.dfr0_pmuver.imp >= ID_AA64DFR0_EL1_PMUVer_V3P4)
 			val &= ~BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT - 32);
 		base = 32;
 	}
@@ -932,11 +933,17 @@  int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 		return 0;
 	}
 	case KVM_ARM_VCPU_PMU_V3_FILTER: {
+		u8 pmuver = kvm_arm_pmu_get_pmuver_limit();
 		struct kvm_pmu_event_filter __user *uaddr;
 		struct kvm_pmu_event_filter filter;
 		int nr_events;
 
-		nr_events = kvm_pmu_event_mask(kvm) + 1;
+		/*
+		 * Allow userspace to specify an event filter for the entire
+		 * event range supported by PMUVer of the hardware, rather
+		 * than the guest's PMUVer for KVM backward compatibility.
+		 */
+		nr_events = __kvm_pmu_event_mask(pmuver) + 1;
 
 		uaddr = (struct kvm_pmu_event_filter __user *)(long)attr->addr;