diff mbox series

[v8,07/13] KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest

Message ID 20231020214053.2144305-8-rananta@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: PMU: Allow userspace to limit the number of PMCs on vCPU | expand

Commit Message

Raghavendra Rao Ananta Oct. 20, 2023, 9:40 p.m. UTC
From: Reiji Watanabe <reijiw@google.com>

KVM does not yet support userspace modifying PMCR_EL0.N (With
the previous patch, KVM ignores what is written by userspace).
Add support userspace limiting PMCR_EL0.N.

Disallow userspace to set PMCR_EL0.N to a value that is greater
than the host value as KVM doesn't support more event counters
than what the host HW implements. Also, make this register
immutable after the VM has started running. To maintain the
existing expectations, instead of returning an error, KVM
returns a success for these two cases.

Finally, ignore writes to read-only bits that are cleared on
vCPU reset, and RES{0,1} bits (including writable bits that
KVM doesn't support yet), as those bits shouldn't be modified
(at least with the current KVM).

Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
---
 arch/arm64/kvm/sys_regs.c | 57 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 55 insertions(+), 2 deletions(-)

Comments

Marc Zyngier Oct. 23, 2023, 1 p.m. UTC | #1
On Fri, 20 Oct 2023 22:40:47 +0100,
Raghavendra Rao Ananta <rananta@google.com> wrote:
> 
> From: Reiji Watanabe <reijiw@google.com>
> 
> KVM does not yet support userspace modifying PMCR_EL0.N (With
> the previous patch, KVM ignores what is written by userspace).
> Add support userspace limiting PMCR_EL0.N.
> 
> Disallow userspace to set PMCR_EL0.N to a value that is greater
> than the host value as KVM doesn't support more event counters
> than what the host HW implements. Also, make this register
> immutable after the VM has started running. To maintain the
> existing expectations, instead of returning an error, KVM
> returns a success for these two cases.
> 
> Finally, ignore writes to read-only bits that are cleared on
> vCPU reset, and RES{0,1} bits (including writable bits that
> KVM doesn't support yet), as those bits shouldn't be modified
> (at least with the current KVM).
> 
> Signed-off-by: Reiji Watanabe <reijiw@google.com>
> Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
> ---
>  arch/arm64/kvm/sys_regs.c | 57 +++++++++++++++++++++++++++++++++++++--
>  1 file changed, 55 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 2e5d497596ef8..a2c5f210b3d6b 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1176,6 +1176,59 @@ static int get_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
>  	return 0;
>  }
>  
> +static int set_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> +		    u64 val)
> +{
> +	struct kvm *kvm = vcpu->kvm;
> +	u64 new_n, mutable_mask;

Really, this lacks consistency. Either you make N a u8 everywhere, or
a u64 everywhere. I don't mind either, but the type confusion is not
great.

> +
> +	mutex_lock(&kvm->arch.config_lock);
> +
> +	/*
> +	 * Make PMCR immutable once the VM has started running, but
> +	 * do not return an error to meet the existing expectations.
> +	 */
> +	if (kvm_vm_has_ran_once(vcpu->kvm)) {
> +		mutex_unlock(&kvm->arch.config_lock);
> +		return 0;
> +	}
> +
> +	new_n = (val >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK;
> +	if (new_n != kvm->arch.pmcr_n) {

Why do we need to check this?

> +		u8 pmcr_n_limit = kvm_arm_pmu_get_max_counters(kvm);

Can you see why I'm annoyed?

> +
> +		/*
> +		 * The vCPU can't have more counters than the PMU hardware
> +		 * implements. Ignore this error to maintain compatibility
> +		 * with the existing KVM behavior.
> +		 */
> +		if (new_n <= pmcr_n_limit)

Isn't this the only thing that actually matters?

> +			kvm->arch.pmcr_n = new_n;
> +	}
> +	mutex_unlock(&kvm->arch.config_lock);
> +
> +	/*
> +	 * Ignore writes to RES0 bits, read only bits that are cleared on
> +	 * vCPU reset, and writable bits that KVM doesn't support yet.
> +	 * (i.e. only PMCR.N and bits [7:0] are mutable from userspace)
> +	 * The LP bit is RES0 when FEAT_PMUv3p5 is not supported on the vCPU.
> +	 * But, we leave the bit as it is here, as the vCPU's PMUver might
> +	 * be changed later (NOTE: the bit will be cleared on first vCPU run
> +	 * if necessary).
> +	 */
> +	mutable_mask = (ARMV8_PMU_PMCR_MASK |
> +			(ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT));

Why is N part of the 'mutable' mask? The only bits that should make it
into the register are ARMV8_PMU_PMCR_MASK.

> +	val &= mutable_mask;
> +	val |= (__vcpu_sys_reg(vcpu, r->reg) & ~mutable_mask);
> +
> +	/* The LC bit is RES1 when AArch32 is not supported */
> +	if (!kvm_supports_32bit_el0())
> +		val |= ARMV8_PMU_PMCR_LC;
> +
> +	__vcpu_sys_reg(vcpu, r->reg) = val;
> +	return 0;

I think this should be rewritten as:

	val &= ARMV8_PMU_PMCR_MASK;
	/* The LC bit is RES1 when AArch32 is not supported */
	if (!kvm_supports_32bit_el0())
		val |= ARMV8_PMU_PMCR_LC;

	__vcpu_sys_reg(vcpu, r->reg) = val;
	return 0;

And that's it. Drop this 'mutable_mask' nonsense, as we should be
getting the correct value (merge of the per-vcpu register and VM-wide
N) since patch 4.

	M.
Raghavendra Rao Ananta Oct. 23, 2023, 5:53 p.m. UTC | #2
On Mon, Oct 23, 2023 at 6:00 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 20 Oct 2023 22:40:47 +0100,
> Raghavendra Rao Ananta <rananta@google.com> wrote:
> >
> > From: Reiji Watanabe <reijiw@google.com>
> >
> > KVM does not yet support userspace modifying PMCR_EL0.N (With
> > the previous patch, KVM ignores what is written by userspace).
> > Add support userspace limiting PMCR_EL0.N.
> >
> > Disallow userspace to set PMCR_EL0.N to a value that is greater
> > than the host value as KVM doesn't support more event counters
> > than what the host HW implements. Also, make this register
> > immutable after the VM has started running. To maintain the
> > existing expectations, instead of returning an error, KVM
> > returns a success for these two cases.
> >
> > Finally, ignore writes to read-only bits that are cleared on
> > vCPU reset, and RES{0,1} bits (including writable bits that
> > KVM doesn't support yet), as those bits shouldn't be modified
> > (at least with the current KVM).
> >
> > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
> > ---
> >  arch/arm64/kvm/sys_regs.c | 57 +++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 55 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index 2e5d497596ef8..a2c5f210b3d6b 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -1176,6 +1176,59 @@ static int get_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> >       return 0;
> >  }
> >
> > +static int set_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> > +                 u64 val)
> > +{
> > +     struct kvm *kvm = vcpu->kvm;
> > +     u64 new_n, mutable_mask;
>
> Really, this lacks consistency. Either you make N a u8 everywhere, or
> a u64 everywhere. I don't mind either, but the type confusion is not
> great.
>
Sorry about that. I'll make it u8 across the board.

> > +
> > +     mutex_lock(&kvm->arch.config_lock);
> > +
> > +     /*
> > +      * Make PMCR immutable once the VM has started running, but
> > +      * do not return an error to meet the existing expectations.
> > +      */
> > +     if (kvm_vm_has_ran_once(vcpu->kvm)) {
> > +             mutex_unlock(&kvm->arch.config_lock);
> > +             return 0;
> > +     }
> > +
> > +     new_n = (val >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK;
> > +     if (new_n != kvm->arch.pmcr_n) {
>
> Why do we need to check this?
>
Hmm, it may be redundant. I guess we can skip this, check for the
limit, and directly write new_n to kvm->arch.pmcr_n.

> > +             u8 pmcr_n_limit = kvm_arm_pmu_get_max_counters(kvm);
>
> Can you see why I'm annoyed?
>
Yes. I'll make these consistent.

> > +
> > +             /*
> > +              * The vCPU can't have more counters than the PMU hardware
> > +              * implements. Ignore this error to maintain compatibility
> > +              * with the existing KVM behavior.
> > +              */
> > +             if (new_n <= pmcr_n_limit)
>
> Isn't this the only thing that actually matters?
>
Yes, I'll remove the above check.

> > +                     kvm->arch.pmcr_n = new_n;
> > +     }
> > +     mutex_unlock(&kvm->arch.config_lock);
> > +
> > +     /*
> > +      * Ignore writes to RES0 bits, read only bits that are cleared on
> > +      * vCPU reset, and writable bits that KVM doesn't support yet.
> > +      * (i.e. only PMCR.N and bits [7:0] are mutable from userspace)
> > +      * The LP bit is RES0 when FEAT_PMUv3p5 is not supported on the vCPU.
> > +      * But, we leave the bit as it is here, as the vCPU's PMUver might
> > +      * be changed later (NOTE: the bit will be cleared on first vCPU run
> > +      * if necessary).
> > +      */
> > +     mutable_mask = (ARMV8_PMU_PMCR_MASK |
> > +                     (ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT));
>
> Why is N part of the 'mutable' mask? The only bits that should make it
> into the register are ARMV8_PMU_PMCR_MASK.
>
> > +     val &= mutable_mask;
> > +     val |= (__vcpu_sys_reg(vcpu, r->reg) & ~mutable_mask);
> > +
> > +     /* The LC bit is RES1 when AArch32 is not supported */
> > +     if (!kvm_supports_32bit_el0())
> > +             val |= ARMV8_PMU_PMCR_LC;
> > +
> > +     __vcpu_sys_reg(vcpu, r->reg) = val;
> > +     return 0;
>
> I think this should be rewritten as:
>
>         val &= ARMV8_PMU_PMCR_MASK;
>         /* The LC bit is RES1 when AArch32 is not supported */
>         if (!kvm_supports_32bit_el0())
>                 val |= ARMV8_PMU_PMCR_LC;
>
>         __vcpu_sys_reg(vcpu, r->reg) = val;
>         return 0;
>
> And that's it. Drop this 'mutable_mask' nonsense, as we should be
> getting the correct value (merge of the per-vcpu register and VM-wide
> N) since patch 4.
>
Sure, I'll consider this.

Thank you.
Raghavendra
Oliver Upton Oct. 24, 2023, 6:37 p.m. UTC | #3
On Fri, Oct 20, 2023 at 09:40:47PM +0000, Raghavendra Rao Ananta wrote:

[...]

> +	/*
> +	 * Make PMCR immutable once the VM has started running, but
> +	 * do not return an error to meet the existing expectations.
> +	 */
> +	if (kvm_vm_has_ran_once(vcpu->kvm)) {
> +		mutex_unlock(&kvm->arch.config_lock);
> +		return 0;
> +	}

Marc pointed out offline that PMCR_EL0 needs to be mutable at runtime as
well. I'll admit, the architecture isn't very helpful as it is both used
for identification _and_ configuration.

What I had in mind a few revisions ago when I gave the suggestion was to
ignore writes to _just_ the PMCR_EL0.N field after the VM has started.
diff mbox series

Patch

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 2e5d497596ef8..a2c5f210b3d6b 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1176,6 +1176,59 @@  static int get_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
 	return 0;
 }
 
+static int set_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+		    u64 val)
+{
+	struct kvm *kvm = vcpu->kvm;
+	u64 new_n, mutable_mask;
+
+	mutex_lock(&kvm->arch.config_lock);
+
+	/*
+	 * Make PMCR immutable once the VM has started running, but
+	 * do not return an error to meet the existing expectations.
+	 */
+	if (kvm_vm_has_ran_once(vcpu->kvm)) {
+		mutex_unlock(&kvm->arch.config_lock);
+		return 0;
+	}
+
+	new_n = (val >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK;
+	if (new_n != kvm->arch.pmcr_n) {
+		u8 pmcr_n_limit = kvm_arm_pmu_get_max_counters(kvm);
+
+		/*
+		 * The vCPU can't have more counters than the PMU hardware
+		 * implements. Ignore this error to maintain compatibility
+		 * with the existing KVM behavior.
+		 */
+		if (new_n <= pmcr_n_limit)
+			kvm->arch.pmcr_n = new_n;
+	}
+	mutex_unlock(&kvm->arch.config_lock);
+
+	/*
+	 * Ignore writes to RES0 bits, read only bits that are cleared on
+	 * vCPU reset, and writable bits that KVM doesn't support yet.
+	 * (i.e. only PMCR.N and bits [7:0] are mutable from userspace)
+	 * The LP bit is RES0 when FEAT_PMUv3p5 is not supported on the vCPU.
+	 * But, we leave the bit as it is here, as the vCPU's PMUver might
+	 * be changed later (NOTE: the bit will be cleared on first vCPU run
+	 * if necessary).
+	 */
+	mutable_mask = (ARMV8_PMU_PMCR_MASK |
+			(ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT));
+	val &= mutable_mask;
+	val |= (__vcpu_sys_reg(vcpu, r->reg) & ~mutable_mask);
+
+	/* The LC bit is RES1 when AArch32 is not supported */
+	if (!kvm_supports_32bit_el0())
+		val |= ARMV8_PMU_PMCR_LC;
+
+	__vcpu_sys_reg(vcpu, r->reg) = val;
+	return 0;
+}
+
 /* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
 #define DBG_BCR_BVR_WCR_WVR_EL1(n)					\
 	{ SYS_DESC(SYS_DBGBVRn_EL1(n)),					\
@@ -2309,8 +2362,8 @@  static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_CTR_EL0), access_ctr },
 	{ SYS_DESC(SYS_SVCR), undef_access },
 
-	{ PMU_SYS_REG(PMCR_EL0), .access = access_pmcr,
-	  .reset = reset_pmcr, .reg = PMCR_EL0, .get_user = get_pmcr },
+	{ PMU_SYS_REG(PMCR_EL0), .access = access_pmcr, .reset = reset_pmcr,
+	  .reg = PMCR_EL0, .get_user = get_pmcr, .set_user = set_pmcr },
 	{ PMU_SYS_REG(PMCNTENSET_EL0),
 	  .access = access_pmcnten, .reg = PMCNTENSET_EL0,
 	  .get_user = get_pmcnten, .set_user = set_pmcnten },