diff mbox series

[v6,4/4] arm64: paravirt: Enable errata based on implementation CPUs

Message ID 20250205132222.55816-5-shameerali.kolothum.thodi@huawei.com (mailing list archive)
State New
Headers show
Series KVM: arm64: Errata management for VM Live migration | expand

Commit Message

Shameerali Kolothum Thodi Feb. 5, 2025, 1:22 p.m. UTC
Retrieve any migration target implementation CPUs using the hypercall
and enable associated errata.

Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 arch/arm64/include/asm/cputype.h    | 24 ++++++++++-
 arch/arm64/include/asm/hypervisor.h |  1 +
 arch/arm64/kernel/cpu_errata.c      | 20 +++++++--
 arch/arm64/kernel/cpufeature.c      |  2 +
 arch/arm64/kernel/image-vars.h      |  2 +
 drivers/firmware/smccc/kvm_guest.c  | 63 +++++++++++++++++++++++++++++
 6 files changed, 107 insertions(+), 5 deletions(-)

Comments

Catalin Marinas Feb. 7, 2025, 2:08 p.m. UTC | #1
On Wed, Feb 05, 2025 at 01:22:22PM +0000, Shameer Kolothum wrote:
>  static inline bool is_midr_in_range(struct midr_range const *range)
>  {
> -	return midr_is_cpu_model_range(read_cpuid_id(), range->model,
> -				       range->rv_min, range->rv_max);
> +	int i;
> +
> +	if (!target_impl_cpu_num)
> +		return midr_is_cpu_model_range(read_cpuid_id(), range->model,
> +					       range->rv_min, range->rv_max);
> +
> +	for (i = 0; i < target_impl_cpu_num; i++) {
> +		if (midr_is_cpu_model_range(target_impl_cpus[i].midr,
> +					    range->model,
> +					    range->rv_min, range->rv_max))
> +			return true;
> +	}
> +	return false;
>  }

It's a interesting approach but how does this work in practice if an
erratum requires a firmware counterpart? Do we expect firmwares on all
machines involved to have workarounds for the other machines? Or is KVM
going to intercept those SMCs and pretend the EL3 counterpart is there?
Marc Zyngier Feb. 7, 2025, 2:31 p.m. UTC | #2
On Fri, 07 Feb 2025 14:08:44 +0000,
Catalin Marinas <catalin.marinas@arm.com> wrote:
> 
> On Wed, Feb 05, 2025 at 01:22:22PM +0000, Shameer Kolothum wrote:
> >  static inline bool is_midr_in_range(struct midr_range const *range)
> >  {
> > -	return midr_is_cpu_model_range(read_cpuid_id(), range->model,
> > -				       range->rv_min, range->rv_max);
> > +	int i;
> > +
> > +	if (!target_impl_cpu_num)
> > +		return midr_is_cpu_model_range(read_cpuid_id(), range->model,
> > +					       range->rv_min, range->rv_max);
> > +
> > +	for (i = 0; i < target_impl_cpu_num; i++) {
> > +		if (midr_is_cpu_model_range(target_impl_cpus[i].midr,
> > +					    range->model,
> > +					    range->rv_min, range->rv_max))
> > +			return true;
> > +	}
> > +	return false;
> >  }
> 
> It's a interesting approach but how does this work in practice if an
> erratum requires a firmware counterpart? Do we expect firmwares on all
> machines involved to have workarounds for the other machines? Or is KVM
> going to intercept those SMCs and pretend the EL3 counterpart is there?

KVM already traps SMCs, and could do something on behalf of the guest
(such as pretending that the mitigation has happened if not on the
correct host) *IF* the mitigation is architected (à la WA{1,2,3}).

If it is implementation specific, then we can immediately stop
pretending that a guest running on those systems can be migrated.

The only thing it helps a bit is big-little.

Overall, this isn't meant to be foolproof at all. It is only giving
people enough ammunition to make it plain that the cross-platform
story is a bit of a sad joke right now.

Thanks,

	M.
Catalin Marinas Feb. 7, 2025, 6:10 p.m. UTC | #3
On Fri, Feb 07, 2025 at 02:31:12PM +0000, Marc Zyngier wrote:
> On Fri, 07 Feb 2025 14:08:44 +0000,
> Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Wed, Feb 05, 2025 at 01:22:22PM +0000, Shameer Kolothum wrote:
> > >  static inline bool is_midr_in_range(struct midr_range const *range)
> > >  {
> > > -	return midr_is_cpu_model_range(read_cpuid_id(), range->model,
> > > -				       range->rv_min, range->rv_max);
> > > +	int i;
> > > +
> > > +	if (!target_impl_cpu_num)
> > > +		return midr_is_cpu_model_range(read_cpuid_id(), range->model,
> > > +					       range->rv_min, range->rv_max);
> > > +
> > > +	for (i = 0; i < target_impl_cpu_num; i++) {
> > > +		if (midr_is_cpu_model_range(target_impl_cpus[i].midr,
> > > +					    range->model,
> > > +					    range->rv_min, range->rv_max))
> > > +			return true;
> > > +	}
> > > +	return false;
> > >  }
> > 
> > It's a interesting approach but how does this work in practice if an
> > erratum requires a firmware counterpart? Do we expect firmwares on all
> > machines involved to have workarounds for the other machines? Or is KVM
> > going to intercept those SMCs and pretend the EL3 counterpart is there?
> 
> KVM already traps SMCs, and could do something on behalf of the guest
> (such as pretending that the mitigation has happened if not on the
> correct host) *IF* the mitigation is architected (à la WA{1,2,3}).

That's the main thing I had in mind. I don't think we have any other
errata that requires firmware run-time discovery and interaction, though
you never know when we'll add new one.

> If it is implementation specific, then we can immediately stop
> pretending that a guest running on those systems can be migrated.

Makes sense.

> The only thing it helps a bit is big-little.

It does help a bit or, at least, we have some code for handling these
variations that cab be extended. However, with this patchset, the host
only probes the availability of the workarounds on the SoC it booted. It
has no idea about the extra MIDRs the VMM picks and what the other
machines in the clouds support.

Anyway, let's hope the VMs only migrate between platforms that are
equally broken.
Marc Zyngier Feb. 7, 2025, 6:17 p.m. UTC | #4
On Fri, 07 Feb 2025 18:10:08 +0000,
Catalin Marinas <catalin.marinas@arm.com> wrote:
> 
> On Fri, Feb 07, 2025 at 02:31:12PM +0000, Marc Zyngier wrote:
> > On Fri, 07 Feb 2025 14:08:44 +0000,
> > Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > On Wed, Feb 05, 2025 at 01:22:22PM +0000, Shameer Kolothum wrote:
> > > >  static inline bool is_midr_in_range(struct midr_range const *range)
> > > >  {
> > > > -	return midr_is_cpu_model_range(read_cpuid_id(), range->model,
> > > > -				       range->rv_min, range->rv_max);
> > > > +	int i;
> > > > +
> > > > +	if (!target_impl_cpu_num)
> > > > +		return midr_is_cpu_model_range(read_cpuid_id(), range->model,
> > > > +					       range->rv_min, range->rv_max);
> > > > +
> > > > +	for (i = 0; i < target_impl_cpu_num; i++) {
> > > > +		if (midr_is_cpu_model_range(target_impl_cpus[i].midr,
> > > > +					    range->model,
> > > > +					    range->rv_min, range->rv_max))
> > > > +			return true;
> > > > +	}
> > > > +	return false;
> > > >  }
> > > 
> > > It's a interesting approach but how does this work in practice if an
> > > erratum requires a firmware counterpart? Do we expect firmwares on all
> > > machines involved to have workarounds for the other machines? Or is KVM
> > > going to intercept those SMCs and pretend the EL3 counterpart is there?
> > 
> > KVM already traps SMCs, and could do something on behalf of the guest
> > (such as pretending that the mitigation has happened if not on the
> > correct host) *IF* the mitigation is architected (à la WA{1,2,3}).
> 
> That's the main thing I had in mind. I don't think we have any other
> errata that requires firmware run-time discovery and interaction, though
> you never know when we'll add new one.
> 
> > If it is implementation specific, then we can immediately stop
> > pretending that a guest running on those systems can be migrated.
> 
> Makes sense.
> 
> > The only thing it helps a bit is big-little.
> 
> It does help a bit or, at least, we have some code for handling these
> variations that cab be extended. However, with this patchset, the host
> only probes the availability of the workarounds on the SoC it booted. It
> has no idea about the extra MIDRs the VMM picks and what the other
> machines in the clouds support.

But that's the contract. The VMM has to be omniscient and know exactly
what it can safely migrate to. It literally says "trust me, I know
what I'm doing".

> Anyway, let's hope the VMs only migrate between platforms that are
> equally broken.

No shortage of that, I'm afraid! :)

	M.
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index 2a76f0e30006..975fbdfc9354 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -267,6 +267,15 @@  struct midr_range {
 #define MIDR_REV(m, v, r) MIDR_RANGE(m, v, r, v, r)
 #define MIDR_ALL_VERSIONS(m) MIDR_RANGE(m, 0, 0, 0xf, 0xf)
 
+struct target_impl_cpu {
+	u64 midr;
+	u64 revidr;
+	u64 aidr;
+};
+
+extern u32 target_impl_cpu_num;
+extern struct target_impl_cpu *target_impl_cpus;
+
 static inline bool midr_is_cpu_model_range(u32 midr, u32 model, u32 rv_min,
 					   u32 rv_max)
 {
@@ -278,8 +287,19 @@  static inline bool midr_is_cpu_model_range(u32 midr, u32 model, u32 rv_min,
 
 static inline bool is_midr_in_range(struct midr_range const *range)
 {
-	return midr_is_cpu_model_range(read_cpuid_id(), range->model,
-				       range->rv_min, range->rv_max);
+	int i;
+
+	if (!target_impl_cpu_num)
+		return midr_is_cpu_model_range(read_cpuid_id(), range->model,
+					       range->rv_min, range->rv_max);
+
+	for (i = 0; i < target_impl_cpu_num; i++) {
+		if (midr_is_cpu_model_range(target_impl_cpus[i].midr,
+					    range->model,
+					    range->rv_min, range->rv_max))
+			return true;
+	}
+	return false;
 }
 
 static inline bool
diff --git a/arch/arm64/include/asm/hypervisor.h b/arch/arm64/include/asm/hypervisor.h
index 409e239834d1..a12fd897c877 100644
--- a/arch/arm64/include/asm/hypervisor.h
+++ b/arch/arm64/include/asm/hypervisor.h
@@ -6,6 +6,7 @@ 
 
 void kvm_init_hyp_services(void);
 bool kvm_arm_hyp_service_available(u32 func_id);
+void kvm_arm_target_impl_cpu_init(void);
 
 #ifdef CONFIG_ARM_PKVM_GUEST
 void pkvm_init_hyp_services(void);
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 99b55893fc4e..1177a3094cf2 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -14,6 +14,9 @@ 
 #include <asm/kvm_asm.h>
 #include <asm/smp_plat.h>
 
+u32 target_impl_cpu_num;
+struct target_impl_cpu *target_impl_cpus;
+
 static bool __maybe_unused
 __is_affected_midr_range(const struct arm64_cpu_capabilities *entry,
 			 u32 midr, u32 revidr)
@@ -32,9 +35,20 @@  __is_affected_midr_range(const struct arm64_cpu_capabilities *entry,
 static bool __maybe_unused
 is_affected_midr_range(const struct arm64_cpu_capabilities *entry, int scope)
 {
-	WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
-	return __is_affected_midr_range(entry, read_cpuid_id(),
-					read_cpuid(REVIDR_EL1));
+	int i;
+
+	if (!target_impl_cpu_num) {
+		WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
+		return __is_affected_midr_range(entry, read_cpuid_id(),
+						read_cpuid(REVIDR_EL1));
+	}
+
+	for (i = 0; i < target_impl_cpu_num; i++) {
+		if (__is_affected_midr_range(entry, target_impl_cpus[i].midr,
+					     target_impl_cpus[i].midr))
+			return true;
+	}
+	return false;
 }
 
 static bool __maybe_unused
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 72e876f37cd4..5c61d9d9f097 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -86,6 +86,7 @@ 
 #include <asm/kvm_host.h>
 #include <asm/mmu_context.h>
 #include <asm/mte.h>
+#include <asm/hypervisor.h>
 #include <asm/processor.h>
 #include <asm/smp.h>
 #include <asm/sysreg.h>
@@ -3679,6 +3680,7 @@  unsigned long cpu_get_elf_hwcap3(void)
 
 static void __init setup_boot_cpu_capabilities(void)
 {
+	kvm_arm_target_impl_cpu_init();
 	/*
 	 * The boot CPU's feature register values have been recorded. Detect
 	 * boot cpucaps and local cpucaps for the boot CPU, then enable and
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index ef3a69cc398e..a6926750faaf 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -49,6 +49,8 @@  PROVIDE(__pi_arm64_sw_feature_override	= arm64_sw_feature_override);
 PROVIDE(__pi_arm64_use_ng_mappings	= arm64_use_ng_mappings);
 #ifdef CONFIG_CAVIUM_ERRATUM_27456
 PROVIDE(__pi_cavium_erratum_27456_cpus	= cavium_erratum_27456_cpus);
+PROVIDE(__pi_target_impl_cpu_num	= target_impl_cpu_num);
+PROVIDE(__pi_target_impl_cpus		= target_impl_cpus);
 #endif
 PROVIDE(__pi__ctype			= _ctype);
 PROVIDE(__pi_memstart_offset_seed	= memstart_offset_seed);
diff --git a/drivers/firmware/smccc/kvm_guest.c b/drivers/firmware/smccc/kvm_guest.c
index f3319be20b36..f0b0154150b5 100644
--- a/drivers/firmware/smccc/kvm_guest.c
+++ b/drivers/firmware/smccc/kvm_guest.c
@@ -6,8 +6,11 @@ 
 #include <linux/bitmap.h>
 #include <linux/cache.h>
 #include <linux/kernel.h>
+#include <linux/memblock.h>
 #include <linux/string.h>
 
+#include <uapi/linux/psci.h>
+
 #include <asm/hypervisor.h>
 
 static DECLARE_BITMAP(__kvm_arm_hyp_services, ARM_SMCCC_KVM_NUM_FUNCS) __ro_after_init = { };
@@ -51,3 +54,63 @@  bool kvm_arm_hyp_service_available(u32 func_id)
 	return test_bit(func_id, __kvm_arm_hyp_services);
 }
 EXPORT_SYMBOL_GPL(kvm_arm_hyp_service_available);
+
+void  __init kvm_arm_target_impl_cpu_init(void)
+{
+	int i;
+	u32 ver;
+	u64 max_cpus;
+	struct arm_smccc_res res;
+
+	/* Check we have already set targets */
+	if (target_impl_cpu_num)
+		return;
+
+	if (!kvm_arm_hyp_service_available(ARM_SMCCC_KVM_FUNC_DISCOVER_IMPL_VER) ||
+	    !kvm_arm_hyp_service_available(ARM_SMCCC_KVM_FUNC_DISCOVER_IMPL_CPUS))
+		return;
+
+	arm_smccc_1_1_invoke(ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_VER_FUNC_ID,
+			     0, &res);
+	if (res.a0 != SMCCC_RET_SUCCESS)
+		return;
+
+	/* Version info is in lower 32 bits and is in SMMCCC_VERSION format */
+	ver = lower_32_bits(res.a1);
+	if (PSCI_VERSION_MAJOR(ver) != 1) {
+		pr_warn("Unsupported target CPU implementation version v%d.%d\n",
+			PSCI_VERSION_MAJOR(ver), PSCI_VERSION_MINOR(ver));
+		return;
+	}
+
+	if (!res.a2) {
+		pr_warn("No target implementation CPUs specified\n");
+		return;
+	}
+
+	max_cpus = res.a2;
+	target_impl_cpus = memblock_alloc(sizeof(*target_impl_cpus) * max_cpus,
+					  __alignof__(*target_impl_cpus));
+	if (!target_impl_cpus) {
+		pr_warn("Not enough memory for struct target_impl_cpu\n");
+		return;
+	}
+
+	for (i = 0; i < max_cpus; i++) {
+		arm_smccc_1_1_invoke(ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_CPUS_FUNC_ID,
+				     i, &res);
+		if (res.a0 != SMCCC_RET_SUCCESS) {
+			memblock_free(target_impl_cpus,
+				      sizeof(*target_impl_cpus) * max_cpus);
+			target_impl_cpus = NULL;
+			pr_warn("Discovering target implementation CPUs failed\n");
+			return;
+		}
+		target_impl_cpus[i].midr = res.a1;
+		target_impl_cpus[i].revidr = res.a2;
+		target_impl_cpus[i].aidr = res.a3;
+	};
+
+	target_impl_cpu_num = max_cpus;
+	pr_info("Number of target implementation CPUs is %d\n", target_impl_cpu_num);
+}