diff mbox series

x86/cpu-policy: Fix migration from Ice Lake to Cascade Lake

Message ID 20240507112957.1701824-1-andrew.cooper3@citrix.com (mailing list archive)
State Superseded
Headers show
Series x86/cpu-policy: Fix migration from Ice Lake to Cascade Lake | expand

Commit Message

Andrew Cooper May 7, 2024, 11:29 a.m. UTC
Ever since Xen 4.14, there has been a latent bug with migration.

While some toolstacks can level the features properly, they don't shink
feat.max_subleaf when all features have been dropped.  This is because
we *still* have not completed the toolstack side work for full CPU Policy
objects.

As a consequence, even when properly feature levelled, VMs can't migrate
"backwards" across hardware which reduces feat.max_subleaf.  One such example
is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0).

Extend the host policy's feat.max_subleaf to the hightest number Xen knows
about, similarly to how we extend extd.max_leaf for LFENCE_DISPATCH.  This
will allow VMs with a higher feat.max_subleaf than strictly necessary to
migrate in.

Eventually we'll manage to teach the toolstack how to avoid creating such VMs
in the first place, but there's still more work to do there.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/cpu-policy.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)


base-commit: ebab808eb1bb8f24c7d0dd41b956e48cb1824b81

Comments

Roger Pau Monné May 7, 2024, 11:50 a.m. UTC | #1
On Tue, May 07, 2024 at 12:29:57PM +0100, Andrew Cooper wrote:
> Ever since Xen 4.14, there has been a latent bug with migration.
> 
> While some toolstacks can level the features properly, they don't shink
> feat.max_subleaf when all features have been dropped.  This is because
> we *still* have not completed the toolstack side work for full CPU Policy
> objects.
> 
> As a consequence, even when properly feature levelled, VMs can't migrate
> "backwards" across hardware which reduces feat.max_subleaf.  One such example
> is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0).
> 
> Extend the host policy's feat.max_subleaf to the hightest number Xen knows
> about, similarly to how we extend extd.max_leaf for LFENCE_DISPATCH.  This
> will allow VMs with a higher feat.max_subleaf than strictly necessary to
> migrate in.

Seeing what we do for max_extd_leaf, shouldn't we switch to doing what
you propose for feat.max_subleaf to max_extd_leaf also?

To allow migration between hosts that have 0x80000021.eax and hosts
that don't have such extended leaf.

cpu_has_lfence_dispatch kind of does that, but if lfence cannot be
made serializing then the max extended leaf is not expanded.  And we
should also likely account for more feature leafs possibly appearing
after 0x80000021?

Thanks, Roger.
Andrew Cooper May 7, 2024, 12:19 p.m. UTC | #2
On 07/05/2024 12:50 pm, Roger Pau Monné wrote:
> On Tue, May 07, 2024 at 12:29:57PM +0100, Andrew Cooper wrote:
>> Ever since Xen 4.14, there has been a latent bug with migration.
>>
>> While some toolstacks can level the features properly, they don't shink
>> feat.max_subleaf when all features have been dropped.  This is because
>> we *still* have not completed the toolstack side work for full CPU Policy
>> objects.
>>
>> As a consequence, even when properly feature levelled, VMs can't migrate
>> "backwards" across hardware which reduces feat.max_subleaf.  One such example
>> is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0).
>>
>> Extend the host policy's feat.max_subleaf to the hightest number Xen knows
>> about, similarly to how we extend extd.max_leaf for LFENCE_DISPATCH.  This
>> will allow VMs with a higher feat.max_subleaf than strictly necessary to
>> migrate in.
> Seeing what we do for max_extd_leaf, shouldn't we switch to doing what
> you propose for feat.max_subleaf to max_extd_leaf also?
>
> To allow migration between hosts that have 0x80000021.eax and hosts
> that don't have such extended leaf.
>
> cpu_has_lfence_dispatch kind of does that, but if lfence cannot be
> made serializing then the max extended leaf is not expanded.  And we
> should also likely account for more feature leafs possibly appearing
> after 0x80000021?

On second thoughts, this adjustment ought to be in the max policies only.

It's slightly different to LFENCE_DISPATCH, in that we don't actually
have any set bits in those leaves.

I'll do a different patch.

~Andrew
diff mbox series

Patch

diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index 4b6d96276399..a216fc8b886f 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -373,8 +373,13 @@  static void __init calculate_host_policy(void)
 
     p->basic.max_leaf =
         min_t(uint32_t, p->basic.max_leaf,   ARRAY_SIZE(p->basic.raw) - 1);
-    p->feat.max_subleaf =
-        min_t(uint32_t, p->feat.max_subleaf, ARRAY_SIZE(p->feat.raw) - 1);
+
+    /*
+     * p->feat is "just" featureset information.  We know about more than may
+     * be present in this hardware.  Also, VMs may have a higher max_subleaf
+     * than strictly necessary, and we can accept those too.
+     */
+    p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1;
 
     max_extd_leaf = p->extd.max_leaf;