Message ID | 20240507112957.1701824-1-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86/cpu-policy: Fix migration from Ice Lake to Cascade Lake | expand |
On Tue, May 07, 2024 at 12:29:57PM +0100, Andrew Cooper wrote: > Ever since Xen 4.14, there has been a latent bug with migration. > > While some toolstacks can level the features properly, they don't shink > feat.max_subleaf when all features have been dropped. This is because > we *still* have not completed the toolstack side work for full CPU Policy > objects. > > As a consequence, even when properly feature levelled, VMs can't migrate > "backwards" across hardware which reduces feat.max_subleaf. One such example > is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0). > > Extend the host policy's feat.max_subleaf to the hightest number Xen knows > about, similarly to how we extend extd.max_leaf for LFENCE_DISPATCH. This > will allow VMs with a higher feat.max_subleaf than strictly necessary to > migrate in. Seeing what we do for max_extd_leaf, shouldn't we switch to doing what you propose for feat.max_subleaf to max_extd_leaf also? To allow migration between hosts that have 0x80000021.eax and hosts that don't have such extended leaf. cpu_has_lfence_dispatch kind of does that, but if lfence cannot be made serializing then the max extended leaf is not expanded. And we should also likely account for more feature leafs possibly appearing after 0x80000021? Thanks, Roger.
On 07/05/2024 12:50 pm, Roger Pau Monné wrote: > On Tue, May 07, 2024 at 12:29:57PM +0100, Andrew Cooper wrote: >> Ever since Xen 4.14, there has been a latent bug with migration. >> >> While some toolstacks can level the features properly, they don't shink >> feat.max_subleaf when all features have been dropped. This is because >> we *still* have not completed the toolstack side work for full CPU Policy >> objects. >> >> As a consequence, even when properly feature levelled, VMs can't migrate >> "backwards" across hardware which reduces feat.max_subleaf. One such example >> is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0). >> >> Extend the host policy's feat.max_subleaf to the hightest number Xen knows >> about, similarly to how we extend extd.max_leaf for LFENCE_DISPATCH. This >> will allow VMs with a higher feat.max_subleaf than strictly necessary to >> migrate in. > Seeing what we do for max_extd_leaf, shouldn't we switch to doing what > you propose for feat.max_subleaf to max_extd_leaf also? > > To allow migration between hosts that have 0x80000021.eax and hosts > that don't have such extended leaf. > > cpu_has_lfence_dispatch kind of does that, but if lfence cannot be > made serializing then the max extended leaf is not expanded. And we > should also likely account for more feature leafs possibly appearing > after 0x80000021? On second thoughts, this adjustment ought to be in the max policies only. It's slightly different to LFENCE_DISPATCH, in that we don't actually have any set bits in those leaves. I'll do a different patch. ~Andrew
diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c index 4b6d96276399..a216fc8b886f 100644 --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -373,8 +373,13 @@ static void __init calculate_host_policy(void) p->basic.max_leaf = min_t(uint32_t, p->basic.max_leaf, ARRAY_SIZE(p->basic.raw) - 1); - p->feat.max_subleaf = - min_t(uint32_t, p->feat.max_subleaf, ARRAY_SIZE(p->feat.raw) - 1); + + /* + * p->feat is "just" featureset information. We know about more than may + * be present in this hardware. Also, VMs may have a higher max_subleaf + * than strictly necessary, and we can accept those too. + */ + p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1; max_extd_leaf = p->extd.max_leaf;
Ever since Xen 4.14, there has been a latent bug with migration. While some toolstacks can level the features properly, they don't shink feat.max_subleaf when all features have been dropped. This is because we *still* have not completed the toolstack side work for full CPU Policy objects. As a consequence, even when properly feature levelled, VMs can't migrate "backwards" across hardware which reduces feat.max_subleaf. One such example is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0). Extend the host policy's feat.max_subleaf to the hightest number Xen knows about, similarly to how we extend extd.max_leaf for LFENCE_DISPATCH. This will allow VMs with a higher feat.max_subleaf than strictly necessary to migrate in. Eventually we'll manage to teach the toolstack how to avoid creating such VMs in the first place, but there's still more work to do there. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> --- xen/arch/x86/cpu-policy.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) base-commit: ebab808eb1bb8f24c7d0dd41b956e48cb1824b81