Message ID | 05257008-13e3-0d49-cd1d-6a8c9eee2ce5@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86: port self-snoop related patches from Linux | expand |
On 18/07/2019 13:09, Jan Beulich wrote: > --- a/xen/arch/x86/cpu/intel.c > +++ b/xen/arch/x86/cpu/intel.c > @@ -15,6 +15,32 @@ > #include "cpu.h" > > /* > + * Processors which have self-snooping capability can handle conflicting > + * memory type across CPUs by snooping its own cache. However, there exists > + * CPU models in which having conflicting memory types still leads to > + * unpredictable behavior, machine check errors, or hangs. Clear this > + * feature to prevent its use on machines with known erratas. > + */ > +static void __init check_memory_type_self_snoop_errata(void) > +{ > + switch (boot_cpu_data.x86_model) { > + case 0x0f: /* Merom */ > + case 0x16: /* Merom L */ > + case 0x17: /* Penryn */ > + case 0x1d: /* Dunnington */ > + case 0x1e: /* Nehalem */ > + case 0x1f: /* Auburndale / Havendale */ > + case 0x1a: /* Nehalem EP */ > + case 0x2e: /* Nehalem EX */ > + case 0x25: /* Westmere */ > + case 0x2c: /* Westmere EP */ > + case 0x2a: /* SandyBridge */ It would have been nice if the errata had actually been identified... > + setup_clear_cpu_cap(X86_FEATURE_SS); I'm regretting exposing SS to guests at this point. As this stands, it will result in a migration compatibility issue, because updating Xen will cause a feature to disappear. If we had a default vs full policy split, this would be easy enough to work around in a compatible way. I wonder if there is anything clever we can do in the meantime as a stopgap workaround. ~Andrew
On 18.07.2019 14:23, Andrew Cooper wrote: > On 18/07/2019 13:09, Jan Beulich wrote: >> --- a/xen/arch/x86/cpu/intel.c >> +++ b/xen/arch/x86/cpu/intel.c >> @@ -15,6 +15,32 @@ >> #include "cpu.h" >> >> /* >> + * Processors which have self-snooping capability can handle conflicting >> + * memory type across CPUs by snooping its own cache. However, there exists >> + * CPU models in which having conflicting memory types still leads to >> + * unpredictable behavior, machine check errors, or hangs. Clear this >> + * feature to prevent its use on machines with known erratas. >> + */ >> +static void __init check_memory_type_self_snoop_errata(void) >> +{ >> + switch (boot_cpu_data.x86_model) { >> + case 0x0f: /* Merom */ >> + case 0x16: /* Merom L */ >> + case 0x17: /* Penryn */ >> + case 0x1d: /* Dunnington */ >> + case 0x1e: /* Nehalem */ >> + case 0x1f: /* Auburndale / Havendale */ >> + case 0x1a: /* Nehalem EP */ >> + case 0x2e: /* Nehalem EX */ >> + case 0x25: /* Westmere */ >> + case 0x2c: /* Westmere EP */ >> + case 0x2a: /* SandyBridge */ > > It would have been nice if the errata had actually been identified... Indeed; I hope you don't expect me to go hunt them down. I'm cloning a Linux commit here only, after all. >> + setup_clear_cpu_cap(X86_FEATURE_SS); > > I'm regretting exposing SS to guests at this point. > > As this stands, it will result in a migration compatibility issue, > because updating Xen will cause a feature to disappear. If we had a > default vs full policy split, this would be easy enough to work around > in a compatible way. I wonder if there is anything clever we can do in > the meantime as a stopgap workaround. Should we perhaps introduce X86_FEATURE_XEN_SELFSNOOP, just like we do for SMEP and SMAP, such that we can leave the real one alone? Jan
On 18/07/2019 14:07, Jan Beulich wrote: > On 18.07.2019 14:23, Andrew Cooper wrote: >> On 18/07/2019 13:09, Jan Beulich wrote: >>> --- a/xen/arch/x86/cpu/intel.c >>> +++ b/xen/arch/x86/cpu/intel.c >>> @@ -15,6 +15,32 @@ >>> #include "cpu.h" >>> >>> /* >>> + * Processors which have self-snooping capability can handle conflicting >>> + * memory type across CPUs by snooping its own cache. However, there exists >>> + * CPU models in which having conflicting memory types still leads to >>> + * unpredictable behavior, machine check errors, or hangs. Clear this >>> + * feature to prevent its use on machines with known erratas. >>> + */ >>> +static void __init check_memory_type_self_snoop_errata(void) >>> +{ >>> + switch (boot_cpu_data.x86_model) { >>> + case 0x0f: /* Merom */ >>> + case 0x16: /* Merom L */ >>> + case 0x17: /* Penryn */ >>> + case 0x1d: /* Dunnington */ >>> + case 0x1e: /* Nehalem */ >>> + case 0x1f: /* Auburndale / Havendale */ >>> + case 0x1a: /* Nehalem EP */ >>> + case 0x2e: /* Nehalem EX */ >>> + case 0x25: /* Westmere */ >>> + case 0x2c: /* Westmere EP */ >>> + case 0x2a: /* SandyBridge */ >> It would have been nice if the errata had actually been identified... > Indeed; I hope you don't expect me to go hunt them down. I'm > cloning a Linux commit here only, after all. No - I don't expect you to hunt them down. I'll try to keep an eye out next time I'm in the relevant documents. > >>> + setup_clear_cpu_cap(X86_FEATURE_SS); >> I'm regretting exposing SS to guests at this point. >> >> As this stands, it will result in a migration compatibility issue, >> because updating Xen will cause a feature to disappear. If we had a >> default vs full policy split, this would be easy enough to work around >> in a compatible way. I wonder if there is anything clever we can do in >> the meantime as a stopgap workaround. > Should we perhaps introduce X86_FEATURE_XEN_SELFSNOOP, just like > we do for SMEP and SMAP, such that we can leave the real one alone? Ah yes - that will do just fine. When we get a default/max split, I'll try to remember to take it back out. ~Andrew
--- a/xen/arch/x86/cpu/intel.c +++ b/xen/arch/x86/cpu/intel.c @@ -15,6 +15,32 @@ #include "cpu.h" /* + * Processors which have self-snooping capability can handle conflicting + * memory type across CPUs by snooping its own cache. However, there exists + * CPU models in which having conflicting memory types still leads to + * unpredictable behavior, machine check errors, or hangs. Clear this + * feature to prevent its use on machines with known erratas. + */ +static void __init check_memory_type_self_snoop_errata(void) +{ + switch (boot_cpu_data.x86_model) { + case 0x0f: /* Merom */ + case 0x16: /* Merom L */ + case 0x17: /* Penryn */ + case 0x1d: /* Dunnington */ + case 0x1e: /* Nehalem */ + case 0x1f: /* Auburndale / Havendale */ + case 0x1a: /* Nehalem EP */ + case 0x2e: /* Nehalem EX */ + case 0x25: /* Westmere */ + case 0x2c: /* Westmere EP */ + case 0x2a: /* SandyBridge */ + setup_clear_cpu_cap(X86_FEATURE_SS); + break; + } +} + +/* * Set caps in expected_levelling_cap, probe a specific masking MSR, and set * caps in levelling_caps if it is found, or clobber the MSR index if missing. * If preset, reads the default value into msr_val. @@ -256,8 +282,11 @@ static void early_init_intel(struct cpui (boot_cpu_data.x86_mask == 3 || boot_cpu_data.x86_mask == 4)) paddr_bits = 36; - if (c == &boot_cpu_data) + if (c == &boot_cpu_data) { + check_memory_type_self_snoop_errata(); + intel_init_levelling(); + } ctxt_switch_levelling(NULL); }