Message ID | c9c3228982cc81c79cab4ced983f80296107124a.1630929059.git.jane.malalane@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/cpuid: Use AMD's NullSelectorClearsBase CPUID bit | expand |
On 06.09.2021 14:00, Jane Malalane wrote: > --- a/xen/arch/x86/cpu/amd.c > +++ b/xen/arch/x86/cpu/amd.c > @@ -681,6 +681,19 @@ void amd_init_lfence(struct cpuinfo_x86 *c) > c->x86_capability); > } > > +void detect_zen2_null_seg_behaviour(void) This can in principle be marked __init. > +{ > + uint64_t base; > + > + wrmsrl(MSR_FS_BASE, 1); > + asm volatile ( "mov %0, %%fs" :: "rm" (0) ); While I don't strictly mind the "m" part of the constraint to remain there (in the hope for compilers actually to support this), iirc it's not useful to have when the value is a constant: Last time I checked, the compiler would not instantiate an anonymous (stack) variable to fulfill this constraint (as can be seen when dropping the "r" part of the constraint). > @@ -731,6 +744,11 @@ static void init_amd(struct cpuinfo_x86 *c) > else /* Implicily "== 0x10 || >= 0x12" by being 64bit. */ > amd_init_lfence(c); > > + /* Probe for NSCB on Zen2 CPUs when not virtualised */ > + if (!cpu_has_hypervisor && !cpu_has_nscb && c == &boot_cpu_data && > + c->x86 == 0x17 && c->x86_model >= 30 && c->x86_model <= 0x5f) DYM 0x30 here? Or 0x1e? In any event 0x5f should be accompanied by another hex constant. And it would also help if in the description you said where these bounds as well as ... > --- a/xen/arch/x86/cpu/hygon.c > +++ b/xen/arch/x86/cpu/hygon.c > @@ -34,6 +34,11 @@ static void init_hygon(struct cpuinfo_x86 *c) > > amd_init_lfence(c); > > + /* Probe for NSCB on Zen2 CPUs when not virtualised */ > + if (!cpu_has_hypervisor && !cpu_has_nscb && c == &boot_cpu_data && > + c->x86 == 0x18 && c->x86_model >= 4) ... this one come from. Jan
On 06/09/2021 16:17, Jan Beulich wrote: > On 06.09.2021 14:00, Jane Malalane wrote: >> --- a/xen/arch/x86/cpu/amd.c >> +++ b/xen/arch/x86/cpu/amd.c >> @@ -681,6 +681,19 @@ void amd_init_lfence(struct cpuinfo_x86 *c) >> c->x86_capability); >> } >> >> +void detect_zen2_null_seg_behaviour(void) > This can in principle be marked __init. > >> +{ >> + uint64_t base; >> + >> + wrmsrl(MSR_FS_BASE, 1); >> + asm volatile ( "mov %0, %%fs" :: "rm" (0) ); > While I don't strictly mind the "m" part of the constraint to remain > there (in the hope for compilers actually to support this), iirc it's > not useful to have when the value is a constant: Last time I checked, > the compiler would not instantiate an anonymous (stack) variable to > fulfill this constraint (as can be seen when dropping the "r" part of > the constraint). This is "rm" because it is what we use elsewhere in Xen for selectors, and because it is the correct constraints based on the legal instruction encodings. If you want to work around what you perceive to be bugs in compilers then submit a independent change yourself. >> @@ -731,6 +744,11 @@ static void init_amd(struct cpuinfo_x86 *c) >> else /* Implicily "== 0x10 || >= 0x12" by being 64bit. */ >> amd_init_lfence(c); >> >> + /* Probe for NSCB on Zen2 CPUs when not virtualised */ >> + if (!cpu_has_hypervisor && !cpu_has_nscb && c == &boot_cpu_data && >> + c->x86 == 0x17 && c->x86_model >= 30 && c->x86_model <= 0x5f) > DYM 0x30 here? 0x30, although it turns out that some of the mobile Zen2 CPUs exceed 0x60 in terms of model number. As Zen3 changes the family number to 0x19, I'd just drop the upper bound. > Or 0x1e? In any event 0x5f should be accompanied by > another hex constant. And it would also help if in the description > you said where these bounds From talking to people at AMD. > as well as ... > >> --- a/xen/arch/x86/cpu/hygon.c >> +++ b/xen/arch/x86/cpu/hygon.c >> @@ -34,6 +34,11 @@ static void init_hygon(struct cpuinfo_x86 *c) >> >> amd_init_lfence(c); >> >> + /* Probe for NSCB on Zen2 CPUs when not virtualised */ >> + if (!cpu_has_hypervisor && !cpu_has_nscb && c == &boot_cpu_data && >> + c->x86 == 0x18 && c->x86_model >= 4) > ... this one come from. From talking to people at Hygon. ~Andrew
On 06.09.2021 20:07, Andrew Cooper wrote: > On 06/09/2021 16:17, Jan Beulich wrote: >> On 06.09.2021 14:00, Jane Malalane wrote: >>> --- a/xen/arch/x86/cpu/amd.c >>> +++ b/xen/arch/x86/cpu/amd.c >>> @@ -681,6 +681,19 @@ void amd_init_lfence(struct cpuinfo_x86 *c) >>> c->x86_capability); >>> } >>> >>> +void detect_zen2_null_seg_behaviour(void) >> This can in principle be marked __init. >> >>> +{ >>> + uint64_t base; >>> + >>> + wrmsrl(MSR_FS_BASE, 1); >>> + asm volatile ( "mov %0, %%fs" :: "rm" (0) ); >> While I don't strictly mind the "m" part of the constraint to remain >> there (in the hope for compilers actually to support this), iirc it's >> not useful to have when the value is a constant: Last time I checked, >> the compiler would not instantiate an anonymous (stack) variable to >> fulfill this constraint (as can be seen when dropping the "r" part of >> the constraint). > > This is "rm" because it is what we use elsewhere in Xen for selectors, > and because it is the correct constraints based on the legal instruction > encodings. grep-ing for "%%[defgs]s" reveals: efi_arch_post_exit_boot(), svm_ctxt_switch_to(), and do_set_segment_base() all use just "r". This grep has not produced any use of "rm". What are you talking about? > If you want to work around what you perceive to be bugs in compilers > then submit a independent change yourself. I don't perceive this as a bug; perhaps a desirable feature. I also did start my response with "While I don't strictly mind the "m" part ..." - was this not careful enough to indicate I'm not going to insist on the change, but I'd prefer it to be made? >>> @@ -731,6 +744,11 @@ static void init_amd(struct cpuinfo_x86 *c) >>> else /* Implicily "== 0x10 || >= 0x12" by being 64bit. */ >>> amd_init_lfence(c); >>> >>> + /* Probe for NSCB on Zen2 CPUs when not virtualised */ >>> + if (!cpu_has_hypervisor && !cpu_has_nscb && c == &boot_cpu_data && >>> + c->x86 == 0x17 && c->x86_model >= 30 && c->x86_model <= 0x5f) >> DYM 0x30 here? > > 0x30, although it turns out that some of the mobile Zen2 CPUs exceed > 0x60 in terms of model number. > > As Zen3 changes the family number to 0x19, I'd just drop the upper bound. Minor note: Even if it didn't, the !cpu_has_nscb would also be enough to avoid the probing there. >> Or 0x1e? In any event 0x5f should be accompanied by >> another hex constant. And it would also help if in the description >> you said where these bounds > > From talking to people at AMD. > >> as well as ... >> >>> --- a/xen/arch/x86/cpu/hygon.c >>> +++ b/xen/arch/x86/cpu/hygon.c >>> @@ -34,6 +34,11 @@ static void init_hygon(struct cpuinfo_x86 *c) >>> >>> amd_init_lfence(c); >>> >>> + /* Probe for NSCB on Zen2 CPUs when not virtualised */ >>> + if (!cpu_has_hypervisor && !cpu_has_nscb && c == &boot_cpu_data && >>> + c->x86 == 0x18 && c->x86_model >= 4) >> ... this one come from. > > From talking to people at Hygon. Fair enough, but imo this wants mentioning in the description. Jan
On 07/09/2021 07:09, Jan Beulich wrote: > On 06.09.2021 20:07, Andrew Cooper wrote: >> On 06/09/2021 16:17, Jan Beulich wrote: >>> On 06.09.2021 14:00, Jane Malalane wrote: >>>> --- a/xen/arch/x86/cpu/amd.c >>>> +++ b/xen/arch/x86/cpu/amd.c >>>> @@ -681,6 +681,19 @@ void amd_init_lfence(struct cpuinfo_x86 *c) >>>> c->x86_capability); >>>> } >>>> >>>> +void detect_zen2_null_seg_behaviour(void) >>> This can in principle be marked __init. >>> >>>> +{ >>>> + uint64_t base; >>>> + >>>> + wrmsrl(MSR_FS_BASE, 1); >>>> + asm volatile ( "mov %0, %%fs" :: "rm" (0) ); >>> While I don't strictly mind the "m" part of the constraint to remain >>> there (in the hope for compilers actually to support this), iirc it's >>> not useful to have when the value is a constant: Last time I checked, >>> the compiler would not instantiate an anonymous (stack) variable to >>> fulfill this constraint (as can be seen when dropping the "r" part of >>> the constraint). >> This is "rm" because it is what we use elsewhere in Xen for selectors, >> and because it is the correct constraints based on the legal instruction >> encodings. > grep-ing for "%%[defgs]s" reveals: > > efi_arch_post_exit_boot(), svm_ctxt_switch_to(), These are writing multiple selectors in one go, and a register constraint is the only sane option. > and do_set_segment_base() all use just "r". I had missed this one. > This grep has not produced > any use of "rm". What are you talking about? TRY_LOAD_SEG(), pv_emul_read_descriptor() for both lar and lsl, do_double_fault() for another lsl, lldt(), ltr(). So ok - not everything, but most. > >> If you want to work around what you perceive to be bugs in compilers >> then submit a independent change yourself. > I don't perceive this as a bug; perhaps a desirable feature. I also > did start my response with "While I don't strictly mind the "m" > part ..." - was this not careful enough to indicate I'm not going > to insist on the change, but I'd prefer it to be made? No, because a maintainer saying "I'd prefer this to be changed" is still an instruction to the submitter to make the change. But the request is inappropriate. "Last time I checked, the compiler would" presumably means you've checked GCC and not Clang, and therefore any conclusions about the behaviour are incomplete. Unless there is a real concrete compiler bug to work around, "rm" is the appropriate constraint to use, all other things being equal. If the complier is merely doing something dumb with the flexibility it has been permitted, then fix the compiler and the problem will resolve itself the proper way. > >>>> @@ -731,6 +744,11 @@ static void init_amd(struct cpuinfo_x86 *c) >>>> else /* Implicily "== 0x10 || >= 0x12" by being 64bit. */ >>>> amd_init_lfence(c); >>>> >>>> + /* Probe for NSCB on Zen2 CPUs when not virtualised */ >>>> + if (!cpu_has_hypervisor && !cpu_has_nscb && c == &boot_cpu_data && >>>> + c->x86 == 0x17 && c->x86_model >= 30 && c->x86_model <= 0x5f) >>> DYM 0x30 here? >> 0x30, although it turns out that some of the mobile Zen2 CPUs exceed >> 0x60 in terms of model number. >> >> As Zen3 changes the family number to 0x19, I'd just drop the upper bound. > Minor note: Even if it didn't, the !cpu_has_nscb would also be enough > to avoid the probing there. There is actually a problem. From a non-AMD source, I've found the Sucor Z+ CPU which is a Fam17h Model 0x50 Zen1. So instead I'm going to recommend dropping all model checks and just keeping the family checks. This will extend the probe function to Zen1 too, but it is once on boot, trivial in terms of complexity, and really not worth the time/effort it has taken to discover that the model list wasn't correct to start with. ~Andrew
On 07.09.2021 15:27, Andrew Cooper wrote: > On 07/09/2021 07:09, Jan Beulich wrote: >> On 06.09.2021 20:07, Andrew Cooper wrote: >>> On 06/09/2021 16:17, Jan Beulich wrote: >>>> On 06.09.2021 14:00, Jane Malalane wrote: >>>>> --- a/xen/arch/x86/cpu/amd.c >>>>> +++ b/xen/arch/x86/cpu/amd.c >>>>> @@ -681,6 +681,19 @@ void amd_init_lfence(struct cpuinfo_x86 *c) >>>>> c->x86_capability); >>>>> } >>>>> >>>>> +void detect_zen2_null_seg_behaviour(void) >>>> This can in principle be marked __init. >>>> >>>>> +{ >>>>> + uint64_t base; >>>>> + >>>>> + wrmsrl(MSR_FS_BASE, 1); >>>>> + asm volatile ( "mov %0, %%fs" :: "rm" (0) ); >>>> While I don't strictly mind the "m" part of the constraint to remain >>>> there (in the hope for compilers actually to support this), iirc it's >>>> not useful to have when the value is a constant: Last time I checked, >>>> the compiler would not instantiate an anonymous (stack) variable to >>>> fulfill this constraint (as can be seen when dropping the "r" part of >>>> the constraint). >>> This is "rm" because it is what we use elsewhere in Xen for selectors, >>> and because it is the correct constraints based on the legal instruction >>> encodings. >> grep-ing for "%%[defgs]s" reveals: >> >> efi_arch_post_exit_boot(), svm_ctxt_switch_to(), > > These are writing multiple selectors in one go, and a register > constraint is the only sane option. > >> and do_set_segment_base() all use just "r". > > I had missed this one. > >> This grep has not produced >> any use of "rm". What are you talking about? > > TRY_LOAD_SEG(), pv_emul_read_descriptor() for both lar and lsl, > do_double_fault() for another lsl, lldt(), ltr(). TRY_LOAD_SEG() and pv_emul_read_descriptor() don't pass constants as asm() argument values. do_double_fault()'s use of lsl is indeed an example matching the pattern here. lldt() and ltr() are generic inline helpers, so validly allow for both because they should not make assumptions on what the caller passes. Plus "m" there is okay, because if the caller passes a constant there will be a named variable (the function parameter), i.e. the compiler does not need to instantiate any anonymous one. > So ok - not everything, but most. > >> >>> If you want to work around what you perceive to be bugs in compilers >>> then submit a independent change yourself. >> I don't perceive this as a bug; perhaps a desirable feature. I also >> did start my response with "While I don't strictly mind the "m" >> part ..." - was this not careful enough to indicate I'm not going >> to insist on the change, but I'd prefer it to be made? > > No, because a maintainer saying "I'd prefer this to be changed" is still > an instruction to the submitter to make the change. It was a request to _consider_ dropping the m part, yes. But (see below) now that you've forced me to re-check (I presume you didn't check yourself, or else I would expect you would have drawn the same conclusion as I did), I actually feel stronger about this needing adjustment. > But the request is inappropriate. "Last time I checked, the compiler > would" presumably means you've checked GCC and not Clang, and therefore > any conclusions about the behaviour are incomplete. Not really, no. IIRC I did check the version of clang that I have easy access to. (For gcc I've just now re-checked with 10.x and 11.x.) > Unless there is a real concrete compiler bug to work around, "rm" is the > appropriate constraint to use, all other things being equal. If the > complier is merely doing something dumb with the flexibility it has been > permitted, then fix the compiler and the problem will resolve itself the > proper way. I disagree. When an asm() constraint permits multiple kinds of values, dropping one or more of the alternatives should IMO still yield valid (perhaps sub-optimal) code (IOW every one of the supplied kinds should be valid). The issue here is that it is not spelled out clearly whether something like "m" (0) is actually legal. The error messages, however, suggest this is intended to be illegal: gcc says "memory input ... is not directly addressable", whereas clang says "invalid lvalue in asm input for constraint 'm'". IOW I do think the one case of LSL in do_double_fault() does need adjusting. Jan
diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c index 2260eef3aa..654f82e2cb 100644 --- a/xen/arch/x86/cpu/amd.c +++ b/xen/arch/x86/cpu/amd.c @@ -681,6 +681,19 @@ void amd_init_lfence(struct cpuinfo_x86 *c) c->x86_capability); } +void detect_zen2_null_seg_behaviour(void) +{ + uint64_t base; + + wrmsrl(MSR_FS_BASE, 1); + asm volatile ( "mov %0, %%fs" :: "rm" (0) ); + rdmsrl(MSR_FS_BASE, base); + + if (base == 0) + setup_force_cpu_cap(X86_FEATURE_NSCB); + +} + static void init_amd(struct cpuinfo_x86 *c) { u32 l, h; @@ -731,6 +744,11 @@ static void init_amd(struct cpuinfo_x86 *c) else /* Implicily "== 0x10 || >= 0x12" by being 64bit. */ amd_init_lfence(c); + /* Probe for NSCB on Zen2 CPUs when not virtualised */ + if (!cpu_has_hypervisor && !cpu_has_nscb && c == &boot_cpu_data && + c->x86 == 0x17 && c->x86_model >= 30 && c->x86_model <= 0x5f) + detect_zen2_null_seg_behaviour(); + /* * If the user has explicitly chosen to disable Memory Disambiguation * to mitigiate Speculative Store Bypass, poke the appropriate MSR. diff --git a/xen/arch/x86/cpu/cpu.h b/xen/arch/x86/cpu/cpu.h index 1ac3b2867a..0dd1b762ff 100644 --- a/xen/arch/x86/cpu/cpu.h +++ b/xen/arch/x86/cpu/cpu.h @@ -21,3 +21,4 @@ extern bool detect_extended_topology(struct cpuinfo_x86 *c); void early_init_amd(struct cpuinfo_x86 *c); void amd_log_freq(const struct cpuinfo_x86 *c); void amd_init_lfence(struct cpuinfo_x86 *c); +void detect_zen2_null_seg_behaviour(void); diff --git a/xen/arch/x86/cpu/hygon.c b/xen/arch/x86/cpu/hygon.c index 67e23c5df9..232edb0c4d 100644 --- a/xen/arch/x86/cpu/hygon.c +++ b/xen/arch/x86/cpu/hygon.c @@ -34,6 +34,11 @@ static void init_hygon(struct cpuinfo_x86 *c) amd_init_lfence(c); + /* Probe for NSCB on Zen2 CPUs when not virtualised */ + if (!cpu_has_hypervisor && !cpu_has_nscb && c == &boot_cpu_data && + c->x86 == 0x18 && c->x86_model >= 4) + detect_zen2_null_seg_behaviour(); + /* * If the user has explicitly chosen to disable Memory Disambiguation * to mitigiate Speculative Store Bypass, poke the appropriate MSR. diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h index 5f6b83f71c..4faf9bff29 100644 --- a/xen/include/asm-x86/cpufeature.h +++ b/xen/include/asm-x86/cpufeature.h @@ -146,6 +146,7 @@ #define cpu_has_cpuid_faulting boot_cpu_has(X86_FEATURE_CPUID_FAULTING) #define cpu_has_aperfmperf boot_cpu_has(X86_FEATURE_APERFMPERF) #define cpu_has_lfence_dispatch boot_cpu_has(X86_FEATURE_LFENCE_DISPATCH) +#define cpu_has_nscb boot_cpu_has(X86_FEATURE_NSCB) #define cpu_has_xen_lbr boot_cpu_has(X86_FEATURE_XEN_LBR) #define cpu_has_xen_shstk boot_cpu_has(X86_FEATURE_XEN_SHSTK)
Zen2 CPUs actually have this behaviour, but the CPUID bit couldn't be introduced into Zen2 due to a lack of leaves. So, it was added in a new leaf in Zen3. Nonetheless, hypervisors can synthesize the CPUID bit in software. So, on Zen2 hardware, Xen probes for NSCB (NullSelectorClearsBit) and synthesizes the bit. Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jane Malalane <jane.malalane@citrix.com> --- CC: Wei Liu <wl@xen.org> CC: Jan Beulich <jbeulich@suse.com> CC: Andrew Cooper <andrew.cooper3@citrix.com> CC: "Roger Pau Monné" <roger.pau@citrix.com> CC: Pu Wen <puwen@hygon.cn> CC: Andy Lutomirski <luto@kernel.org> --- xen/arch/x86/cpu/amd.c | 18 ++++++++++++++++++ xen/arch/x86/cpu/cpu.h | 1 + xen/arch/x86/cpu/hygon.c | 5 +++++ xen/include/asm-x86/cpufeature.h | 1 + 4 files changed, 25 insertions(+)