Message ID | 20231003062458.23552-2-xin3.li@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86: enable FRED for x86-64 | expand |
On Mon, Oct 02, 2023 at 11:24:22PM -0700, Xin Li wrote: > Subject: Re: [PATCH v12 01/37] x86/cpufeatures: Add the cpu feature bit for WRMSRNS ^^^^ For all your text: s/cpu/CPU/g > WRMSRNS is an instruction that behaves exactly like WRMSR, with > the only difference being that it is not a serializing instruction > by default. Under certain conditions, WRMSRNS may replace WRMSR to > improve performance. > > Add the CPU feature bit for WRMSRNS. > > Tested-by: Shan Kang <shan.kang@intel.com> > Signed-off-by: Xin Li <xin3.li@intel.com> > --- > arch/x86/include/asm/cpufeatures.h | 1 + > tools/arch/x86/include/asm/cpufeatures.h | 1 + > 2 files changed, 2 insertions(+) It looks to me like you can merge the first three patches into one as all they do is add that insn support. Then, further down in the patchset, it says: + if (cpu_feature_enabled(X86_FEATURE_FRED)) { + /* WRMSRNS is a baseline feature for FRED. */ but WRMSRNS is not mentioned in the FRED spec "Document Number: 346446-005US, Revision: 5.0" which, according to https://www.intel.com/content/www/us/en/content-details/780121/flexible-return-and-event-delivery-fred-specification.html is the latest. Am I looking at the wrong one? > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h > index 58cb9495e40f..330876d34b68 100644 > --- a/arch/x86/include/asm/cpufeatures.h > +++ b/arch/x86/include/asm/cpufeatures.h > @@ -322,6 +322,7 @@ > #define X86_FEATURE_FSRS (12*32+11) /* "" Fast short REP STOSB */ > #define X86_FEATURE_FSRC (12*32+12) /* "" Fast short REP {CMPSB,SCASB} */ > #define X86_FEATURE_LKGS (12*32+18) /* "" Load "kernel" (userspace) GS */ > +#define X86_FEATURE_WRMSRNS (12*32+19) /* "" Non-Serializing Write to Model Specific Register instruction */ /* "" Non-serializing WRMSR */ is more than enough. And now I'm wondering: when you're adding a separate CPUID bit, then the above should be + if (cpu_feature_enabled(X86_FEATURE_WRMSRNS)) { + /* WRMSRNS is a baseline feature for FRED. */ I see that you're adding a dependency: + { X86_FEATURE_FRED, X86_FEATURE_WRMSRNS }, which then means you don't need the X86_FEATURE_WRMSRNS definition at all and can use X86_FEATURE_FRED only. So, what's up? Thx.
> Then, further down in the patchset, it says: > > + if (cpu_feature_enabled(X86_FEATURE_FRED)) { > + /* WRMSRNS is a baseline feature for FRED. */ > > but WRMSRNS is not mentioned in the FRED spec "Document Number: > 346446-005US, Revision: 5.0" which, according to > > https://www.intel.com/content/www/us/en/content-details/780121/flexible- > return-and-event-delivery-fred-specification.html > > is the latest. > > Am I looking at the wrong one? No. tglx asked for it: https://lkml.kernel.org/kvm/87y1h81ht4.ffs@tglx/ > > And now I'm wondering: when you're adding a separate CPUID bit, then the > above should be > > + if (cpu_feature_enabled(X86_FEATURE_WRMSRNS)) { > + /* WRMSRNS is a baseline feature for FRED. */ Because we are doing wrmsrns(MSR_IA32_FRED_RSP0, ...) here, and X86_FEATURE_WRMSRNS doesn't guarantee MSR_IA32_FRED_RSP0 exists. Or I missed something? > > I see that you're adding a dependency: > > + { X86_FEATURE_FRED, X86_FEATURE_WRMSRNS }, > > which then means you don't need the X86_FEATURE_WRMSRNS definition at all > and can use X86_FEATURE_FRED only. > > So, what's up? FRED just gets the honor to introduce WRMSRNS and its first usage: https://lkml.kernel.org/kvm/b05e3092-8ba3-f4e1-b5a3-2125944936fd@zytor.com/ Another patch set should replace WRMSR with WRMSRNS, with SERIALIZE added when needed. Sorry for the late response, it was a long weekend in the US. Thanks! Xin
On Tue, Nov 14, 2023 at 12:43:38AM +0000, Li, Xin3 wrote: > No. tglx asked for it: > https://lkml.kernel.org/kvm/87y1h81ht4.ffs@tglx/ Aha "According to the CPU folks FRED systems are guaranteed to have WRMSRNS - I asked for that :). It's just not yet documented." so I'm going to expect that to appear in the next FRED spec revision... > Because we are doing > wrmsrns(MSR_IA32_FRED_RSP0, ...) > here, and X86_FEATURE_WRMSRNS doesn't guarantee MSR_IA32_FRED_RSP0 exists. > > Or I missed something? Well, according to what I'm hearing and reading so far: FRED means WRMSRNS FRED means MSR_IA32_FRED_RSP0 and if you had to be precise, the code should do: if (cpu_feature_enabled(X86_FEATURE_FRED)) { if (cpu_feature_enabled(X86_FEATURE_WRMSRNS)) wrmsrns(MSR_IA32_FRED_RSP0, (unsigned long)task_stack_page(task) + THREAD_SIZE); else wrmsr(MSR_IA32_FRED_RSP0, (unsigned long)task_stack_page(task) + THREAD_SIZE); } but apparently FRED implies WRMSRNS - not documented anywhere currently - so you can save yourself one check. But your version checks FRED if it can do WRMSRNS while there's a separate WRMSRNS flag and that made me wonder... > Another patch set should replace WRMSR with WRMSRNS, with SERIALIZE added > when needed. I sense someone wants to optimize MSR writes ... :-) Thx.
> and if you had to be precise, the code should do: > > if (cpu_feature_enabled(X86_FEATURE_FRED)) { > if (cpu_feature_enabled(X86_FEATURE_WRMSRNS)) > wrmsrns(MSR_IA32_FRED_RSP0, (unsigned > long)task_stack_page(task) + THREAD_SIZE); > else > wrmsr(MSR_IA32_FRED_RSP0, (unsigned > long)task_stack_page(task) + THREAD_SIZE); > } This is exactly what tglx wanted to avoid. And I love the idea "baseline", especially we have a ton of CPU features. > > > Another patch set should replace WRMSR with WRMSRNS, with SERIALIZE > > added when needed. > > I sense someone wants to optimize MSR writes ... :-) :-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 58cb9495e40f..330876d34b68 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -322,6 +322,7 @@ #define X86_FEATURE_FSRS (12*32+11) /* "" Fast short REP STOSB */ #define X86_FEATURE_FSRC (12*32+12) /* "" Fast short REP {CMPSB,SCASB} */ #define X86_FEATURE_LKGS (12*32+18) /* "" Load "kernel" (userspace) GS */ +#define X86_FEATURE_WRMSRNS (12*32+19) /* "" Non-Serializing Write to Model Specific Register instruction */ #define X86_FEATURE_AMX_FP16 (12*32+21) /* "" AMX fp16 Support */ #define X86_FEATURE_AVX_IFMA (12*32+23) /* "" Support for VPMADD52[H,L]UQ */ #define X86_FEATURE_LAM (12*32+26) /* Linear Address Masking */ diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h index 798e60b5454b..1b9d86ba5bc2 100644 --- a/tools/arch/x86/include/asm/cpufeatures.h +++ b/tools/arch/x86/include/asm/cpufeatures.h @@ -318,6 +318,7 @@ #define X86_FEATURE_FSRS (12*32+11) /* "" Fast short REP STOSB */ #define X86_FEATURE_FSRC (12*32+12) /* "" Fast short REP {CMPSB,SCASB} */ #define X86_FEATURE_LKGS (12*32+18) /* "" Load "kernel" (userspace) GS */ +#define X86_FEATURE_WRMSRNS (12*32+19) /* "" Non-Serializing Write to Model Specific Register instruction */ #define X86_FEATURE_AMX_FP16 (12*32+21) /* "" AMX fp16 Support */ #define X86_FEATURE_AVX_IFMA (12*32+23) /* "" Support for VPMADD52[H,L]UQ */ #define X86_FEATURE_LAM (12*32+26) /* Linear Address Masking */