Message ID | 20200328000422.98978-1-Jason@zx2c4.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: check to see if the FPU is available before using it | expand |
On Sat, Mar 28, 2020 at 1:59 AM Chris Wilson <chris@chris-wilson.co.uk> wrote: > > Quoting Jason A. Donenfeld (2020-03-28 00:04:22) > > It's not safe to just grab the FPU willy nilly without first checking to > > see if it's available. This patch adds the usual call to may_use_simd() > > and falls back to boring memcpy if it's not available. > > These instructions do not use the fpu, nor are these registers aliased > over the fpu stack. This description and the may_use_simd() do not > look like they express the right granularity as to which simd state are > included Most of the time when discussing vector instructions in the kernel with x86, "FPU" is used to denote the whole shebang, because of similar XSAVE semantics and requirements with actual floating point instructions and SIMD instructions. So when I say "grab the FPU", I'm really referring to the act of messing with any registers that aren't saved and restored by default during context switch and need the explicit marking for XSAVE to come in -- the kernel_fpu_begin/end calls that you already have. With regards to the granularity here, you are in fact touching xmm registers. That means you need kernel_fpu_begin/end, which you correctly have. However, it also means that before using those, you should check to see if that's okay by using the may_use_simd() instruction. Now you may claim that at the moment may_use_simd()-->irq_fpu_usable()-->(!in_interrupt() || interrupted_user_mode() || interrupted_kernel_fpu_idle()) always holds true, and you're a keen follower of the (recently changed) kernel fpu x86 semantics in case those conditions change, and that your driver is so strictly written that you know exactly the context this fancy memcpy will run in, always, and you'll never deviate from it, and therefore it's okay to depart from the rules and omit the check and safe fallback code. But c'mon - i915 is complex, and mixed context bugs abound, and the rules for using those registers might in fact change without you noticing. So why not apply this to have a safe fallback for when the numerous assumptions no longer hold? (If you're extra worried I suppose you could make it a `if (WARN_ON(!may_use_simd()))` instead or something, but that seems like a bit much.) > Look at caller, return the error and let them decide if they can avoid > the read from WC, which quite often they can. And no, this is not done > from interrupt context, we would be crucified if we did. Ahh, now, reading this comment here I realize maybe I've misunderstood you. Do you mean to say that checking for may_use_simd() is a thing that you'd like to do after all, but you don't like it falling back to slow memcpy. Instead, you'd like for the code to return an error value, and then caller can just optionally skip the memcpy under some complicated driver circumstances? Jason
diff --git a/drivers/gpu/drm/i915/i915_memcpy.c b/drivers/gpu/drm/i915/i915_memcpy.c index fdd550405fd3..7c0e022586bc 100644 --- a/drivers/gpu/drm/i915/i915_memcpy.c +++ b/drivers/gpu/drm/i915/i915_memcpy.c @@ -24,6 +24,7 @@ #include <linux/kernel.h> #include <asm/fpu/api.h> +#include <asm/simd.h> #include "i915_memcpy.h" @@ -38,6 +39,12 @@ static DEFINE_STATIC_KEY_FALSE(has_movntdqa); #ifdef CONFIG_AS_MOVNTDQA static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len) { + if (unlikely(!may_use_simd())) { + memcpy(dst, src, len); + return; + } + + kernel_fpu_begin(); while (len >= 4) { @@ -67,6 +74,11 @@ static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len) static void __memcpy_ntdqu(void *dst, const void *src, unsigned long len) { + if (unlikely(!may_use_simd())) { + memcpy(dst, src, len); + return; + } + kernel_fpu_begin(); while (len >= 4) {
It's not safe to just grab the FPU willy nilly without first checking to see if it's available. This patch adds the usual call to may_use_simd() and falls back to boring memcpy if it's not available. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> --- drivers/gpu/drm/i915/i915_memcpy.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)