Message ID | 20211011223610.828296394@linutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/fpu: Preparatory cleanups for AMX support (part 1) | expand |
On Tue, Oct 12, 2021 at 02:00:11AM +0200, Thomas Gleixner wrote: > CLONE_THREAD does not have the guarantee of a true fork to inherit all > state. Especially the FPU state is meaningless for CLONE_THREAD. > > Just wipe out the minimal required state so restore on return to user space > let's the thread start with a clean FPU. This sentence reads weird, needs massaging.
On Tue, Oct 12 2021 at 18:10, Borislav Petkov wrote: > On Tue, Oct 12, 2021 at 02:00:11AM +0200, Thomas Gleixner wrote: >> CLONE_THREAD does not have the guarantee of a true fork to inherit all >> state. Especially the FPU state is meaningless for CLONE_THREAD. >> >> Just wipe out the minimal required state so restore on return to user space >> let's the thread start with a clean FPU. > > This sentence reads weird, needs massaging. The patch is wrong and needs to be removed. I just double checked pthread_create() again and it says: The new thread inherits the calling thread's floating-point environment (fenv(3)) No idea where I was looking at a few days ago. :( Thanks, tglx
On Tue, Oct 12 2021 at 20:52, Thomas Gleixner wrote: > On Tue, Oct 12 2021 at 18:10, Borislav Petkov wrote: > >> On Tue, Oct 12, 2021 at 02:00:11AM +0200, Thomas Gleixner wrote: >>> CLONE_THREAD does not have the guarantee of a true fork to inherit all >>> state. Especially the FPU state is meaningless for CLONE_THREAD. >>> >>> Just wipe out the minimal required state so restore on return to user space >>> let's the thread start with a clean FPU. >> >> This sentence reads weird, needs massaging. > > The patch is wrong and needs to be removed. I just double checked > pthread_create() again and it says: > > The new thread inherits the calling thread's floating-point environment > (fenv(3)) > > No idea where I was looking at a few days ago. :( But fenv(3) is not the FPU state. Duh! Anyway. It's an optimization which we can do later still and not required for the cleanups here. Thanks, tglx
--- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -34,7 +34,7 @@ extern int fpu__exception_code(struct f extern void fpu_sync_fpstate(struct fpu *fpu); /* Clone and exit operations */ -extern int fpu_clone(struct task_struct *dst); +extern int fpu_clone(struct task_struct *dst, unsigned long clone_flags); extern void fpu_flush_thread(void); /* --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -257,7 +257,7 @@ void fpstate_init(union fpregs_state *st EXPORT_SYMBOL_GPL(fpstate_init); /* Clone current's FPU state on fork */ -int fpu_clone(struct task_struct *dst) +int fpu_clone(struct task_struct *dst, unsigned long clone_flags) { struct fpu *src_fpu = ¤t->thread.fpu; struct fpu *dst_fpu = &dst->thread.fpu; @@ -276,9 +276,11 @@ int fpu_clone(struct task_struct *dst) /* * No FPU state inheritance for kernel threads and IO - * worker threads. + * worker threads. Neither CLONE_THREAD needs a copy + * of the FPU state. */ - if (dst->flags & (PF_KTHREAD | PF_IO_WORKER)) { + if (clone_flags & CLONE_THREAD || + dst->flags & (PF_KTHREAD | PF_IO_WORKER)) { /* Clear out the minimal state */ memcpy(&dst_fpu->state, &init_fpstate, init_fpstate_copy_size()); --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -154,7 +154,7 @@ int copy_thread(unsigned long clone_flag frame->flags = X86_EFLAGS_FIXED; #endif - fpu_clone(p); + fpu_clone(p, clone_flags); /* Kernel thread ? */ if (unlikely(p->flags & PF_KTHREAD)) {
CLONE_THREAD does not have the guarantee of a true fork to inherit all state. Especially the FPU state is meaningless for CLONE_THREAD. Just wipe out the minimal required state so restore on return to user space let's the thread start with a clean FPU. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/include/asm/fpu/internal.h | 2 +- arch/x86/kernel/fpu/core.c | 8 +++++--- arch/x86/kernel/process.c | 2 +- 3 files changed, 7 insertions(+), 5 deletions(-)