Message ID | 20190425173545.sogxyptbaqvoofm6@linutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] x86/fpu: Don't unconditionally add XFEATURE_MASK_FPSSE on sigentry | expand |
On 4/25/19 10:35 AM, Sebastian Andrzej Siewior wrote: > This commit reverts commit 04944b793e18e ("x86: xsave: set FP, SSE bits > in the xsave header in the user sigcontext"). The commit claims that it > is required for legacy applications but fails to explain why this is > needed and it is not obvious to me why the application would require the > FP/SSE state in the signal handler. Any software that understands XSAVE is OK. I think the legacy software would be that which groks 'fxregs_state, and FXSAVE/FXRSTOR but does not comprehend XSAVE/XRSTOR. *That* software might change fxregs_state in the signal frame, but the lack of XFEATURE_MASK_FPSSE in xfeatures would prevent XRSTOR from restoring it. That's just a guess, though. If we care, I think we should just use XSAVE instead of XSAVEOPT and trying to reconstruct the init state in software.
On 2019-04-25 14:13:05 [-0700], Dave Hansen wrote: > On 4/25/19 10:35 AM, Sebastian Andrzej Siewior wrote: > > This commit reverts commit 04944b793e18e ("x86: xsave: set FP, SSE bits > > in the xsave header in the user sigcontext"). The commit claims that it > > is required for legacy applications but fails to explain why this is > > needed and it is not obvious to me why the application would require the > > FP/SSE state in the signal handler. > > Any software that understands XSAVE is OK. I think the legacy software > would be that which groks 'fxregs_state, and FXSAVE/FXRSTOR but does not > comprehend XSAVE/XRSTOR. *That* software might change fxregs_state in > the signal frame, but the lack of XFEATURE_MASK_FPSSE in xfeatures would > prevent XRSTOR from restoring it. but it would edit its FP state before it has been used. > That's just a guess, though. > > If we care, I think we should just use XSAVE instead of XSAVEOPT and > trying to reconstruct the init state in software. We can't use XSAVE directly in the slowpath. We need to reconstruct the init state. We have the mxcsr quirk. We would need just to extend it and set the FP area to init state if the FP state is missing like we do in fpstate_sanitize_xstate(). Sebastian
On 4/26/19 12:26 AM, Sebastian Andrzej Siewior wrote: >> That's just a guess, though. >> >> If we care, I think we should just use XSAVE instead of XSAVEOPT and >> trying to reconstruct the init state in software. > We can't use XSAVE directly in the slowpath. We need to reconstruct the > init state. We have the mxcsr quirk. We would need just to extend it and > set the FP area to init state if the FP state is missing like we do in > fpstate_sanitize_xstate(). Can you remind me why we can't use XSAVE directly in the slow path?
On 2019-04-26 09:33:28 [-0700], Dave Hansen wrote: > On 4/26/19 12:26 AM, Sebastian Andrzej Siewior wrote: > >> That's just a guess, though. > >> > >> If we care, I think we should just use XSAVE instead of XSAVEOPT and > >> trying to reconstruct the init state in software. > > We can't use XSAVE directly in the slowpath. We need to reconstruct the > > init state. We have the mxcsr quirk. We would need just to extend it and > > set the FP area to init state if the FP state is missing like we do in > > fpstate_sanitize_xstate(). > > Can you remind me why we can't use XSAVE directly in the slow path? Where to? In the fastpath we XSAVE directly to task's stack. We are in the slowpath because this failed. Task's FPU-state is using compacted form. So we use this as source and copy_to_user() to task's stack. I don't think we can XSAVE to task's FPU-state because the compacted form may need less memory than the non-compacted form. Currently I'm leaning towards cleaning the FP area so we behave like XSAVE does. Independently of that, I would like to revert that commit. Based on the comment and patch description it does not say that it fixes a real problem. It *may* fix something. Sebastian
On 4/26/19 9:50 AM, Sebastian Andrzej Siewior wrote: > On 2019-04-26 09:33:28 [-0700], Dave Hansen wrote: >> On 4/26/19 12:26 AM, Sebastian Andrzej Siewior wrote: >>>> That's just a guess, though. >>>> >>>> If we care, I think we should just use XSAVE instead of XSAVEOPT and >>>> trying to reconstruct the init state in software. >>> We can't use XSAVE directly in the slowpath. We need to reconstruct the >>> init state. We have the mxcsr quirk. We would need just to extend it and >>> set the FP area to init state if the FP state is missing like we do in >>> fpstate_sanitize_xstate(). >> >> Can you remind me why we can't use XSAVE directly in the slow path? > > Where to? In the fastpath we XSAVE directly to task's stack. We are > in the slowpath because this failed. I'm looking at the code and having a bit of a hard time connecting it to what you're saying here. We are in copy_fpstate_to_sigframe(), right? Let's assume we are using_compacted_format(). If copy_fpregs_to_sigframe() fails to copy, we return immediately and never get to the save_xstate_epilog() code in question. So, to even get to save_xstate_epilog(), we had to do a *successful* copy_fpstate_to_sigframe() which, on XSAVE systems will use copy_xregs_to_user() which already uses plain XSAVE (not XSAVEOPT). save_xstate_epilog() goes and tries to set XFEATURE_MASK_FPSSE on this XSAVE (literally the XSAVE instruction) generated header. But, if we're dealing with header.xfeatures generated by an XSAVE with the RFBM=-1, I don't understand how header.xfeatures could ever *not* have XFEATURE_MASK_FPSSE set. The only way would be if we had gotten here after using FXSAVE, but I believe that's impossible since those systems have use_xsave()==0. IOW, I think the patch is right, but I'm not sure I totally agree with the reasoning.
On 2019-04-26 12:04:55 [-0700], Dave Hansen wrote: > I'm looking at the code and having a bit of a hard time connecting it to > what you're saying here. > > We are in copy_fpstate_to_sigframe(), right? Let's assume we are > using_compacted_format(). If copy_fpregs_to_sigframe() fails to copy, > we return immediately and never get to the save_xstate_epilog() code in > question. > > So, to even get to save_xstate_epilog(), we had to do a *successful* > copy_fpstate_to_sigframe() which, on XSAVE systems will use > copy_xregs_to_user() which already uses plain XSAVE (not XSAVEOPT). We are in: copy_fpstate_to_sigframe() | copy_fpregs_to_sigframe() fails. | using_compacted_format() | copy_xstate_to_user() | __copy_xstate_to_user() the xstate_header | for (i = 0; i < XFEATURE_MAX; i++) copy only XFEATURE_SSE + XFEATURE_YMM, other aren't set. | xfeatures_mxcsr_quirk() true, we copy | __copy_xstate_to_user() the xstate_fx_sw_bytes Now. The FP part of the xsave has never been touched which means the bytes are not initialized. The are is filled with random conten on user stack. The saved xfeatures say only (XFEATURE_SSE | XFEATURE_YMM) so a XRESTOR of that gives us exactly what we stored. > save_xstate_epilog() goes and tries to set XFEATURE_MASK_FPSSE on this > XSAVE (literally the XSAVE instruction) generated header. Correct. But we never saved the "data" for XFEATURE_MASK_FP and, as noted above, the content is random garbage on the stack. Be setting XFEATURE_MASK_FPSSE we claim that suddenly that the FP part of the XSAVE area is valid which is not the case. > But, if we're dealing with header.xfeatures generated by an XSAVE with > the RFBM=-1, I don't understand how header.xfeatures could ever *not* > have XFEATURE_MASK_FPSSE set. The only way would be if we had gotten > here after using FXSAVE, but I believe that's impossible since those > systems have use_xsave()==0. Look at the code path above. XSAVES stored the state in task's &fpu->state.xsave using XSAVES and we copied the *saved* bits so it looks like XSAVE. XFEATURE_MASK_FP was not set, so we skipped that area. > IOW, I think the patch is right, but I'm not sure I totally agree with > the reasoning. As I described in the path description: XSAVE (the fastpath, the code before I started touching it) would also not set XFEATURE_MASK_FP in xfeatures *but* would set FP area to its ini-state. Therefore the unconditional OR of XFEATURE_MASK_FPSSE does not hurt because the FP area was initialized to its initstate. The following restore would load a valid FP state. Sebastian
On 4/26/19 1:05 PM, Sebastian Andrzej Siewior wrote: > > copy_fpstate_to_sigframe() > | > copy_fpregs_to_sigframe() fails. > | > using_compacted_format() Aw, crud. I was looking at Linus's tree. Sorry about that. In Linus's tree, if copy_fpregs_to_sigframe() fails, we just return from copy_fpstate_to_sigframe() immediately. In other words, either XSAVE works, and we don't have this issue, or XSAVE fails and we don't do the fixup code in question. I'll say it again, though: I think we should be using the XSAVE instruction to save this state. If we need to map some pages in order to make XSAVE work, then I think we should bring some pages in and we can do *that* in the slow path. We can even do that without taking page faults by just calling get_user_pages() for instance. Or, the slow path could be to fpregs_unlock(), zero the user area (taking and handling page faults), then fpregs_lock() and retry the copy_fpregs_to_sigframe().
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c index 7026f1c4e5e30..6c66203b19f3c 100644 --- a/arch/x86/kernel/fpu/signal.c +++ b/arch/x86/kernel/fpu/signal.c @@ -81,7 +81,6 @@ static inline int save_xstate_epilog(void __user *buf, int ia32_frame) { struct xregs_state __user *x = buf; struct _fpx_sw_bytes *sw_bytes; - u32 xfeatures; int err; /* Setup the bytes not touched by the [f]xsave and reserved for SW. */ @@ -93,28 +92,6 @@ static inline int save_xstate_epilog(void __user *buf, int ia32_frame) err |= __put_user(FP_XSTATE_MAGIC2, (__u32 __user *)(buf + fpu_user_xstate_size)); - - /* - * Read the xfeatures which we copied (directly from the cpu or - * from the state in task struct) to the user buffers. - */ - err |= __get_user(xfeatures, (__u32 __user *)&x->header.xfeatures); - - /* - * For legacy compatible, we always set FP/SSE bits in the bit - * vector while saving the state to the user context. This will - * enable us capturing any changes(during sigreturn) to - * the FP/SSE bits by the legacy applications which don't touch - * xfeatures in the xsave header. - * - * xsave aware apps can change the xfeatures in the xsave - * header as well as change any contents in the memory layout. - * xrestore as part of sigreturn will capture all the changes. - */ - xfeatures |= XFEATURE_MASK_FPSSE; - - err |= __put_user(xfeatures, (__u32 __user *)&x->header.xfeatures); - return err; }
This commit reverts commit 04944b793e18e ("x86: xsave: set FP, SSE bits in the xsave header in the user sigcontext"). The commit claims that it is required for legacy applications but fails to explain why this is needed and it is not obvious to me why the application would require the FP/SSE state in the signal handler. In the compacted form, the XSAVES may save only XMM+SSE but skip FP. This is denoted by header->xfeatures = 6. The fastpath (copy_fpregs_to_sigframe()) does that but _also_ initialises the FP state (cwd to 0x37f, mxcsr as we do, remaining fields to 0). The slowpath (copy_xstate_to_user()) leaves most of the FP untouched. Only mxcsr and mxcsr_flags are set due to xfeatures_mxcsr_quirk(). Now that XFEATURE_MASK_FP is set unconditionally we load on return from signal random garbage as the FP state. This results in SIGFPE on one particular machine (the same testcase (starting a java application) on older generations seems to work). One way to fix this is not to unconditionally set XFEATURE_MASK_FPSSE since it was never saved. This avoids loading garbage on the return path. We could also initialise the FP state in the xfeatures_mxcsr_quirk() case like xsave does. Reported-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de> Fixes: 69277c98f5eef ("x86/fpu: Always store the registers in copy_fpstate_to_sigframe()") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- arch/x86/kernel/fpu/signal.c | 23 ----------------------- 1 file changed, 23 deletions(-)