Message ID | 20220315201706.7576-2-rick.p.edgecombe@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Regset cleanups | expand |
On Tue, Mar 15, 2022 at 01:17:04PM -0700, Rick Edgecombe wrote: > In ptrace, the x86_32_regsets and x86_64_regsets are constructed such that > there are no gaps in the arrays. This appears to be for two reasons. One, > the code in fill_thread_core_info() can't handle the gaps. This will be > addressed in a future patch. And two, not having gaps shrinks the size of > the array in memory. > > Both regset arrays draw their indices from a shared enum x86_regset, but 32 > bit and 64 bit don't all support the same regsets. In the case of > IA32_EMULATION they can be compiled in at the same time. So this enum has > to be laid out in a special way such that there are no gaps for both > x86_32_regsets and x86_64_regsets. This involves creating aliases for > enum’s that are only in one view or the other, or creating multiple > versions like in the case of REGSET_IOPERM32/REGSET_IOPERM64. > > Simplify the construction of these arrays by just fully separating out the > enums for 32 bit and 64 bit. Add some bitsize-free defines for > REGSET_GENERAL and REGSET_FP since they are the only two referred to in > bitsize generic code. > > This should have no functional change and is only changing how constants > are generated and named. The enum is local to this file, so it does not > introduce any burden on code calling from other places in the kernel now > having to worry about whether to use a 32 bit or 64 bit enum name. > > [1] https://lore.kernel.org/lkml/20180717162502.32274-1-yu-cheng.yu@intel.com/ > > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Have you verified there's no binary difference in machine code output? Reviewed-by: Kees Cook <keescook@chromium.org>
On Tue, 2022-03-15 at 13:41 -0700, Kees Cook wrote: > Have you verified there's no binary difference in machine code > output? There actually was a different in the binaries. I investigated a bit, and it seemed at least part of it was due to the line numbers changing the WARN_ON()s. But otherwise, I assumed some compiler optimization must have been bumped.
Rick Edgecombe <rick.p.edgecombe@intel.com> writes: > In ptrace, the x86_32_regsets and x86_64_regsets are constructed such that > there are no gaps in the arrays. This appears to be for two reasons. One, > the code in fill_thread_core_info() can't handle the gaps. This will be > addressed in a future patch. And two, not having gaps shrinks the size of > the array in memory. > > Both regset arrays draw their indices from a shared enum x86_regset, but 32 > bit and 64 bit don't all support the same regsets. In the case of > IA32_EMULATION they can be compiled in at the same time. So this enum has > to be laid out in a special way such that there are no gaps for both > x86_32_regsets and x86_64_regsets. This involves creating aliases for > enum’s that are only in one view or the other, or creating multiple > versions like in the case of REGSET_IOPERM32/REGSET_IOPERM64. > > Simplify the construction of these arrays by just fully separating out the > enums for 32 bit and 64 bit. Add some bitsize-free defines for > REGSET_GENERAL and REGSET_FP since they are the only two referred to in > bitsize generic code. > > This should have no functional change and is only changing how constants > are generated and named. The enum is local to this file, so it does not > introduce any burden on code calling from other places in the kernel now > having to worry about whether to use a 32 bit or 64 bit enum name. > > [1] https://lore.kernel.org/lkml/20180717162502.32274-1-yu-cheng.yu@intel.com/ > > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > --- > arch/x86/kernel/ptrace.c | 60 ++++++++++++++++++++++++++-------------- > 1 file changed, 39 insertions(+), 21 deletions(-) > > diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c > index 8d2f2f995539..7a4988d13c43 100644 > --- a/arch/x86/kernel/ptrace.c > +++ b/arch/x86/kernel/ptrace.c > @@ -45,16 +45,34 @@ > > #include "tls.h" > > -enum x86_regset { > - REGSET_GENERAL, > - REGSET_FP, > - REGSET_XFP, > - REGSET_IOPERM64 = REGSET_XFP, > - REGSET_XSTATE, > - REGSET_TLS, > +enum x86_regset_32 { > + REGSET_GENERAL32, > + REGSET_FP32, > + REGSET_XFP32, > + REGSET_XSTATE32, > + REGSET_TLS32, > REGSET_IOPERM32, > }; > > +enum x86_regset_64 { > + REGSET_GENERAL64, > + REGSET_FP64, > + REGSET_IOPERM64, > + REGSET_XSTATE64, > +}; So I am looking at this and am wondering if the enums should be: enum x86_32_regset { REGSET32_GENERAL, REGSET32_FP, REGSET32_XFP, REGSET32_XSTATE, REGSET32_TLS, REGSET32_IOPERM32, }; enum x86_64_regset { REGSET64_GENERAL, REGSET64_FP, REGSET64_IOPERM64, REGSET64_XSTATE, }; That is named in such a way that it emphasizes that the difference is the architecture. Otherwise it reads like the difference is the size of the registers in the regset. I am pretty certain that in your REGSET_FP32 and REGSET_FP64 all of the registers are 80 bits long. Eric > + > +#define REGSET_GENERAL \ > +({ \ > + BUILD_BUG_ON((int)REGSET_GENERAL32 != (int)REGSET_GENERAL64); \ > + REGSET_GENERAL32; \ > +}) > + > +#define REGSET_FP \ > + BUILD_BUG_ON((int)REGSET_FP32 != (int)REGSET_FP64); \ > + REGSET_FP32; \ > +}) > + > struct pt_regs_offset { > const char *name; > int offset; > @@ -789,13 +807,13 @@ long arch_ptrace(struct task_struct *child, long request, > #ifdef CONFIG_X86_32 > case PTRACE_GETFPXREGS: /* Get the child extended FPU state. */ > return copy_regset_to_user(child, &user_x86_32_view, > - REGSET_XFP, > + REGSET_XFP32, > 0, sizeof(struct user_fxsr_struct), > datap) ? -EIO : 0; > > case PTRACE_SETFPXREGS: /* Set the child extended FPU state. */ > return copy_regset_from_user(child, &user_x86_32_view, > - REGSET_XFP, > + REGSET_XFP32, > 0, sizeof(struct user_fxsr_struct), > datap) ? -EIO : 0; > #endif > @@ -1087,13 +1105,13 @@ static long ia32_arch_ptrace(struct task_struct *child, compat_long_t request, > > case PTRACE_GETFPXREGS: /* Get the child extended FPU state. */ > return copy_regset_to_user(child, &user_x86_32_view, > - REGSET_XFP, 0, > + REGSET_XFP32, 0, > sizeof(struct user32_fxsr_struct), > datap); > > case PTRACE_SETFPXREGS: /* Set the child extended FPU state. */ > return copy_regset_from_user(child, &user_x86_32_view, > - REGSET_XFP, 0, > + REGSET_XFP32, 0, > sizeof(struct user32_fxsr_struct), > datap); > > @@ -1216,19 +1234,19 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request, > #ifdef CONFIG_X86_64 > > static struct user_regset x86_64_regsets[] __ro_after_init = { > - [REGSET_GENERAL] = { > + [REGSET_GENERAL64] = { > .core_note_type = NT_PRSTATUS, > .n = sizeof(struct user_regs_struct) / sizeof(long), > .size = sizeof(long), .align = sizeof(long), > .regset_get = genregs_get, .set = genregs_set > }, > - [REGSET_FP] = { > + [REGSET_FP64] = { > .core_note_type = NT_PRFPREG, > .n = sizeof(struct fxregs_state) / sizeof(long), > .size = sizeof(long), .align = sizeof(long), > .active = regset_xregset_fpregs_active, .regset_get = xfpregs_get, .set = xfpregs_set > }, > - [REGSET_XSTATE] = { > + [REGSET_XSTATE64] = { > .core_note_type = NT_X86_XSTATE, > .size = sizeof(u64), .align = sizeof(u64), > .active = xstateregs_active, .regset_get = xstateregs_get, > @@ -1257,31 +1275,31 @@ static const struct user_regset_view user_x86_64_view = { > > #if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION > static struct user_regset x86_32_regsets[] __ro_after_init = { > - [REGSET_GENERAL] = { > + [REGSET_GENERAL32] = { > .core_note_type = NT_PRSTATUS, > .n = sizeof(struct user_regs_struct32) / sizeof(u32), > .size = sizeof(u32), .align = sizeof(u32), > .regset_get = genregs32_get, .set = genregs32_set > }, > - [REGSET_FP] = { > + [REGSET_FP32] = { > .core_note_type = NT_PRFPREG, > .n = sizeof(struct user_i387_ia32_struct) / sizeof(u32), > .size = sizeof(u32), .align = sizeof(u32), > .active = regset_fpregs_active, .regset_get = fpregs_get, .set = fpregs_set > }, > - [REGSET_XFP] = { > + [REGSET_XFP32] = { > .core_note_type = NT_PRXFPREG, > .n = sizeof(struct fxregs_state) / sizeof(u32), > .size = sizeof(u32), .align = sizeof(u32), > .active = regset_xregset_fpregs_active, .regset_get = xfpregs_get, .set = xfpregs_set > }, > - [REGSET_XSTATE] = { > + [REGSET_XSTATE32] = { > .core_note_type = NT_X86_XSTATE, > .size = sizeof(u64), .align = sizeof(u64), > .active = xstateregs_active, .regset_get = xstateregs_get, > .set = xstateregs_set > }, > - [REGSET_TLS] = { > + [REGSET_TLS32] = { > .core_note_type = NT_386_TLS, > .n = GDT_ENTRY_TLS_ENTRIES, .bias = GDT_ENTRY_TLS_MIN, > .size = sizeof(struct user_desc), > @@ -1312,10 +1330,10 @@ u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS]; > void __init update_regset_xstate_info(unsigned int size, u64 xstate_mask) > { > #ifdef CONFIG_X86_64 > - x86_64_regsets[REGSET_XSTATE].n = size / sizeof(u64); > + x86_64_regsets[REGSET_XSTATE64].n = size / sizeof(u64); > #endif > #if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION > - x86_32_regsets[REGSET_XSTATE].n = size / sizeof(u64); > + x86_32_regsets[REGSET_XSTATE32].n = size / sizeof(u64); > #endif > xstate_fx_sw_bytes[USER_XSTATE_XCR0_WORD] = xstate_mask; > }
On Tue, 2022-03-15 at 18:01 -0500, Eric W. Biederman wrote: > So I am looking at this and am wondering if the enums should be: > > enum x86_32_regset { > REGSET32_GENERAL, > REGSET32_FP, > REGSET32_XFP, > REGSET32_XSTATE, > REGSET32_TLS, > REGSET32_IOPERM32, > }; > > enum x86_64_regset { > REGSET64_GENERAL, > REGSET64_FP, > REGSET64_IOPERM64, > REGSET64_XSTATE, > }; > > > That is named in such a way that it emphasizes that the difference is > the architecture. Otherwise it reads like the difference is the size > of > the registers in the regset. I am pretty certain that in your > REGSET_FP32 and REGSET_FP64 all of the registers are 80 bits long. Yes, that makes sense. I had just copied the format of REGSET_IOPERM32/REGSET_IOPERM64, but I'll change it like you suggest here. Thanks, Rick
On Tue, Mar 15, 2022 at 09:53:13PM +0000, Edgecombe, Rick P wrote: > On Tue, 2022-03-15 at 13:41 -0700, Kees Cook wrote: > > Have you verified there's no binary difference in machine code > > output? > > There actually was a different in the binaries. I investigated a bit, > and it seemed at least part of it was due to the line numbers changing > the WARN_ON()s. But otherwise, I assumed some compiler optimization > must have been bumped. Right, you can ignore all the debugging line number changes. "diffoscope" should help see the difference by section. As long as the actual object code isn't changing, you should be good.
On Tue, 2022-03-15 at 19:48 -0700, Kees Cook wrote: > On Tue, Mar 15, 2022 at 09:53:13PM +0000, Edgecombe, Rick P wrote: > > On Tue, 2022-03-15 at 13:41 -0700, Kees Cook wrote: > > > Have you verified there's no binary difference in machine code > > > output? > > > > There actually was a different in the binaries. I investigated a > > bit, > > and it seemed at least part of it was due to the line numbers > > changing > > the WARN_ON()s. But otherwise, I assumed some compiler optimization > > must have been bumped. > > Right, you can ignore all the debugging line number changes. > "diffoscope" should help see the difference by section. As long as > the > actual object code isn't changing, you should be good. What I did originally was objdump -D ptrace.o and diff that. Then I slowly reduced changes to see what was generating the difference. When I maintained the line numbers from the original version, and simply converted the enum to defines, it still generated slightly different code in places that didn't seem to connected to the changes. So I figured the compiler was doing something, and relied on checking that the actual constants didn't change in value. This morning I tried again to figure out what was causing the difference. If I strip debug symbols, remove the BUILD_BUG_ON()s and reformat the enums such that the line numbers are the same below the enums then the objdump output is identical. I think what is happening in this debug stripped test, is that in the call's to put_user(), it calls might_fault(), which has a __LINE__. But even adding a comment to the base file has surprisingly wide effects. It caused the __bug_table section table to get code generated with different instructions, not just line numbers constants changing. So I think there should be no functional change, but the binaries are not identical.
On Wed, 2022-03-16 at 12:06 -0700, Edgecombe, Richard P wrote: > But even adding a comment to the base file has surprisingly wide > effects. It caused the __bug_table section table to get code > generated > with different instructions, not just line numbers constants > changing. Err... I suppose this is probably because they are not actually instructions and so should be expected, duh.
On Wed, Mar 16, 2022 at 07:06:48PM +0000, Edgecombe, Rick P wrote: > On Tue, 2022-03-15 at 19:48 -0700, Kees Cook wrote: > > On Tue, Mar 15, 2022 at 09:53:13PM +0000, Edgecombe, Rick P wrote: > > > On Tue, 2022-03-15 at 13:41 -0700, Kees Cook wrote: > > > > Have you verified there's no binary difference in machine code > > > > output? > > > > > > There actually was a different in the binaries. I investigated a > > > bit, > > > and it seemed at least part of it was due to the line numbers > > > changing > > > the WARN_ON()s. But otherwise, I assumed some compiler optimization > > > must have been bumped. > > > > Right, you can ignore all the debugging line number changes. > > "diffoscope" should help see the difference by section. As long as > > the > > actual object code isn't changing, you should be good. > > What I did originally was objdump -D ptrace.o and diff that. Then I > slowly reduced changes to see what was generating the difference. When > I maintained the line numbers from the original version, and simply > converted the enum to defines, it still generated slightly different > code in places that didn't seem to connected to the changes. So I > figured the compiler was doing something, and relied on checking that > the actual constants didn't change in value. > > This morning I tried again to figure out what was causing the > difference. If I strip debug symbols, remove the BUILD_BUG_ON()s and > reformat the enums such that the line numbers are the same below the > enums then the objdump output is identical. > > I think what is happening in this debug stripped test, is that in the > call's to put_user(), it calls might_fault(), which has a __LINE__. > > But even adding a comment to the base file has surprisingly wide > effects. It caused the __bug_table section table to get code generated > with different instructions, not just line numbers constants changing. > > So I think there should be no functional change, but the binaries are > not identical. Right, that's fine: the instructions should be the same, just with various different offsets.
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c index 8d2f2f995539..7a4988d13c43 100644 --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -45,16 +45,34 @@ #include "tls.h" -enum x86_regset { - REGSET_GENERAL, - REGSET_FP, - REGSET_XFP, - REGSET_IOPERM64 = REGSET_XFP, - REGSET_XSTATE, - REGSET_TLS, +enum x86_regset_32 { + REGSET_GENERAL32, + REGSET_FP32, + REGSET_XFP32, + REGSET_XSTATE32, + REGSET_TLS32, REGSET_IOPERM32, }; +enum x86_regset_64 { + REGSET_GENERAL64, + REGSET_FP64, + REGSET_IOPERM64, + REGSET_XSTATE64, +}; + +#define REGSET_GENERAL \ +({ \ + BUILD_BUG_ON((int)REGSET_GENERAL32 != (int)REGSET_GENERAL64); \ + REGSET_GENERAL32; \ +}) + +#define REGSET_FP \ +({ \ + BUILD_BUG_ON((int)REGSET_FP32 != (int)REGSET_FP64); \ + REGSET_FP32; \ +}) + struct pt_regs_offset { const char *name; int offset; @@ -789,13 +807,13 @@ long arch_ptrace(struct task_struct *child, long request, #ifdef CONFIG_X86_32 case PTRACE_GETFPXREGS: /* Get the child extended FPU state. */ return copy_regset_to_user(child, &user_x86_32_view, - REGSET_XFP, + REGSET_XFP32, 0, sizeof(struct user_fxsr_struct), datap) ? -EIO : 0; case PTRACE_SETFPXREGS: /* Set the child extended FPU state. */ return copy_regset_from_user(child, &user_x86_32_view, - REGSET_XFP, + REGSET_XFP32, 0, sizeof(struct user_fxsr_struct), datap) ? -EIO : 0; #endif @@ -1087,13 +1105,13 @@ static long ia32_arch_ptrace(struct task_struct *child, compat_long_t request, case PTRACE_GETFPXREGS: /* Get the child extended FPU state. */ return copy_regset_to_user(child, &user_x86_32_view, - REGSET_XFP, 0, + REGSET_XFP32, 0, sizeof(struct user32_fxsr_struct), datap); case PTRACE_SETFPXREGS: /* Set the child extended FPU state. */ return copy_regset_from_user(child, &user_x86_32_view, - REGSET_XFP, 0, + REGSET_XFP32, 0, sizeof(struct user32_fxsr_struct), datap); @@ -1216,19 +1234,19 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request, #ifdef CONFIG_X86_64 static struct user_regset x86_64_regsets[] __ro_after_init = { - [REGSET_GENERAL] = { + [REGSET_GENERAL64] = { .core_note_type = NT_PRSTATUS, .n = sizeof(struct user_regs_struct) / sizeof(long), .size = sizeof(long), .align = sizeof(long), .regset_get = genregs_get, .set = genregs_set }, - [REGSET_FP] = { + [REGSET_FP64] = { .core_note_type = NT_PRFPREG, .n = sizeof(struct fxregs_state) / sizeof(long), .size = sizeof(long), .align = sizeof(long), .active = regset_xregset_fpregs_active, .regset_get = xfpregs_get, .set = xfpregs_set }, - [REGSET_XSTATE] = { + [REGSET_XSTATE64] = { .core_note_type = NT_X86_XSTATE, .size = sizeof(u64), .align = sizeof(u64), .active = xstateregs_active, .regset_get = xstateregs_get, @@ -1257,31 +1275,31 @@ static const struct user_regset_view user_x86_64_view = { #if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION static struct user_regset x86_32_regsets[] __ro_after_init = { - [REGSET_GENERAL] = { + [REGSET_GENERAL32] = { .core_note_type = NT_PRSTATUS, .n = sizeof(struct user_regs_struct32) / sizeof(u32), .size = sizeof(u32), .align = sizeof(u32), .regset_get = genregs32_get, .set = genregs32_set }, - [REGSET_FP] = { + [REGSET_FP32] = { .core_note_type = NT_PRFPREG, .n = sizeof(struct user_i387_ia32_struct) / sizeof(u32), .size = sizeof(u32), .align = sizeof(u32), .active = regset_fpregs_active, .regset_get = fpregs_get, .set = fpregs_set }, - [REGSET_XFP] = { + [REGSET_XFP32] = { .core_note_type = NT_PRXFPREG, .n = sizeof(struct fxregs_state) / sizeof(u32), .size = sizeof(u32), .align = sizeof(u32), .active = regset_xregset_fpregs_active, .regset_get = xfpregs_get, .set = xfpregs_set }, - [REGSET_XSTATE] = { + [REGSET_XSTATE32] = { .core_note_type = NT_X86_XSTATE, .size = sizeof(u64), .align = sizeof(u64), .active = xstateregs_active, .regset_get = xstateregs_get, .set = xstateregs_set }, - [REGSET_TLS] = { + [REGSET_TLS32] = { .core_note_type = NT_386_TLS, .n = GDT_ENTRY_TLS_ENTRIES, .bias = GDT_ENTRY_TLS_MIN, .size = sizeof(struct user_desc), @@ -1312,10 +1330,10 @@ u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS]; void __init update_regset_xstate_info(unsigned int size, u64 xstate_mask) { #ifdef CONFIG_X86_64 - x86_64_regsets[REGSET_XSTATE].n = size / sizeof(u64); + x86_64_regsets[REGSET_XSTATE64].n = size / sizeof(u64); #endif #if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION - x86_32_regsets[REGSET_XSTATE].n = size / sizeof(u64); + x86_32_regsets[REGSET_XSTATE32].n = size / sizeof(u64); #endif xstate_fx_sw_bytes[USER_XSTATE_XCR0_WORD] = xstate_mask; }
In ptrace, the x86_32_regsets and x86_64_regsets are constructed such that there are no gaps in the arrays. This appears to be for two reasons. One, the code in fill_thread_core_info() can't handle the gaps. This will be addressed in a future patch. And two, not having gaps shrinks the size of the array in memory. Both regset arrays draw their indices from a shared enum x86_regset, but 32 bit and 64 bit don't all support the same regsets. In the case of IA32_EMULATION they can be compiled in at the same time. So this enum has to be laid out in a special way such that there are no gaps for both x86_32_regsets and x86_64_regsets. This involves creating aliases for enum’s that are only in one view or the other, or creating multiple versions like in the case of REGSET_IOPERM32/REGSET_IOPERM64. Simplify the construction of these arrays by just fully separating out the enums for 32 bit and 64 bit. Add some bitsize-free defines for REGSET_GENERAL and REGSET_FP since they are the only two referred to in bitsize generic code. This should have no functional change and is only changing how constants are generated and named. The enum is local to this file, so it does not introduce any burden on code calling from other places in the kernel now having to worry about whether to use a 32 bit or 64 bit enum name. [1] https://lore.kernel.org/lkml/20180717162502.32274-1-yu-cheng.yu@intel.com/ Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> --- arch/x86/kernel/ptrace.c | 60 ++++++++++++++++++++++++++-------------- 1 file changed, 39 insertions(+), 21 deletions(-)