Message ID | 20230413184219.36404-7-minipli@grsecurity.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86: cleanups, fixes and new tests | expand |
On Thu, Apr 13, 2023, Mathias Krause wrote: > Use an array type to refer to the code label 'ret_to_kernel'. Why? Taking the address of a label when setting what is effectively the target of a branch seems more intuitive than pointing at an array (that's not an array). > No functional change. > > Signed-off-by: Mathias Krause <minipli@grsecurity.net> > --- > lib/x86/usermode.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/lib/x86/usermode.c b/lib/x86/usermode.c > index e22fb8f0132b..b976123ca753 100644 > --- a/lib/x86/usermode.c > +++ b/lib/x86/usermode.c > @@ -35,12 +35,12 @@ uint64_t run_in_user(usermode_func func, unsigned int fault_vector, > uint64_t arg1, uint64_t arg2, uint64_t arg3, > uint64_t arg4, bool *raised_vector) > { > - extern char ret_to_kernel; > + extern char ret_to_kernel[]; > uint64_t rax = 0; > static unsigned char user_stack[USERMODE_STACK_SIZE]; > > *raised_vector = 0; > - set_idt_entry(RET_TO_KERNEL_IRQ, &ret_to_kernel, 3); > + set_idt_entry(RET_TO_KERNEL_IRQ, ret_to_kernel, 3); > handle_exception(fault_vector, > restore_exec_to_jmpbuf_exception_handler); > > -- > 2.39.2 >
On 13.06.23 02:32, Sean Christopherson wrote: > On Thu, Apr 13, 2023, Mathias Krause wrote: >> Use an array type to refer to the code label 'ret_to_kernel'. > > Why? Taking the address of a label when setting what is effectively the target > of a branch seems more intuitive than pointing at an array (that's not an array). Well, the flexible array notation is what my understanding of referring to a "label" defined in ASM is. I'm probably biased, as that's a pattern used a lot in the Linux kernel but trying to look at the individual semantics may make it clearer why I prefer 'extern char sym[]' over 'extern char sym'. The current code refers to the code sequence starting at 'ret_to_kernel' by creating an alias of it's first instruction byte. However, we're not interested in the first instruction byte at all. We want a symbolic handle of the begin of that sequence, which might be an unknown number of bytes. But that's also a detail that doesn't matter. We only what to know the start. By referring to it as 'extern char' implies that there is at least a single byte. (Let's ignore the hair splitting about just taking the address vs. actually dereferencing it (which we do not).) By looking at another code example, that byte may not actually be there! From x86/vmx_tests.c: extern char vmx_mtf_pdpte_guest_begin; extern char vmx_mtf_pdpte_guest_end; asm("vmx_mtf_pdpte_guest_begin:\n\t" "mov %cr0, %rax\n\t" /* save CR0 with PG=1 */ "vmcall\n\t" /* on return from this CR0.PG=0 */ "mov %rax, %cr0\n\t" /* restore CR0.PG=1 to enter PAE mode */ "vmcall\n\t" "retq\n\t" "vmx_mtf_pdpte_guest_end:"); The byte referred to via &vmx_mtf_pdpte_guest_end may not even be mapped due to -ftoplevel-reorder possibly putting that asm block at the very end of the compilation unit. By using 'extern char []' instead this nastiness is avoided by referring to an unknown sized byte sequence starting at that symbol (with zero being a totally valid length). We don't need to know how many bytes follow the label. All we want to know is its address. And that's what an array type provides easily. But I can understand, that YMMV and you prefer it the other way around. > >> No functional change. >> >> Signed-off-by: Mathias Krause <minipli@grsecurity.net> >> --- >> lib/x86/usermode.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/lib/x86/usermode.c b/lib/x86/usermode.c >> index e22fb8f0132b..b976123ca753 100644 >> --- a/lib/x86/usermode.c >> +++ b/lib/x86/usermode.c >> @@ -35,12 +35,12 @@ uint64_t run_in_user(usermode_func func, unsigned int fault_vector, >> uint64_t arg1, uint64_t arg2, uint64_t arg3, >> uint64_t arg4, bool *raised_vector) >> { >> - extern char ret_to_kernel; >> + extern char ret_to_kernel[]; >> uint64_t rax = 0; >> static unsigned char user_stack[USERMODE_STACK_SIZE]; >> >> *raised_vector = 0; >> - set_idt_entry(RET_TO_KERNEL_IRQ, &ret_to_kernel, 3); >> + set_idt_entry(RET_TO_KERNEL_IRQ, ret_to_kernel, 3); >> handle_exception(fault_vector, >> restore_exec_to_jmpbuf_exception_handler); >> >> -- >> 2.39.2 >>
On Wed, Jun 14, 2023, Mathias Krause wrote: > On 13.06.23 02:32, Sean Christopherson wrote: > > On Thu, Apr 13, 2023, Mathias Krause wrote: > >> Use an array type to refer to the code label 'ret_to_kernel'. > > > > Why? Taking the address of a label when setting what is effectively the target > > of a branch seems more intuitive than pointing at an array (that's not an array). > > Well, the flexible array notation is what my understanding of referring > to a "label" defined in ASM is. I'm probably biased, as that's a pattern > used a lot in the Linux kernel but trying to look at the individual > semantics may make it clearer why I prefer 'extern char sym[]' over > 'extern char sym'. > > The current code refers to the code sequence starting at 'ret_to_kernel' > by creating an alias of it's first instruction byte. However, we're not > interested in the first instruction byte at all. We want a symbolic > handle of the begin of that sequence, which might be an unknown number > of bytes. But that's also a detail that doesn't matter. We only what to > know the start. By referring to it as 'extern char' implies that there > is at least a single byte. (Let's ignore the hair splitting about just > taking the address vs. actually dereferencing it (which we do not).) By > looking at another code example, that byte may not actually be there! > >From x86/vmx_tests.c: > > extern char vmx_mtf_pdpte_guest_begin; > extern char vmx_mtf_pdpte_guest_end; > > asm("vmx_mtf_pdpte_guest_begin:\n\t" > "mov %cr0, %rax\n\t" /* save CR0 with PG=1 */ > "vmcall\n\t" /* on return from this CR0.PG=0 */ > "mov %rax, %cr0\n\t" /* restore CR0.PG=1 to enter PAE mode */ > "vmcall\n\t" > "retq\n\t" > "vmx_mtf_pdpte_guest_end:"); > > The byte referred to via &vmx_mtf_pdpte_guest_end may not even be mapped > due to -ftoplevel-reorder possibly putting that asm block at the very > end of the compilation unit. > > By using 'extern char []' instead this nastiness is avoided by referring > to an unknown sized byte sequence starting at that symbol (with zero > being a totally valid length). We don't need to know how many bytes > follow the label. All we want to know is its address. And that's what an > array type provides easily. I think my hangup is that arrays make me think of data, not code. I wouldn't object if this were new code, but I dislike changing existing code to something that's arguably just as flawed. What if we declare the label as a function? That's not exactly correct either, but IMO it's closer to the truth, and let's us document that the kernel trampoline has a return value, i.e. preserves/outputs RAX. diff --git a/lib/x86/usermode.c b/lib/x86/usermode.c index c3ec0ad7..a46a369a 100644 --- a/lib/x86/usermode.c +++ b/lib/x86/usermode.c @@ -31,17 +31,18 @@ static void restore_exec_to_jmpbuf_exception_handler(struct ex_regs *regs) #endif } +uint64_t ret_to_kernel(void); + uint64_t run_in_user(usermode_func func, unsigned int fault_vector, uint64_t arg1, uint64_t arg2, uint64_t arg3, uint64_t arg4, bool *raised_vector) { - extern char ret_to_kernel; volatile uint64_t rax = 0; static unsigned char user_stack[USERMODE_STACK_SIZE]; handler old_ex; *raised_vector = 0; - set_idt_entry(RET_TO_KERNEL_IRQ, &ret_to_kernel, 3); + set_idt_entry(RET_TO_KERNEL_IRQ, ret_to_kernel, 3); old_ex = handle_exception(fault_vector, restore_exec_to_jmpbuf_exception_handler);
On 15.06.23 17:18, Sean Christopherson wrote: > On Wed, Jun 14, 2023, Mathias Krause wrote: >> On 13.06.23 02:32, Sean Christopherson wrote: >>> On Thu, Apr 13, 2023, Mathias Krause wrote: >>>> Use an array type to refer to the code label 'ret_to_kernel'. >>> >>> Why? Taking the address of a label when setting what is effectively the target >>> of a branch seems more intuitive than pointing at an array (that's not an array). >> >> Well, the flexible array notation is what my understanding of referring >> to a "label" defined in ASM is. I'm probably biased, as that's a pattern >> used a lot in the Linux kernel but trying to look at the individual >> semantics may make it clearer why I prefer 'extern char sym[]' over >> 'extern char sym'. >> >> The current code refers to the code sequence starting at 'ret_to_kernel' >> by creating an alias of it's first instruction byte. However, we're not >> interested in the first instruction byte at all. We want a symbolic >> handle of the begin of that sequence, which might be an unknown number >> of bytes. But that's also a detail that doesn't matter. We only what to >> know the start. By referring to it as 'extern char' implies that there >> is at least a single byte. (Let's ignore the hair splitting about just >> taking the address vs. actually dereferencing it (which we do not).) By >> looking at another code example, that byte may not actually be there! >> >From x86/vmx_tests.c: >> >> extern char vmx_mtf_pdpte_guest_begin; >> extern char vmx_mtf_pdpte_guest_end; >> >> asm("vmx_mtf_pdpte_guest_begin:\n\t" >> "mov %cr0, %rax\n\t" /* save CR0 with PG=1 */ >> "vmcall\n\t" /* on return from this CR0.PG=0 */ >> "mov %rax, %cr0\n\t" /* restore CR0.PG=1 to enter PAE mode */ >> "vmcall\n\t" >> "retq\n\t" >> "vmx_mtf_pdpte_guest_end:"); >> >> The byte referred to via &vmx_mtf_pdpte_guest_end may not even be mapped >> due to -ftoplevel-reorder possibly putting that asm block at the very >> end of the compilation unit. >> >> By using 'extern char []' instead this nastiness is avoided by referring >> to an unknown sized byte sequence starting at that symbol (with zero >> being a totally valid length). We don't need to know how many bytes >> follow the label. All we want to know is its address. And that's what an >> array type provides easily. > > I think my hangup is that arrays make me think of data, not code. I can say the same for 'extern char' -- that's clearly data, not even r/o data, less so .text ;) > I wouldn't > object if this were new code, but I dislike changing existing code to something > that's arguably just as flawed. Fair enough. > What if we declare the label as a function? That's not exactly correct either, > but IMO it's closer to the truth, and let's us document that the kernel trampoline > has a return value, i.e. preserves/outputs RAX. No, please don't. It's *not* a function, not at all. First thing that code block does is switching the stack, wich is already a bummer. But it also has no return instruction, making it fall-through to the middle of run_in_user() and execute code out of context. It's really just a label to mark the beginning of a code sequence we have to care about. Now, we do not want to ever call ret_to_kernel() from C, but declaring it as a function just feels way more worse than having a "wrong" extern char declaration. So, let's leave the code as-is. It's not broken, just imperfect. > > diff --git a/lib/x86/usermode.c b/lib/x86/usermode.c > index c3ec0ad7..a46a369a 100644 > --- a/lib/x86/usermode.c > +++ b/lib/x86/usermode.c > @@ -31,17 +31,18 @@ static void restore_exec_to_jmpbuf_exception_handler(struct ex_regs *regs) > #endif > } > > +uint64_t ret_to_kernel(void); > + > uint64_t run_in_user(usermode_func func, unsigned int fault_vector, > uint64_t arg1, uint64_t arg2, uint64_t arg3, > uint64_t arg4, bool *raised_vector) > { > - extern char ret_to_kernel; > volatile uint64_t rax = 0; > static unsigned char user_stack[USERMODE_STACK_SIZE]; > handler old_ex; > > *raised_vector = 0; > - set_idt_entry(RET_TO_KERNEL_IRQ, &ret_to_kernel, 3); > + set_idt_entry(RET_TO_KERNEL_IRQ, ret_to_kernel, 3); > old_ex = handle_exception(fault_vector, > restore_exec_to_jmpbuf_exception_handler); > >
diff --git a/lib/x86/usermode.c b/lib/x86/usermode.c index e22fb8f0132b..b976123ca753 100644 --- a/lib/x86/usermode.c +++ b/lib/x86/usermode.c @@ -35,12 +35,12 @@ uint64_t run_in_user(usermode_func func, unsigned int fault_vector, uint64_t arg1, uint64_t arg2, uint64_t arg3, uint64_t arg4, bool *raised_vector) { - extern char ret_to_kernel; + extern char ret_to_kernel[]; uint64_t rax = 0; static unsigned char user_stack[USERMODE_STACK_SIZE]; *raised_vector = 0; - set_idt_entry(RET_TO_KERNEL_IRQ, &ret_to_kernel, 3); + set_idt_entry(RET_TO_KERNEL_IRQ, ret_to_kernel, 3); handle_exception(fault_vector, restore_exec_to_jmpbuf_exception_handler);
Use an array type to refer to the code label 'ret_to_kernel'. No functional change. Signed-off-by: Mathias Krause <minipli@grsecurity.net> --- lib/x86/usermode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)