diff mbox series

[kvm-unit-tests] x86/emulator: Test non-canonical memory access exceptions

Message ID 20230215142344.20200-1-minipli@grsecurity.net (mailing list archive)
State New, archived
Headers show
Series [kvm-unit-tests] x86/emulator: Test non-canonical memory access exceptions | expand

Commit Message

Mathias Krause Feb. 15, 2023, 2:23 p.m. UTC
A stack based memory access should generate a #SS(0) exception but
QEMU/TCG as of now (7.2) makes all exceptions based on a non-canonical
address generate a #GP(0) instead (issue linked below).

Add a test that will succeed when run under KVM but fail when using TCG.

Link: https://gitlab.com/qemu-project/qemu/-/issues/928
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
The non-canonical jump test is, apparently, broken under TCG as well.
It "succeeds," as in changing RIP and thereby creating a #GP loop.
I therefore put the new test in front of it to allow it to run.

 x86/emulator64.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

Comments

Sean Christopherson April 6, 2023, 3:02 a.m. UTC | #1
On Wed, Feb 15, 2023, Mathias Krause wrote:
> +static void test_reg_noncanonical(void)
> +{
> +	extern char nc_rsp_start, nc_rsp_end, nc_rbp_start, nc_rbp_end;
> +	extern char nc_rax_start, nc_rax_end;
> +	handler old_ss, old_gp;
> +
> +	old_ss = handle_exception(SS_VECTOR, advance_rip_and_note_exception);
> +	old_gp = handle_exception(GP_VECTOR, advance_rip_and_note_exception);
> +
> +	/* RAX based, should #GP(0) */
> +	exceptions = 0;
> +	rip_advance = &nc_rax_end - &nc_rax_start;
> +	asm volatile("nc_rax_start: orq $0, (%[msb]); nc_rax_end:\n\t"

Can't we use ASM_TRY() + exception_vector() + exception_error_code()?  Installing
a dedicated handler is (slowly) being phased out.  Even better, if you're willing
to take a dependency and/or wait a few weeks for my series to land[*], you should
be able to use asm_safe() to streamline this even further.

[*] https://lkml.kernel.org/r/20230406025117.738014-1-seanjc%40google.com


> +		     : : [msb]"a"(1ul << 63));

Use BIT_ULL().  Actually, scratch that, we have a NONCANONICAL macro.  It _probably_
won't matter, but who know what will happen with things like LAM and LASS.

And why hardcode use of RAX?  Won't any "r" constraint work?

E.g. I believe this can be something like:

	asm_safe_report_ex(GP_VECTOR, "orq $0, (%[noncanonical]), "r" (NONCANONICAL));
	report(!exception_error_code());

Or we could even add asm_safe_report_ex_ec(), e.g.

	asm_safe_report_ex_ec(GP_VECTOR, 0,
			      "orq $0, (%[noncanonical]), "r" (NONCANONICAL));
Mathias Krause April 6, 2023, 8:11 a.m. UTC | #2
On 06.04.23 05:02, Sean Christopherson wrote:
> On Wed, Feb 15, 2023, Mathias Krause wrote:
>> +static void test_reg_noncanonical(void)
>> +{
>> +	extern char nc_rsp_start, nc_rsp_end, nc_rbp_start, nc_rbp_end;
>> +	extern char nc_rax_start, nc_rax_end;
>> +	handler old_ss, old_gp;
>> +
>> +	old_ss = handle_exception(SS_VECTOR, advance_rip_and_note_exception);
>> +	old_gp = handle_exception(GP_VECTOR, advance_rip_and_note_exception);
>> +
>> +	/* RAX based, should #GP(0) */
>> +	exceptions = 0;
>> +	rip_advance = &nc_rax_end - &nc_rax_start;
>> +	asm volatile("nc_rax_start: orq $0, (%[msb]); nc_rax_end:\n\t"
> 
> Can't we use ASM_TRY() + exception_vector() + exception_error_code()?  Installing
> a dedicated handler is (slowly) being phased out.

Well, you may have guessed it, but I tried to be "consistent with the
existing style." Sure this code could get a lot of cleanups too, the
whole file, actually, like the externs should be 'extern char []'
instead to point out they're just "labels" and no chars. But, again, I
just did as was "prior art" in this file. But if moving forward to a
more modern version is wanted, I can adapt.

>  Even better, if you're willing
> to take a dependency and/or wait a few weeks for my series to land[*], you should
> be able to use asm_safe() to streamline this even further.
> 
> [*] https://lkml.kernel.org/r/20230406025117.738014-1-seanjc%40google.com

Looks like a nice cleanup, indeed. However, the conversion should be
straight forward, so I don't think this change has to "wait" for it.

The linked bug report turned 1 just two weeks ago. About time to get it
some more traction. :D

That said, I'll do a spring cleanup of emulator64.c along with my change
to address the points you mentioned in the other test functions as well.
But likely not before next week.

> 
> 
>> +		     : : [msb]"a"(1ul << 63));
> 
> Use BIT_ULL().  Actually, scratch that, we have a NONCANONICAL macro.  It _probably_
> won't matter, but who know what will happen with things like LAM and LASS.

Thanks, will change.

> 
> And why hardcode use of RAX?  Won't any "r" constraint work?

Unfortunately not. It must be neither rsp nor rbp and with "r" the
compiler is free to choose one of these. It'll unlikely make use of rsp,
but rbp is a valid target we need to avoid. (Yes, I saw the
-no-omit-frame-pointer handling in the Makefiles, but I dislike this
implicit dependency.)

I can change the constraints to "abcdSD" to give the compiler a little
bit more freedom, but that makes the inline asm little harder to read,
IMHO. Hardcoding rax is no real constraint to the compiler either, as
it's a volatile register anyway. The call to report() will invalidate
its old value, so I don't see the need for a change -- a comment, at
best, but that's already there ;)

> 
> E.g. I believe this can be something like:
> 
> 	asm_safe_report_ex(GP_VECTOR, "orq $0, (%[noncanonical]), "r" (NONCANONICAL));
> 	report(!exception_error_code());
> 
> Or we could even add asm_safe_report_ex_ec(), e.g.
> 
> 	asm_safe_report_ex_ec(GP_VECTOR, 0,
> 			      "orq $0, (%[noncanonical]), "r" (NONCANONICAL));

Yeah, the latter. Verifying the error code is part of the test, so that
should be preserved.

The tests as written by me also ensure that an exception actually
occurred, exactly one, actually. Maybe that should be accounted for in
asm_safe*() as well?


Thanks,
Mathias

PS: Would be nice if the entry barrier for new tests wouldn't require to
handle the accumulated technical debt of the file one's touching ;P
But I can understand that adding more code adapting to "existing style"
makes the problem only worse. So it's fine with me.
Sean Christopherson April 6, 2023, 4:43 p.m. UTC | #3
On Thu, Apr 06, 2023, Mathias Krause wrote:
> On 06.04.23 05:02, Sean Christopherson wrote:
> > On Wed, Feb 15, 2023, Mathias Krause wrote:
> >> +static void test_reg_noncanonical(void)
> >> +{
> >> +	extern char nc_rsp_start, nc_rsp_end, nc_rbp_start, nc_rbp_end;
> >> +	extern char nc_rax_start, nc_rax_end;
> >> +	handler old_ss, old_gp;
> >> +
> >> +	old_ss = handle_exception(SS_VECTOR, advance_rip_and_note_exception);
> >> +	old_gp = handle_exception(GP_VECTOR, advance_rip_and_note_exception);
> >> +
> >> +	/* RAX based, should #GP(0) */
> >> +	exceptions = 0;
> >> +	rip_advance = &nc_rax_end - &nc_rax_start;
> >> +	asm volatile("nc_rax_start: orq $0, (%[msb]); nc_rax_end:\n\t"
> > 
> > Can't we use ASM_TRY() + exception_vector() + exception_error_code()?  Installing
> > a dedicated handler is (slowly) being phased out.
> 
> Well, you may have guessed it, but I tried to be "consistent with the
> existing style." Sure this code could get a lot of cleanups too, the
> whole file, actually, like the externs should be 'extern char []'
> instead to point out they're just "labels" and no chars. But, again, I
> just did as was "prior art" in this file. But if moving forward to a
> more modern version is wanted, I can adapt.

Yes, please ignore prior art in KVM-Unit-Tests, so much of its "art" is just awful.

> > And why hardcode use of RAX?  Won't any "r" constraint work?
> 
> Unfortunately not. It must be neither rsp nor rbp and with "r" the
> compiler is free to choose one of these.

Ah, right, I forgot about RBP.

> It'll unlikely make use of rsp, but rbp is a valid target we need to avoid.
> (Yes, I saw the -no-omit-frame-pointer handling in the Makefiles, but I
> dislike this implicit dependency.)
> 
> I can change the constraints to "abcdSD" to give the compiler a little
> bit more freedom, but that makes the inline asm little harder to read,
> IMHO. Hardcoding rax is no real constraint to the compiler either, as
> it's a volatile register anyway. The call to report() will invalidate
> its old value, so I don't see the need for a change -- a comment, at
> best, but that's already there ;)

Hardcoding RAX is totally fine, I just forgot about RBP.

> > E.g. I believe this can be something like:
> > 
> > 	asm_safe_report_ex(GP_VECTOR, "orq $0, (%[noncanonical]), "r" (NONCANONICAL));
> > 	report(!exception_error_code());
> > 
> > Or we could even add asm_safe_report_ex_ec(), e.g.
> > 
> > 	asm_safe_report_ex_ec(GP_VECTOR, 0,
> > 			      "orq $0, (%[noncanonical]), "r" (NONCANONICAL));
> 
> Yeah, the latter. Verifying the error code is part of the test, so that
> should be preserved.
> 
> The tests as written by me also ensure that an exception actually
> occurred, exactly one, actually. Maybe that should be accounted for in
> asm_safe*() as well?

That's accounted for, the ASM_TRY() machinery treats "0" as no exception (we
sacrified #DE for the greater good).  Realistically, the only way to have multiple
exceptions without going into the weeds is if KVM somehow restarted the faulting
instruction.  That would essentially require very precise memory corruption to
undo the exception fixup handler's modification of RIP on the stack.  And at that
point, one could also argue that KVM could also corrupt the exception counter.

> PS: Would be nice if the entry barrier for new tests wouldn't require to
> handle the accumulated technical debt of the file one's touching ;P

Heh, and if wishes were horses we'd all be eating steak.  Just be glad I didn't
ask you to rewrite the entire test ;-)

Joking aside, coercing/extorting contributors into using new/better infrastructure
is the only feasible way to keep KVM's test infrastructure somewhat manageable.
E.g. I would love to be able to dedicate a substantial portion of my time to
cleaning up the many warts KUT, but the unfortunate reality is that test infrastructure
is always going to be lower priority than the product itself.

> But I can understand that adding more code adapting to "existing style"
> makes the problem only worse. So it's fine with me.
Mathias Krause April 13, 2023, 7:55 a.m. UTC | #4
On 06.04.23 18:43, Sean Christopherson wrote:
> On Thu, Apr 06, 2023, Mathias Krause wrote:
>> On 06.04.23 05:02, Sean Christopherson wrote:
>>> [...]
>>> E.g. I believe this can be something like:
>>>
>>> 	asm_safe_report_ex(GP_VECTOR, "orq $0, (%[noncanonical]), "r" (NONCANONICAL));
>>> 	report(!exception_error_code());
>>>
>>> Or we could even add asm_safe_report_ex_ec(), e.g.
>>>
>>> 	asm_safe_report_ex_ec(GP_VECTOR, 0,
>>> 			      "orq $0, (%[noncanonical]), "r" (NONCANONICAL));
>>
>> Yeah, the latter. Verifying the error code is part of the test, so that
>> should be preserved.
>>
>> The tests as written by me also ensure that an exception actually
>> occurred, exactly one, actually. Maybe that should be accounted for in
>> asm_safe*() as well?
> 
> That's accounted for, the ASM_TRY() machinery treats "0" as no exception (we
> sacrified #DE for the greater good).

I overlooked the GS-relative MOVL in ASM_TRY() first, which, after some
digging, turns out to be zeroing the per-cpu 'exception_data' member.
Sneaky ;)

>                                       Realistically, the only way to have multiple
> exceptions without going into the weeds is if KVM somehow restarted the faulting
> instruction.  That would essentially require very precise memory corruption to
> undo the exception fixup handler's modification of RIP on the stack.  And at that
> point, one could also argue that KVM could also corrupt the exception counter.

I wasn't concerned about memory corruption per se but more of a broken
emulation as in not generating an exception at all. But ASM_TRY()
handles this by resetting the per-cpu exception state. So yeah, looks
like I can make use of the per-cpu helpers for my purpose.

> 
>> PS: Would be nice if the entry barrier for new tests wouldn't require to
>> handle the accumulated technical debt of the file one's touching ;P
> 
> Heh, and if wishes were horses we'd all be eating steak.  Just be glad I didn't
> ask you to rewrite the entire test ;-)
> 
> Joking aside, coercing/extorting contributors into using new/better infrastructure
> is the only feasible way to keep KVM's test infrastructure somewhat manageable.

I understand that, so it's fine with me to enrich my new test case with
some cleanups along the way.

> E.g. I would love to be able to dedicate a substantial portion of my time to
> cleaning up the many warts KUT, but the unfortunate reality is that test infrastructure
> is always going to be lower priority than the product itself.

I feel your pain and in my case the product is something completely
different even, so even less time on my side :/

Thanks,
Mathias
Sean Christopherson April 13, 2023, 4:07 p.m. UTC | #5
On Thu, Apr 13, 2023, Mathias Krause wrote:
> On 06.04.23 18:43, Sean Christopherson wrote:
> > On Thu, Apr 06, 2023, Mathias Krause wrote:
> >> On 06.04.23 05:02, Sean Christopherson wrote:
> >>> [...]
> >>> E.g. I believe this can be something like:
> >>>
> >>> 	asm_safe_report_ex(GP_VECTOR, "orq $0, (%[noncanonical]), "r" (NONCANONICAL));
> >>> 	report(!exception_error_code());
> >>>
> >>> Or we could even add asm_safe_report_ex_ec(), e.g.
> >>>
> >>> 	asm_safe_report_ex_ec(GP_VECTOR, 0,
> >>> 			      "orq $0, (%[noncanonical]), "r" (NONCANONICAL));
> >>
> >> Yeah, the latter. Verifying the error code is part of the test, so that
> >> should be preserved.
> >>
> >> The tests as written by me also ensure that an exception actually
> >> occurred, exactly one, actually. Maybe that should be accounted for in
> >> asm_safe*() as well?
> > 
> > That's accounted for, the ASM_TRY() machinery treats "0" as no exception (we
> > sacrified #DE for the greater good).
> 
> I overlooked the GS-relative MOVL in ASM_TRY() first, which, after some
> digging, turns out to be zeroing the per-cpu 'exception_data' member.
> Sneaky ;)

Heh, "sneaky" is a much more polite description than I would use.  I really don't
like using per-CPU data for excpetion fixup, but having to support 32-bit builds
means our options our limited.  E.g. KVM selftests is 64-bit only and so can use
r9-r11 to communicate with the exception handler without conflicting with instructions
that have hardcoded registers (testing SYSRET isn't exactly a priority).
diff mbox series

Patch

diff --git a/x86/emulator64.c b/x86/emulator64.c
index 7f55d388c597..df2cd2c85308 100644
--- a/x86/emulator64.c
+++ b/x86/emulator64.c
@@ -2,10 +2,12 @@ 
 #define GS_BASE 0x400000
 
 static unsigned long rip_advance;
+static struct ex_regs last_ex_regs;
 
 static void advance_rip_and_note_exception(struct ex_regs *regs)
 {
 	++exceptions;
+	last_ex_regs = *regs;
 	regs->rip += rip_advance;
 }
 
@@ -347,6 +349,55 @@  static void test_jmp_noncanonical(uint64_t *mem)
 	handle_exception(GP_VECTOR, old);
 }
 
+static void test_reg_noncanonical(void)
+{
+	extern char nc_rsp_start, nc_rsp_end, nc_rbp_start, nc_rbp_end;
+	extern char nc_rax_start, nc_rax_end;
+	handler old_ss, old_gp;
+
+	old_ss = handle_exception(SS_VECTOR, advance_rip_and_note_exception);
+	old_gp = handle_exception(GP_VECTOR, advance_rip_and_note_exception);
+
+	/* RAX based, should #GP(0) */
+	exceptions = 0;
+	rip_advance = &nc_rax_end - &nc_rax_start;
+	asm volatile("nc_rax_start: orq $0, (%[msb]); nc_rax_end:\n\t"
+		     : : [msb]"a"(1ul << 63));
+	report(exceptions == 1
+	       && last_ex_regs.vector == GP_VECTOR
+	       && last_ex_regs.error_code == 0,
+	       "non-canonical memory access, should %s(0), got %s(%lu)",
+	       exception_mnemonic(GP_VECTOR),
+	       exception_mnemonic(last_ex_regs.vector), last_ex_regs.error_code);
+
+	/* RSP based, should #SS(0) */
+	exceptions = 0;
+	rip_advance = &nc_rsp_end - &nc_rsp_start;
+	asm volatile("nc_rsp_start: orq $0, (%%rsp,%[msb],1); nc_rsp_end:\n\t"
+		     : : [msb]"r"(1ul << 63));
+	report(exceptions == 1
+	       && last_ex_regs.vector == SS_VECTOR
+	       && last_ex_regs.error_code == 0,
+	       "non-canonical rsp-based access, should %s(0), got %s(%lu)",
+	       exception_mnemonic(SS_VECTOR),
+	       exception_mnemonic(last_ex_regs.vector), last_ex_regs.error_code);
+
+	/* RBP based, should #SS(0) */
+	exceptions = 0;
+	rip_advance = &nc_rbp_end - &nc_rbp_start;
+	asm volatile("nc_rbp_start: orq $0, (%%rbp,%[msb],1); nc_rbp_end:\n\t"
+		     : : [msb]"r"(1ul << 63));
+	report(exceptions == 1
+	       && last_ex_regs.vector == SS_VECTOR
+	       && last_ex_regs.error_code == 0,
+	       "non-canonical rbp-based access, should %s(0), got %s(%lu)",
+	       exception_mnemonic(SS_VECTOR),
+	       exception_mnemonic(last_ex_regs.vector), last_ex_regs.error_code);
+
+	handle_exception(SS_VECTOR, old_ss);
+	handle_exception(GP_VECTOR, old_gp);
+}
+
 static void test_movabs(uint64_t *mem)
 {
 	/* mov $0x9090909090909090, %rcx */
@@ -460,5 +511,6 @@  static void test_emulator_64(void *mem)
 
 	test_push16(mem);
 
+	test_reg_noncanonical();
 	test_jmp_noncanonical(mem);
 }