diff mbox series

x86/entry: shrink insn size for some of our EFLAGS manipulation

Message ID 0ba4903d-638d-408e-bcd4-7c13cb56e078@suse.com (mailing list archive)
State Superseded
Headers show
Series x86/entry: shrink insn size for some of our EFLAGS manipulation | expand

Commit Message

Jan Beulich March 5, 2024, 2:04 p.m. UTC
Much like was recently done for setting entry vector, and along the
lines of what we already had in handle_exception_saved, avoid 32-bit
immediates where 8-bit ones do. Reduces .text.entry size by 16 bytes in
my non-CET reference build, while in my CET reference build section size
doesn't change (there and in .text only padding space increases).

Inspired by other long->byte conversion work.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Numbers above are biased by me also having the straight-line-speculation
change in the tree, thus every JMP is followed by an INT3. Without that,
.text.entry size would also shrink by 16 bytes in the CET build.

Comments

Andrew Cooper March 6, 2024, 10:33 a.m. UTC | #1
On 05/03/2024 2:04 pm, Jan Beulich wrote:
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -52,7 +52,7 @@ UNLIKELY_END(syscall_no_callback)
>          movq  %rax,TRAPBOUNCE_eip(%rdx)
>          movb  %cl,TRAPBOUNCE_flags(%rdx)
>          call  create_bounce_frame
> -        andl  $~X86_EFLAGS_DF,UREGS_eflags(%rsp)
> +        andb  $~(X86_EFLAGS_DF >> 8), UREGS_eflags + 1(%rsp)

The other adjustments are fine, but what on earth are we doing with DF here?

This looks straight up buggy.  We've got no legitimate reason to be
playing with the guest's view of DF.

~Andrew
Jan Beulich March 6, 2024, 10:49 a.m. UTC | #2
On 06.03.2024 11:33, Andrew Cooper wrote:
> On 05/03/2024 2:04 pm, Jan Beulich wrote:
>> --- a/xen/arch/x86/x86_64/entry.S
>> +++ b/xen/arch/x86/x86_64/entry.S
>> @@ -52,7 +52,7 @@ UNLIKELY_END(syscall_no_callback)
>>          movq  %rax,TRAPBOUNCE_eip(%rdx)
>>          movb  %cl,TRAPBOUNCE_flags(%rdx)
>>          call  create_bounce_frame
>> -        andl  $~X86_EFLAGS_DF,UREGS_eflags(%rsp)
>> +        andb  $~(X86_EFLAGS_DF >> 8), UREGS_eflags + 1(%rsp)
> 
> The other adjustments are fine, but what on earth are we doing with DF here?
> 
> This looks straight up buggy.  We've got no legitimate reason to be
> playing with the guest's view of DF.

This is the PV equivalent of the SYSCALL_MASK MSR, isn't it? With it not
really being an (emulated) MSR, but an agreement that guests will only ever
care about having DF cleared (besides the requested way of dealing with the
event mask). One of the many things not written down anywhere ...

Jan
Andrew Cooper March 6, 2024, 11:14 a.m. UTC | #3
On 06/03/2024 10:49 am, Jan Beulich wrote:
> On 06.03.2024 11:33, Andrew Cooper wrote:
>> On 05/03/2024 2:04 pm, Jan Beulich wrote:
>>> --- a/xen/arch/x86/x86_64/entry.S
>>> +++ b/xen/arch/x86/x86_64/entry.S
>>> @@ -52,7 +52,7 @@ UNLIKELY_END(syscall_no_callback)
>>>          movq  %rax,TRAPBOUNCE_eip(%rdx)
>>>          movb  %cl,TRAPBOUNCE_flags(%rdx)
>>>          call  create_bounce_frame
>>> -        andl  $~X86_EFLAGS_DF,UREGS_eflags(%rsp)
>>> +        andb  $~(X86_EFLAGS_DF >> 8), UREGS_eflags + 1(%rsp)
>> The other adjustments are fine, but what on earth are we doing with DF here?
>>
>> This looks straight up buggy.  We've got no legitimate reason to be
>> playing with the guest's view of DF.
> This is the PV equivalent of the SYSCALL_MASK MSR, isn't it?

Well, is it?

It definitely never existed in 32bit builds of Xen, when the int80
direct trap existed.

I don't see any evidence of it applying anywhere for compat PV guests
either, even those with syscall enabled.

> With it not
> really being an (emulated) MSR, but an agreement that guests will only ever
> care about having DF cleared (besides the requested way of dealing with the
> event mask). One of the many things not written down anywhere ...

If it's not written down, it doesn't exist...

And even if this is supposed to be a PV-FMASK-ish thing, it's buggy
because it apples also when #UD is raised for no registered callback. 
(And yes, I realise there is a chronology issues here (the #UD check is
the newer element), but it really will corrupt state as presented in a
SIGSEGV frame.

The question we need to answer is whether there is any remotely-credible
way that a PV guest kernel author could be expecting this behaviour and
relying on it.

I honestly don't think there is.

It fails the principle of least surprise compared to native behaviour,
32bit PV behaviour, and to every non-SYSCALL based 64bit event also.

It either needs writing down somewhere (and the #UD case fixing), or it
needs deleing, because continuing to leave it in this state isn't ok.

~Andrew
Jan Beulich March 6, 2024, 11:54 a.m. UTC | #4
On 06.03.2024 12:14, Andrew Cooper wrote:
> On 06/03/2024 10:49 am, Jan Beulich wrote:
>> On 06.03.2024 11:33, Andrew Cooper wrote:
>>> On 05/03/2024 2:04 pm, Jan Beulich wrote:
>>>> --- a/xen/arch/x86/x86_64/entry.S
>>>> +++ b/xen/arch/x86/x86_64/entry.S
>>>> @@ -52,7 +52,7 @@ UNLIKELY_END(syscall_no_callback)
>>>>          movq  %rax,TRAPBOUNCE_eip(%rdx)
>>>>          movb  %cl,TRAPBOUNCE_flags(%rdx)
>>>>          call  create_bounce_frame
>>>> -        andl  $~X86_EFLAGS_DF,UREGS_eflags(%rsp)
>>>> +        andb  $~(X86_EFLAGS_DF >> 8), UREGS_eflags + 1(%rsp)
>>> The other adjustments are fine, but what on earth are we doing with DF here?
>>>
>>> This looks straight up buggy.  We've got no legitimate reason to be
>>> playing with the guest's view of DF.
>> This is the PV equivalent of the SYSCALL_MASK MSR, isn't it?
> 
> Well, is it?
> 
> It definitely never existed in 32bit builds of Xen, when the int80
> direct trap existed.
> 
> I don't see any evidence of it applying anywhere for compat PV guests
> either, even those with syscall enabled.

Neither int80 nor the 32-bit mode syscall apply the flags mask. Therefore
why would we mimic such behavior in Xen?

>> With it not
>> really being an (emulated) MSR, but an agreement that guests will only ever
>> care about having DF cleared (besides the requested way of dealing with the
>> event mask). One of the many things not written down anywhere ...
> 
> If it's not written down, it doesn't exist...

IOW the PV interface as a whole largely doesn't exist.

> And even if this is supposed to be a PV-FMASK-ish thing, it's buggy
> because it apples also when #UD is raised for no registered callback. 
> (And yes, I realise there is a chronology issues here (the #UD check is
> the newer element), but it really will corrupt state as presented in a
> SIGSEGV frame.
> 
> The question we need to answer is whether there is any remotely-credible
> way that a PV guest kernel author could be expecting this behaviour and
> relying on it.
> 
> I honestly don't think there is.

I was going to say XenoLinux and its forward ports did, but upstream
Linux does, too. The native and PV paths being largely shared, there's
no CLD in sight anywhere.

> It fails the principle of least surprise compared to native behaviour,
> 32bit PV behaviour, and to every non-SYSCALL based 64bit event also.

Why "every"? switch_to_kernel is used solely for syscall handling.

And why do you consider this behavior surprising, when really it would
be surprising for the flag to remain untouched from what user mode had,
when taking into account how the MSR is set in the native case?

> It either needs writing down somewhere (and the #UD case fixing), or it
> needs deleing, because continuing to leave it in this state isn't ok.

I can try to determine whether addressing the #UD case can be done in a
non-intrusive way. If so, I can split off that part of the change here,
and make it a separate one with the MOVL->MOVB just done on the side.
But really this is scope creep: The change as is has no (intended)
change in behavior. It ought to be fine to go in independently; the
further work you ask for could be done up front or later on.

Jan
diff mbox series

Patch

--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -52,7 +52,7 @@  UNLIKELY_END(syscall_no_callback)
         movq  %rax,TRAPBOUNCE_eip(%rdx)
         movb  %cl,TRAPBOUNCE_flags(%rdx)
         call  create_bounce_frame
-        andl  $~X86_EFLAGS_DF,UREGS_eflags(%rsp)
+        andb  $~(X86_EFLAGS_DF >> 8), UREGS_eflags + 1(%rsp)
 /* %rbx: struct vcpu */
 test_all_events:
         ASSERT_NOT_IN_ATOMIC
@@ -223,7 +223,7 @@  LABEL_LOCAL(.Lrestore_rcx_iret_exit_to_g
 /* No special register assumptions. */
 iret_exit_to_guest:
         andl  $~(X86_EFLAGS_IOPL | X86_EFLAGS_VM), EFRAME_eflags(%rsp)
-        orl   $X86_EFLAGS_IF, EFRAME_eflags(%rsp)
+        orb   $X86_EFLAGS_IF >> 8, EFRAME_eflags + 1(%rsp)
         addq  $8,%rsp
 .Lft0:  iretq
         _ASM_PRE_EXTABLE(.Lft0, handle_exception)
@@ -346,7 +346,7 @@  LABEL(sysenter_eflags_saved, 0)
 
         GET_STACK_END(bx)
         /* PUSHF above has saved EFLAGS.IF clear (the caller had it set). */
-        orl   $X86_EFLAGS_IF, UREGS_eflags(%rsp)
+        orb   $X86_EFLAGS_IF >> 8, UREGS_eflags + 1(%rsp)
         mov   STACK_CPUINFO_FIELD(xen_cr3)(%rbx), %rcx
         test  %rcx, %rcx
         jz    .Lsyse_cr3_okay
@@ -361,11 +361,11 @@  LABEL(sysenter_eflags_saved, 0)
         cmpb  $0,VCPU_sysenter_disables_events(%rbx)
         movq  VCPU_sysenter_addr(%rbx),%rax
         setne %cl
-        testl $X86_EFLAGS_NT,UREGS_eflags(%rsp)
+        testb $X86_EFLAGS_NT >> 8, UREGS_eflags + 1(%rsp)
         leaq  VCPU_trap_bounce(%rbx),%rdx
 UNLIKELY_START(nz, sysenter_nt_set)
         pushfq
-        andl  $~X86_EFLAGS_NT,(%rsp)
+        andb  $~(X86_EFLAGS_NT >> 8), 1(%rsp)
         popfq
 UNLIKELY_END(sysenter_nt_set)
         testq %rax,%rax