diff mbox

[05/39] x86/entry/32: Unshare NMI return path

Message ID 1531308586-29340-6-git-send-email-joro@8bytes.org (mailing list archive)
State New, archived
Headers show

Commit Message

Joerg Roedel July 11, 2018, 11:29 a.m. UTC
From: Joerg Roedel <jroedel@suse.de>

NMI will no longer use most of the shared return path,
because NMI needs special handling when the CR3 switches for
PTI are added. This patch prepares for that.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/entry/entry_32.S | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Andy Lutomirski July 12, 2018, 8:53 p.m. UTC | #1
> On Jul 11, 2018, at 4:29 AM, Joerg Roedel <joro@8bytes.org> wrote:
> 
> From: Joerg Roedel <jroedel@suse.de>
> 
> NMI will no longer use most of the shared return path,
> because NMI needs special handling when the CR3 switches for
> PTI are added.

Why?  What would go wrong?

How many return-to-usermode paths will we have?  64-bit has only one.

> This patch prepares for that.
> 
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
> arch/x86/entry/entry_32.S | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
> index d35a69a..571209e 100644
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -1017,7 +1017,7 @@ ENTRY(nmi)
> 
>    /* Not on SYSENTER stack. */
>    call    do_nmi
> -    jmp    .Lrestore_all_notrace
> +    jmp    .Lnmi_return
> 
> .Lnmi_from_sysenter_stack:
>    /*
> @@ -1028,7 +1028,11 @@ ENTRY(nmi)
>    movl    PER_CPU_VAR(cpu_current_top_of_stack), %esp
>    call    do_nmi
>    movl    %ebx, %esp
> -    jmp    .Lrestore_all_notrace
> +
> +.Lnmi_return:
> +    CHECK_AND_APPLY_ESPFIX
> +    RESTORE_REGS 4
> +    jmp    .Lirq_return
> 
> #ifdef CONFIG_X86_ESPFIX32
> .Lnmi_espfix_stack:
> -- 
> 2.7.4
>
Joerg Roedel July 13, 2018, 10:05 a.m. UTC | #2
On Thu, Jul 12, 2018 at 01:53:19PM -0700, Andy Lutomirski wrote:
> > On Jul 11, 2018, at 4:29 AM, Joerg Roedel <joro@8bytes.org> wrote:
> > NMI will no longer use most of the shared return path,
> > because NMI needs special handling when the CR3 switches for
> > PTI are added.
> 
> Why?  What would go wrong?
> 
> How many return-to-usermode paths will we have?  64-bit has only one.

In the non-NMI return path we make a decission on whether we return to
user-space or kernel-space and do different things based on that. For
example, when returning to user-space we call
prepare_exit_to_usermode(). With the CR3 switches added later we also
unconditionally switch to user-cr3 when we are in the return-to-user
path.

The NMI return path does not need any of that, as it doesn't call
prepare_exit_to_usermode() even when it returns to user-space. It
doesn't even care where it returns to. It just remembers stack and cr3
on entry in callee-safed registers and restores that on exit. This works
in the NMI path because it is pretty simple and doesn't do any fancy
work on exit.

While working on a previous version I also tried to store stack and cr3
in a callee-safed register and restore that on exit again, but it didn't
work, most likley because something in-between overwrote one of the
registers. I also found it a bit fragile to make make two registers
untouchable in the whole entry-code. It doesn't make future changes
simpler or more robust.

So long story short, the NMI path can be simpler wrt. stack and cr3
handling as the other entry/exit points, and therefore it is handled
differently.

Regards,

	Joerg
Andy Lutomirski July 13, 2018, 5:26 p.m. UTC | #3
On Fri, Jul 13, 2018 at 3:05 AM, Joerg Roedel <joro@8bytes.org> wrote:
> On Thu, Jul 12, 2018 at 01:53:19PM -0700, Andy Lutomirski wrote:
>> > On Jul 11, 2018, at 4:29 AM, Joerg Roedel <joro@8bytes.org> wrote:
>> > NMI will no longer use most of the shared return path,
>> > because NMI needs special handling when the CR3 switches for
>> > PTI are added.
>>
>> Why?  What would go wrong?
>>
>> How many return-to-usermode paths will we have?  64-bit has only one.
>
> In the non-NMI return path we make a decission on whether we return to
> user-space or kernel-space and do different things based on that. For
> example, when returning to user-space we call
> prepare_exit_to_usermode(). With the CR3 switches added later we also
> unconditionally switch to user-cr3 when we are in the return-to-user
> path.
>
> The NMI return path does not need any of that, as it doesn't call
> prepare_exit_to_usermode() even when it returns to user-space. It
> doesn't even care where it returns to. It just remembers stack and cr3
> on entry in callee-safed registers and restores that on exit. This works
> in the NMI path because it is pretty simple and doesn't do any fancy
> work on exit.
>
> While working on a previous version I also tried to store stack and cr3
> in a callee-safed register and restore that on exit again, but it didn't
> work, most likley because something in-between overwrote one of the
> registers. I also found it a bit fragile to make make two registers
> untouchable in the whole entry-code. It doesn't make future changes
> simpler or more robust.
>
> So long story short, the NMI path can be simpler wrt. stack and cr3
> handling as the other entry/exit points, and therefore it is handled
> differently.
>
>

We used to do it this way on 64-bit, but I had to change it because of
a nasty case where we *fail* the return to user mode when we're
returning from an NMI.  In theory this can't happen any more due to a
bunch of tightening up of the way we handle segmentation, but it's
still quite nasty.  The whole situation on 32-bit isn't quite as
fragile because espfix32 is much more robust than espfix64.

So I suppose this is okay, but I wouldn't be totally shocked if we
need to redo it down the road.
diff mbox

Patch

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index d35a69a..571209e 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1017,7 +1017,7 @@  ENTRY(nmi)
 
 	/* Not on SYSENTER stack. */
 	call	do_nmi
-	jmp	.Lrestore_all_notrace
+	jmp	.Lnmi_return
 
 .Lnmi_from_sysenter_stack:
 	/*
@@ -1028,7 +1028,11 @@  ENTRY(nmi)
 	movl	PER_CPU_VAR(cpu_current_top_of_stack), %esp
 	call	do_nmi
 	movl	%ebx, %esp
-	jmp	.Lrestore_all_notrace
+
+.Lnmi_return:
+	CHECK_AND_APPLY_ESPFIX
+	RESTORE_REGS 4
+	jmp	.Lirq_return
 
 #ifdef CONFIG_X86_ESPFIX32
 .Lnmi_espfix_stack: