Message ID | 20200930140108.48075-1-jarkko.sakkinen@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/vdso: Remove retpoline from SGX vDSO call | expand |
On 9/30/20 7:01 AM, Jarkko Sakkinen wrote: > The user handler, which can be optionally used to handle enclave > exceptions, is always the same global handler provided by the SGX > runtime, who wants to use such a handler instead returning on exception. > > Thus, there is no any non-deterministic branch prediction happening. > The code path is always the same and never change. Obviously, you could > change it all the time purposely but for any sane real-world use that > would not make any sense. The fundamental problem mitigated by retpolines is that indirect branch instructions themselves are non-deterministic (speculatively). This: > + call *%rax is an indirect branch instruction. That leaves me a bit confused since the changelog doesn't really match the code. Do we care about mitigating Spectre-v2-style attacks for the VDSO's indirect calls?
On Wed, Sep 30, 2020 at 07:08:58AM -0700, Dave Hansen wrote: > On 9/30/20 7:01 AM, Jarkko Sakkinen wrote: > > The user handler, which can be optionally used to handle enclave > > exceptions, is always the same global handler provided by the SGX > > runtime, who wants to use such a handler instead returning on exception. > > > > Thus, there is no any non-deterministic branch prediction happening. > > The code path is always the same and never change. Obviously, you could > > change it all the time purposely but for any sane real-world use that > > would not make any sense. > > The fundamental problem mitigated by retpolines is that indirect branch > instructions themselves are non-deterministic (speculatively). > > This: > > > + call *%rax > > is an indirect branch instruction. That leaves me a bit confused since > the changelog doesn't really match the code. > > Do we care about mitigating Spectre-v2-style attacks for the VDSO's > indirect calls? It is yes, my wording was just extremely bad. What I meant to say is that there is branch prediction happening but it is, given how runtime will use the handler, leading always unconditionally to the same destination. I asked does this have any bad mitigations yesterday: https://lkml.org/lkml/2020/9/29/2505 I'm not expert on Spectre, or any sort of security researcher, but I've read a few papers about and understand the general concept. With the constraints how the callback is used in practice, I'd *guess* it is fine to drop retpoline but I really need some feedback on this from people who understand these attacks better. I'll submit a patch with boot time patching (aka using ALTERNATE) if this does not get the buy-in. Just have to evaluate the both options before making any decisions. /Jarkko
On 9/30/20 7:20 AM, Jarkko Sakkinen wrote: > I'm not expert on Spectre, or any sort of security researcher, but I've > read a few papers about and understand the general concept. With the > constraints how the callback is used in practice, I'd *guess* it is > fine to drop retpoline but I really need some feedback on this from > people who understand these attacks better. Do you recall why you added it in the first place? What was the motivation for it? Were you responding to a review comment?
On Wed, Sep 30, 2020 at 07:33:38AM -0700, Dave Hansen wrote: > On 9/30/20 7:20 AM, Jarkko Sakkinen wrote: > > I'm not expert on Spectre, or any sort of security researcher, but I've > > read a few papers about and understand the general concept. With the > > constraints how the callback is used in practice, I'd *guess* it is > > fine to drop retpoline but I really need some feedback on this from > > people who understand these attacks better. > > Do you recall why you added it in the first place? What was the > motivation for it? Were you responding to a review comment? Absolutely cannot recall it :-) I even cannot recall the exact time when we landed the vDSO in the first place. Too much stuff has happend during the long three year upstreaming cycle. I will try to backtrack this info. /Jarkko
On Wed, Sep 30, 2020 at 06:28:06PM +0300, Jarkko Sakkinen wrote: > On Wed, Sep 30, 2020 at 07:33:38AM -0700, Dave Hansen wrote: > > On 9/30/20 7:20 AM, Jarkko Sakkinen wrote: > > > I'm not expert on Spectre, or any sort of security researcher, but I've > > > read a few papers about and understand the general concept. With the > > > constraints how the callback is used in practice, I'd *guess* it is > > > fine to drop retpoline but I really need some feedback on this from > > > people who understand these attacks better. > > > > Do you recall why you added it in the first place? What was the > > motivation for it? Were you responding to a review comment? > > Absolutely cannot recall it :-) I even cannot recall the exact time when > we landed the vDSO in the first place. Too much stuff has happend during > the long three year upstreaming cycle. I will try to backtrack this > info. It originated in a comment from Andy when we were discussing the legitimacy of the callback. From that point on it got taken as gospel that the indirect call would be implemented as a retpoline. https://lkml.kernel.org/r/CALCETrVBR+2HjTqX=W4r9GOq69Xg36v4gmCKqK0wUjzAqBJnrw@mail.gmail.com
On 9/30/20 8:43 AM, Sean Christopherson wrote: >>> Do you recall why you added it in the first place? What was the >>> motivation for it? Were you responding to a review comment? >> Absolutely cannot recall it :-) I even cannot recall the exact time when >> we landed the vDSO in the first place. Too much stuff has happend during >> the long three year upstreaming cycle. I will try to backtrack this >> info. > It originated in a comment from Andy when we were discussing the legitimacy > of the callback. From that point on it got taken as gospel that the indirect > call would be implemented as a retpoline. > > https://lkml.kernel.org/r/CALCETrVBR+2HjTqX=W4r9GOq69Xg36v4gmCKqK0wUjzAqBJnrw@mail.gmail.com OK, so that was Andy L. saying: > But I have a real argument for dropping exit_handler: in this new age > of Spectre, the indirect call is a retpoline, and it's therefore quite > slow. So I'm not saying NAK, but I do think it's unnecessary. It sounds like we were never able to jettison the indirect call. So, we've got a kernel-provided indirect call that's only ever executed by userspace. A quick grep didn't show any other indirect branches in the VDSO, so there might not be good precedent for what to do. The problem with the VDSO is that even if userspace is compiled to the gills with mitigations, this VDSO branch won't be mitigated since it comes from the kernel. So, here's the big question for me: How does a security-sensitive userspace *binary* make mitigated indirect calls these days? Seems like the kind of thing for which Intel should have a good writeup. :)
On Wed, Sep 30, 2020 at 08:43:49AM -0700, Sean Christopherson wrote: > On Wed, Sep 30, 2020 at 06:28:06PM +0300, Jarkko Sakkinen wrote: > > On Wed, Sep 30, 2020 at 07:33:38AM -0700, Dave Hansen wrote: > > > On 9/30/20 7:20 AM, Jarkko Sakkinen wrote: > > > > I'm not expert on Spectre, or any sort of security researcher, but I've > > > > read a few papers about and understand the general concept. With the > > > > constraints how the callback is used in practice, I'd *guess* it is > > > > fine to drop retpoline but I really need some feedback on this from > > > > people who understand these attacks better. > > > > > > Do you recall why you added it in the first place? What was the > > > motivation for it? Were you responding to a review comment? > > > > Absolutely cannot recall it :-) I even cannot recall the exact time when > > we landed the vDSO in the first place. Too much stuff has happend during > > the long three year upstreaming cycle. I will try to backtrack this > > info. > > It originated in a comment from Andy when we were discussing the legitimacy > of the callback. From that point on it got taken as gospel that the indirect > call would be implemented as a retpoline. > > https://lkml.kernel.org/r/CALCETrVBR+2HjTqX=W4r9GOq69Xg36v4gmCKqK0wUjzAqBJnrw@mail.gmail.com This patch from v20-v21 era is also relevant: https://lore.kernel.org/linux-sgx/ba2a51568f3adaf74994d33ea3cbee570e20c6f6.1555965327.git.cedric.xing@intel.com/ It introduced the retpoline wrapping to the patch set but unfortunately does not have any explanation for that particular detail. Neither does Andy's comment except correctly stating that retpoline is the modern standard for indirect calls but that does not get us too far. My argument, or maybe just actually a question, is essentially that given the usage pattern for this particular indirect call, do we need to actually retpoline it? Boot time patching turned out to be simple to implement but it is also yet another whistle for the already complex vDSO. /Jarkko
On 2020-09-30 18:28, Dave Hansen wrote: > On 9/30/20 8:43 AM, Sean Christopherson wrote: >>>> Do you recall why you added it in the first place? What was the >>>> motivation for it? Were you responding to a review comment? >>> Absolutely cannot recall it :-) I even cannot recall the exact time when >>> we landed the vDSO in the first place. Too much stuff has happend during >>> the long three year upstreaming cycle. I will try to backtrack this >>> info. >> It originated in a comment from Andy when we were discussing the legitimacy >> of the callback. From that point on it got taken as gospel that the indirect >> call would be implemented as a retpoline. >> >> https://lkml.kernel.org/r/CALCETrVBR+2HjTqX=W4r9GOq69Xg36v4gmCKqK0wUjzAqBJnrw@mail.gmail.com > > OK, so that was Andy L. saying: > >> But I have a real argument for dropping exit_handler: in this new age >> of Spectre, the indirect call is a retpoline, and it's therefore quite >> slow. So I'm not saying NAK, but I do think it's unnecessary. > > It sounds like we were never able to jettison the indirect call. So, > we've got a kernel-provided indirect call that's only ever executed by > userspace. A quick grep didn't show any other indirect branches in the > VDSO, so there might not be good precedent for what to do. > > The problem with the VDSO is that even if userspace is compiled to the > gills with mitigations, this VDSO branch won't be mitigated since it > comes from the kernel. > > So, here's the big question for me: How does a security-sensitive > userspace *binary* make mitigated indirect calls these days? > > Seems like the kind of thing for which Intel should have a good writeup. :) > If by "these days" you mean also protecting against LVI, then it'd be like so, I think: lfence callq *%rax (at least this is what the LLVM LVI hardening pass does) -- Jethro Beekman | Fortanix
On 30/09/2020 18:01, Jethro Beekman wrote: > On 2020-09-30 18:28, Dave Hansen wrote: >> On 9/30/20 8:43 AM, Sean Christopherson wrote: >>>>> Do you recall why you added it in the first place? What was the >>>>> motivation for it? Were you responding to a review comment? >>>> Absolutely cannot recall it :-) I even cannot recall the exact time when >>>> we landed the vDSO in the first place. Too much stuff has happend during >>>> the long three year upstreaming cycle. I will try to backtrack this >>>> info. >>> It originated in a comment from Andy when we were discussing the legitimacy >>> of the callback. From that point on it got taken as gospel that the indirect >>> call would be implemented as a retpoline. >>> >>> https://lkml.kernel.org/r/CALCETrVBR+2HjTqX=W4r9GOq69Xg36v4gmCKqK0wUjzAqBJnrw@mail.gmail.com >> OK, so that was Andy L. saying: >> >>> But I have a real argument for dropping exit_handler: in this new age >>> of Spectre, the indirect call is a retpoline, and it's therefore quite >>> slow. So I'm not saying NAK, but I do think it's unnecessary. >> It sounds like we were never able to jettison the indirect call. So, >> we've got a kernel-provided indirect call that's only ever executed by >> userspace. A quick grep didn't show any other indirect branches in the >> VDSO, so there might not be good precedent for what to do. >> >> The problem with the VDSO is that even if userspace is compiled to the >> gills with mitigations, this VDSO branch won't be mitigated since it >> comes from the kernel. >> >> So, here's the big question for me: How does a security-sensitive >> userspace *binary* make mitigated indirect calls these days? >> >> Seems like the kind of thing for which Intel should have a good writeup. :) >> > If by "these days" you mean also protecting against LVI, then it'd be like so, I think: > > lfence > callq *%rax > > (at least this is what the LLVM LVI hardening pass does) The problem is that "what to do about speculation" is exceedingly complicated. The 2017/8 code samples to beat Spectre v2 are one of: 1) Retpoline (not totally safe on Skylake uarch, but pretty good) 2) (legacy) IBRS and CALL *reg/mem (Intel) 3) LFENCE; CALL *reg/mem (AMD) Later (i.e. this year), there is finally public documentation from all vendors concerning straight-line speculation, so the code fragments now need to be: 1) Retpoline (not totally safe on Skylake uarch, but pretty good) 2) (legacy) IBRS and CALL *reg/mem; LFENCE (Intel) 3) LFENCE; CALL *reg/mem; LFENCE (AMD) to avoid potential speculative type confusion in the remainder of the basic block. Same goes for indirect JMP (Intel and AMD) and RET (AMD only), although as there is no architectural execution, you can use INT3/INT1 to halt speculation with lower code volume than LFENCE. As noted, to avoid LVI poisoning of the branch target, a leading LFENCE is needed (or one of the even more complicated variations), including ahead of the RET of a retpoline. However in practice, its only code inside the enclave which is possibly worth protecting in this way. Intel hardware mitigations, eIBRS, comes recommended with switching back to a plain CALL *reg/mem, until a recent paper where it was demonstrated that even with eIBRS active, you can still get speculative type confusion due to same-mode training. If same-mode training is a problem, the recommendation is to use eIBRS and Retpoline. AMD hardware mitigations, simply called IBRS, is programmed in the same way as eIBRS (i.e. turn it on at the start of day and leave it on), does guarantee to to protect against same-mode training. As for the VDSO, it is userspace, and there's no way that it will have WRSS generally enabled. Therefore, it cannot use a retpoline gadget even with the "fix up the shadow stack" variation when CET is in use. Any of the other variations ought to be CET-safe. What, if anything, userspace does to protect against Spectre attacks is a matter of every piece of compiled code in every library loaded, and the risk profile of the application itself. A browser with a javascript sandbox is liable to try and take more care than something like cowsay. Honestly, my advice would be to leave it unprotected for now. Anyone who managed to figure out the rest of the practical userspace issues will probably have a much better idea of what can/should be done in this case. If that doesn't sit well with people, then the next best would probably be LFENCE; CALL *reg/mem; LFENCE to cover as many of the corner cases as possible without being incompatible with CET. Its not as if this callback is the slow aspect of entering/exiting SGX mode. ~Andrew
On Wed, Sep 30, 2020 at 07:09:33PM +0100, Andrew Cooper wrote: > Honestly, my advice would be to leave it unprotected for now. Anyone > who managed to figure out the rest of the practical userspace issues > will probably have a much better idea of what can/should be done in this > case. > > If that doesn't sit well with people, then the next best would probably > be LFENCE; CALL *reg/mem; LFENCE to cover as many of the corner cases as > possible without being incompatible with CET. Its not as if this > callback is the slow aspect of entering/exiting SGX mode. > > ~Andrew I tend to agree. We cannot drive changes based on unknown unknowns. And I don't see why we could not add boot time patching of retpoline even after the code is in the mainline kernel, if something ever pushes to that direction. /Jarkko
On 9/30/2020 12:25 PM, Jarkko Sakkinen wrote: > On Wed, Sep 30, 2020 at 07:09:33PM +0100, Andrew Cooper wrote: >> Honestly, my advice would be to leave it unprotected for now. Anyone >> who managed to figure out the rest of the practical userspace issues >> will probably have a much better idea of what can/should be done in this >> case. >> >> If that doesn't sit well with people, then the next best would probably >> be LFENCE; CALL *reg/mem; LFENCE to cover as many of the corner cases as >> possible without being incompatible with CET. Its not as if this >> callback is the slow aspect of entering/exiting SGX mode. >> >> ~Andrew > > I tend to agree. We cannot drive changes based on unknown unknowns. > > And I don't see why we could not add boot time patching of retpoline > even after the code is in the mainline kernel, if something ever > pushes to that direction. > > /Jarkko > I agree. It'll be compatible with CET. The overhead of LFENCE is negligible comparing to entering/exiting SGX mode.
On Wed, Sep 30, 2020 at 01:45:52PM -0700, Xing, Cedric wrote: > On 9/30/2020 12:25 PM, Jarkko Sakkinen wrote: > > On Wed, Sep 30, 2020 at 07:09:33PM +0100, Andrew Cooper wrote: > > > Honestly, my advice would be to leave it unprotected for now. Anyone > > > who managed to figure out the rest of the practical userspace issues > > > will probably have a much better idea of what can/should be done in this > > > case. > > > > > > If that doesn't sit well with people, then the next best would probably > > > be LFENCE; CALL *reg/mem; LFENCE to cover as many of the corner cases as > > > possible without being incompatible with CET. Its not as if this > > > callback is the slow aspect of entering/exiting SGX mode. > > > > > > ~Andrew > > > > I tend to agree. We cannot drive changes based on unknown unknowns. > > > > And I don't see why we could not add boot time patching of retpoline > > even after the code is in the mainline kernel, if something ever > > pushes to that direction. > > > > /Jarkko > > > I agree. It'll be compatible with CET. The overhead of LFENCE is negligible > comparing to entering/exiting SGX mode. Andrew's advice was to do "just call" as for now. If we add also lfence, what is the real-world threat scenario that we are protecting against that exposes a real visible risk that could harm the users of these patches? Please remember that: 1. We can assume that the usage model and implementation is for the callback is sane. It is something that is contained to the run-time and there is just one instance of the callback. 2. We can always harden this more later on. I do not want to add any extra bytes to the vDSO without any practical purpose and I also need to document this choice. /Jarkko
On Thu, Oct 01, 2020 at 12:22:20AM +0300, Jarkko Sakkinen wrote: > On Wed, Sep 30, 2020 at 01:45:52PM -0700, Xing, Cedric wrote: > > On 9/30/2020 12:25 PM, Jarkko Sakkinen wrote: > > > On Wed, Sep 30, 2020 at 07:09:33PM +0100, Andrew Cooper wrote: > > > > Honestly, my advice would be to leave it unprotected for now. Anyone > > > > who managed to figure out the rest of the practical userspace issues > > > > will probably have a much better idea of what can/should be done in this > > > > case. > > > > > > > > If that doesn't sit well with people, then the next best would probably > > > > be LFENCE; CALL *reg/mem; LFENCE to cover as many of the corner cases as > > > > possible without being incompatible with CET. Its not as if this > > > > callback is the slow aspect of entering/exiting SGX mode. > > > > > > > > ~Andrew > > > > > > I tend to agree. We cannot drive changes based on unknown unknowns. > > > > > > And I don't see why we could not add boot time patching of retpoline > > > even after the code is in the mainline kernel, if something ever > > > pushes to that direction. > > > > > > /Jarkko > > > > > I agree. It'll be compatible with CET. The overhead of LFENCE is negligible > > comparing to entering/exiting SGX mode. > > Andrew's advice was to do "just call" as for now. > > If we add also lfence, what is the real-world threat scenario that we > are protecting against that exposes a real visible risk that could harm > the users of these patches? > > Please remember that: > > 1. We can assume that the usage model and implementation is for the > callback is sane. It is something that is contained to the run-time > and there is just one instance of the callback. > 2. We can always harden this more later on. > > I do not want to add any extra bytes to the vDSO without any practical > purpose and I also need to document this choice. What if we just keep everything as it is? Why boot time patching cannot be part of CET-SS patch set? Why? 1. Full reptoline is the safest alternative and we have it done already. 2. Before CET-SS there is no *functional* need to do boot time patching. The usual guideline is: do not add unused cruft to the kernel. There is too much guesswork in other alternatives. If CET-SS actually lands before SGX patches, then I'm happy to add in boot time patching. AFAIK we actually could not add boot time patching without any applications for it. That's incompatible with the common practices. I'd leaving the code as it is and remark to the changelog that CET-SS will require refining the reptoline to be optional. /Jarkko
On 9/30/20 2:36 PM, Jarkko Sakkinen wrote:
> 1. Full reptoline is the safest alternative and we have it done already.
I wouldn't feel _quite_ comfortable saying this.
LFENCEs have architecturally defined behavior. Retpolines have zero
future guarantees in the architecture. I'll take an LFENCE that (versus
a retpoline) is:
1. Less code
2. Never has to be patched
3. Never causes functional problems (like with CET)
4. Has architectural semantics
The only advantage retpolines offer is that they have a well-defined
mitigations on existing microarchitectures.
To me, an LFENCE is waaaaaaay "safer".
On Wed, Sep 30, 2020 at 02:46:05PM -0700, Dave Hansen wrote: > On 9/30/20 2:36 PM, Jarkko Sakkinen wrote: > > 1. Full reptoline is the safest alternative and we have it done already. > > I wouldn't feel _quite_ comfortable saying this. > > LFENCEs have architecturally defined behavior. Retpolines have zero > future guarantees in the architecture. I'll take an LFENCE that (versus > a retpoline) is: > > 1. Less code > 2. Never has to be patched > 3. Never causes functional problems (like with CET) > 4. Has architectural semantics > > The only advantage retpolines offer is that they have a well-defined > mitigations on existing microarchitectures. > > To me, an LFENCE is waaaaaaay "safer". This is a buy-in argument for me. We know that CET-SS breaks RETPOLINE, which can be solved by applying boot time patching. However, there could be however many other features that could conflict with it in yet non-existing microarchitectures, which would make the patching process tedious to manage over time. Essentially, we will end up maintaining the backward and forward compatibility forever in software. That does not sound too motivating, agreed. "Plain" LFENCE does not possess this issue as it is fully contained to the microarchitecture. Thus, software does not have to do anything to maintain backward and forward compatibility, which means that the SGX vDSO continues to functionally work even in the old kernel images to forseeable future. To summarize, we will use LFENCE as it has overally the best characteristics for the vDSO when balancing between security and keeping things functionally working. Do I have the correct understanding of your argument? Just sanity checking before I change any part of the code or documentation. /Jarkko
diff --git a/arch/x86/entry/vdso/vsgx.S b/arch/x86/entry/vdso/vsgx.S index 8f8190ab9ed5..5f65bb22014f 100644 --- a/arch/x86/entry/vdso/vsgx.S +++ b/arch/x86/entry/vdso/vsgx.S @@ -129,7 +129,7 @@ SYM_FUNC_START(__vdso_sgx_enter_enclave) /* Load the callback pointer to %rax and invoke it via retpoline. */ mov SGX_ENCLAVE_RUN_USER_HANDLER(%rax), %rax - call .Lretpoline + call *%rax /* Undo the post-exit %rsp adjustment. */ lea 0x10(%rsp, %rbx), %rsp @@ -143,13 +143,6 @@ SYM_FUNC_START(__vdso_sgx_enter_enclave) jle .Lout jmp .Lenter_enclave -.Lretpoline: - call 2f -1: pause - lfence - jmp 1b -2: mov %rax, (%rsp) - ret .cfi_endproc _ASM_VDSO_EXTABLE_HANDLE(.Lenclu_eenter_eresume, .Lhandle_exception)
The user handler, which can be optionally used to handle enclave exceptions, is always the same global handler provided by the SGX runtime, who wants to use such a handler instead returning on exception. Thus, there is no any non-deterministic branch prediction happening. The code path is always the same and never change. Obviously, you could change it all the time purposely but for any sane real-world use that would not make any sense. Thus, remove retpoline wrapping. Cc: x86@kernel.org Cc: Haitao Huang <haitao.huang@linux.intel.com> CC: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Cedric Xing <cedric.xing@intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> --- arch/x86/entry/vdso/vsgx.S | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-)