Message ID | 20200219000817.195049-1-samitolvanen@google.com (mailing list archive) |
---|---|
Headers | show |
Series | add support for Clang's Shadow Call Stack | expand |
Hi Sami, (CC: +Marc) On 19/02/2020 00:08, Sami Tolvanen wrote: > This patch series adds support for Clang's Shadow Call Stack > (SCS) mitigation, which uses a separately allocated shadow stack > to protect against return address overwrites. I took this for a spin on some real hardware. cpu-idle, kexec hibernate etc all work great... but starting a KVM guest causes the CPU to get stuck in EL2. With CONFIG_SHADOW_CALL_STACK disabled, this doesn't happen ... so its something about the feature being enabled. I'm using clang-9 from debian bullseye/sid. (I tried to build tip of tree ... that doesn't go so well on arm64) KVM takes an instruction abort from EL2 to EL2, because some of the code it runs is not mapped at EL2: | ffffa00011588308 <__kvm_tlb_flush_local_vmid>: | ffffa00011588308: d10103ff sub sp, sp, #0x40 | ffffa0001158830c: f90013f3 str x19, [sp, #32] | ffffa00011588310: a9037bfd stp x29, x30, [sp, #48] | ffffa00011588314: 9100c3fd add x29, sp, #0x30 | ffffa00011588318: 97ae18bf bl ffffa0001010e614 <__kern_hyp_va> INSTRUCTION ABORT! | ffffa0001158831c: f9400000 ldr x0, [x0] | ffffa00011588320: 97ae18bd bl ffffa0001010e614 <__kern_hyp_va> | ffffa00011588324: aa0003f3 mov x19, x0 | ffffa00011588328: 97ae18c1 bl ffffa0001010e62c <has_vhe> __kern_hyp_va() is static-inline which is patched wherever it appears at boot with the EL2 ASLR values, it converts a kernel linear-map address to its EL2 KVM alias: | ffffa0001010dc5c <__kern_hyp_va>: | ffffa0001010dc5c: 92400000 and x0, x0, #0x1 | ffffa0001010dc60: 93c00400 ror x0, x0, #1 | ffffa0001010dc64: 91000000 add x0, x0, #0x0 | ffffa0001010dc68: 91400000 add x0, x0, #0x0, lsl #12 | ffffa0001010dc6c: 93c0fc00 ror x0, x0, #63 | ffffa0001010dc70: d65f03c0 ret The problem here is where __kern_hyp_va() is. Its outside the __hyp_text section: | morse@eglon:~/kernel/linux-pigs$ nm -s vmlinux | grep hyp_text | ffffa0001158b800 T __hyp_text_end | ffffa000115838a0 T __hyp_text_start If I disable CONFIG_SHADOW_CALL_STACK in Kconfig, I get: | ffffa00011527fe0 <__kvm_tlb_flush_local_vmid>: | ffffa00011527fe0: d100c3ff sub sp, sp, #0x30 | ffffa00011527fe4: a9027bfd stp x29, x30, [sp, #32] | ffffa00011527fe8: 910083fd add x29, sp, #0x20 | ffffa00011527fec: 92400000 and x0, x0, #0x1 | ffffa00011527ff0: 93c00400 ror x0, x0, #1 | ffffa00011527ff4: 91000000 add x0, x0, #0x0 | ffffa00011527ff8: 91400000 add x0, x0, #0x0, lsl #12 | ffffa00011527ffc: 93c0fc00 ror x0, x0, #63 | ffffa00011528000: f9400000 ldr x0, [x0] | ffffa00011528004: 910023e1 add x1, sp, #0x8 | ffffa00011528008: 92400000 and x0, x0, #0x1 | ffffa0001152800c: 93c00400 ror x0, x0, #1 | ffffa00011528010: 91000000 add x0, x0, #0x0 | ffffa00011528014: 91400000 add x0, x0, #0x0, lsl #12 | ffffa00011528018: 93c0fc00 ror x0, x0, #63 | ffffa0001152801c: 97ffff78 bl ffffa00011527dfc <__tlb_switch_> | ffffa00011528020: d508871f tlbi vmalle1 | ffffa00011528024: d503201f nop This looks like reserving x18 is causing Clang to not-inline the __kern_hyp_va() calls, losing the vitally important section information. (I can see why the compiler thinks this is fair) Is this a known, er, thing, with clang-9? From eyeballing the disassembly __always_inline on __kern_hyp_va() is enough of a hint to stop this, ... with this configuration of clang-9. But KVM still doesn't work, so it isn't the only inlining decision KVM relies on that is changed by SCS. I suspect repainting all KVM's 'inline' with __always_inline will fix it. (yuck!) I'll try tomorrow. I don't think keeping the compiler-flags as they are today for KVM is the right thing to do, it could lead to x18 getting corrupted with the shared vhe/non-vhe code. Splitting that code up would lead to duplication. (hopefully objtool will be able to catch these at build time) Thanks, James > SCS is currently supported only on arm64, where the compiler > requires the x18 register to be reserved for holding the current > task's shadow stack pointer. > Changes in v8: > - Added __noscs to __hyp_text instead of filtering SCS flags from > the entire arch/arm64/kvm/hyp directory
On Wed, 19 Feb 2020 at 19:38, James Morse <james.morse@arm.com> wrote: > > Hi Sami, > > (CC: +Marc) > > On 19/02/2020 00:08, Sami Tolvanen wrote: > > This patch series adds support for Clang's Shadow Call Stack > > (SCS) mitigation, which uses a separately allocated shadow stack > > to protect against return address overwrites. > > I took this for a spin on some real hardware. cpu-idle, kexec hibernate etc all work > great... but starting a KVM guest causes the CPU to get stuck in EL2. > > With CONFIG_SHADOW_CALL_STACK disabled, this doesn't happen ... so its something about the > feature being enabled. > > > I'm using clang-9 from debian bullseye/sid. (I tried to build tip of tree ... that doesn't > go so well on arm64) > > KVM takes an instruction abort from EL2 to EL2, because some of the code it runs is not > mapped at EL2: > > | ffffa00011588308 <__kvm_tlb_flush_local_vmid>: > | ffffa00011588308: d10103ff sub sp, sp, #0x40 > | ffffa0001158830c: f90013f3 str x19, [sp, #32] > | ffffa00011588310: a9037bfd stp x29, x30, [sp, #48] > | ffffa00011588314: 9100c3fd add x29, sp, #0x30 > | ffffa00011588318: 97ae18bf bl ffffa0001010e614 <__kern_hyp_va> > > INSTRUCTION ABORT! > > | ffffa0001158831c: f9400000 ldr x0, [x0] > | ffffa00011588320: 97ae18bd bl ffffa0001010e614 <__kern_hyp_va> > | ffffa00011588324: aa0003f3 mov x19, x0 > | ffffa00011588328: 97ae18c1 bl ffffa0001010e62c <has_vhe> > > > __kern_hyp_va() is static-inline which is patched wherever it appears at boot with the EL2 > ASLR values, it converts a kernel linear-map address to its EL2 KVM alias: > > | ffffa0001010dc5c <__kern_hyp_va>: > | ffffa0001010dc5c: 92400000 and x0, x0, #0x1 > | ffffa0001010dc60: 93c00400 ror x0, x0, #1 > | ffffa0001010dc64: 91000000 add x0, x0, #0x0 > | ffffa0001010dc68: 91400000 add x0, x0, #0x0, lsl #12 > | ffffa0001010dc6c: 93c0fc00 ror x0, x0, #63 > | ffffa0001010dc70: d65f03c0 ret > > > The problem here is where __kern_hyp_va() is. Its outside the __hyp_text section: > | morse@eglon:~/kernel/linux-pigs$ nm -s vmlinux | grep hyp_text > | ffffa0001158b800 T __hyp_text_end > | ffffa000115838a0 T __hyp_text_start > > > If I disable CONFIG_SHADOW_CALL_STACK in Kconfig, I get: > | ffffa00011527fe0 <__kvm_tlb_flush_local_vmid>: > | ffffa00011527fe0: d100c3ff sub sp, sp, #0x30 > | ffffa00011527fe4: a9027bfd stp x29, x30, [sp, #32] > | ffffa00011527fe8: 910083fd add x29, sp, #0x20 > | ffffa00011527fec: 92400000 and x0, x0, #0x1 > | ffffa00011527ff0: 93c00400 ror x0, x0, #1 > | ffffa00011527ff4: 91000000 add x0, x0, #0x0 > | ffffa00011527ff8: 91400000 add x0, x0, #0x0, lsl #12 > | ffffa00011527ffc: 93c0fc00 ror x0, x0, #63 > | ffffa00011528000: f9400000 ldr x0, [x0] > | ffffa00011528004: 910023e1 add x1, sp, #0x8 > | ffffa00011528008: 92400000 and x0, x0, #0x1 > | ffffa0001152800c: 93c00400 ror x0, x0, #1 > | ffffa00011528010: 91000000 add x0, x0, #0x0 > | ffffa00011528014: 91400000 add x0, x0, #0x0, lsl #12 > | ffffa00011528018: 93c0fc00 ror x0, x0, #63 > | ffffa0001152801c: 97ffff78 bl ffffa00011527dfc <__tlb_switch_> > | ffffa00011528020: d508871f tlbi vmalle1 > | ffffa00011528024: d503201f nop > > > This looks like reserving x18 is causing Clang to not-inline the __kern_hyp_va() calls, > losing the vitally important section information. (I can see why the compiler thinks this > is fair) > > Is this a known, er, thing, with clang-9? > > From eyeballing the disassembly __always_inline on __kern_hyp_va() is enough of a hint to > stop this, ... with this configuration of clang-9. But KVM still doesn't work, so it isn't > the only inlining decision KVM relies on that is changed by SCS. > > I suspect repainting all KVM's 'inline' with __always_inline will fix it. (yuck!) I'll try > tomorrow. > If we are relying on the inlining for correctness, these should have been __always_inline to begin with, and yuckness aside, I don't think there's anything wrong with that. > I don't think keeping the compiler-flags as they are today for KVM is the right thing to > do, it could lead to x18 getting corrupted with the shared vhe/non-vhe code. Splitting > that code up would lead to duplication. > > (hopefully objtool will be able to catch these at build time) > I don't see why we should selectively en/disable the reservation of x18 (as I argued in the context of the EFI libstub patch as well). Just reserving it everywhere shouldn't hurt performance, and removes the need to prove that we reserved it in all the right places.
On Wed, Feb 19, 2020 at 10:38 AM James Morse <james.morse@arm.com> wrote: > This looks like reserving x18 is causing Clang to not-inline the __kern_hyp_va() calls, > losing the vitally important section information. (I can see why the compiler thinks this > is fair) Thanks for catching this. This doesn't appear to be caused by reserving x18, it looks like SCS itself is causing clang to avoid inlining these. If I add __noscs to __kern_hyp_va(), clang inlines the function again. __always_inline also works, as you pointed out. > Is this a known, er, thing, with clang-9? I can reproduce this with ToT clang as well. > I suspect repainting all KVM's 'inline' with __always_inline will fix it. (yuck!) I'll try > tomorrow. I think switching to __always_inline is the correct solution here. Sami
On 2020-02-19 18:53, Ard Biesheuvel wrote: > On Wed, 19 Feb 2020 at 19:38, James Morse <james.morse@arm.com> wrote: >> >> Hi Sami, >> >> (CC: +Marc) >> >> On 19/02/2020 00:08, Sami Tolvanen wrote: >> > This patch series adds support for Clang's Shadow Call Stack >> > (SCS) mitigation, which uses a separately allocated shadow stack >> > to protect against return address overwrites. >> >> I took this for a spin on some real hardware. cpu-idle, kexec >> hibernate etc all work >> great... but starting a KVM guest causes the CPU to get stuck in EL2. >> >> With CONFIG_SHADOW_CALL_STACK disabled, this doesn't happen ... so its >> something about the >> feature being enabled. >> >> >> I'm using clang-9 from debian bullseye/sid. (I tried to build tip of >> tree ... that doesn't >> go so well on arm64) >> >> KVM takes an instruction abort from EL2 to EL2, because some of the >> code it runs is not >> mapped at EL2: >> >> | ffffa00011588308 <__kvm_tlb_flush_local_vmid>: >> | ffffa00011588308: d10103ff sub sp, sp, #0x40 >> | ffffa0001158830c: f90013f3 str x19, [sp, #32] >> | ffffa00011588310: a9037bfd stp x29, x30, [sp, #48] >> | ffffa00011588314: 9100c3fd add x29, sp, #0x30 >> | ffffa00011588318: 97ae18bf bl ffffa0001010e614 >> <__kern_hyp_va> >> >> INSTRUCTION ABORT! >> >> | ffffa0001158831c: f9400000 ldr x0, [x0] >> | ffffa00011588320: 97ae18bd bl ffffa0001010e614 >> <__kern_hyp_va> >> | ffffa00011588324: aa0003f3 mov x19, x0 >> | ffffa00011588328: 97ae18c1 bl ffffa0001010e62c >> <has_vhe> >> >> >> __kern_hyp_va() is static-inline which is patched wherever it appears >> at boot with the EL2 >> ASLR values, it converts a kernel linear-map address to its EL2 KVM >> alias: >> >> | ffffa0001010dc5c <__kern_hyp_va>: >> | ffffa0001010dc5c: 92400000 and x0, x0, #0x1 >> | ffffa0001010dc60: 93c00400 ror x0, x0, #1 >> | ffffa0001010dc64: 91000000 add x0, x0, #0x0 >> | ffffa0001010dc68: 91400000 add x0, x0, #0x0, lsl >> #12 >> | ffffa0001010dc6c: 93c0fc00 ror x0, x0, #63 >> | ffffa0001010dc70: d65f03c0 ret >> >> >> The problem here is where __kern_hyp_va() is. Its outside the >> __hyp_text section: >> | morse@eglon:~/kernel/linux-pigs$ nm -s vmlinux | grep hyp_text >> | ffffa0001158b800 T __hyp_text_end >> | ffffa000115838a0 T __hyp_text_start >> >> >> If I disable CONFIG_SHADOW_CALL_STACK in Kconfig, I get: >> | ffffa00011527fe0 <__kvm_tlb_flush_local_vmid>: >> | ffffa00011527fe0: d100c3ff sub sp, sp, #0x30 >> | ffffa00011527fe4: a9027bfd stp x29, x30, [sp, #32] >> | ffffa00011527fe8: 910083fd add x29, sp, #0x20 >> | ffffa00011527fec: 92400000 and x0, x0, #0x1 >> | ffffa00011527ff0: 93c00400 ror x0, x0, #1 >> | ffffa00011527ff4: 91000000 add x0, x0, #0x0 >> | ffffa00011527ff8: 91400000 add x0, x0, #0x0, lsl >> #12 >> | ffffa00011527ffc: 93c0fc00 ror x0, x0, #63 >> | ffffa00011528000: f9400000 ldr x0, [x0] >> | ffffa00011528004: 910023e1 add x1, sp, #0x8 >> | ffffa00011528008: 92400000 and x0, x0, #0x1 >> | ffffa0001152800c: 93c00400 ror x0, x0, #1 >> | ffffa00011528010: 91000000 add x0, x0, #0x0 >> | ffffa00011528014: 91400000 add x0, x0, #0x0, lsl >> #12 >> | ffffa00011528018: 93c0fc00 ror x0, x0, #63 >> | ffffa0001152801c: 97ffff78 bl ffffa00011527dfc >> <__tlb_switch_> >> | ffffa00011528020: d508871f tlbi vmalle1 >> | ffffa00011528024: d503201f nop >> >> >> This looks like reserving x18 is causing Clang to not-inline the >> __kern_hyp_va() calls, >> losing the vitally important section information. (I can see why the >> compiler thinks this >> is fair) >> >> Is this a known, er, thing, with clang-9? >> >> From eyeballing the disassembly __always_inline on __kern_hyp_va() is >> enough of a hint to >> stop this, ... with this configuration of clang-9. But KVM still >> doesn't work, so it isn't >> the only inlining decision KVM relies on that is changed by SCS. >> >> I suspect repainting all KVM's 'inline' with __always_inline will fix >> it. (yuck!) I'll try >> tomorrow. >> > > If we are relying on the inlining for correctness, these should have > been __always_inline to begin with, and yuckness aside, I don't think > there's anything wrong with that. Agreed. Not having __always_inline is definitely an oversight, and we should fix it ASAP (hell knows what another compiler could produce...). And the whole EL2 aliasing is utter yuck already, this isn't going to make things much worse... I can queue something today for __kern_hyp_va(), but I'd like to make sure there isn't other silly mistakes like this one somewhere... >> I don't think keeping the compiler-flags as they are today for KVM is >> the right thing to >> do, it could lead to x18 getting corrupted with the shared vhe/non-vhe >> code. Splitting >> that code up would lead to duplication. >> >> (hopefully objtool will be able to catch these at build time) >> > > I don't see why we should selectively en/disable the reservation of > x18 (as I argued in the context of the EFI libstub patch as well). > Just reserving it everywhere shouldn't hurt performance, and removes > the need to prove that we reserved it in all the right places. I'd certainly like to keep things simple if we can. M.