diff mbox

arm64: kvm: use -fno-jump-tables with clang

Message ID 20180518170202.11458-1-samitolvanen@google.com (mailing list archive)
State New, archived
Headers show

Commit Message

Sami Tolvanen May 18, 2018, 5:02 p.m. UTC
Starting with LLVM r308050, clang generates a jump table with EL1
virtual addresses in __init_stage2_translation, which results in a
kernel panic when booting at EL2:

  Kernel panic - not syncing: HYP panic:
  PS:800003c9 PC:ffff0000089e6fd8 ESR:86000004
  FAR:ffff0000089e6fd8 HPFAR:0000000009825000 PAR:0000000000000000
  VCPU:000804fc20001221

  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-rc7-dirty #3
  Hardware name: ARM Juno development board (r1) (DT)
  Call trace:
  [<ffff000008088ea4>] dump_backtrace+0x0/0x34c
  [<ffff000008089208>] show_stack+0x18/0x20
  [<ffff0000089c73ec>] dump_stack+0xc4/0xfc
  [<ffff0000080c8e1c>] panic+0x138/0x2b4
  [<ffff0000080c8ce4>] panic+0x0/0x2b4
  SMP: stopping secondary CPUs
  SMP: failed to stop secondary CPUs 0-3,5
  Kernel Offset: disabled
  CPU features: 0x002086
  Memory Limit: none
  ---[ end Kernel panic - not syncing: HYP panic:
  PS:800003c9 PC:ffff0000089e6fd8 ESR:86000004
  FAR:ffff0000089e6fd8 HPFAR:0000000009825000 PAR:0000000000000000
  VCPU:000804fc20001221

This change adds -fno-jump-tables to arm64/hyp to work around the
bug.

Suggested-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 arch/arm64/kvm/hyp/Makefile | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Marc Zyngier May 18, 2018, 6:13 p.m. UTC | #1
On 18/05/18 18:56, Nick Desaulniers wrote:
> + Andrey
> 
> On Fri, May 18, 2018 at 10:45 AM Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 18/05/18 18:40, Nick Desaulniers wrote:
>>> On Fri, May 18, 2018 at 10:30 AM Marc Zyngier <marc.zyngier@arm.com>
> wrote:
>>>> I'm going to ask the question I've asked before when this patch cropped
>>>> up (must be the 4th time now):
> 
> The previous threads for context:
> https://patchwork.kernel.org/patch/10060381/
> https://lkml.org/lkml/2018/3/16/434
> 
>>>> Is it guaranteed that this is the only case where LLVM/clang is going
> to
>>>> generate absolute addresses instead of using relative addressing?
>>>
>>> It seems like if there's requirements that only relative addressing be
>>> used, then the compiler should be told explicitly about this
> restriction,
>>> no?
> 
>> Certainly. What's the rune?
> 
> It seems like -fno-jump-tables covers all known issues and unblocks people
> from doing further work.  It sounds like you'd like some kind of stronger
> guarantee? Wont those cases still "crop up" as far as needing to annotate
> either the code, or build scripts?

What I'd really like is to apply that patch knowing that:

- you have checked that with a released version of the compiler, you
don't observe any absolute address in any of the objects that are going
to be executed at EL2 on a mainline kernel,

- you have successfully run guests with a mainline kernel,

- it works for a reasonable set of common kernel configurations
(defconfig and some of the most useful debug options),

- I can reproduce your findings with the same released compiler.

Is that the case? I don't think any of the above is completely outlandish.

	M.
Nick Desaulniers May 18, 2018, 6:31 p.m. UTC | #2
On Fri, May 18, 2018 at 11:13 AM Marc Zyngier <marc.zyngier@arm.com> wrote:
> What I'd really like is to apply that patch knowing that:

> - you have checked that with a released version of the compiler, you
> don't observe any absolute address in any of the objects that are going
> to be executed at EL2 on a mainline kernel,

To verify, we should disassemble objects from arch/arm64/kvm/hyp/*.o and
make sure we don't see absolute addresses?  I can work with Sami to get a
sense of what the before and after of this patch looks like in disassembly,
then verify those changes are pervasive.

> - you have successfully run guests with a mainline kernel,

I believe Andrey has already done this.  If he can verify (maybe during
working hours next week), then maybe we can add his Tested-by to this
patches commit message?

> - it works for a reasonable set of common kernel configurations
> (defconfig and some of the most useful debug options),

It's easy for us to test our kernel configs for Android, ChromeOS, and
defconfig.  I'd be curious to know the shortlist of "most useful debug
options" just to be a better kernel developer, personally.

> - I can reproduce your findings with the same released compiler.

Lets wait for Andrey to confirm his test setup.  On the Android side, I
think you should be able to get by with a released version, but I'd be
curious to hear from Andrey.

> Is that the case? I don't think any of the above is completely outlandish.

These are all reasonable. Thanks for the feedback.
Marc Zyngier May 19, 2018, 10:44 a.m. UTC | #3
On Fri, 18 May 2018 19:31:50 +0100,
Nick Desaulniers wrote:
> 
> On Fri, May 18, 2018 at 11:13 AM Marc Zyngier <marc.zyngier@arm.com> wrote:
> > What I'd really like is to apply that patch knowing that:
> 
> > - you have checked that with a released version of the compiler, you
> > don't observe any absolute address in any of the objects that are going
> > to be executed at EL2 on a mainline kernel,
> 
> To verify, we should disassemble objects from arch/arm64/kvm/hyp/*.o and
> make sure we don't see absolute addresses?  I can work with Sami to get a
> sense of what the before and after of this patch looks like in disassembly,
> then verify those changes are pervasive.

That seems sensible. You definitely want to look for things stored in
constant pools and subsequently used as an address. Also, you may have
to look at the .hyp.text section of the vmlinux binary, rather than
the individual *.o files, as the linker will likely rewrite things
(the compiler doesn't know about the kernel link address).

> > - you have successfully run guests with a mainline kernel,
> 
> I believe Andrey has already done this.  If he can verify (maybe
> during working hours next week), then maybe we can add his Tested-by
> to this patches commit message?

That would definitely be the right thing to do. Make sure you (or
Andrey tests with the latest released mainline kernel (4.16 for now)
or (even better) the tip of Linus' tree.

> > - it works for a reasonable set of common kernel configurations
> > (defconfig and some of the most useful debug options),
> 
> It's easy for us to test our kernel configs for Android, ChromeOS,
> and defconfig.  I'd be curious to know the shortlist of "most useful
> debug options" just to be a better kernel developer, personally.

Activate the various sanitizers, and all the tracing options, for a
start. They are the most likely to do weird things...

> > - I can reproduce your findings with the same released compiler.
> 
> Lets wait for Andrey to confirm his test setup.  On the Android side, I
> think you should be able to get by with a released version, but I'd be
> curious to hear from Andrey.

Android has all kind of additional patches, and I'm solely concerned
with mainline. If it is what Andrey runs, that's great.

> > Is that the case? I don't think any of the above is completely outlandish.
> 
> These are all reasonable. Thanks for the feedback.

Cheers,

	M.
Andrey Konovalov May 22, 2018, 5:58 p.m. UTC | #4
On Sat, May 19, 2018 at 12:44 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> That would definitely be the right thing to do. Make sure you (or
> Andrey tests with the latest released mainline kernel (4.16 for now)
> or (even better) the tip of Linus' tree.

Hi!

I can confirm that after applying this patch onto 4.17-rc4 kernel the
Odroid C2 board that I have boots (it doesn't without the patch).

This is the result of running KVM tests that I was able to find [1]:

root@odroid64:/home/odroid/kvm-unit-tests# ./run_tests.sh
PASS selftest-setup (2 tests)
PASS selftest-vectors-kernel (2 tests)
PASS selftest-vectors-user (2 tests)
PASS selftest-smp (5 tests)
PASS pci-test (1 tests)
FAIL pmu (3 tests, 3 unexpected failures)
PASS gicv2-ipi (3 tests)
SKIP gicv3-ipi (qemu-system-aarch64: Initialization of device
kvm-arm-gicv3 failed: error creating in-kernel VGIC: No such device)
PASS gicv2-active (1 tests)
SKIP gicv3-active (qemu-system-aarch64: Initialization of device
kvm-arm-gicv3 failed: error creating in-kernel VGIC: No such device)
PASS psci (4 tests)
PASS timer (8 tests)

Here is the result of running the same tests on GCC compiled kernel
(looks the same):

root@odroid64:/home/odroid/kvm-unit-tests# ./run_tests.sh
PASS selftest-setup (2 tests)
PASS selftest-vectors-kernel (2 tests)
PASS selftest-vectors-user (2 tests)
PASS selftest-smp (5 tests)
PASS pci-test (1 tests)
FAIL pmu (3 tests, 3 unexpected failures)
PASS gicv2-ipi (3 tests)
SKIP gicv3-ipi (qemu-system-aarch64: Initialization of device
kvm-arm-gicv3 failed: error creating in-kernel VGIC: No such device)
PASS gicv2-active (1 tests)
SKIP gicv3-active (qemu-system-aarch64: Initialization of device
kvm-arm-gicv3 failed: error creating in-kernel VGIC: No such device)
PASS psci (4 tests)
PASS timer (8 tests)

Tested-by: Andrey Konovalov <andreyknvl@google.com>

Thanks!

[1] https://www.linux-kvm.org/page/KVM-unit-tests
Nick Desaulniers May 22, 2018, 6:28 p.m. UTC | #5
On Fri, May 18, 2018 at 11:13 AM Marc Zyngier <marc.zyngier@arm.com> wrote:
> > - you have checked that with a released version of the compiler, you

On Tue, May 22, 2018 at 10:58 AM Andrey Konovalov <andreyknvl@google.com>
wrote:
> Tested-by: Andrey Konovalov <andreyknvl@google.com>

Hi Andrey,
Thank you very much for this report.  Can you confirm as well the version
of Clang that you were using? If it's not a binary release (built from
source), would you be able to re-confirm with a released version?
Andrey Konovalov May 23, 2018, 11:54 a.m. UTC | #6
On Tue, May 22, 2018 at 8:28 PM, Nick Desaulniers
<ndesaulniers@google.com> wrote:
> On Fri, May 18, 2018 at 11:13 AM Marc Zyngier <marc.zyngier@arm.com> wrote:
>> > - you have checked that with a released version of the compiler, you
>
> On Tue, May 22, 2018 at 10:58 AM Andrey Konovalov <andreyknvl@google.com>
> wrote:
>> Tested-by: Andrey Konovalov <andreyknvl@google.com>
>
> Hi Andrey,
> Thank you very much for this report.  Can you confirm as well the version
> of Clang that you were using?

I'm on 86852a40 ("[InstCombine] Calloc-ed strings optimizations").

> If it's not a binary release (built from
> source), would you be able to re-confirm with a released version?

Sure. Which release should I try and how do I get it?
Nick Desaulniers May 23, 2018, 5:47 p.m. UTC | #7
On Wed, May 23, 2018 at 4:54 AM Andrey Konovalov <andreyknvl@google.com>
wrote:
> On Tue, May 22, 2018 at 8:28 PM, Nick Desaulniers
> <ndesaulniers@google.com> wrote:
> > On Fri, May 18, 2018 at 11:13 AM Marc Zyngier <marc.zyngier@arm.com>
wrote:
> >> > - you have checked that with a released version of the compiler, you
> >
> > On Tue, May 22, 2018 at 10:58 AM Andrey Konovalov <andreyknvl@google.com

> > wrote:
> >> Tested-by: Andrey Konovalov <andreyknvl@google.com>
> >
> > Hi Andrey,
> > Thank you very much for this report.  Can you confirm as well the
version
> > of Clang that you were using?

> I'm on 86852a40 ("[InstCombine] Calloc-ed strings optimizations").

> > If it's not a binary release (built from
> > source), would you be able to re-confirm with a released version?

> Sure. Which release should I try and how do I get it?

Maybe clang-6.0 as the latest release (though I suspect you may run into
the recently-fixed-in-clang-7.0 "S" constraint bug that you reported).

I've had luck on debian based distributions installing from:
http://apt.llvm.org/

(These can be added to your /etc/apt/sources.list, then a `sudo apt update`
and `sudo apt install clang-6.0`)

If you're not able to add remote repositories (some employers block this ;)
), then you can find releases for download for a few different platforms:
https://releases.llvm.org/

For example, a quick:
$ mkdir llvm-6.0
$ cd !$
$ wget
https://releases.llvm.org/6.0.0/clang+llvm-6.0.0-x86_64-linux-gnu-debian8.tar.xz
$ tar xvf clang+llvm-6.0.0-x86_64-linux-gnu-debian8.tar.xz
$ ./clang+llvm-6.0.0-x86_64-linux-gnu-debian8/bin/clang-6.0 -v
clang version 6.0.0 (tags/RELEASE_600/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: .../llvm-6.0/./clang+llvm-6.0.0-x86_64-linux-gnu-debian8/bin
Found candidate GCC installation: ...
Candidate multilib: .;@m64
Selected multilib: .;@m64

Seems to work.
Andrey Konovalov May 23, 2018, 6:57 p.m. UTC | #8
On Wed, May 23, 2018 at 7:47 PM, Nick Desaulniers
<ndesaulniers@google.com> wrote:
> On Wed, May 23, 2018 at 4:54 AM Andrey Konovalov <andreyknvl@google.com>
> wrote:
>> On Tue, May 22, 2018 at 8:28 PM, Nick Desaulniers
>> <ndesaulniers@google.com> wrote:
>> > On Fri, May 18, 2018 at 11:13 AM Marc Zyngier <marc.zyngier@arm.com>
> wrote:
>> >> > - you have checked that with a released version of the compiler, you
>> >
>> > On Tue, May 22, 2018 at 10:58 AM Andrey Konovalov <andreyknvl@google.com
>
>> > wrote:
>> >> Tested-by: Andrey Konovalov <andreyknvl@google.com>
>> >
>> > Hi Andrey,
>> > Thank you very much for this report.  Can you confirm as well the
> version
>> > of Clang that you were using?
>
>> I'm on 86852a40 ("[InstCombine] Calloc-ed strings optimizations").
>
>> > If it's not a binary release (built from
>> > source), would you be able to re-confirm with a released version?
>
>> Sure. Which release should I try and how do I get it?
>
> Maybe clang-6.0 as the latest release (though I suspect you may run into
> the recently-fixed-in-clang-7.0 "S" constraint bug that you reported).

Yes, and also into the "support for "r" prefixed variables in ARM
inline assembly" issue.

Tested on upstream commit ded4c39e (before both issues were
introduced) with -fno-jump-tables patch applied using clang 6.0.

Same result, the patch helps.

>
> I've had luck on debian based distributions installing from:
> http://apt.llvm.org/
>
> (These can be added to your /etc/apt/sources.list, then a `sudo apt update`
> and `sudo apt install clang-6.0`)
>
> If you're not able to add remote repositories (some employers block this ;)
> ), then you can find releases for download for a few different platforms:
> https://releases.llvm.org/
>
> For example, a quick:
> $ mkdir llvm-6.0
> $ cd !$
> $ wget
> https://releases.llvm.org/6.0.0/clang+llvm-6.0.0-x86_64-linux-gnu-debian8.tar.xz
> $ tar xvf clang+llvm-6.0.0-x86_64-linux-gnu-debian8.tar.xz
> $ ./clang+llvm-6.0.0-x86_64-linux-gnu-debian8/bin/clang-6.0 -v
> clang version 6.0.0 (tags/RELEASE_600/final)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
> InstalledDir: .../llvm-6.0/./clang+llvm-6.0.0-x86_64-linux-gnu-debian8/bin
> Found candidate GCC installation: ...
> Candidate multilib: .;@m64
> Selected multilib: .;@m64
>
> Seems to work.
> --
> Thanks,
> ~Nick Desaulniers
diff mbox

Patch

diff --git a/arch/arm64/kvm/hyp/Makefile b/arch/arm64/kvm/hyp/Makefile
index 4313f74753332..745b3255e7832 100644
--- a/arch/arm64/kvm/hyp/Makefile
+++ b/arch/arm64/kvm/hyp/Makefile
@@ -5,6 +5,10 @@ 
 
 ccflags-y += -fno-stack-protector -DDISABLE_BRANCH_PROFILING
 
+ifeq ($(cc-name),clang)
+ccflags-y += -fno-jump-tables
+endif
+
 KVM=../../../../virt/kvm
 
 obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o