mbox series

[v2,00/10] Independent per-CPU data section for nVHE

Message ID 20200903091712.46456-1-dbrazdil@google.com (mailing list archive)
Headers show
Series Independent per-CPU data section for nVHE | expand

Message

David Brazdil Sept. 3, 2020, 9:17 a.m. UTC
Introduce '.hyp.data..percpu' as part of ongoing effort to make nVHE
hyp code self-contained and independent of the rest of the kernel.

The series builds on top of the "Split off nVHE hyp code" series which
used objcopy to rename '.text' to '.hyp.text' and prefix all ELF
symbols with '__kvm_nvhe' for all object files under kvm/hyp/nvhe.

The series is structured as follows:

 - patch 1: Modify generic PERCPU_* linker script macros to make it
     possible to define multiple per-CPU ELF sections with prefixed
     section and symbol names.

 - patch 2: Improve existing hyp build rules. This could be sent and merged
    independently of per-CPU but this series builds on it.

 - patches 3-4: Replace hyp helpers for accessing per-CPU variables
     with common helpers modified to work correctly in hyp. Per-CPU
     variables can now be accessed with one API anywhere.

 - patches 5-7: Where VHE and nVHE use per-CPU variables defined in
     kernel proper, move their definitions to hyp/ where they are
     duplicated and owned by VHE/nVHE, respectively. Non-VHE hyp code
     now refers only to per-CPU variables defined in its source files.
     Helpers are added so that kernel proper can continue to access
     nVHE hyp variables, same way as it does with other nVHE symbols.

 - patches 8-10: Introduce '.hyp.data..percpu' ELF section and allocate
     memory for every CPU core during KVM init. All nVHE per-CPU state
     is now grouped together in ELF and in memory. Introducing a new
     per-CPU variable does not require adding new memory mappings any
     more. nVHE hyp code cannot accidentally refer to kernel-proper
     per-CPU data as it only has the pointer to its own per-CPU memory.

Patches are rebased on v5.9-rc3 and available in branch 'topic/percpu-v2' at:
    https://android-kvm.googlesource.com/linux

Changes v1 -> v2:
  * 5.9-rc3 base
  * partially link hyp code, add linker script

David Brazdil (10):
  Macros to override naming of percpu symbols and sections
  kvm: arm64: Partially link nVHE hyp code, simplify HYPCOPY
  kvm: arm64: Remove __hyp_this_cpu_read
  kvm: arm64: Remove hyp_adr/ldr_this_cpu
  kvm: arm64: Add helpers for accessing nVHE hyp per-cpu vars
  kvm: arm64: Duplicate arm64_ssbd_callback_required for nVHE hyp
  kvm: arm64: Create separate instances of kvm_host_data for VHE/nVHE
  kvm: arm64: Mark hyp stack pages reserved
  kvm: arm64: Set up hyp percpu data for nVHE
  kvm: arm64: Remove unnecessary hyp mappings

 arch/arm64/include/asm/assembler.h        |  27 ++++--
 arch/arm64/include/asm/kvm_asm.h          |  74 ++++++++-------
 arch/arm64/include/asm/kvm_host.h         |   2 +-
 arch/arm64/include/asm/kvm_mmu.h          |  23 ++---
 arch/arm64/include/asm/percpu.h           |  33 ++++++-
 arch/arm64/include/asm/sections.h         |   1 +
 arch/arm64/kernel/image-vars.h            |   2 -
 arch/arm64/kernel/vmlinux.lds.S           |  10 ++
 arch/arm64/kvm/arm.c                      | 110 ++++++++++++++++++----
 arch/arm64/kvm/hyp/hyp-entry.S            |   2 +-
 arch/arm64/kvm/hyp/include/hyp/debug-sr.h |   4 +-
 arch/arm64/kvm/hyp/include/hyp/switch.h   |   8 +-
 arch/arm64/kvm/hyp/nvhe/Makefile          |  56 +++++------
 arch/arm64/kvm/hyp/nvhe/hyp.lds.S         |  19 ++++
 arch/arm64/kvm/hyp/nvhe/switch.c          |   8 +-
 arch/arm64/kvm/hyp/vhe/switch.c           |   5 +-
 arch/arm64/kvm/hyp/vhe/sysreg-sr.c        |   4 +-
 arch/arm64/kvm/pmu.c                      |  13 ++-
 include/asm-generic/vmlinux.lds.h         |  40 +++++---
 19 files changed, 304 insertions(+), 137 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/nvhe/hyp.lds.S

--
2.28.0.402.g5ffc5be6b7-goog

Comments

Will Deacon Sept. 14, 2020, 5:40 p.m. UTC | #1
Hi David,

On Thu, Sep 03, 2020 at 11:17:02AM +0200, David Brazdil wrote:
> Introduce '.hyp.data..percpu' as part of ongoing effort to make nVHE
> hyp code self-contained and independent of the rest of the kernel.
> 
> The series builds on top of the "Split off nVHE hyp code" series which
> used objcopy to rename '.text' to '.hyp.text' and prefix all ELF
> symbols with '__kvm_nvhe' for all object files under kvm/hyp/nvhe.

I've been playing around with this series this afternoon, trying to see
if we can reduce the coupling between the nVHE code and the core code. I've
ended up with the diff below on top of your series, but I think it actually
removes the need to change the core code at all. The idea is to collapse
the percpu sections during prelink, and then we can just deal with the
resulting data section a bit like we do for .hyp.text already.

Have I missed something critical?

Cheers,

Will

--->8

diff --git a/arch/arm64/include/asm/hyp_image.h b/arch/arm64/include/asm/hyp_image.h
new file mode 100644
index 000000000000..40bbf2ddb50f
--- /dev/null
+++ b/arch/arm64/include/asm/hyp_image.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_HYP_IMAGE_H
+#define __ASM_HYP_IMAGE_H
+
+/*
+ * KVM nVHE code has its own symbol namespace prefixed with __kvm_nvhe_, to
+ * separate it from the kernel proper.
+ */
+#define kvm_nvhe_sym(sym)	__kvm_nvhe_##sym
+
+#ifdef LINKER_SCRIPT
+/*
+ * Defines an ELF hyp section from input section @NAME and its subsections.
+ */
+#define HYP_SECTION(NAME)	.hyp ## NAME : { *(NAME NAME ## .*) }
+#define KVM_NVHE_ALIAS(sym)	kvm_nvhe_sym(sym) = sym;
+#endif	/* LINKER_SCRIPT */
+
+#endif	/* __ASM_HYP_IMAGE_H */
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index c87111c25d9e..e0e1e404f6eb 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -7,6 +7,7 @@
 #ifndef __ARM_KVM_ASM_H__
 #define __ARM_KVM_ASM_H__
 
+#include <asm/hyp_image.h>
 #include <asm/virt.h>
 
 #define	VCPU_WORKAROUND_2_FLAG_SHIFT	0
@@ -42,13 +43,6 @@
 
 #include <linux/mm.h>
 
-/*
- * Translate name of a symbol defined in nVHE hyp to the name seen
- * by kernel proper. All nVHE symbols are prefixed by the build system
- * to avoid clashes with the VHE variants.
- */
-#define kvm_nvhe_sym(sym)	__kvm_nvhe_##sym
-
 #define DECLARE_KVM_VHE_SYM(sym)	extern char sym[]
 #define DECLARE_KVM_NVHE_SYM(sym)	extern char kvm_nvhe_sym(sym)[]
 
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 21307e2db3fc..f16205300dbc 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -54,15 +54,11 @@ __efistub__ctype		= _ctype;
 #ifdef CONFIG_KVM
 
 /*
- * KVM nVHE code has its own symbol namespace prefixed with __kvm_nvhe_, to
- * separate it from the kernel proper. The following symbols are legally
- * accessed by it, therefore provide aliases to make them linkable.
- * Do not include symbols which may not be safely accessed under hypervisor
- * memory mappings.
+ * The following symbols are legally accessed by the KVM nVHE code, therefore
+ * provide aliases to make them linkable. Do not include symbols which may not
+ * be safely accessed under hypervisor memory mappings.
  */
 
-#define KVM_NVHE_ALIAS(sym) __kvm_nvhe_##sym = sym;
-
 /* Alternative callbacks for init-time patching of nVHE hyp code. */
 KVM_NVHE_ALIAS(arm64_enable_wa2_handling);
 KVM_NVHE_ALIAS(kvm_patch_vector_branch);
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 5904a4de9f40..c06e6860adfd 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -9,27 +9,37 @@
 
 #include <asm-generic/vmlinux.lds.h>
 #include <asm/cache.h>
+#include <asm/hyp_image.h>
 #include <asm/kernel-pgtable.h>
 #include <asm/memory.h>
 #include <asm/page.h>
 
 #include "image.h"
 
-#define __CONCAT3(x, y, z) x ## y ## z
-#define CONCAT3(x, y, z) __CONCAT3(x, y, z)
-
 OUTPUT_ARCH(aarch64)
 ENTRY(_text)
 
 jiffies = jiffies_64;
 
-
+#ifdef CONFIG_KVM
 #define HYPERVISOR_EXTABLE					\
 	. = ALIGN(SZ_8);					\
 	__start___kvm_ex_table = .;				\
 	*(__kvm_ex_table)					\
 	__stop___kvm_ex_table = .;
 
+#define HYPERVISOR_PERCPU_SECTION			\
+	. = ALIGN(PAGE_SIZE);				\
+	.hyp.data..percpu : {				\
+		kvm_nvhe_sym(__per_cpu_start) = .;	\
+		*(.hyp.data..percpu)			\
+		kvm_nvhe_sym(__per_cpu_end) = .;	\
+	}
+#else
+#define HYPERVISOR_EXTABLE
+#define HYPERVISOR_PERCPU_SECTION
+#endif
+
 #define HYPERVISOR_TEXT					\
 	/*						\
 	 * Align to 4 KB so that			\
@@ -193,13 +203,7 @@ SECTIONS
 	}
 
 	PERCPU_SECTION(L1_CACHE_BYTES)
-
-	/* KVM nVHE per-cpu section */
-	#undef PERCPU_SECTION_NAME
-	#undef PERCPU_SYMBOL_NAME
-	#define PERCPU_SECTION_NAME(suffix)	CONCAT3(.hyp, PERCPU_SECTION_BASE_NAME, suffix)
-	#define PERCPU_SYMBOL_NAME(name)	__kvm_nvhe_ ## name
-	PERCPU_SECTION(L1_CACHE_BYTES)
+	HYPERVISOR_PERCPU_SECTION
 
 	.rela.dyn : ALIGN(8) {
 		*(.rela .rela*)
diff --git a/arch/arm64/kvm/hyp/nvhe/.gitignore b/arch/arm64/kvm/hyp/nvhe/.gitignore
new file mode 100644
index 000000000000..695d73d0249e
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+hyp.lds
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 1b2fbb19f3e8..decc2373aa6c 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -33,8 +33,8 @@ $(obj)/hyp.lds: $(src)/hyp.lds.S FORCE
 
 # 3) Partially link all '.hyp.o' files and apply the linker script.
 #    Prefixes names of ELF sections with '.hyp', eg. '.hyp.text'.
-LDFLAGS_hyp.tmp.o := -r -T $(obj)/hyp.lds
-$(obj)/hyp.tmp.o: $(addprefix $(obj)/,$(hyp-obj)) $(obj)/hyp.lds FORCE
+LDFLAGS_hyp.tmp.o := -r -T
+$(obj)/hyp.tmp.o: $(obj)/hyp.lds $(addprefix $(obj)/,$(hyp-obj)) FORCE
 	$(call if_changed,ld)
 
 # 4) Produce the final 'hyp.o', ready to be linked into 'vmlinux'.
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
index 7d8c3fa004f4..8121f2a6aedf 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
@@ -4,16 +4,9 @@
  * Written by David Brazdil <dbrazdil@google.com>
  */
 
-/*
- * Defines an ELF hyp section from input section @NAME and its subsections.
- */
-#define HYP_SECTION(NAME) .hyp##NAME : { *(NAME NAME##.[0-9a-zA-Z_]*) }
+#include <asm/hyp_image.h>
 
 SECTIONS {
 	HYP_SECTION(.text)
 	HYP_SECTION(.data..percpu)
-	HYP_SECTION(.data..percpu..first)
-	HYP_SECTION(.data..percpu..page_aligned)
-	HYP_SECTION(.data..percpu..read_mostly)
-	HYP_SECTION(.data..percpu..shared_aligned)
 }
David Brazdil Sept. 16, 2020, 11:54 a.m. UTC | #2
Hi Will,

On Mon, Sep 14, 2020 at 06:40:09PM +0100, Will Deacon wrote:
> Hi David,
> 
> On Thu, Sep 03, 2020 at 11:17:02AM +0200, David Brazdil wrote:
> > Introduce '.hyp.data..percpu' as part of ongoing effort to make nVHE
> > hyp code self-contained and independent of the rest of the kernel.
> > 
> > The series builds on top of the "Split off nVHE hyp code" series which
> > used objcopy to rename '.text' to '.hyp.text' and prefix all ELF
> > symbols with '__kvm_nvhe' for all object files under kvm/hyp/nvhe.
> 
> I've been playing around with this series this afternoon, trying to see
> if we can reduce the coupling between the nVHE code and the core code. I've
> ended up with the diff below on top of your series, but I think it actually
> removes the need to change the core code at all. The idea is to collapse
> the percpu sections during prelink, and then we can just deal with the
> resulting data section a bit like we do for .hyp.text already.
> 
> Have I missed something critical?

I was wondering whether this approach would be sufficient as well because of
the simplicity. We'd just need to be careful about correctly preserving the
semantics of the different .data..percpu..* sections.

For instance, I've noticed you make .hyp..data..percpu page-aligned rather than
cacheline-aligned. We need that for stage-2 unmapping but it also happens to
correctly align DEFINE_PER_CPU_PAGE_ALIGNED variables when collapsed into the
single hyp section. The reason why I ended up reusing the global macro was to
avoid introducing subtleties like that into the arm64 linker script. Do you
think it's a worthwhile trade off?

One place where this approach doesn't work is DEFINE_PER_CPU_FIRST. But I'm
guessing that's something we can live without.

I was also wondering about another approach - using the PERCPU_SECTION macro
unchanged in the hyp linker script. It would lay out a single .data..percpu and
we would then prefix it with .hyp and the symbols with __kvm_nvhe_ as with
everything else. WDYT? Haven't tried that yet, could be a naive idea. 

Thanks for reviewing,
David
David Brazdil Sept. 16, 2020, 12:24 p.m. UTC | #3
> I was also wondering about another approach - using the PERCPU_SECTION macro
> unchanged in the hyp linker script. It would lay out a single .data..percpu and
> we would then prefix it with .hyp and the symbols with __kvm_nvhe_ as with
> everything else. WDYT? Haven't tried that yet, could be a naive idea. 

Seems to work. Can't use PERCPU_SECTION directly because then we couldn't
rename it in the same linker script, but if we just unwrap that one layer
we can use PERCPU_INPUT. No global macro changes needed.

Let me know what you think.

------8<------
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 5904a4de9f40..9e6bf21268f1 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -195,11 +195,9 @@ SECTIONS
        PERCPU_SECTION(L1_CACHE_BYTES)

        /* KVM nVHE per-cpu section */
-       #undef PERCPU_SECTION_NAME
-       #undef PERCPU_SYMBOL_NAME
-       #define PERCPU_SECTION_NAME(suffix)     CONCAT3(.hyp, PERCPU_SECTION_BASE_NAME, suffix)
-       #define PERCPU_SYMBOL_NAME(name)        __kvm_nvhe_ ## name
-       PERCPU_SECTION(L1_CACHE_BYTES)
+       . = ALIGN(PAGE_SIZE);
+       .hyp.data..percpu : { *(.hyp.data..percpu) }
+       . = ALIGN(PAGE_SIZE);

        .rela.dyn : ALIGN(8) {
                *(.rela .rela*)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
index 7d8c3fa004f4..1d8e4f7edc29 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
@@ -4,6 +4,10 @@
  * Written by David Brazdil <dbrazdil@google.com>
  */

+#include <asm-generic/vmlinux.lds.h>
+#include <asm/cache.h>
+#include <asm/memory.h>
+
 /*
  * Defines an ELF hyp section from input section @NAME and its subsections.
  */
@@ -11,9 +15,9 @@

 SECTIONS {
        HYP_SECTION(.text)
-       HYP_SECTION(.data..percpu)
-       HYP_SECTION(.data..percpu..first)
-       HYP_SECTION(.data..percpu..page_aligned)
-       HYP_SECTION(.data..percpu..read_mostly)
-       HYP_SECTION(.data..percpu..shared_aligned)
+
+       .hyp..data..percpu : {
+               __per_cpu_load = .;
+               PERCPU_INPUT(L1_CACHE_BYTES)
+       }
 }
-----8<------

David
Will Deacon Sept. 16, 2020, 12:39 p.m. UTC | #4
On Wed, Sep 16, 2020 at 01:24:12PM +0100, David Brazdil wrote:
> > I was also wondering about another approach - using the PERCPU_SECTION macro
> > unchanged in the hyp linker script. It would lay out a single .data..percpu and
> > we would then prefix it with .hyp and the symbols with __kvm_nvhe_ as with
> > everything else. WDYT? Haven't tried that yet, could be a naive idea. 
> 
> Seems to work. Can't use PERCPU_SECTION directly because then we couldn't
> rename it in the same linker script, but if we just unwrap that one layer
> we can use PERCPU_INPUT. No global macro changes needed.
> 
> Let me know what you think.
> 
> ------8<------
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 5904a4de9f40..9e6bf21268f1 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -195,11 +195,9 @@ SECTIONS
>         PERCPU_SECTION(L1_CACHE_BYTES)
> 
>         /* KVM nVHE per-cpu section */
> -       #undef PERCPU_SECTION_NAME
> -       #undef PERCPU_SYMBOL_NAME
> -       #define PERCPU_SECTION_NAME(suffix)     CONCAT3(.hyp, PERCPU_SECTION_BASE_NAME, suffix)
> -       #define PERCPU_SYMBOL_NAME(name)        __kvm_nvhe_ ## name
> -       PERCPU_SECTION(L1_CACHE_BYTES)
> +       . = ALIGN(PAGE_SIZE);
> +       .hyp.data..percpu : { *(.hyp.data..percpu) }
> +       . = ALIGN(PAGE_SIZE);
> 
>         .rela.dyn : ALIGN(8) {
>                 *(.rela .rela*)
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
> index 7d8c3fa004f4..1d8e4f7edc29 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
> @@ -4,6 +4,10 @@
>   * Written by David Brazdil <dbrazdil@google.com>
>   */
> 
> +#include <asm-generic/vmlinux.lds.h>
> +#include <asm/cache.h>
> +#include <asm/memory.h>
> +
>  /*
>   * Defines an ELF hyp section from input section @NAME and its subsections.
>   */
> @@ -11,9 +15,9 @@
> 
>  SECTIONS {
>         HYP_SECTION(.text)
> -       HYP_SECTION(.data..percpu)
> -       HYP_SECTION(.data..percpu..first)
> -       HYP_SECTION(.data..percpu..page_aligned)
> -       HYP_SECTION(.data..percpu..read_mostly)
> -       HYP_SECTION(.data..percpu..shared_aligned)
> +
> +       .hyp..data..percpu : {

Too many '.'s here?

> +               __per_cpu_load = .;

I don't think we need this symbol.

Otherwise, idea looks good to me. Can you respin like this, but also
incorporating some of the cleanup in the diff I posted, please?

Will
David Brazdil Sept. 16, 2020, 12:40 p.m. UTC | #5
> >  SECTIONS {
> >         HYP_SECTION(.text)
> > -       HYP_SECTION(.data..percpu)
> > -       HYP_SECTION(.data..percpu..first)
> > -       HYP_SECTION(.data..percpu..page_aligned)
> > -       HYP_SECTION(.data..percpu..read_mostly)
> > -       HYP_SECTION(.data..percpu..shared_aligned)
> > +
> > +       .hyp..data..percpu : {
> 
> Too many '.'s here?
Oops

> 
> > +               __per_cpu_load = .;
> 
> I don't think we need this symbol.
True

> 
> Otherwise, idea looks good to me. Can you respin like this, but also
> incorporating some of the cleanup in the diff I posted, please?

On it! :)

David