Message ID | 1584200119-18594-2-git-send-email-mikelley@microsoft.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Subject: Enable Linux guests on Hyper-V on ARM64 | expand |
On Sat, 14 Mar 2020 15:35:10 +0000, Hi Michael, Michael Kelley <mikelley@microsoft.com> wrote: > > hyperv-tlfs.h defines Hyper-V interfaces from the Hyper-V Top Level > Functional Spec (TLFS). The TLFS is distinctly oriented to x86/x64, > and Hyper-V has not separated out the architecture-dependent parts into > x86/x64 vs. ARM64. So hyperv-tlfs.h includes information for ARM64 > that is not yet formally published. The TLFS is available here: > > docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs > > mshyperv.h defines Linux-specific structures and routines for > interacting with Hyper-V on ARM64, and #includes the architecture- > independent part of mshyperv.h in include/asm-generic. > > Signed-off-by: Michael Kelley <mikelley@microsoft.com> > --- > MAINTAINERS | 2 + > arch/arm64/include/asm/hyperv-tlfs.h | 413 +++++++++++++++++++++++++++++++++++ > arch/arm64/include/asm/mshyperv.h | 115 ++++++++++ > 3 files changed, 530 insertions(+) > create mode 100644 arch/arm64/include/asm/hyperv-tlfs.h > create mode 100644 arch/arm64/include/asm/mshyperv.h So this is a pretty large patch, mostly containing constants and other data structures that don't necessarily make sense immediately (or at least, I can't make sense of it without reading all the other 9 patches and going back to patch #1). Could you please consider splitting this into more discreet bits that get added as required by the supporting code? So here's only a few sparse comments: > > diff --git a/MAINTAINERS b/MAINTAINERS > index 58bb5c4..398cfdb 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -7809,6 +7809,8 @@ F: arch/x86/include/asm/trace/hyperv.h > F: arch/x86/include/asm/hyperv-tlfs.h > F: arch/x86/kernel/cpu/mshyperv.c > F: arch/x86/hyperv > +F: arch/arm64/include/asm/hyperv-tlfs.h > +F: arch/arm64/include/asm/mshyperv.h > F: drivers/clocksource/hyperv_timer.c > F: drivers/hid/hid-hyperv.c > F: drivers/hv/ > diff --git a/arch/arm64/include/asm/hyperv-tlfs.h b/arch/arm64/include/asm/hyperv-tlfs.h > new file mode 100644 > index 0000000..5e6a087 > --- /dev/null > +++ b/arch/arm64/include/asm/hyperv-tlfs.h > @@ -0,0 +1,413 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > + > +/* > + * This file contains definitions from the Hyper-V Hypervisor Top-Level > + * Functional Specification (TLFS): > + * https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs > + * > + * Copyright (C) 2019, Microsoft, Inc. > + * > + * Author : Michael Kelley <mikelley@microsoft.com> > + */ > + > +#ifndef _ASM_HYPERV_TLFS_H > +#define _ASM_HYPERV_TLFS_H > + > +#include <linux/types.h> > + > +/* > + * All data structures defined in the TLFS that are shared between Hyper-V > + * and a guest VM use Little Endian byte ordering. This matches the default > + * byte ordering of Linux running on ARM64, so no special handling is required. > + */ > + > + > +/* > + * While not explicitly listed in the TLFS, Hyper-V always runs with a page > + * size of 4096. These definitions are used when communicating with Hyper-V > + * using guest physical pages and guest physical page addresses, since the > + * guest page size may not be 4096 on ARM64. > + */ > +#define HV_HYP_PAGE_SHIFT 12 > +#define HV_HYP_PAGE_SIZE (1 << HV_HYP_PAGE_SHIFT) Probably worth writing this as 1UL to be on the safe side. > +#define HV_HYP_PAGE_MASK (~(HV_HYP_PAGE_SIZE - 1)) > + > +/* > + * These Hyper-V registers provide information equivalent to the CPUID > + * instruction on x86/x64. > + */ > +#define HV_REGISTER_HYPERVISOR_VERSION 0x00000100 /*CPUID 0x40000002 */ > +#define HV_REGISTER_PRIVILEGES_AND_FEATURES 0x00000200 /*CPUID 0x40000003 */ > +#define HV_REGISTER_FEATURES 0x00000201 /*CPUID 0x40000004 */ > +#define HV_REGISTER_IMPLEMENTATION_LIMITS 0x00000202 /*CPUID 0x40000005 */ > +#define HV_ARM64_REGISTER_INTERFACE_VERSION 0x00090006 /*CPUID 0x40000001 */ > + > +/* > + * Feature identification. HvRegisterPrivilegesAndFeaturesInfo returns a > + * 128-bit value with flags indicating which features are available to the > + * partition based upon the current partition privileges. The 128-bit > + * value is broken up with different portions stored in different 32-bit > + * fields in the ms_hyperv structure. > + */ > + > +/* Partition Reference Counter available*/ > +#define HV_MSR_TIME_REF_COUNT_AVAILABLE BIT(1) > + > +/* Synthetic Timers available */ > +#define HV_MSR_SYNTIMER_AVAILABLE BIT(3) > + > +/* Reference TSC available */ > +#define HV_MSR_REFERENCE_TSC_AVAILABLE BIT(9) > + > + > +/* > + * This group of flags is in the high order 64-bits of the returned > + * 128-bit value. Note that this set of bit positions differs from what > + * is used on x86/x64 architecture. > + */ > + > +/* Crash MSRs available */ > +#define HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE BIT(8) It is confusing that you don't have a single bit space for all these flags. It'd probably help if you had a structure describing this 128bit value in multiple 32bit or 64bit words, and indicating which field the bit position is relevant to. [...] > +/* Define the hypercall status result */ > + > +union hv_hypercall_status { > + u64 as_uint64; nit: it'd be more consistent if as_uint64 was actually a uint64 type. > + struct { > + u16 status; > + u16 reserved; > + u16 reps_completed; /* Low 12 bits */ > + u16 reserved2; > + }; > +}; > + > +/* hypercall status code */ > +#define HV_STATUS_SUCCESS 0 > +#define HV_STATUS_INVALID_HYPERCALL_CODE 2 > +#define HV_STATUS_INVALID_HYPERCALL_INPUT 3 > +#define HV_STATUS_INVALID_ALIGNMENT 4 > +#define HV_STATUS_INSUFFICIENT_MEMORY 11 > +#define HV_STATUS_INVALID_CONNECTION_ID 18 > +#define HV_STATUS_INSUFFICIENT_BUFFERS 19 > + > +/* Define input and output layout for Get VP Register hypercall */ > +struct hv_get_vp_register_input { > + u64 partitionid; > + u32 vpindex; > + u8 inputvtl; > + u8 padding[3]; > + u32 name0; > + u32 name1; > +} __packed; > + > +struct hv_get_vp_register_output { > + u64 registervaluelow; > + u64 registervaluehigh; > +} __packed; > + > +#define HV_FLUSH_ALL_PROCESSORS BIT(0) > +#define HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES BIT(1) > +#define HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY BIT(2) I"m curious: Are these supposed to be PV'd TLB invalidation operations? > +#define HV_FLUSH_USE_EXTENDED_RANGE_FORMAT BIT(3) > + > +enum HV_GENERIC_SET_FORMAT { > + HV_GENERIC_SET_SPARSE_4K, > + HV_GENERIC_SET_ALL, > +}; > + > +/* > + * The Hyper-V TimeRefCount register and the TSC > + * page provide a guest VM clock with 100ns tick rate > + */ > +#define HV_CLOCK_HZ (NSEC_PER_SEC/100) > + > +/* > + * The fields in this structure are set by Hyper-V and read > + * by the Linux guest. They should be accessed with READ_ONCE() > + * so the compiler doesn't optimize in a way that will cause > + * problems. The union pads the size out to the page size > + * used to communicate with Hyper-V. > + */ > +struct ms_hyperv_tsc_page { > + union { > + struct { > + u32 tsc_sequence; > + u32 reserved1; > + u64 tsc_scale; > + s64 tsc_offset; > + } __packed; > + u8 reserved2[HV_HYP_PAGE_SIZE]; > + }; > +}; > + > +/* Define the number of synthetic interrupt sources. */ > +#define HV_SYNIC_SINT_COUNT (16) > +/* Define the expected SynIC version. */ > +#define HV_SYNIC_VERSION_1 (0x1) > + > +#define HV_SYNIC_CONTROL_ENABLE (1ULL << 0) > +#define HV_SYNIC_SIMP_ENABLE (1ULL << 0) > +#define HV_SYNIC_SIEFP_ENABLE (1ULL << 0) > +#define HV_SYNIC_SINT_MASKED (1ULL << 16) > +#define HV_SYNIC_SINT_AUTO_EOI (1ULL << 17) > +#define HV_SYNIC_SINT_VECTOR_MASK (0xFF) Let's me guess: a PV interrupt controller? Do you really need this? Specially as I don't see any PV irqchip driver in this submission... [...] > diff --git a/arch/arm64/include/asm/mshyperv.h b/arch/arm64/include/asm/mshyperv.h > new file mode 100644 > index 0000000..60b3f68 > --- /dev/null > +++ b/arch/arm64/include/asm/mshyperv.h > @@ -0,0 +1,115 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > + > +/* > + * Linux-specific definitions for managing interactions with Microsoft's > + * Hyper-V hypervisor. The definitions in this file are specific to > + * the ARM64 architecture. See include/asm-generic/mshyperv.h for > + * definitions are that architecture independent. > + * > + * Definitions that are specified in the Hyper-V Top Level Functional > + * Spec (TLFS) should not go in this file, but should instead go in > + * hyperv-tlfs.h. > + * > + * Copyright (C) 2019, Microsoft, Inc. > + * > + * Author : Michael Kelley <mikelley@microsoft.com> > + */ > + > +#ifndef _ASM_MSHYPERV_H > +#define _ASM_MSHYPERV_H > + > +#include <linux/types.h> > +#include <linux/interrupt.h> > +#include <linux/clocksource.h> > +#include <linux/irq.h> > +#include <linux/irqdesc.h> > +#include <linux/arm-smccc.h> > +#include <asm/hyperv-tlfs.h> > + > +/* > + * Define the IRQ numbers/vectors used by Hyper-V VMbus interrupts > + * and by STIMER0 Direct Mode interrupts. Hyper-V should be supplying > + * these values through ACPI, but there are no other interrupting > + * devices in a Hyper-V VM on ARM64, so it's OK to hard code for now. I'm not convinced it is OK. If you don't try to do the right thing now, what is the incentive to do it later? Starting to hard code things is akin to going back to the ARM board files of old. Been there, managed to fix it, not going back to it again anytime soon. > + * The "CALLBACK_VECTOR" terminology is a left-over from the x86/x64 > + * world that is used in architecture independent Hyper-V code. > + */ > +#define HYPERVISOR_CALLBACK_VECTOR 16 > +#define HV_STIMER0_IRQNR 17 > + > +extern u64 hv_do_hvc(u64 control, ...); > +extern u64 hv_do_hvc_fast_get(u64 control, u64 input1, u64 input2, u64 input3, > + struct hv_get_vp_register_output *output); > + > +/* > + * Declare calls to get and set Hyper-V VP register values on ARM64, which > + * requires a hypercall. > + */ > +extern void hv_set_vpreg(u32 reg, u64 value); > +extern u64 hv_get_vpreg(u32 reg); > +extern void hv_get_vpreg_128(u32 reg, struct hv_get_vp_register_output *result); > + > +/* > + * Use the Hyper-V provided stimer0 as the timer that is made > + * available to the architecture independent Hyper-V drivers. > + */ > +#define hv_init_timer(timer, tick) \ > + hv_set_vpreg(HV_REGISTER_STIMER0_COUNT + (2*timer), tick) > +#define hv_init_timer_config(timer, val) \ > + hv_set_vpreg(HV_REGISTER_STIMER0_CONFIG + (2*timer), val) > +#define hv_get_current_tick(tick) \ > + (tick = hv_get_vpreg(HV_REGISTER_TIME_REFCOUNT)) > + > +#define hv_get_simp(val) (val = hv_get_vpreg(HV_REGISTER_SIPP)) > +#define hv_set_simp(val) hv_set_vpreg(HV_REGISTER_SIPP, val) > + > +#define hv_get_siefp(val) (val = hv_get_vpreg(HV_REGISTER_SIFP)) > +#define hv_set_siefp(val) hv_set_vpreg(HV_REGISTER_SIFP, val) > + > +#define hv_get_synic_state(val) (val = hv_get_vpreg(HV_REGISTER_SCONTROL)) > +#define hv_set_synic_state(val) hv_set_vpreg(HV_REGISTER_SCONTROL, val) > + > +#define hv_get_vp_index(index) (index = hv_get_vpreg(HV_REGISTER_VPINDEX)) > + > +#define hv_signal_eom() hv_set_vpreg(HV_REGISTER_EOM, 0) > + > +/* > + * Hyper-V SINT registers are numbered sequentially, so we can just > + * add the SINT number to the register number of SINT0 > + */ > +#define hv_get_synint_state(sint_num, val) \ > + (val = hv_get_vpreg(HV_REGISTER_SINT0 + sint_num)) > +#define hv_set_synint_state(sint_num, val) \ > + hv_set_vpreg(HV_REGISTER_SINT0 + sint_num, val) > + > +#define hv_get_crash_ctl(val) \ > + (val = hv_get_vpreg(HV_REGISTER_CRASH_CTL)) > +#define hv_get_time_ref_count(val) \ > + (val = hv_get_vpreg(HV_REGISTER_TIME_REFCOUNT)) > +#define hv_get_reference_tsc(val) \ > + (val = hv_get_vpreg(HV_REGISTER_REFERENCE_TSC)) > +#define hv_set_reference_tsc(val) \ > + hv_set_vpreg(HV_REGISTER_REFERENCE_TSC, val) > +#define hv_enable_vdso_clocksource() > +#define hv_set_clocksource_vdso(val) \ > + ((val).vdso_clock_mode = VDSO_CLOCKMODE_NONE) > + > +#if IS_ENABLED(CONFIG_HYPERV) I don't think this guards anything useful. > +#define hv_enable_stimer0_percpu_irq(irq) enable_percpu_irq(irq, 0) > +#define hv_disable_stimer0_percpu_irq(irq) disable_percpu_irq(irq) and this looks pretty premature. > +#endif > + > +/* ARM64 specific code to read the hardware clock */ > +#define hv_get_raw_timer() arch_timer_read_counter() > + > +/* SMCCC hypercall parameters */ > +#define HV_SMCCC_FUNC_NUMBER 1 > +#define HV_FUNC_ID ARM_SMCCC_CALL_VAL( \ > + ARM_SMCCC_STD_CALL, \ > + ARM_SMCCC_SMC_64, \ > + ARM_SMCCC_OWNER_VENDOR_HYP, \ This is only defined in patch #2... > + HV_SMCCC_FUNC_NUMBER) > + > +#include <asm-generic/mshyperv.h> > + > +#endif Thanks, M.
On Sat, Mar 14, 2020 at 4:36 PM Michael Kelley <mikelley@microsoft.com> wrote: > + > +/* Define input and output layout for Get VP Register hypercall */ > +struct hv_get_vp_register_input { > + u64 partitionid; > + u32 vpindex; > + u8 inputvtl; > + u8 padding[3]; > + u32 name0; > + u32 name1; > +} __packed; Are you sure these need to be made byte-aligned according to the specification? If the structure itself is aligned to 64 bit, better mark only the individual fields that are misaligned as __packed. If the structure is aligned to only 32-bit addresses instead of 64-bit, mark it as "__packed __aligned(4)" to let the compiler generate better code for accessing it. Also, in order to write portable code, it would be helpful to mark all the fields as explicitly little-endian, and use __le32_to_cpu() etc for accessing them. > +/* Define synthetic interrupt controller message flags. */ > +union hv_message_flags { > + __u8 asu8; > + struct { > + __u8 msg_pending:1; > + __u8 reserved:7; > + } __packed; > +}; For similar reasons, please avoid bit fields and just use a bit mask on the first member of the union. The __packed annotation here is not meaningful, as the total size is already only a single byte. > +/* Define port identifier type. */ > +union hv_port_id { > + __u32 asu32; > + struct { > + __u32 id:24; > + __u32 reserved:8; > + } __packed u; > +}; Here, the __packed annotation is inconsistent with the use in the rest of the file: marking only one member of the union as __packed means that the union itself is still expected to be 4 byte aligned. I would expect that either all of these structures have a sensible alignment, or they are all completely unaligned. > + * Use the Hyper-V provided stimer0 as the timer that is made > + * available to the architecture independent Hyper-V drivers. > + */ > +#define hv_init_timer(timer, tick) \ > + hv_set_vpreg(HV_REGISTER_STIMER0_COUNT + (2*timer), tick) > +#define hv_init_timer_config(timer, val) \ > + hv_set_vpreg(HV_REGISTER_STIMER0_CONFIG + (2*timer), val) > +#define hv_get_current_tick(tick) \ > + (tick = hv_get_vpreg(HV_REGISTER_TIME_REFCOUNT)) In general, we prefer inline functions over macros in header files. > +#if IS_ENABLED(CONFIG_HYPERV) > +#define hv_enable_stimer0_percpu_irq(irq) enable_percpu_irq(irq, 0) > +#define hv_disable_stimer0_percpu_irq(irq) disable_percpu_irq(irq) > +#endif Should there be an #else definition here? It helps readability to have the two versions (with and without hyperv support) close together rather than in different files. If there is no other definition, just drop the #if. Arnd
From: Marc Zyngier <maz@kernel.org> Sent: Sunday, March 15, 2020 10:31 AM > > On Sat, 14 Mar 2020 15:35:10 +0000, > Hi Michael, > > Michael Kelley <mikelley@microsoft.com> wrote: > > > > hyperv-tlfs.h defines Hyper-V interfaces from the Hyper-V Top Level > > Functional Spec (TLFS). The TLFS is distinctly oriented to x86/x64, > > and Hyper-V has not separated out the architecture-dependent parts into > > x86/x64 vs. ARM64. So hyperv-tlfs.h includes information for ARM64 > > that is not yet formally published. The TLFS is available here: > > > > docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs > > > > mshyperv.h defines Linux-specific structures and routines for > > interacting with Hyper-V on ARM64, and #includes the architecture- > > independent part of mshyperv.h in include/asm-generic. > > > > Signed-off-by: Michael Kelley <mikelley@microsoft.com> > > --- > > MAINTAINERS | 2 + > > arch/arm64/include/asm/hyperv-tlfs.h | 413 > +++++++++++++++++++++++++++++++++++ > > arch/arm64/include/asm/mshyperv.h | 115 ++++++++++ > > 3 files changed, 530 insertions(+) > > create mode 100644 arch/arm64/include/asm/hyperv-tlfs.h > > create mode 100644 arch/arm64/include/asm/mshyperv.h > > So this is a pretty large patch, mostly containing constants and other > data structures that don't necessarily make sense immediately (or at > least, I can't make sense of it without reading all the other 9 > patches and going back to patch #1). > > Could you please consider splitting this into more discreet bits that > get added as required by the supporting code? Yes, I'll do this in the next version. > > So here's only a few sparse comments: > > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > index 58bb5c4..398cfdb 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -7809,6 +7809,8 @@ F: arch/x86/include/asm/trace/hyperv.h > > F: arch/x86/include/asm/hyperv-tlfs.h > > F: arch/x86/kernel/cpu/mshyperv.c > > F: arch/x86/hyperv > > +F: arch/arm64/include/asm/hyperv-tlfs.h > > +F: arch/arm64/include/asm/mshyperv.h > > F: drivers/clocksource/hyperv_timer.c > > F: drivers/hid/hid-hyperv.c > > F: drivers/hv/ > > diff --git a/arch/arm64/include/asm/hyperv-tlfs.h b/arch/arm64/include/asm/hyperv- > tlfs.h > > new file mode 100644 > > index 0000000..5e6a087 > > --- /dev/null > > +++ b/arch/arm64/include/asm/hyperv-tlfs.h > > @@ -0,0 +1,413 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > + > > +/* > > + * This file contains definitions from the Hyper-V Hypervisor Top-Level > > + * Functional Specification (TLFS): > > + * https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs > > + * > > + * Copyright (C) 2019, Microsoft, Inc. > > + * > > + * Author : Michael Kelley <mikelley@microsoft.com> > > + */ > > + > > +#ifndef _ASM_HYPERV_TLFS_H > > +#define _ASM_HYPERV_TLFS_H > > + > > +#include <linux/types.h> > > + > > +/* > > + * All data structures defined in the TLFS that are shared between Hyper-V > > + * and a guest VM use Little Endian byte ordering. This matches the default > > + * byte ordering of Linux running on ARM64, so no special handling is required. > > + */ > > + > > + > > +/* > > + * While not explicitly listed in the TLFS, Hyper-V always runs with a page > > + * size of 4096. These definitions are used when communicating with Hyper-V > > + * using guest physical pages and guest physical page addresses, since the > > + * guest page size may not be 4096 on ARM64. > > + */ > > +#define HV_HYP_PAGE_SHIFT 12 > > +#define HV_HYP_PAGE_SIZE (1 << HV_HYP_PAGE_SHIFT) > > Probably worth writing this as 1UL to be on the safe side. Agreed. > > > +#define HV_HYP_PAGE_MASK (~(HV_HYP_PAGE_SIZE - 1)) > > + > > +/* > > + * These Hyper-V registers provide information equivalent to the CPUID > > + * instruction on x86/x64. > > + */ > > +#define HV_REGISTER_HYPERVISOR_VERSION 0x00000100 /*CPUID > 0x40000002 */ > > +#define HV_REGISTER_PRIVILEGES_AND_FEATURES 0x00000200 /*CPUID > 0x40000003 */ > > +#define HV_REGISTER_FEATURES 0x00000201 /*CPUID > 0x40000004 */ > > +#define HV_REGISTER_IMPLEMENTATION_LIMITS 0x00000202 /*CPUID > 0x40000005 */ > > +#define HV_ARM64_REGISTER_INTERFACE_VERSION 0x00090006 /*CPUID > 0x40000001 */ > > + > > +/* > > + * Feature identification. HvRegisterPrivilegesAndFeaturesInfo returns a > > + * 128-bit value with flags indicating which features are available to the > > + * partition based upon the current partition privileges. The 128-bit > > + * value is broken up with different portions stored in different 32-bit > > + * fields in the ms_hyperv structure. > > + */ > > + > > +/* Partition Reference Counter available*/ > > +#define HV_MSR_TIME_REF_COUNT_AVAILABLE BIT(1) > > + > > +/* Synthetic Timers available */ > > +#define HV_MSR_SYNTIMER_AVAILABLE BIT(3) > > + > > +/* Reference TSC available */ > > +#define HV_MSR_REFERENCE_TSC_AVAILABLE BIT(9) > > + > > + > > +/* > > + * This group of flags is in the high order 64-bits of the returned > > + * 128-bit value. Note that this set of bit positions differs from what > > + * is used on x86/x64 architecture. > > + */ > > + > > +/* Crash MSRs available */ > > +#define HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE BIT(8) > > It is confusing that you don't have a single bit space for all these > flags. It'd probably help if you had a structure describing this > 128bit value in multiple 32bit or 64bit words, and indicating which > field the bit position is relevant to. I'll add this in the next version. > > [...] > > > +/* Define the hypercall status result */ > > + > > +union hv_hypercall_status { > > + u64 as_uint64; > > nit: it'd be more consistent if as_uint64 was actually a uint64 type. Agreed. > > > + struct { > > + u16 status; > > + u16 reserved; > > + u16 reps_completed; /* Low 12 bits */ > > + u16 reserved2; > > + }; > > +}; > > + > > +/* hypercall status code */ > > +#define HV_STATUS_SUCCESS 0 > > +#define HV_STATUS_INVALID_HYPERCALL_CODE 2 > > +#define HV_STATUS_INVALID_HYPERCALL_INPUT 3 > > +#define HV_STATUS_INVALID_ALIGNMENT 4 > > +#define HV_STATUS_INSUFFICIENT_MEMORY 11 > > +#define HV_STATUS_INVALID_CONNECTION_ID 18 > > +#define HV_STATUS_INSUFFICIENT_BUFFERS 19 > > + > > +/* Define input and output layout for Get VP Register hypercall */ > > +struct hv_get_vp_register_input { > > + u64 partitionid; > > + u32 vpindex; > > + u8 inputvtl; > > + u8 padding[3]; > > + u32 name0; > > + u32 name1; > > +} __packed; > > + > > +struct hv_get_vp_register_output { > > + u64 registervaluelow; > > + u64 registervaluehigh; > > +} __packed; > > + > > +#define HV_FLUSH_ALL_PROCESSORS BIT(0) > > +#define HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES BIT(1) > > +#define HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY BIT(2) > > I"m curious: Are these supposed to be PV'd TLB invalidation > operations? Yes, they are. Hyper-V provides PV TLB flush hypercalls on ARM64, but this patch set doesn't use those hypercalls. I'll remove the definitions. > > > +#define HV_FLUSH_USE_EXTENDED_RANGE_FORMAT BIT(3) > > + > > +enum HV_GENERIC_SET_FORMAT { > > + HV_GENERIC_SET_SPARSE_4K, > > + HV_GENERIC_SET_ALL, > > +}; > > + > > +/* > > + * The Hyper-V TimeRefCount register and the TSC > > + * page provide a guest VM clock with 100ns tick rate > > + */ > > +#define HV_CLOCK_HZ (NSEC_PER_SEC/100) > > + > > +/* > > + * The fields in this structure are set by Hyper-V and read > > + * by the Linux guest. They should be accessed with READ_ONCE() > > + * so the compiler doesn't optimize in a way that will cause > > + * problems. The union pads the size out to the page size > > + * used to communicate with Hyper-V. > > + */ > > +struct ms_hyperv_tsc_page { > > + union { > > + struct { > > + u32 tsc_sequence; > > + u32 reserved1; > > + u64 tsc_scale; > > + s64 tsc_offset; > > + } __packed; > > + u8 reserved2[HV_HYP_PAGE_SIZE]; > > + }; > > +}; > > + > > +/* Define the number of synthetic interrupt sources. */ > > +#define HV_SYNIC_SINT_COUNT (16) > > +/* Define the expected SynIC version. */ > > +#define HV_SYNIC_VERSION_1 (0x1) > > + > > +#define HV_SYNIC_CONTROL_ENABLE (1ULL << 0) > > +#define HV_SYNIC_SIMP_ENABLE (1ULL << 0) > > +#define HV_SYNIC_SIEFP_ENABLE (1ULL << 0) > > +#define HV_SYNIC_SINT_MASKED (1ULL << 16) > > +#define HV_SYNIC_SINT_AUTO_EOI (1ULL << 17) > > +#define HV_SYNIC_SINT_VECTOR_MASK (0xFF) > > Let's me guess: a PV interrupt controller? Do you really need this? > Specially as I don't see any PV irqchip driver in this submission... > No, the above definitions aren't needed. I'll remove them. Hyper-V does provide a limited synthetic interrupt controller that's implemented entirely in architecture independent code and has been used on the x86 side since the beginning of Hyper-V support. It pre-dates me by a lot of years, and I've never considered whether it should be modeled as an irqchip. > [...] > > > diff --git a/arch/arm64/include/asm/mshyperv.h b/arch/arm64/include/asm/mshyperv.h > > new file mode 100644 > > index 0000000..60b3f68 > > --- /dev/null > > +++ b/arch/arm64/include/asm/mshyperv.h > > @@ -0,0 +1,115 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > + > > +/* > > + * Linux-specific definitions for managing interactions with Microsoft's > > + * Hyper-V hypervisor. The definitions in this file are specific to > > + * the ARM64 architecture. See include/asm-generic/mshyperv.h for > > + * definitions are that architecture independent. > > + * > > + * Definitions that are specified in the Hyper-V Top Level Functional > > + * Spec (TLFS) should not go in this file, but should instead go in > > + * hyperv-tlfs.h. > > + * > > + * Copyright (C) 2019, Microsoft, Inc. > > + * > > + * Author : Michael Kelley <mikelley@microsoft.com> > > + */ > > + > > +#ifndef _ASM_MSHYPERV_H > > +#define _ASM_MSHYPERV_H > > + > > +#include <linux/types.h> > > +#include <linux/interrupt.h> > > +#include <linux/clocksource.h> > > +#include <linux/irq.h> > > +#include <linux/irqdesc.h> > > +#include <linux/arm-smccc.h> > > +#include <asm/hyperv-tlfs.h> > > + > > +/* > > + * Define the IRQ numbers/vectors used by Hyper-V VMbus interrupts > > + * and by STIMER0 Direct Mode interrupts. Hyper-V should be supplying > > + * these values through ACPI, but there are no other interrupting > > + * devices in a Hyper-V VM on ARM64, so it's OK to hard code for now. > > I'm not convinced it is OK. If you don't try to do the right thing > now, what is the incentive to do it later? Starting to hard code > things is akin to going back to the ARM board files of old. Been > there, managed to fix it, not going back to it again anytime soon. I'll check back with the Hyper-V guys on getting appropriate values assigned in ACPI. > > > + * The "CALLBACK_VECTOR" terminology is a left-over from the x86/x64 > > + * world that is used in architecture independent Hyper-V code. > > + */ > > +#define HYPERVISOR_CALLBACK_VECTOR 16 > > +#define HV_STIMER0_IRQNR 17 > > + > > +extern u64 hv_do_hvc(u64 control, ...); > > +extern u64 hv_do_hvc_fast_get(u64 control, u64 input1, u64 input2, u64 input3, > > + struct hv_get_vp_register_output *output); > > + > > +/* > > + * Declare calls to get and set Hyper-V VP register values on ARM64, which > > + * requires a hypercall. > > + */ > > +extern void hv_set_vpreg(u32 reg, u64 value); > > +extern u64 hv_get_vpreg(u32 reg); > > +extern void hv_get_vpreg_128(u32 reg, struct hv_get_vp_register_output *result); > > + > > +/* > > + * Use the Hyper-V provided stimer0 as the timer that is made > > + * available to the architecture independent Hyper-V drivers. > > + */ > > +#define hv_init_timer(timer, tick) \ > > + hv_set_vpreg(HV_REGISTER_STIMER0_COUNT + (2*timer), tick) > > +#define hv_init_timer_config(timer, val) \ > > + hv_set_vpreg(HV_REGISTER_STIMER0_CONFIG + (2*timer), val) > > +#define hv_get_current_tick(tick) \ > > + (tick = hv_get_vpreg(HV_REGISTER_TIME_REFCOUNT)) > > + > > +#define hv_get_simp(val) (val = hv_get_vpreg(HV_REGISTER_SIPP)) > > +#define hv_set_simp(val) hv_set_vpreg(HV_REGISTER_SIPP, val) > > + > > +#define hv_get_siefp(val) (val = hv_get_vpreg(HV_REGISTER_SIFP)) > > +#define hv_set_siefp(val) hv_set_vpreg(HV_REGISTER_SIFP, val) > > + > > +#define hv_get_synic_state(val) (val = hv_get_vpreg(HV_REGISTER_SCONTROL)) > > +#define hv_set_synic_state(val) hv_set_vpreg(HV_REGISTER_SCONTROL, val) > > + > > +#define hv_get_vp_index(index) (index = hv_get_vpreg(HV_REGISTER_VPINDEX)) > > + > > +#define hv_signal_eom() hv_set_vpreg(HV_REGISTER_EOM, 0) > > + > > +/* > > + * Hyper-V SINT registers are numbered sequentially, so we can just > > + * add the SINT number to the register number of SINT0 > > + */ > > +#define hv_get_synint_state(sint_num, val) \ > > + (val = hv_get_vpreg(HV_REGISTER_SINT0 + sint_num)) > > +#define hv_set_synint_state(sint_num, val) \ > > + hv_set_vpreg(HV_REGISTER_SINT0 + sint_num, val) > > + > > +#define hv_get_crash_ctl(val) \ > > + (val = hv_get_vpreg(HV_REGISTER_CRASH_CTL)) > > +#define hv_get_time_ref_count(val) \ > > + (val = hv_get_vpreg(HV_REGISTER_TIME_REFCOUNT)) > > +#define hv_get_reference_tsc(val) \ > > + (val = hv_get_vpreg(HV_REGISTER_REFERENCE_TSC)) > > +#define hv_set_reference_tsc(val) \ > > + hv_set_vpreg(HV_REGISTER_REFERENCE_TSC, val) > > +#define hv_enable_vdso_clocksource() > > +#define hv_set_clocksource_vdso(val) \ > > + ((val).vdso_clock_mode = VDSO_CLOCKMODE_NONE) > > + > > +#if IS_ENABLED(CONFIG_HYPERV) > > I don't think this guards anything useful. You are probably right. I'll double-check. > > > +#define hv_enable_stimer0_percpu_irq(irq) enable_percpu_irq(irq, 0) > > +#define hv_disable_stimer0_percpu_irq(irq) disable_percpu_irq(irq) > > and this looks pretty premature. I'm not sure I understand your comment about "premature". Could you clarify? > > > +#endif > > + > > +/* ARM64 specific code to read the hardware clock */ > > +#define hv_get_raw_timer() arch_timer_read_counter() > > + > > +/* SMCCC hypercall parameters */ > > +#define HV_SMCCC_FUNC_NUMBER 1 > > +#define HV_FUNC_ID ARM_SMCCC_CALL_VAL( \ > > + ARM_SMCCC_STD_CALL, \ > > + ARM_SMCCC_SMC_64, \ > > + ARM_SMCCC_OWNER_VENDOR_HYP, \ > > This is only defined in patch #2... Indeed. :-( I'll fix as part of breaking up this patch into smaller pieces. Michael > > > + HV_SMCCC_FUNC_NUMBER) > > + > > +#include <asm-generic/mshyperv.h> > > + > > +#endif > > Thanks, > > M. > > -- > Jazz is not dead, it just smells funny.
From: Arnd Bergmann <arnd@arndb.de> Sent: Monday, March 16, 2020 1:48 AM > > On Sat, Mar 14, 2020 at 4:36 PM Michael Kelley <mikelley@microsoft.com> wrote: > > > + > > +/* Define input and output layout for Get VP Register hypercall */ > > +struct hv_get_vp_register_input { > > + u64 partitionid; > > + u32 vpindex; > > + u8 inputvtl; > > + u8 padding[3]; > > + u32 name0; > > + u32 name1; > > +} __packed; > > Are you sure these need to be made byte-aligned according to the > specification? If the structure itself is aligned to 64 bit, better mark only > the individual fields that are misaligned as __packed. > > If the structure is aligned to only 32-bit addresses instead of > 64-bit, mark it as "__packed __aligned(4)" to let the compiler > generate better code for accessing it. None of the fields are misaligned and it will always be aligned to 64-bit addresses, so there should be no padding needed or added. There was a discussion of __packed and the Hyper-V data structures in general on LKML here: https://lkml.org/lkml/2018/11/30/848. Adding __packed was done as a preventative measure, not because anything was actually broken. Marking as __aligned(8) here would indicate the correct intent, though the use of the structure always ensures 64-bit alignment. > > Also, in order to write portable code, it would be helpful to mark > all the fields as explicitly little-endian, and use __le32_to_cpu() > etc for accessing them. There's an opening comment in this file stating that all data structures shared between Hyper-V and a guest VM are little endian. Is there some other marking to consider using? We have definitely not allowed for the case of Hyper-V running on a big endian architecture. There are a *lot* of messages and data structures passed between the guest and Hyper-V, and coding to handle either endianness is a big project. I'm doubtful of the value until and unless we actually have a need for it. > > > +/* Define synthetic interrupt controller message flags. */ > > +union hv_message_flags { > > + __u8 asu8; > > + struct { > > + __u8 msg_pending:1; > > + __u8 reserved:7; > > + } __packed; > > +}; > > For similar reasons, please avoid bit fields and just use a > bit mask on the first member of the union. Unfortunately, changing to a bit mask ripples into architecture independent code and into the x86 implementation. I'd prefer not to drag that complexity into this patch set. > > The __packed annotation here is not meaningful, > as the total size is already only a single byte. Agreed. > > > +/* Define port identifier type. */ > > +union hv_port_id { > > + __u32 asu32; > > + struct { > > + __u32 id:24; > > + __u32 reserved:8; > > + } __packed u; > > +}; > > Here, the __packed annotation is inconsistent with the use > in the rest of the file: marking only one member of the union > as __packed means that the union itself is still expected to > be 4 byte aligned. I would expect that either all of these > structures have a sensible alignment, or they are all > completely unaligned. Agreed. Looks like it is wrong on the x86 side too. > > > + * Use the Hyper-V provided stimer0 as the timer that is made > > + * available to the architecture independent Hyper-V drivers. > > + */ > > +#define hv_init_timer(timer, tick) \ > > + hv_set_vpreg(HV_REGISTER_STIMER0_COUNT + (2*timer), tick) > > +#define hv_init_timer_config(timer, val) \ > > + hv_set_vpreg(HV_REGISTER_STIMER0_CONFIG + (2*timer), val) > > +#define hv_get_current_tick(tick) \ > > + (tick = hv_get_vpreg(HV_REGISTER_TIME_REFCOUNT)) > > In general, we prefer inline functions over macros in header files. I can change the "set" calls to inline functions. As you can see, the "get" functions are coded and used in architecture independent code and on the x86 side in a way that won't convert to inline functions. > > > +#if IS_ENABLED(CONFIG_HYPERV) > > +#define hv_enable_stimer0_percpu_irq(irq) enable_percpu_irq(irq, 0) > > +#define hv_disable_stimer0_percpu_irq(irq) disable_percpu_irq(irq) > > +#endif > > Should there be an #else definition here? It helps readability > to have the two versions (with and without hyperv support) close > together rather than in different files. If there is no other > definition, just drop the #if. Agreed. I'll figure out a way to handle this better. > > Arnd
On Wed, Mar 18, 2020 at 1:12 AM Michael Kelley <mikelley@microsoft.com> wrote: > From: Arnd Bergmann <arnd@arndb.de> Sent: Monday, March 16, 2020 1:48 AM > > On Sat, Mar 14, 2020 at 4:36 PM Michael Kelley <mikelley@microsoft.com> wrote: > > > > > + > > > +/* Define input and output layout for Get VP Register hypercall */ > > > +struct hv_get_vp_register_input { > > > + u64 partitionid; > > > + u32 vpindex; > > > + u8 inputvtl; > > > + u8 padding[3]; > > > + u32 name0; > > > + u32 name1; > > > +} __packed; > > > > Are you sure these need to be made byte-aligned according to the > > specification? If the structure itself is aligned to 64 bit, better mark only > > the individual fields that are misaligned as __packed. > > > > If the structure is aligned to only 32-bit addresses instead of > > 64-bit, mark it as "__packed __aligned(4)" to let the compiler > > generate better code for accessing it. > > None of the fields are misaligned and it will always be aligned to 64-bit > addresses, so there should be no padding needed or added. There was > a discussion of __packed and the Hyper-V data structures in general on > LKML here: https://lkml.org/lkml/2018/11/30/848. Adding __packed was > done as a preventative measure, not because anything was actually > broken. Marking as __aligned(8) here would indicate the correct intent, > though the use of the structure always ensures 64-bit alignment. Just drop the __packed annotations then, they just confuse the compiler in this case. In particular, when the compiler thinks that a structure is misaligned, it tries to avoid using load/store instructions on it that are inefficient or trap with misaligned code, so having default alignment produces better object code. > > Also, in order to write portable code, it would be helpful to mark > > all the fields as explicitly little-endian, and use __le32_to_cpu() > > etc for accessing them. > > There's an opening comment in this file stating that all data > structures shared between Hyper-V and a guest VM are little > endian. Is there some other marking to consider using? Yes, device drivers should generally define data structures using the __le32, __le64 etc types, and use the conversion functions to access them. Building with 'make C=1' usually tells you when you have mismatched annotations. > We have definitely not allowed for the case of Hyper-V running on > a big endian architecture. There are a *lot* of messages and data > structures passed between the guest and Hyper-V, and coding > to handle either endianness is a big project. I'm doubtful > of the value until and unless we actually have a need for it. In general, the use of big-endian software on Linux is declining, however - arm64 as an architecture is meant to support both endian types, and we still try to ensure it works either way as long as there are users that depend on it. - The remaining users of big-endian software are probably more likely to run on virtual machines than on real hardware - Any device driver should generally be written against portable interfaces, even if you think you know how it will be used. As driver writers tend to look at existing code for new drivers, it's better to have them all be portable. (This is a similar argument to the irqchip interface). Even if you don't convert any of the existing architecture independent code to run both ways, I see no reason to not do it for new drivers. > > > +/* Define synthetic interrupt controller message flags. */ > > > +union hv_message_flags { > > > + __u8 asu8; > > > + struct { > > > + __u8 msg_pending:1; > > > + __u8 reserved:7; > > > + } __packed; > > > +}; > > > > For similar reasons, please avoid bit fields and just use a > > bit mask on the first member of the union. > > Unfortunately, changing to a bit mask ripples into > architecture independent code and into the x86 > implementation. I'd prefer not to drag that complexity > into this patch set. How so? If this file is arm64 specific, there should be no need to make x86 do the same change. > > > + * Use the Hyper-V provided stimer0 as the timer that is made > > > + * available to the architecture independent Hyper-V drivers. > > > + */ > > > +#define hv_init_timer(timer, tick) \ > > > + hv_set_vpreg(HV_REGISTER_STIMER0_COUNT + (2*timer), tick) > > > +#define hv_init_timer_config(timer, val) \ > > > + hv_set_vpreg(HV_REGISTER_STIMER0_CONFIG + (2*timer), val) > > > +#define hv_get_current_tick(tick) \ > > > + (tick = hv_get_vpreg(HV_REGISTER_TIME_REFCOUNT)) > > > > In general, we prefer inline functions over macros in header files. > > I can change the "set" calls to inline functions. As you can see, the "get" > functions are coded and used in architecture independent code and on > the x86 side in a way that won't convert to inline functions. Ok. Arnd
From: Arnd Bergmann <arnd@arndb.de> Sent: Wednesday, March 18, 2020 3:10 AM > > On Wed, Mar 18, 2020 at 1:12 AM Michael Kelley <mikelley@microsoft.com> wrote: > > From: Arnd Bergmann <arnd@arndb.de> Sent: Monday, March 16, 2020 1:48 AM > > > On Sat, Mar 14, 2020 at 4:36 PM Michael Kelley <mikelley@microsoft.com> wrote: > > > > > > > + > > > > +/* Define input and output layout for Get VP Register hypercall */ > > > > +struct hv_get_vp_register_input { > > > > + u64 partitionid; > > > > + u32 vpindex; > > > > + u8 inputvtl; > > > > + u8 padding[3]; > > > > + u32 name0; > > > > + u32 name1; > > > > +} __packed; > > > > > > Are you sure these need to be made byte-aligned according to the > > > specification? If the structure itself is aligned to 64 bit, better mark only > > > the individual fields that are misaligned as __packed. > > > > > > If the structure is aligned to only 32-bit addresses instead of > > > 64-bit, mark it as "__packed __aligned(4)" to let the compiler > > > generate better code for accessing it. > > > > None of the fields are misaligned and it will always be aligned to 64-bit > > addresses, so there should be no padding needed or added. There was > > a discussion of __packed and the Hyper-V data structures in general on > > LKML here: https://lkml.org/lkml/2018/11/30/848 Adding __packed was > > done as a preventative measure, not because anything was actually > > broken. Marking as __aligned(8) here would indicate the correct intent, > > though the use of the structure always ensures 64-bit alignment. > > Just drop the __packed annotations then, they just confuse the compiler > in this case. In particular, when the compiler thinks that a structure is > misaligned, it tries to avoid using load/store instructions on it that are > inefficient or trap with misaligned code, so having default alignment > produces better object code. So I'm confused a bit. Were the original concerns in the above LKML discussion bogus? Is it legal for the compiler to reorder fields or add padding, even if the layout of fields in the structure doesn't require it? If the compiler *could* do such, then it seems like keeping the __packed would be appropriate per the LKML discussion. > > > > Also, in order to write portable code, it would be helpful to mark > > > all the fields as explicitly little-endian, and use __le32_to_cpu() > > > etc for accessing them. > > > > There's an opening comment in this file stating that all data > > structures shared between Hyper-V and a guest VM are little > > endian. Is there some other marking to consider using? > > Yes, device drivers should generally define data structures using > the __le32, __le64 etc types, and use the conversion functions > to access them. Building with 'make C=1' usually tells you when > you have mismatched annotations. > > > We have definitely not allowed for the case of Hyper-V running on > > a big endian architecture. There are a *lot* of messages and data > > structures passed between the guest and Hyper-V, and coding > > to handle either endianness is a big project. I'm doubtful > > of the value until and unless we actually have a need for it. > > In general, the use of big-endian software on Linux is declining, however > > - arm64 as an architecture is meant to support both endian types, > and we still try to ensure it works either way as long as there are > users that depend on it. > > - The remaining users of big-endian software are probably > more likely to run on virtual machines than on real hardware > > - Any device driver should generally be written against portable > interfaces, even if you think you know how it will be used. As > driver writers tend to look at existing code for new drivers, it's > better to have them all be portable. (This is a similar argument > to the irqchip interface). > > Even if you don't convert any of the existing architecture independent > code to run both ways, I see no reason to not do it for new drivers. OK, let me look into this. Given how the major Linux distros on ARM64 have all gone little-endian, I'm a bit skeptical of the value for the big server environments in which Hyper-V would be used. > > > > > +/* Define synthetic interrupt controller message flags. */ > > > > +union hv_message_flags { > > > > + __u8 asu8; > > > > + struct { > > > > + __u8 msg_pending:1; > > > > + __u8 reserved:7; > > > > + } __packed; > > > > +}; > > > > > > For similar reasons, please avoid bit fields and just use a > > > bit mask on the first member of the union. > > > > Unfortunately, changing to a bit mask ripples into > > architecture independent code and into the x86 > > implementation. I'd prefer not to drag that complexity > > into this patch set. > > How so? If this file is arm64 specific, there should be no need to make > x86 do the same change. This file, hyperv-tlfs.h, is duplicating some definitions on the x86 and ARM64 sides that are used by arch independent code, and this is one of those definitions. I had held off on breaking the file into arch independent and arch specific portions because the Hyper-V team has left some gray areas for functionality that isn't yet used on the ARM64 side. So in some cases, it's hard to know what functionality to put into the arch independent portion. But I think I'll go ahead and make the separation with reasonably good accuracy, and update the x86 side accordingly. That will reduce the size of this patch set to contain only the things that we know are ARM64 specific and which are actually used by the ARM64 code. Things like the hv_message_flags will go into the arch independent portion so that they can be used by the arch independent code without cluttering up the arch specific code. Making the change will help reduce any confusion about what is ARM64-specific. The other core #include file, mshyperv.h, has already been done this way. Michael > > > > > + * Use the Hyper-V provided stimer0 as the timer that is made > > > > + * available to the architecture independent Hyper-V drivers. > > > > + */ > > > > +#define hv_init_timer(timer, tick) \ > > > > + hv_set_vpreg(HV_REGISTER_STIMER0_COUNT + (2*timer), tick) > > > > +#define hv_init_timer_config(timer, val) \ > > > > + hv_set_vpreg(HV_REGISTER_STIMER0_CONFIG + (2*timer), val) > > > > +#define hv_get_current_tick(tick) \ > > > > + (tick = hv_get_vpreg(HV_REGISTER_TIME_REFCOUNT)) > > > > > > In general, we prefer inline functions over macros in header files. > > > > I can change the "set" calls to inline functions. As you can see, the "get" > > functions are coded and used in architecture independent code and on > > the x86 side in a way that won't convert to inline functions. > > Ok. > > Arnd
On Thu, Mar 19, 2020 at 10:31 PM Michael Kelley <mikelley@microsoft.com> wrote: > From: Arnd Bergmann <arnd@arndb.de> Sent: Wednesday, March 18, 2020 3:10 AM > > Just drop the __packed annotations then, they just confuse the compiler > > in this case. In particular, when the compiler thinks that a structure is > > misaligned, it tries to avoid using load/store instructions on it that are > > inefficient or trap with misaligned code, so having default alignment > > produces better object code. > > So I'm confused a bit. Were the original concerns in the above LKML > discussion bogus? Is it legal for the compiler to reorder fields or add > padding, even if the layout of fields in the structure doesn't require it? > If the compiler *could* do such, then it seems like keeping the __packed > would be appropriate per the LKML discussion. The padding is defined in the ELF psABI document for a particular architecture. In theory an architecture might require padding around smaller members, but they generally don't when you look at the ones that Linux runs on. The few odd ones are those that require less padding, only aligning members to 16 or 32 bit rather than natural alignment, or padding the size of the structure to 32 bit even if it only contains 8-bit or 16-bit members. When you have structures in which every member is naturally aligned and the size it a multiple of 32 bit, better leave out the __packed. Aside from generating sub-optimal code, the __packed annotation can also lead to undefined behavior, if you pass a pointer to an unaligned member into a function call that takes an aligned pointer. Newer compilers warn about this. > > > Unfortunately, changing to a bit mask ripples into > > > architecture independent code and into the x86 > > > implementation. I'd prefer not to drag that complexity > > > into this patch set. > > > > How so? If this file is arm64 specific, there should be no need to make > > x86 do the same change. > > This file, hyperv-tlfs.h, is duplicating some definitions on the x86 and > ARM64 sides that are used by arch independent code, and this is one > of those definitions. I had held off on breaking the file into arch > independent and arch specific portions because the Hyper-V team has > left some gray areas for functionality that isn't yet used on the ARM64 > side. So in some cases, it's hard to know what functionality to put > into the arch independent portion. > > But I think I'll go ahead and make the separation with reasonably good > accuracy, and update the x86 side accordingly. That will reduce the size > of this patch set to contain only the things that we know are ARM64 > specific and which are actually used by the ARM64 code. Things like the > hv_message_flags will go into the arch independent portion so that > they can be used by the arch independent code without cluttering up > the arch specific code. Making the change will help reduce any > confusion about what is ARM64-specific. The other core #include file, > mshyperv.h, has already been done this way. Ok, sounds good. Arnd
diff --git a/MAINTAINERS b/MAINTAINERS index 58bb5c4..398cfdb 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7809,6 +7809,8 @@ F: arch/x86/include/asm/trace/hyperv.h F: arch/x86/include/asm/hyperv-tlfs.h F: arch/x86/kernel/cpu/mshyperv.c F: arch/x86/hyperv +F: arch/arm64/include/asm/hyperv-tlfs.h +F: arch/arm64/include/asm/mshyperv.h F: drivers/clocksource/hyperv_timer.c F: drivers/hid/hid-hyperv.c F: drivers/hv/ diff --git a/arch/arm64/include/asm/hyperv-tlfs.h b/arch/arm64/include/asm/hyperv-tlfs.h new file mode 100644 index 0000000..5e6a087 --- /dev/null +++ b/arch/arm64/include/asm/hyperv-tlfs.h @@ -0,0 +1,413 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * This file contains definitions from the Hyper-V Hypervisor Top-Level + * Functional Specification (TLFS): + * https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs + * + * Copyright (C) 2019, Microsoft, Inc. + * + * Author : Michael Kelley <mikelley@microsoft.com> + */ + +#ifndef _ASM_HYPERV_TLFS_H +#define _ASM_HYPERV_TLFS_H + +#include <linux/types.h> + +/* + * All data structures defined in the TLFS that are shared between Hyper-V + * and a guest VM use Little Endian byte ordering. This matches the default + * byte ordering of Linux running on ARM64, so no special handling is required. + */ + + +/* + * While not explicitly listed in the TLFS, Hyper-V always runs with a page + * size of 4096. These definitions are used when communicating with Hyper-V + * using guest physical pages and guest physical page addresses, since the + * guest page size may not be 4096 on ARM64. + */ +#define HV_HYP_PAGE_SHIFT 12 +#define HV_HYP_PAGE_SIZE (1 << HV_HYP_PAGE_SHIFT) +#define HV_HYP_PAGE_MASK (~(HV_HYP_PAGE_SIZE - 1)) + +/* + * These Hyper-V registers provide information equivalent to the CPUID + * instruction on x86/x64. + */ +#define HV_REGISTER_HYPERVISOR_VERSION 0x00000100 /*CPUID 0x40000002 */ +#define HV_REGISTER_PRIVILEGES_AND_FEATURES 0x00000200 /*CPUID 0x40000003 */ +#define HV_REGISTER_FEATURES 0x00000201 /*CPUID 0x40000004 */ +#define HV_REGISTER_IMPLEMENTATION_LIMITS 0x00000202 /*CPUID 0x40000005 */ +#define HV_ARM64_REGISTER_INTERFACE_VERSION 0x00090006 /*CPUID 0x40000001 */ + +/* + * Feature identification. HvRegisterPrivilegesAndFeaturesInfo returns a + * 128-bit value with flags indicating which features are available to the + * partition based upon the current partition privileges. The 128-bit + * value is broken up with different portions stored in different 32-bit + * fields in the ms_hyperv structure. + */ + +/* Partition Reference Counter available*/ +#define HV_MSR_TIME_REF_COUNT_AVAILABLE BIT(1) + +/* Synthetic Timers available */ +#define HV_MSR_SYNTIMER_AVAILABLE BIT(3) + +/* Reference TSC available */ +#define HV_MSR_REFERENCE_TSC_AVAILABLE BIT(9) + + +/* + * This group of flags is in the high order 64-bits of the returned + * 128-bit value. Note that this set of bit positions differs from what + * is used on x86/x64 architecture. + */ + +/* Crash MSRs available */ +#define HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE BIT(8) + +/* STIMER direct mode is available */ +#define HV_STIMER_DIRECT_MODE_AVAILABLE BIT(13) + +/* + * Implementation recommendations in register + * HvRegisterFeaturesInfo. Indicates which behaviors the hypervisor + * recommends the OS implement for optimal performance. + */ + +/* + * Recommend not using Auto EOI + */ +#define HV_DEPRECATING_AEOI_RECOMMENDED BIT(9) + +/* + * Synthetic register definitions equivalent to MSRs on x86/x64 + */ +#define HV_REGISTER_CRASH_P0 0x00000210 +#define HV_REGISTER_CRASH_P1 0x00000211 +#define HV_REGISTER_CRASH_P2 0x00000212 +#define HV_REGISTER_CRASH_P3 0x00000213 +#define HV_REGISTER_CRASH_P4 0x00000214 +#define HV_REGISTER_CRASH_CTL 0x00000215 + +#define HV_REGISTER_GUEST_OSID 0x00090002 +#define HV_REGISTER_VPINDEX 0x00090003 +#define HV_REGISTER_TIME_REFCOUNT 0x00090004 +#define HV_REGISTER_REFERENCE_TSC 0x00090017 + +#define HV_REGISTER_SINT0 0x000A0000 +#define HV_REGISTER_SINT1 0x000A0001 +#define HV_REGISTER_SINT2 0x000A0002 +#define HV_REGISTER_SINT3 0x000A0003 +#define HV_REGISTER_SINT4 0x000A0004 +#define HV_REGISTER_SINT5 0x000A0005 +#define HV_REGISTER_SINT6 0x000A0006 +#define HV_REGISTER_SINT7 0x000A0007 +#define HV_REGISTER_SINT8 0x000A0008 +#define HV_REGISTER_SINT9 0x000A0009 +#define HV_REGISTER_SINT10 0x000A000A +#define HV_REGISTER_SINT11 0x000A000B +#define HV_REGISTER_SINT12 0x000A000C +#define HV_REGISTER_SINT13 0x000A000D +#define HV_REGISTER_SINT14 0x000A000E +#define HV_REGISTER_SINT15 0x000A000F +#define HV_REGISTER_SCONTROL 0x000A0010 +#define HV_REGISTER_SVERSION 0x000A0011 +#define HV_REGISTER_SIFP 0x000A0012 +#define HV_REGISTER_SIPP 0x000A0013 +#define HV_REGISTER_EOM 0x000A0014 +#define HV_REGISTER_SIRBP 0x000A0015 + +#define HV_REGISTER_STIMER0_CONFIG 0x000B0000 +#define HV_REGISTER_STIMER0_COUNT 0x000B0001 +#define HV_REGISTER_STIMER1_CONFIG 0x000B0002 +#define HV_REGISTER_STIMER1_COUNT 0x000B0003 +#define HV_REGISTER_STIMER2_CONFIG 0x000B0004 +#define HV_REGISTER_STIMER2_COUNT 0x000B0005 +#define HV_REGISTER_STIMER3_CONFIG 0x000B0006 +#define HV_REGISTER_STIMER3_COUNT 0x000B0007 + +/* + * Crash notification flags. + */ +#define HV_CRASH_CTL_CRASH_NOTIFY_MSG BIT_ULL(62) +#define HV_CRASH_CTL_CRASH_NOTIFY BIT_ULL(63) + +/* + * The guest OS needs to register the guest ID with the hypervisor. + * The guest ID is a 64 bit entity and the structure of this ID is + * specified in the Hyper-V TLFS. + */ +#define HV_LINUX_VENDOR_ID 0x8100 + +/* Declare the various hypercall operations. */ +#define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE 0x0002 +#define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST 0x0003 +#define HVCALL_NOTIFY_LONG_SPIN_WAIT 0x0008 +#define HVCALL_SEND_IPI 0x000b +#define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX 0x0013 +#define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX 0x0014 +#define HVCALL_SEND_IPI_EX 0x0015 +#define HVCALL_GET_VP_REGISTERS 0x0050 +#define HVCALL_SET_VP_REGISTERS 0x0051 +#define HVCALL_POST_MESSAGE 0x005c +#define HVCALL_SIGNAL_EVENT 0x005d +#define HVCALL_RETARGET_INTERRUPT 0x007e +#define HVCALL_START_VIRTUAL_PROCESSOR 0x0099 + +/* Declare standard hypercall field values. */ +#define HV_PARTITION_ID_SELF ((u64)-1) +#define HV_VP_INDEX_SELF ((u32)-2) + +#define HV_HYPERCALL_FAST_BIT BIT(16) +#define HV_HYPERCALL_REP_COUNT_1 BIT_ULL(32) +#define HV_HYPERCALL_RESULT_MASK GENMASK_ULL(15, 0) + +/* Define the hypercall status result */ + +union hv_hypercall_status { + u64 as_uint64; + struct { + u16 status; + u16 reserved; + u16 reps_completed; /* Low 12 bits */ + u16 reserved2; + }; +}; + +/* hypercall status code */ +#define HV_STATUS_SUCCESS 0 +#define HV_STATUS_INVALID_HYPERCALL_CODE 2 +#define HV_STATUS_INVALID_HYPERCALL_INPUT 3 +#define HV_STATUS_INVALID_ALIGNMENT 4 +#define HV_STATUS_INSUFFICIENT_MEMORY 11 +#define HV_STATUS_INVALID_CONNECTION_ID 18 +#define HV_STATUS_INSUFFICIENT_BUFFERS 19 + +/* Define input and output layout for Get VP Register hypercall */ +struct hv_get_vp_register_input { + u64 partitionid; + u32 vpindex; + u8 inputvtl; + u8 padding[3]; + u32 name0; + u32 name1; +} __packed; + +struct hv_get_vp_register_output { + u64 registervaluelow; + u64 registervaluehigh; +} __packed; + +#define HV_FLUSH_ALL_PROCESSORS BIT(0) +#define HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES BIT(1) +#define HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY BIT(2) +#define HV_FLUSH_USE_EXTENDED_RANGE_FORMAT BIT(3) + +enum HV_GENERIC_SET_FORMAT { + HV_GENERIC_SET_SPARSE_4K, + HV_GENERIC_SET_ALL, +}; + +/* + * The Hyper-V TimeRefCount register and the TSC + * page provide a guest VM clock with 100ns tick rate + */ +#define HV_CLOCK_HZ (NSEC_PER_SEC/100) + +/* + * The fields in this structure are set by Hyper-V and read + * by the Linux guest. They should be accessed with READ_ONCE() + * so the compiler doesn't optimize in a way that will cause + * problems. The union pads the size out to the page size + * used to communicate with Hyper-V. + */ +struct ms_hyperv_tsc_page { + union { + struct { + u32 tsc_sequence; + u32 reserved1; + u64 tsc_scale; + s64 tsc_offset; + } __packed; + u8 reserved2[HV_HYP_PAGE_SIZE]; + }; +}; + +/* Define the number of synthetic interrupt sources. */ +#define HV_SYNIC_SINT_COUNT (16) +/* Define the expected SynIC version. */ +#define HV_SYNIC_VERSION_1 (0x1) + +#define HV_SYNIC_CONTROL_ENABLE (1ULL << 0) +#define HV_SYNIC_SIMP_ENABLE (1ULL << 0) +#define HV_SYNIC_SIEFP_ENABLE (1ULL << 0) +#define HV_SYNIC_SINT_MASKED (1ULL << 16) +#define HV_SYNIC_SINT_AUTO_EOI (1ULL << 17) +#define HV_SYNIC_SINT_VECTOR_MASK (0xFF) + +#define HV_SYNIC_STIMER_COUNT (4) + +/* Define synthetic interrupt controller message constants. */ +#define HV_MESSAGE_SIZE (256) +#define HV_MESSAGE_PAYLOAD_BYTE_COUNT (240) +#define HV_MESSAGE_PAYLOAD_QWORD_COUNT (30) + +/* Define hypervisor message types. */ +enum hv_message_type { + HVMSG_NONE = 0x00000000, + + /* Memory access messages. */ + HVMSG_UNMAPPED_GPA = 0x80000000, + HVMSG_GPA_INTERCEPT = 0x80000001, + + /* Timer notification messages. */ + HVMSG_TIMER_EXPIRED = 0x80000010, + + /* Error messages. */ + HVMSG_INVALID_VP_REGISTER_VALUE = 0x80000020, + HVMSG_UNRECOVERABLE_EXCEPTION = 0x80000021, + HVMSG_UNSUPPORTED_FEATURE = 0x80000022, + + /* Trace buffer complete messages. */ + HVMSG_EVENTLOG_BUFFERCOMPLETE = 0x80000040, +}; + +/* Define synthetic interrupt controller message flags. */ +union hv_message_flags { + __u8 asu8; + struct { + __u8 msg_pending:1; + __u8 reserved:7; + } __packed; +}; + +/* Define port identifier type. */ +union hv_port_id { + __u32 asu32; + struct { + __u32 id:24; + __u32 reserved:8; + } __packed u; +}; + +/* Define synthetic interrupt controller message header. */ +struct hv_message_header { + __u32 message_type; + __u8 payload_size; + union hv_message_flags message_flags; + __u8 reserved[2]; + union { + __u64 sender; + union hv_port_id port; + }; +} __packed; + +/* Define synthetic interrupt controller message format. */ +struct hv_message { + struct hv_message_header header; + union { + __u64 payload[HV_MESSAGE_PAYLOAD_QWORD_COUNT]; + } u; +} __packed; + +/* Define the synthetic interrupt message page layout. */ +struct hv_message_page { + struct hv_message sint_message[HV_SYNIC_SINT_COUNT]; +} __packed; + +/* Define timer message payload structure. */ +struct hv_timer_message_payload { + __u32 timer_index; + __u32 reserved; + __u64 expiration_time; /* When the timer expired */ + __u64 delivery_time; /* When the message was delivered */ +} __packed; + +#define HV_STIMER_ENABLE (1ULL << 0) +#define HV_STIMER_PERIODIC (1ULL << 1) +#define HV_STIMER_LAZY (1ULL << 2) +#define HV_STIMER_AUTOENABLE (1ULL << 3) +#define HV_STIMER_SINT(config) (__u8)(((config) >> 16) & 0x0F) + + +/* Define synthetic interrupt controller flag constants. */ +#define HV_EVENT_FLAGS_COUNT (256 * 8) +#define HV_EVENT_FLAGS_LONG_COUNT (256 / sizeof(unsigned long)) + +/* + * Timer configuration register. + */ +union hv_stimer_config { + u64 as_uint64; + struct { + u64 enable:1; + u64 periodic:1; + u64 lazy:1; + u64 auto_enable:1; + u64 apic_vector:8; + u64 direct_mode:1; + u64 reserved_z0:3; + u64 sintx:4; + u64 reserved_z1:44; + } __packed; +}; + + +/* Define the synthetic interrupt controller event flags format. */ +union hv_synic_event_flags { + unsigned long flags[HV_EVENT_FLAGS_LONG_COUNT]; +}; + +/* Define SynIC control register. */ +union hv_synic_scontrol { + u64 as_uint64; + struct { + u64 enable:1; + u64 reserved:63; + } __packed; +}; + +/* Define synthetic interrupt source. */ +union hv_synic_sint { + u64 as_uint64; + struct { + u64 vector:8; + u64 reserved1:8; + u64 masked:1; + u64 auto_eoi:1; + u64 reserved2:46; + } __packed; +}; + +/* Define the format of the SIMP register */ +union hv_synic_simp { + u64 as_uint64; + struct { + u64 simp_enabled:1; + u64 preserved:11; + u64 base_simp_gpa:52; + } __packed; +}; + +/* Define the format of the SIEFP register */ +union hv_synic_siefp { + u64 as_uint64; + struct { + u64 siefp_enabled:1; + u64 preserved:11; + u64 base_siefp_gpa:52; + } __packed; +}; + +struct hv_vpset { + u64 format; + u64 valid_bank_mask; + u64 bank_contents[]; +} __packed; + + +#endif diff --git a/arch/arm64/include/asm/mshyperv.h b/arch/arm64/include/asm/mshyperv.h new file mode 100644 index 0000000..60b3f68 --- /dev/null +++ b/arch/arm64/include/asm/mshyperv.h @@ -0,0 +1,115 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Linux-specific definitions for managing interactions with Microsoft's + * Hyper-V hypervisor. The definitions in this file are specific to + * the ARM64 architecture. See include/asm-generic/mshyperv.h for + * definitions are that architecture independent. + * + * Definitions that are specified in the Hyper-V Top Level Functional + * Spec (TLFS) should not go in this file, but should instead go in + * hyperv-tlfs.h. + * + * Copyright (C) 2019, Microsoft, Inc. + * + * Author : Michael Kelley <mikelley@microsoft.com> + */ + +#ifndef _ASM_MSHYPERV_H +#define _ASM_MSHYPERV_H + +#include <linux/types.h> +#include <linux/interrupt.h> +#include <linux/clocksource.h> +#include <linux/irq.h> +#include <linux/irqdesc.h> +#include <linux/arm-smccc.h> +#include <asm/hyperv-tlfs.h> + +/* + * Define the IRQ numbers/vectors used by Hyper-V VMbus interrupts + * and by STIMER0 Direct Mode interrupts. Hyper-V should be supplying + * these values through ACPI, but there are no other interrupting + * devices in a Hyper-V VM on ARM64, so it's OK to hard code for now. + * The "CALLBACK_VECTOR" terminology is a left-over from the x86/x64 + * world that is used in architecture independent Hyper-V code. + */ +#define HYPERVISOR_CALLBACK_VECTOR 16 +#define HV_STIMER0_IRQNR 17 + +extern u64 hv_do_hvc(u64 control, ...); +extern u64 hv_do_hvc_fast_get(u64 control, u64 input1, u64 input2, u64 input3, + struct hv_get_vp_register_output *output); + +/* + * Declare calls to get and set Hyper-V VP register values on ARM64, which + * requires a hypercall. + */ +extern void hv_set_vpreg(u32 reg, u64 value); +extern u64 hv_get_vpreg(u32 reg); +extern void hv_get_vpreg_128(u32 reg, struct hv_get_vp_register_output *result); + +/* + * Use the Hyper-V provided stimer0 as the timer that is made + * available to the architecture independent Hyper-V drivers. + */ +#define hv_init_timer(timer, tick) \ + hv_set_vpreg(HV_REGISTER_STIMER0_COUNT + (2*timer), tick) +#define hv_init_timer_config(timer, val) \ + hv_set_vpreg(HV_REGISTER_STIMER0_CONFIG + (2*timer), val) +#define hv_get_current_tick(tick) \ + (tick = hv_get_vpreg(HV_REGISTER_TIME_REFCOUNT)) + +#define hv_get_simp(val) (val = hv_get_vpreg(HV_REGISTER_SIPP)) +#define hv_set_simp(val) hv_set_vpreg(HV_REGISTER_SIPP, val) + +#define hv_get_siefp(val) (val = hv_get_vpreg(HV_REGISTER_SIFP)) +#define hv_set_siefp(val) hv_set_vpreg(HV_REGISTER_SIFP, val) + +#define hv_get_synic_state(val) (val = hv_get_vpreg(HV_REGISTER_SCONTROL)) +#define hv_set_synic_state(val) hv_set_vpreg(HV_REGISTER_SCONTROL, val) + +#define hv_get_vp_index(index) (index = hv_get_vpreg(HV_REGISTER_VPINDEX)) + +#define hv_signal_eom() hv_set_vpreg(HV_REGISTER_EOM, 0) + +/* + * Hyper-V SINT registers are numbered sequentially, so we can just + * add the SINT number to the register number of SINT0 + */ +#define hv_get_synint_state(sint_num, val) \ + (val = hv_get_vpreg(HV_REGISTER_SINT0 + sint_num)) +#define hv_set_synint_state(sint_num, val) \ + hv_set_vpreg(HV_REGISTER_SINT0 + sint_num, val) + +#define hv_get_crash_ctl(val) \ + (val = hv_get_vpreg(HV_REGISTER_CRASH_CTL)) +#define hv_get_time_ref_count(val) \ + (val = hv_get_vpreg(HV_REGISTER_TIME_REFCOUNT)) +#define hv_get_reference_tsc(val) \ + (val = hv_get_vpreg(HV_REGISTER_REFERENCE_TSC)) +#define hv_set_reference_tsc(val) \ + hv_set_vpreg(HV_REGISTER_REFERENCE_TSC, val) +#define hv_enable_vdso_clocksource() +#define hv_set_clocksource_vdso(val) \ + ((val).vdso_clock_mode = VDSO_CLOCKMODE_NONE) + +#if IS_ENABLED(CONFIG_HYPERV) +#define hv_enable_stimer0_percpu_irq(irq) enable_percpu_irq(irq, 0) +#define hv_disable_stimer0_percpu_irq(irq) disable_percpu_irq(irq) +#endif + +/* ARM64 specific code to read the hardware clock */ +#define hv_get_raw_timer() arch_timer_read_counter() + +/* SMCCC hypercall parameters */ +#define HV_SMCCC_FUNC_NUMBER 1 +#define HV_FUNC_ID ARM_SMCCC_CALL_VAL( \ + ARM_SMCCC_STD_CALL, \ + ARM_SMCCC_SMC_64, \ + ARM_SMCCC_OWNER_VENDOR_HYP, \ + HV_SMCCC_FUNC_NUMBER) + +#include <asm-generic/mshyperv.h> + +#endif
hyperv-tlfs.h defines Hyper-V interfaces from the Hyper-V Top Level Functional Spec (TLFS). The TLFS is distinctly oriented to x86/x64, and Hyper-V has not separated out the architecture-dependent parts into x86/x64 vs. ARM64. So hyperv-tlfs.h includes information for ARM64 that is not yet formally published. The TLFS is available here: docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs mshyperv.h defines Linux-specific structures and routines for interacting with Hyper-V on ARM64, and #includes the architecture- independent part of mshyperv.h in include/asm-generic. Signed-off-by: Michael Kelley <mikelley@microsoft.com> --- MAINTAINERS | 2 + arch/arm64/include/asm/hyperv-tlfs.h | 413 +++++++++++++++++++++++++++++++++++ arch/arm64/include/asm/mshyperv.h | 115 ++++++++++ 3 files changed, 530 insertions(+) create mode 100644 arch/arm64/include/asm/hyperv-tlfs.h create mode 100644 arch/arm64/include/asm/mshyperv.h