Message ID | 1584200119-18594-5-git-send-email-mikelley@microsoft.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Subject: Enable Linux guests on Hyper-V on ARM64 | expand |
On Sat, Mar 14, 2020 at 4:36 PM Michael Kelley <mikelley@microsoft.com> wrote: > /* > + * Functions for allocating and freeing memory with size and > + * alignment HV_HYP_PAGE_SIZE. These functions are needed because > + * the guest page size may not be the same as the Hyper-V page > + * size. We depend upon kmalloc() aligning power-of-two size > + * allocations to the allocation size boundary, so that the > + * allocated memory appears to Hyper-V as a page of the size > + * it expects. > + * > + * These functions are used by arm64 specific code as well as > + * arch independent Hyper-V drivers. > + */ > + > +void *hv_alloc_hyperv_page(void) > +{ > + BUILD_BUG_ON(PAGE_SIZE < HV_HYP_PAGE_SIZE); > + return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); > +} > +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page); I don't think there is any guarantee that kmalloc() returns page-aligned allocations in general. How about using get_free_pages() to implement this? Arnd
On Mon, Mar 16, 2020 at 09:22:43AM +0100, Arnd Bergmann wrote: > On Sat, Mar 14, 2020 at 4:36 PM Michael Kelley <mikelley@microsoft.com> wrote: > > /* > > + * Functions for allocating and freeing memory with size and > > + * alignment HV_HYP_PAGE_SIZE. These functions are needed because > > + * the guest page size may not be the same as the Hyper-V page > > + * size. We depend upon kmalloc() aligning power-of-two size > > + * allocations to the allocation size boundary, so that the > > + * allocated memory appears to Hyper-V as a page of the size > > + * it expects. > > + * > > + * These functions are used by arm64 specific code as well as > > + * arch independent Hyper-V drivers. > > + */ > > + > > +void *hv_alloc_hyperv_page(void) > > +{ > > + BUILD_BUG_ON(PAGE_SIZE < HV_HYP_PAGE_SIZE); > > + return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); > > +} > > +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page); > > I don't think there is any guarantee that kmalloc() returns page-aligned > allocations in general. How about using get_free_pages() > to implement this? Even if it was guaranteed, a pointless wrapper like this is not needed or ok, and shouldn't be created, just use kmalloc. thanks, greg k-h
On 2020-03-16 08:22, Arnd Bergmann wrote: > On Sat, Mar 14, 2020 at 4:36 PM Michael Kelley <mikelley@microsoft.com> > wrote: >> /* >> + * Functions for allocating and freeing memory with size and >> + * alignment HV_HYP_PAGE_SIZE. These functions are needed because >> + * the guest page size may not be the same as the Hyper-V page >> + * size. We depend upon kmalloc() aligning power-of-two size >> + * allocations to the allocation size boundary, so that the >> + * allocated memory appears to Hyper-V as a page of the size >> + * it expects. >> + * >> + * These functions are used by arm64 specific code as well as >> + * arch independent Hyper-V drivers. >> + */ >> + >> +void *hv_alloc_hyperv_page(void) >> +{ >> + BUILD_BUG_ON(PAGE_SIZE < HV_HYP_PAGE_SIZE); >> + return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); >> +} >> +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page); > > I don't think there is any guarantee that kmalloc() returns > page-aligned > allocations in general. I believe that guarantee came with 59bb47985c1db ("mm, sl[aou]b: guarantee natural alignment for kmalloc(power-of-two)"). > How about using get_free_pages() to implement this? This would certainly work, at the expense of a lot of wasted memory when PAGE_SIZE isn't 4k. Thanks, M.
On Mon, Mar 16, 2020 at 9:30 AM Marc Zyngier <maz@kernel.org> wrote: > On 2020-03-16 08:22, Arnd Bergmann wrote: > > On Sat, Mar 14, 2020 at 4:36 PM Michael Kelley <mikelley@microsoft.com> > > wrote: > >> /* > >> + * Functions for allocating and freeing memory with size and > >> + * alignment HV_HYP_PAGE_SIZE. These functions are needed because > >> + * the guest page size may not be the same as the Hyper-V page > >> + * size. We depend upon kmalloc() aligning power-of-two size > >> + * allocations to the allocation size boundary, so that the > >> + * allocated memory appears to Hyper-V as a page of the size > >> + * it expects. > >> + * > >> + * These functions are used by arm64 specific code as well as > >> + * arch independent Hyper-V drivers. > >> + */ > >> + > >> +void *hv_alloc_hyperv_page(void) > >> +{ > >> + BUILD_BUG_ON(PAGE_SIZE < HV_HYP_PAGE_SIZE); > >> + return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); > >> +} > >> +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page); > > > > I don't think there is any guarantee that kmalloc() returns > > page-aligned > > allocations in general. > > I believe that guarantee came with 59bb47985c1db ("mm, sl[aou]b: > guarantee > natural alignment for kmalloc(power-of-two)"). > > > How about using get_free_pages() to implement this? > > This would certainly work, at the expense of a lot of wasted memory when > PAGE_SIZE isn't 4k. I'm sure this is the least of your problems when the guest runs with a large base page size, you've already wasted most of your memory otherwise then. Arnd
From: Arnd Bergmann <arnd@arndb.de> Sent: Monday, March 16, 2020 1:33 AM > > On Mon, Mar 16, 2020 at 9:30 AM Marc Zyngier <maz@kernel.org> wrote: > > On 2020-03-16 08:22, Arnd Bergmann wrote: > > > On Sat, Mar 14, 2020 at 4:36 PM Michael Kelley <mikelley@microsoft.com> > > > wrote: > > >> /* > > >> + * Functions for allocating and freeing memory with size and > > >> + * alignment HV_HYP_PAGE_SIZE. These functions are needed because > > >> + * the guest page size may not be the same as the Hyper-V page > > >> + * size. We depend upon kmalloc() aligning power-of-two size > > >> + * allocations to the allocation size boundary, so that the > > >> + * allocated memory appears to Hyper-V as a page of the size > > >> + * it expects. > > >> + * > > >> + * These functions are used by arm64 specific code as well as > > >> + * arch independent Hyper-V drivers. > > >> + */ > > >> + > > >> +void *hv_alloc_hyperv_page(void) > > >> +{ > > >> + BUILD_BUG_ON(PAGE_SIZE < HV_HYP_PAGE_SIZE); > > >> + return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); > > >> +} > > >> +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page); > > > > > > I don't think there is any guarantee that kmalloc() returns > > > page-aligned > > > allocations in general. > > > > I believe that guarantee came with 59bb47985c1db ("mm, sl[aou]b: > > guarantee > > natural alignment for kmalloc(power-of-two)"). > > > > > How about using get_free_pages() to implement this? > > > > This would certainly work, at the expense of a lot of wasted memory when > > PAGE_SIZE isn't 4k. > > I'm sure this is the least of your problems when the guest runs with > a large base page size, you've already wasted most of your memory > otherwise then. > I think there's value in keeping these functions. There are 8 uses in architecture independent code at the moment, which admittedly saves only ~1/2 Mbyte of memory with a 64K page size, but we will have additional uses with more memory savings as we get all of the Hyper-V synthetic drivers to work with 64K page size. Furthermore, there's coming work that will require additional steps to share a page between a guest and the Hyper-V host. These functions are the right place to put the code for the additional sharing steps. Removing them now in favor of a bare kmalloc() and then adding them back doesn't seem worthwhile. Michael
On Wed, Mar 18, 2020 at 1:15 AM Michael Kelley <mikelley@microsoft.com> wrote: > From: Arnd Bergmann <arnd@arndb.de> Sent: Monday, March 16, 2020 1:33 AM > > > > On Mon, Mar 16, 2020 at 9:30 AM Marc Zyngier <maz@kernel.org> wrote: > > > On 2020-03-16 08:22, Arnd Bergmann wrote: > > > > On Sat, Mar 14, 2020 at 4:36 PM Michael Kelley <mikelley@microsoft.com> > > > > wrote: > > > >> /* > > > >> + * Functions for allocating and freeing memory with size and > > > >> + * alignment HV_HYP_PAGE_SIZE. These functions are needed because > > > >> + * the guest page size may not be the same as the Hyper-V page > > > >> + * size. We depend upon kmalloc() aligning power-of-two size > > > >> + * allocations to the allocation size boundary, so that the > > > >> + * allocated memory appears to Hyper-V as a page of the size > > > >> + * it expects. > > > >> + * > > > >> + * These functions are used by arm64 specific code as well as > > > >> + * arch independent Hyper-V drivers. > > > >> + */ > > > >> + > > > >> +void *hv_alloc_hyperv_page(void) > > > >> +{ > > > >> + BUILD_BUG_ON(PAGE_SIZE < HV_HYP_PAGE_SIZE); > > > >> + return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); > > > >> +} > > > >> +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page); > > > > > > > > I don't think there is any guarantee that kmalloc() returns > > > > page-aligned > > > > allocations in general. > > > > > > I believe that guarantee came with 59bb47985c1db ("mm, sl[aou]b: > > > guarantee > > > natural alignment for kmalloc(power-of-two)"). > > > > > > > How about using get_free_pages() to implement this? > > > > > > This would certainly work, at the expense of a lot of wasted memory when > > > PAGE_SIZE isn't 4k. > > > > I'm sure this is the least of your problems when the guest runs with > > a large base page size, you've already wasted most of your memory > > otherwise then. > > > > I think there's value in keeping these functions. There are 8 uses in > architecture independent code at the moment, which admittedly saves > only ~1/2 Mbyte of memory with a 64K page size, but we will have > additional uses with more memory savings as we get all of the > Hyper-V synthetic drivers to work with 64K page size. Furthermore, > there's coming work that will require additional steps to share a page > between a guest and the Hyper-V host. These functions are the right > place to put the code for the additional sharing steps. Removing them > now in favor of a bare kmalloc() and then adding them back doesn't > seem worthwhile. My point was to keep the functions but use alloc_pages() internally, so you can deal with the hypervisor having a larger page size than the guest, which seems to be a more important scenario if I correctly understand the differences between the way Windows and Linux deal with page cache. As far as I understand, using 64kb pages on Windows is generally a win as the VFS code already deals with units of that size, while on Linux the 4kb page size is too deeply entrenched within the file system code to mess with it: Whatever you gain in terms of TLB pressure on workloads that cannot use huge pages is all lost again through extra I/O and wasted physical memory. Arnd
From: Arnd Bergmann <arnd@arndb.de> Sent: Wednesday, March 18, 2020 2:58 AM > > On Wed, Mar 18, 2020 at 1:15 AM Michael Kelley <mikelley@microsoft.com> wrote: > > From: Arnd Bergmann <arnd@arndb.de> Sent: Monday, March 16, 2020 1:33 AM > > > > > > On Mon, Mar 16, 2020 at 9:30 AM Marc Zyngier <maz@kernel.org> wrote: > > > > On 2020-03-16 08:22, Arnd Bergmann wrote: > > > > > On Sat, Mar 14, 2020 at 4:36 PM Michael Kelley <mikelley@microsoft.com> > > > > > wrote: > > > > >> /* > > > > >> + * Functions for allocating and freeing memory with size and > > > > >> + * alignment HV_HYP_PAGE_SIZE. These functions are needed because > > > > >> + * the guest page size may not be the same as the Hyper-V page > > > > >> + * size. We depend upon kmalloc() aligning power-of-two size > > > > >> + * allocations to the allocation size boundary, so that the > > > > >> + * allocated memory appears to Hyper-V as a page of the size > > > > >> + * it expects. > > > > >> + * > > > > >> + * These functions are used by arm64 specific code as well as > > > > >> + * arch independent Hyper-V drivers. > > > > >> + */ > > > > >> + > > > > >> +void *hv_alloc_hyperv_page(void) > > > > >> +{ > > > > >> + BUILD_BUG_ON(PAGE_SIZE < HV_HYP_PAGE_SIZE); > > > > >> + return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); > > > > >> +} > > > > >> +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page); > > > > > > > > > > I don't think there is any guarantee that kmalloc() returns > > > > > page-aligned > > > > > allocations in general. > > > > > > > > I believe that guarantee came with 59bb47985c1db ("mm, sl[aou]b: > > > > guarantee > > > > natural alignment for kmalloc(power-of-two)"). > > > > > > > > > How about using get_free_pages() to implement this? > > > > > > > > This would certainly work, at the expense of a lot of wasted memory when > > > > PAGE_SIZE isn't 4k. > > > > > > I'm sure this is the least of your problems when the guest runs with > > > a large base page size, you've already wasted most of your memory > > > otherwise then. > > > > > > > I think there's value in keeping these functions. There are 8 uses in > > architecture independent code at the moment, which admittedly saves > > only ~1/2 Mbyte of memory with a 64K page size, but we will have > > additional uses with more memory savings as we get all of the > > Hyper-V synthetic drivers to work with 64K page size. Furthermore, > > there's coming work that will require additional steps to share a page > > between a guest and the Hyper-V host. These functions are the right > > place to put the code for the additional sharing steps. Removing them > > now in favor of a bare kmalloc() and then adding them back doesn't > > seem worthwhile. > > My point was to keep the functions but use alloc_pages() internally, > so you can deal with the hypervisor having a larger page size than > the guest, which seems to be a more important scenario if I correctly > understand the differences between the way Windows and Linux > deal with page cache. OK, I see now what you are getting at. I should spell out my assumption, which is the opposite. Hyper-V won't have a page size other than 4K anytime in the foreseeable future. The code is too wedded to the x86 4K page size, and the host-guest interfaces have a lot of implicit assumptions that the page size is 4K (unfortunately). But the last time I looked, RHEL for ARM64 is delivered with a 64K page size. So my assumption is that the only combination that really matters is the guest page size being larger than the Hyper-V page size. The other way around just won't happen without a major overhaul to Hyper-V, including a rework of the guest-host interface. Michael > > As far as I understand, using 64kb pages on Windows is generally > a win as the VFS code already deals with units of that size, while > on Linux the 4kb page size is too deeply entrenched within the > file system code to mess with it: Whatever you gain in terms of > TLB pressure on workloads that cannot use huge pages is all lost > again through extra I/O and wasted physical memory. > > Arnd
On Thu, Mar 19, 2020 at 10:43 PM Michael Kelley <mikelley@microsoft.com> wrote: > From: Arnd Bergmann <arnd@arndb.de> Sent: Wednesday, March 18, 2020 2:58 AM > > On Wed, Mar 18, 2020 at 1:15 AM Michael Kelley <mikelley@microsoft.com> wrote: > > My point was to keep the functions but use alloc_pages() internally, > > so you can deal with the hypervisor having a larger page size than > > the guest, which seems to be a more important scenario if I correctly > > understand the differences between the way Windows and Linux > > deal with page cache. > > OK, I see now what you are getting at. I should spell out my > assumption, which is the opposite. Hyper-V won't have a page > size other than 4K anytime in the foreseeable future. The > code is too wedded to the x86 4K page size, and the host-guest > interfaces have a lot of implicit assumptions that the page size is > 4K (unfortunately). But the last time I looked, RHEL for ARM64 is > delivered with a 64K page size. So my assumption is that the only > combination that really matters is the guest page size being larger > than the Hyper-V page size. The other way around just won't > happen without a major overhaul to Hyper-V, including a rework > of the guest-host interface. Ok, got it. We should really ask Red Hat to change the page size, but as long as you care existing systems, and you expect this to result in gigabytes of allocation on future systems, having the wrapper seems reasonable. Maybe you could fall back to alloc_page when PAGE_SIZE equals HV_HYP_PAGE_SIZE though? I'm not sure what the tradeoff between kmalloc and alloc_page is these days, other than the feeling that alloc_page is the more obvious choice when you know you always want exactly a page here. Arnd
From: Arnd Bergmann <arnd@arndb.de> Sent: Friday, March 20, 2020 9:28 AM > > On Thu, Mar 19, 2020 at 10:43 PM Michael Kelley <mikelley@microsoft.com> wrote: > > From: Arnd Bergmann <arnd@arndb.de> Sent: Wednesday, March 18, 2020 2:58 AM > > > On Wed, Mar 18, 2020 at 1:15 AM Michael Kelley <mikelley@microsoft.com> wrote: > > > My point was to keep the functions but use alloc_pages() internally, > > > so you can deal with the hypervisor having a larger page size than > > > the guest, which seems to be a more important scenario if I correctly > > > understand the differences between the way Windows and Linux > > > deal with page cache. > > > > OK, I see now what you are getting at. I should spell out my > > assumption, which is the opposite. Hyper-V won't have a page > > size other than 4K anytime in the foreseeable future. The > > code is too wedded to the x86 4K page size, and the host-guest > > interfaces have a lot of implicit assumptions that the page size is > > 4K (unfortunately). But the last time I looked, RHEL for ARM64 is > > delivered with a 64K page size. So my assumption is that the only > > combination that really matters is the guest page size being larger > > than the Hyper-V page size. The other way around just won't > > happen without a major overhaul to Hyper-V, including a rework > > of the guest-host interface. > > Ok, got it. We should really ask Red Hat to change the page size, > but as long as you care existing systems, and you expect this to > result in gigabytes of allocation on future systems, having the > wrapper seems reasonable. > > Maybe you could fall back to alloc_page when PAGE_SIZE equals > HV_HYP_PAGE_SIZE though? I'm not sure what the tradeoff > between kmalloc and alloc_page is these days, other than the > feeling that alloc_page is the more obvious choice when you know > you always want exactly a page here. > Yes, that works for me. Michael
diff --git a/arch/arm64/hyperv/hv_core.c b/arch/arm64/hyperv/hv_core.c index 272c348..4aa6b8f 100644 --- a/arch/arm64/hyperv/hv_core.c +++ b/arch/arm64/hyperv/hv_core.c @@ -17,12 +17,46 @@ #include <linux/slab.h> #include <linux/hyperv.h> #include <linux/arm-smccc.h> +#include <linux/string.h> #include <asm-generic/bug.h> #include <asm/hyperv-tlfs.h> #include <asm/mshyperv.h> /* + * Functions for allocating and freeing memory with size and + * alignment HV_HYP_PAGE_SIZE. These functions are needed because + * the guest page size may not be the same as the Hyper-V page + * size. We depend upon kmalloc() aligning power-of-two size + * allocations to the allocation size boundary, so that the + * allocated memory appears to Hyper-V as a page of the size + * it expects. + * + * These functions are used by arm64 specific code as well as + * arch independent Hyper-V drivers. + */ + +void *hv_alloc_hyperv_page(void) +{ + BUILD_BUG_ON(PAGE_SIZE < HV_HYP_PAGE_SIZE); + return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); +} +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page); + +void *hv_alloc_hyperv_zeroed_page(void) +{ + return kzalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); +} +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_zeroed_page); + +void hv_free_hyperv_page(unsigned long addr) +{ + kfree((void *)addr); +} +EXPORT_SYMBOL_GPL(hv_free_hyperv_page); + + +/* * hv_do_hypercall- Invoke the specified hypercall */ u64 hv_do_hypercall(u64 control, void *input, void *output) diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h index b3f1082..1a7c4c5 100644 --- a/include/asm-generic/mshyperv.h +++ b/include/asm-generic/mshyperv.h @@ -99,6 +99,11 @@ static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type) void hv_setup_crash_handler(void (*handler)(struct pt_regs *regs)); void hv_remove_crash_handler(void); +void *hv_alloc_hyperv_page(void); +void *hv_alloc_hyperv_zeroed_page(void); +void hv_free_hyperv_page(unsigned long addr); + + #if IS_ENABLED(CONFIG_HYPERV) /* * Hypervisor's notion of virtual processor ID is different from
Add ARM64-specific code to allocate memory with HV_HYP_PAGE_SIZE size and alignment. These are for use when pages need to be shared with Hyper-V. Separate functions are needed as the page size used by Hyper-V may not be the same as the guest page size. This code is built only when CONFIG_HYPERV is enabled. Signed-off-by: Michael Kelley <mikelley@microsoft.com> --- arch/arm64/hyperv/hv_core.c | 34 ++++++++++++++++++++++++++++++++++ include/asm-generic/mshyperv.h | 5 +++++ 2 files changed, 39 insertions(+)