Message ID | 20241025141453.1210600-4-david@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | virtio-mem: s390 support | expand |
On Fri, Oct 25, 2024 at 04:14:48PM +0200, David Hildenbrand wrote: > To support memory devices under QEMU/KVM, such as virtio-mem, > we have to prepare our kernel virtual address space accordingly and > have to know the highest possible physical memory address we might see > later: the storage limit. The good old SCLP interface is not suitable for > this use case. > > In particular, memory owned by memory devices has no relationship to > storage increments, it is always detected using the device driver, and > unaware OSes (no driver) must never try making use of that memory. > Consequently this memory is located outside of the "maximum storage > increment"-indicated memory range. > > Let's use our new diag500 STORAGE_LIMIT subcode to query this storage > limit that can exceed the "maximum storage increment", and use the > existing interfaces (i.e., SCLP) to obtain information about the initial > memory that is not owned+managed by memory devices. > > If a hypervisor does not support such memory devices, the address exposed > through diag500 STORAGE_LIMIT will correspond to the maximum storage > increment exposed through SCLP. > > To teach kdump on s390 to include memory owned by memory devices, there > will be ways to query the relevant memory ranges from the device via a > driver running in special kdump mode (like virtio-mem already implements > to filter /proc/vmcore access so we don't end up reading from unplugged > device blocks). > > Update setup_ident_map_size(), to clarify that there can be more than > just online and standby memory. > > Tested-by: Mario Casquero <mcasquer@redhat.com> > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > arch/s390/boot/physmem_info.c | 47 +++++++++++++++++++++++++++- > arch/s390/boot/startup.c | 7 +++-- > arch/s390/include/asm/physmem_info.h | 3 ++ > 3 files changed, 54 insertions(+), 3 deletions(-) Looks like I couldn't convince you to implement a query subcode. But anyway, let's move on. Acked-by: Heiko Carstens <hca@linux.ibm.com> However, I would like to see an Ack or review from Alexander Gordeev or Vasily Gorbik for this patch.
On 30.10.24 10:23, Heiko Carstens wrote: > On Fri, Oct 25, 2024 at 04:14:48PM +0200, David Hildenbrand wrote: >> To support memory devices under QEMU/KVM, such as virtio-mem, >> we have to prepare our kernel virtual address space accordingly and >> have to know the highest possible physical memory address we might see >> later: the storage limit. The good old SCLP interface is not suitable for >> this use case. >> >> In particular, memory owned by memory devices has no relationship to >> storage increments, it is always detected using the device driver, and >> unaware OSes (no driver) must never try making use of that memory. >> Consequently this memory is located outside of the "maximum storage >> increment"-indicated memory range. >> >> Let's use our new diag500 STORAGE_LIMIT subcode to query this storage >> limit that can exceed the "maximum storage increment", and use the >> existing interfaces (i.e., SCLP) to obtain information about the initial >> memory that is not owned+managed by memory devices. >> >> If a hypervisor does not support such memory devices, the address exposed >> through diag500 STORAGE_LIMIT will correspond to the maximum storage >> increment exposed through SCLP. >> >> To teach kdump on s390 to include memory owned by memory devices, there >> will be ways to query the relevant memory ranges from the device via a >> driver running in special kdump mode (like virtio-mem already implements >> to filter /proc/vmcore access so we don't end up reading from unplugged >> device blocks). >> >> Update setup_ident_map_size(), to clarify that there can be more than >> just online and standby memory. >> >> Tested-by: Mario Casquero <mcasquer@redhat.com> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> arch/s390/boot/physmem_info.c | 47 +++++++++++++++++++++++++++- >> arch/s390/boot/startup.c | 7 +++-- >> arch/s390/include/asm/physmem_info.h | 3 ++ >> 3 files changed, 54 insertions(+), 3 deletions(-) > > Looks like I couldn't convince you to implement a query subcode. Well, you convinced me that it might be useful, but after waiting on feedback from the KVM folks ... which didn't happen I moved on. In the cover letter I have "No query function for diag500 for now." My thinking was that if we go for a query subcode, maybe we'd start "anew" with a new diag and use "0=query" like all similar instructions I am aware of. And that is then a bigger rework ... ... and I am not particularly interested in extra work without a clear statement from KVM people what (a) if that work is required and; (b) what it should look like. Thanks for the review Heiko!
On Wed, Oct 30, 2024 at 10:42:05AM +0100, David Hildenbrand wrote: > On 30.10.24 10:23, Heiko Carstens wrote: > > Looks like I couldn't convince you to implement a query subcode. > > Well, you convinced me that it might be useful, but after waiting on > feedback from the KVM folks ... which didn't happen I moved on. In the cover > letter I have "No query function for diag500 for now." > > My thinking was that if we go for a query subcode, maybe we'd start "anew" > with a new diag and use "0=query" like all similar instructions I am aware > of. And that is then a bigger rework ... > > ... and I am not particularly interested in extra work without a clear > statement from KVM people what (a) if that work is required and; (b) what it > should look like. Yes, it is all good. Let's just move on.
On Fri, Oct 25, 2024 at 04:14:48PM +0200, David Hildenbrand wrote: > To support memory devices under QEMU/KVM, such as virtio-mem, > we have to prepare our kernel virtual address space accordingly and > have to know the highest possible physical memory address we might see > later: the storage limit. The good old SCLP interface is not suitable for > this use case. > > In particular, memory owned by memory devices has no relationship to > storage increments, it is always detected using the device driver, and > unaware OSes (no driver) must never try making use of that memory. > Consequently this memory is located outside of the "maximum storage > increment"-indicated memory range. > > Let's use our new diag500 STORAGE_LIMIT subcode to query this storage > limit that can exceed the "maximum storage increment", and use the > existing interfaces (i.e., SCLP) to obtain information about the initial > memory that is not owned+managed by memory devices. > > If a hypervisor does not support such memory devices, the address exposed > through diag500 STORAGE_LIMIT will correspond to the maximum storage > increment exposed through SCLP. > > To teach kdump on s390 to include memory owned by memory devices, there > will be ways to query the relevant memory ranges from the device via a > driver running in special kdump mode (like virtio-mem already implements > to filter /proc/vmcore access so we don't end up reading from unplugged > device blocks). > > Update setup_ident_map_size(), to clarify that there can be more than > just online and standby memory. > > Tested-by: Mario Casquero <mcasquer@redhat.com> > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > arch/s390/boot/physmem_info.c | 47 +++++++++++++++++++++++++++- > arch/s390/boot/startup.c | 7 +++-- > arch/s390/include/asm/physmem_info.h | 3 ++ > 3 files changed, 54 insertions(+), 3 deletions(-) Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
On 30.10.24 15:32, Alexander Gordeev wrote: > On Fri, Oct 25, 2024 at 04:14:48PM +0200, David Hildenbrand wrote: >> To support memory devices under QEMU/KVM, such as virtio-mem, >> we have to prepare our kernel virtual address space accordingly and >> have to know the highest possible physical memory address we might see >> later: the storage limit. The good old SCLP interface is not suitable for >> this use case. >> >> In particular, memory owned by memory devices has no relationship to >> storage increments, it is always detected using the device driver, and >> unaware OSes (no driver) must never try making use of that memory. >> Consequently this memory is located outside of the "maximum storage >> increment"-indicated memory range. >> >> Let's use our new diag500 STORAGE_LIMIT subcode to query this storage >> limit that can exceed the "maximum storage increment", and use the >> existing interfaces (i.e., SCLP) to obtain information about the initial >> memory that is not owned+managed by memory devices. >> >> If a hypervisor does not support such memory devices, the address exposed >> through diag500 STORAGE_LIMIT will correspond to the maximum storage >> increment exposed through SCLP. >> >> To teach kdump on s390 to include memory owned by memory devices, there >> will be ways to query the relevant memory ranges from the device via a >> driver running in special kdump mode (like virtio-mem already implements >> to filter /proc/vmcore access so we don't end up reading from unplugged >> device blocks). >> >> Update setup_ident_map_size(), to clarify that there can be more than >> just online and standby memory. >> >> Tested-by: Mario Casquero <mcasquer@redhat.com> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> arch/s390/boot/physmem_info.c | 47 +++++++++++++++++++++++++++- >> arch/s390/boot/startup.c | 7 +++-- >> arch/s390/include/asm/physmem_info.h | 3 ++ >> 3 files changed, 54 insertions(+), 3 deletions(-) > > Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> > Thanks Alexander!
diff --git a/arch/s390/boot/physmem_info.c b/arch/s390/boot/physmem_info.c index 1d131a81cb8b..f3ea5dbff10b 100644 --- a/arch/s390/boot/physmem_info.c +++ b/arch/s390/boot/physmem_info.c @@ -109,6 +109,42 @@ static int diag260(void) return 0; } +#define DIAG500_SC_STOR_LIMIT 4 + +static int diag500_storage_limit(unsigned long *max_physmem_end) +{ + unsigned long storage_limit; + unsigned long reg1, reg2; + psw_t old; + + asm volatile( + " mvc 0(16,%[psw_old]),0(%[psw_pgm])\n" + " epsw %[reg1],%[reg2]\n" + " st %[reg1],0(%[psw_pgm])\n" + " st %[reg2],4(%[psw_pgm])\n" + " larl %[reg1],1f\n" + " stg %[reg1],8(%[psw_pgm])\n" + " lghi 1,%[subcode]\n" + " lghi 2,0\n" + " diag 2,4,0x500\n" + "1: mvc 0(16,%[psw_pgm]),0(%[psw_old])\n" + " lgr %[slimit],2\n" + : [reg1] "=&d" (reg1), + [reg2] "=&a" (reg2), + [slimit] "=d" (storage_limit), + "=Q" (get_lowcore()->program_new_psw), + "=Q" (old) + : [psw_old] "a" (&old), + [psw_pgm] "a" (&get_lowcore()->program_new_psw), + [subcode] "i" (DIAG500_SC_STOR_LIMIT) + : "memory", "1", "2"); + if (!storage_limit) + return -EINVAL; + /* Convert inclusive end to exclusive end */ + *max_physmem_end = storage_limit + 1; + return 0; +} + static int tprot(unsigned long addr) { unsigned long reg1, reg2; @@ -157,7 +193,9 @@ unsigned long detect_max_physmem_end(void) { unsigned long max_physmem_end = 0; - if (!sclp_early_get_memsize(&max_physmem_end)) { + if (!diag500_storage_limit(&max_physmem_end)) { + physmem_info.info_source = MEM_DETECT_DIAG500_STOR_LIMIT; + } else if (!sclp_early_get_memsize(&max_physmem_end)) { physmem_info.info_source = MEM_DETECT_SCLP_READ_INFO; } else { max_physmem_end = search_mem_end(); @@ -170,6 +208,13 @@ void detect_physmem_online_ranges(unsigned long max_physmem_end) { if (!sclp_early_read_storage_info()) { physmem_info.info_source = MEM_DETECT_SCLP_STOR_INFO; + } else if (physmem_info.info_source == MEM_DETECT_DIAG500_STOR_LIMIT) { + unsigned long online_end; + + if (!sclp_early_get_memsize(&online_end)) { + physmem_info.info_source = MEM_DETECT_SCLP_READ_INFO; + add_physmem_online_range(0, online_end); + } } else if (!diag260()) { physmem_info.info_source = MEM_DETECT_DIAG260; } else if (max_physmem_end) { diff --git a/arch/s390/boot/startup.c b/arch/s390/boot/startup.c index c8f149ad77e5..76c33c7442df 100644 --- a/arch/s390/boot/startup.c +++ b/arch/s390/boot/startup.c @@ -182,12 +182,15 @@ static void kaslr_adjust_got(unsigned long offset) * Merge information from several sources into a single ident_map_size value. * "ident_map_size" represents the upper limit of physical memory we may ever * reach. It might not be all online memory, but also include standby (offline) - * memory. "ident_map_size" could be lower then actual standby or even online + * memory or memory areas reserved for other means (e.g., memory devices such as + * virtio-mem). + * + * "ident_map_size" could be lower then actual standby/reserved or even online * memory present, due to limiting factors. We should never go above this limit. * It is the size of our identity mapping. * * Consider the following factors: - * 1. max_physmem_end - end of physical memory online or standby. + * 1. max_physmem_end - end of physical memory online, standby or reserved. * Always >= end of the last online memory range (get_physmem_online_end()). * 2. CONFIG_MAX_PHYSMEM_BITS - the maximum size of physical memory the * kernel is able to support. diff --git a/arch/s390/include/asm/physmem_info.h b/arch/s390/include/asm/physmem_info.h index f45cfc8bc233..51b68a43e195 100644 --- a/arch/s390/include/asm/physmem_info.h +++ b/arch/s390/include/asm/physmem_info.h @@ -9,6 +9,7 @@ enum physmem_info_source { MEM_DETECT_NONE = 0, MEM_DETECT_SCLP_STOR_INFO, MEM_DETECT_DIAG260, + MEM_DETECT_DIAG500_STOR_LIMIT, MEM_DETECT_SCLP_READ_INFO, MEM_DETECT_BIN_SEARCH }; @@ -107,6 +108,8 @@ static inline const char *get_physmem_info_source(void) return "sclp storage info"; case MEM_DETECT_DIAG260: return "diag260"; + case MEM_DETECT_DIAG500_STOR_LIMIT: + return "diag500 storage limit"; case MEM_DETECT_SCLP_READ_INFO: return "sclp read info"; case MEM_DETECT_BIN_SEARCH: