Message ID | 20231107130309.3257776-1-kraxel@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v5] limit physical address space size | expand |
On Tue, Nov 07, 2023 at 02:03:09PM +0100, Gerd Hoffmann wrote: > For better compatibility with old linux kernels, > see source code comment. > > Related (same problem in ovmf): > https://github.com/tianocore/edk2/commit/c1e853769046 Thanks. I'll defer to your judgement on this. It does seem a little odd to alter the CPUPhysBits variable itself instead of adding additional checking to the pciinit.c code that uses CPUPhysBits. (That is, if there are old Linux kernels that can't handle a very high PCI space, then a workaround in the PCI code might make it more clear what is occurring.) Cheers, -Kevin > > Cc: Claudio Fontana <cfontana@suse.de> > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> > --- > src/fw/paravirt.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/src/fw/paravirt.c b/src/fw/paravirt.c > index e5d4eca0cb5a..a2c9c64d5e78 100644 > --- a/src/fw/paravirt.c > +++ b/src/fw/paravirt.c > @@ -182,6 +182,14 @@ static void physbits(int qemu_quirk) > __func__, signature, pae ? "yes" : "no", CPULongMode ? "yes" : "no", > physbits, valid ? "yes" : "no"); > > + if (valid && physbits > 46) { > + // Old linux kernels have trouble dealing with more than 46 > + // phys-bits, so avoid that for now. Seems to be a bug in the > + // virtio-pci driver. Reported: centos-7, ubuntu-18.04 > + dprintf(1, "%s: using phys-bits=46 (old linux kernel compatibility)\n", __func__); > + physbits = 46; > + } > + > if (valid) > CPUPhysBits = physbits; > } > -- > 2.41.0 > > _______________________________________________ > SeaBIOS mailing list -- seabios@seabios.org > To unsubscribe send an email to seabios-leave@seabios.org
On 11/8/23 19:35, Kevin O'Connor wrote: > On Tue, Nov 07, 2023 at 02:03:09PM +0100, Gerd Hoffmann wrote: >> For better compatibility with old linux kernels, >> see source code comment. >> >> Related (same problem in ovmf): >> https://github.com/tianocore/edk2/commit/c1e853769046 > > Thanks. I'll defer to your judgement on this. It does seem a little > odd to alter the CPUPhysBits variable itself instead of adding > additional checking to the pciinit.c code that uses CPUPhysBits. > (That is, if there are old Linux kernels that can't handle a very high > PCI space, then a workaround in the PCI code might make it more clear > what is occurring.) > > Cheers, > -Kevin > > >> >> Cc: Claudio Fontana <cfontana@suse.de> >> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Hi, I thought about this, and I am not sure it's the right call though. The change breaking old guests (CentOS7, Ubuntu 18.04 are the known ones, but presumably others as well) in QEMU is: commit 784155cdcb02ffaae44afecab93861070e7d652d Author: Gerd Hoffmann <kraxel@redhat.com> Date: Mon Sep 11 17:20:26 2023 +0200 seabios: update submodule to git snapshot git shortlog ------------ Gerd Hoffmann (7): disable array bounds warning better kvm detection detect physical address space size move 64bit pci window to end of address space be less conservative with the 64bit pci io window qemu: log reservations in fw_cfg e820 table check for e820 conflict The reasoning for the change is according to: https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/message/BHK67ULKJTLJT676J4D5C2ND53CFEL3Q/ "It makes seabios follow a common pattern. real mode address space has io resources mapped high (below 1M). 32-bit address space has io resources mapped high too (below 4G). This does the same for 64-bit resources." So it seems to be an almost aesthetic choice, the way I read the "common pattern", one that ends up actually breaking existing workloads though. Now the correction to that that you propose in SeaBIOS is: >> --- >> src/fw/paravirt.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/src/fw/paravirt.c b/src/fw/paravirt.c >> index e5d4eca0cb5a..a2c9c64d5e78 100644 >> --- a/src/fw/paravirt.c >> +++ b/src/fw/paravirt.c >> @@ -182,6 +182,14 @@ static void physbits(int qemu_quirk) >> __func__, signature, pae ? "yes" : "no", CPULongMode ? "yes" : "no", >> physbits, valid ? "yes" : "no"); >> >> + if (valid && physbits > 46) { >> + // Old linux kernels have trouble dealing with more than 46 >> + // phys-bits, so avoid that for now. Seems to be a bug in the >> + // virtio-pci driver. Reported: centos-7, ubuntu-18.04 >> + dprintf(1, "%s: using phys-bits=46 (old linux kernel compatibility)\n", __func__); >> + physbits = 46; >> + } >> + >> if (valid) >> CPUPhysBits = physbits; >> } >> -- >> 2.41.0 but to me this is potentially breaking the situation for another set of users, those that are passing through 48 physical bits from their host cpu to the guest, and expect it to actually happen. Would it not be a better solution to instead either revert the original change, or to patch it to find a new range that better satisfies code consistency/aesthetic requirements, but also does not break any running workload? I could help with the testing for this. Or we could add some extra workarounds in the stack elsewhere in the management tools to try to detect older guests (not ideal either), but this seems like adding breakage on top of breakage to me. Might be wrong though, looking forward for opinions across Seabios and QEMU. Thanks, Claudio
On Fri, Nov 10, 2023 at 09:36:02AM +0100, Claudio Fontana wrote: > On 11/8/23 19:35, Kevin O'Connor wrote: > > On Tue, Nov 07, 2023 at 02:03:09PM +0100, Gerd Hoffmann wrote: > >> For better compatibility with old linux kernels, > >> see source code comment. > >> > >> Related (same problem in ovmf): > >> https://github.com/tianocore/edk2/commit/c1e853769046 > > > > Thanks. I'll defer to your judgement on this. It does seem a little > > odd to alter the CPUPhysBits variable itself instead of adding > > additional checking to the pciinit.c code that uses CPUPhysBits. > > (That is, if there are old Linux kernels that can't handle a very high > > PCI space, then a workaround in the PCI code might make it more clear > > what is occurring.) > > > > Cheers, > > -Kevin > > > > > >> > >> Cc: Claudio Fontana <cfontana@suse.de> > >> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> > > Hi, > > I thought about this, and I am not sure it's the right call though. > > The change breaking old guests (CentOS7, Ubuntu 18.04 are the known ones, but presumably others as well) in QEMU is: > > commit 784155cdcb02ffaae44afecab93861070e7d652d > Author: Gerd Hoffmann <kraxel@redhat.com> > Date: Mon Sep 11 17:20:26 2023 +0200 > > seabios: update submodule to git snapshot > > git shortlog > ------------ > > Gerd Hoffmann (7): > disable array bounds warning > better kvm detection > detect physical address space size > move 64bit pci window to end of address space > be less conservative with the 64bit pci io window > qemu: log reservations in fw_cfg e820 table > check for e820 conflict > > The reasoning for the change is according to: > > https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/message/BHK67ULKJTLJT676J4D5C2ND53CFEL3Q/ > > "It makes seabios follow a common pattern. real mode address space > has io resources mapped high (below 1M). 32-bit address space has > io resources mapped high too (below 4G). This does the same for > 64-bit resources." > > So it seems to be an almost aesthetic choice, the way I read the > "common pattern", one that ends up actually breaking existing > workloads though. It's not about aesthetics. It's about handing out more resources to the 64bit mmio window and also to the pci bridge windows if we have enough address space for that. This is important because not only main memory sizes grow, but also device memory grows. Especially GPUs and AI accelerators can have rather big PCI memory bars these days. If you want support hotplugging them you need pcie root ports with big enough bridge windows. The "common pattern" refers to OVMF doing the same for the same reason. > Now the correction to that that you propose in SeaBIOS is: > > >> + if (valid && physbits > 46) { > >> + // Old linux kernels have trouble dealing with more than 46 > >> + // phys-bits, so avoid that for now. Seems to be a bug in the > >> + // virtio-pci driver. Reported: centos-7, ubuntu-18.04 > >> + dprintf(1, "%s: using phys-bits=46 (old linux kernel compatibility)\n", __func__); > >> + physbits = 46; > >> + } > but to me this is potentially breaking the situation for another set > of users, those that are passing through 48 physical bits from their > host cpu to the guest, and expect it to actually happen. This doesn't change the address space size the guest does see. seabios does not have the power to change the information returned by the cpuid instruction. This only changes the placement of the PCI bars. The pci setup code is the only consumer of that variable, guess it makes sense to move the quirk to the pci code (as suggested by Kevin) to clarify this. > Would it not be a better solution to instead either revert the > original change, No (see above). > or to patch it to find a new range that better satisfies code > consistency/aesthetic requirements, but also does not break any > running workload? Upstream OVMF does the same thing for roughly one year, the old linux kernel issue is the only one which showed up so far. OVMF wouldn't see *really* old guests (without UEFI support) though. The pci setup code already has a bunch of conditions for activating the 64bit mmio window, to avoid breaking really old guests. It wouldn't happen on CPUs without long mode. It also wouldn't happen if guests don't have memory above 4G. I'm open to suggestions to refine this. > Or we could add some extra workarounds in the stack elsewhere in the > management tools to try to detect older guests (not ideal either), but > this seems like adding breakage on top of breakage to me. We already have libosinfo which records guest capabilities, which exists to solve exactly that problem: Allow new guests use new features by default, without breaking old guests. Extending libosinfo looks like a perfectly valid approach to me, especially considering that we probably want make the 46 phys-bits limit conditional some day in the future, to allow even larger guests. take care, Gerd
Hi, > This only changes the placement of the PCI bars. The pci setup code is > the only consumer of that variable, guess it makes sense to move the > quirk to the pci code (as suggested by Kevin) to clarify this. i.e. like this: From d538dc7d4316e557ae302464252444d09de0681d Mon Sep 17 00:00:00 2001 From: Gerd Hoffmann <kraxel@redhat.com> Date: Tue, 7 Nov 2023 13:49:31 +0100 Subject: [PATCH] limit physical address space size For better compatibility with old linux kernels, see source code comment. Related (same problem in ovmf): https://github.com/tianocore/edk2/commit/c1e853769046 Reported-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- src/fw/pciinit.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index c7084f5e397e..7aeea61bfd05 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -52,6 +52,7 @@ u64 pcimem64_start = BUILD_PCIMEM64_START; u64 pcimem64_end = BUILD_PCIMEM64_END; u64 pci_io_low_end = 0xa000; u32 pci_use_64bit = 0; +u32 pci_phys_bits = 0; struct pci_region_entry { struct pci_device *dev; @@ -967,7 +968,7 @@ static int pci_bios_check_devices(struct pci_bus *busses) if (hotplug_support == HOTPLUG_PCIE) resource_optional = pcie_cap && (type == PCI_REGION_TYPE_IO); if (hotplug_support && pci_use_64bit && is64 && (type == PCI_REGION_TYPE_PREFMEM)) - align = (u64)1 << (CPUPhysBits - 11); + align = (u64)1 << (pci_phys_bits - 11); if (align > sum && hotplug_support && !resource_optional) sum = align; /* reserve min size for hot-plug */ if (size > sum) { @@ -1131,12 +1132,12 @@ static void pci_bios_map_devices(struct pci_bus *busses) r64_mem.base = le64_to_cpu(romfile_loadint("etc/reserved-memory-end", 0)); if (r64_mem.base < 0x100000000LL + RamSizeOver4G) r64_mem.base = 0x100000000LL + RamSizeOver4G; - if (CPUPhysBits) { - u64 top = 1LL << CPUPhysBits; + if (pci_phys_bits) { + u64 top = 1LL << pci_phys_bits; u64 size = (ALIGN(sum_mem, (1LL<<30)) + ALIGN(sum_pref, (1LL<<30))); if (pci_use_64bit) - size = ALIGN(size, (1LL<<(CPUPhysBits-3))); + size = ALIGN(size, (1LL<<(pci_phys_bits-3))); if (r64_mem.base < top - size) { r64_mem.base = top - size; } @@ -1181,8 +1182,16 @@ pci_setup(void) dprintf(3, "pci setup\n"); - if (CPUPhysBits >= 36 && CPULongMode && RamSizeOver4G) + if (CPUPhysBits >= 36 && CPULongMode && RamSizeOver4G) { pci_use_64bit = 1; + pci_phys_bits = CPUPhysBits; + if (pci_phys_bits > 46) { + // Old linux kernels have trouble dealing with more than 46 + // phys-bits, so avoid that for now. Seems to be a bug in the + // virtio-pci driver. Reported: centos-7, ubuntu-18.04 + pci_phys_bits = 46; + } + } dprintf(1, "=== PCI bus & bridge init ===\n"); if (pci_probe_host() != 0) {
On Fri, Nov 10, 2023 at 12:04:24PM +0100, Gerd Hoffmann wrote: > Hi, > > > This only changes the placement of the PCI bars. The pci setup code is > > the only consumer of that variable, guess it makes sense to move the > > quirk to the pci code (as suggested by Kevin) to clarify this. > > i.e. like this: > > From d538dc7d4316e557ae302464252444d09de0681d Mon Sep 17 00:00:00 2001 > From: Gerd Hoffmann <kraxel@redhat.com> > Date: Tue, 7 Nov 2023 13:49:31 +0100 > Subject: [PATCH] limit physical address space size > > For better compatibility with old linux kernels, > see source code comment. > > Related (same problem in ovmf): > https://github.com/tianocore/edk2/commit/c1e853769046 > > Reported-by: Claudio Fontana <cfontana@suse.de> > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> > --- > src/fw/pciinit.c | 19 ++++++++++++++----- > 1 file changed, 14 insertions(+), 5 deletions(-) > > diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c > index c7084f5e397e..7aeea61bfd05 100644 > --- a/src/fw/pciinit.c > +++ b/src/fw/pciinit.c > @@ -52,6 +52,7 @@ u64 pcimem64_start = BUILD_PCIMEM64_START; > u64 pcimem64_end = BUILD_PCIMEM64_END; > u64 pci_io_low_end = 0xa000; > u32 pci_use_64bit = 0; > +u32 pci_phys_bits = 0; FWIW, all these flags are getting a bit confusing. Maybe do something like the patch below. > > struct pci_region_entry { > struct pci_device *dev; > @@ -967,7 +968,7 @@ static int pci_bios_check_devices(struct pci_bus *busses) > if (hotplug_support == HOTPLUG_PCIE) > resource_optional = pcie_cap && (type == PCI_REGION_TYPE_IO); > if (hotplug_support && pci_use_64bit && is64 && (type == PCI_REGION_TYPE_PREFMEM)) > - align = (u64)1 << (CPUPhysBits - 11); > + align = (u64)1 << (pci_phys_bits - 11); > if (align > sum && hotplug_support && !resource_optional) > sum = align; /* reserve min size for hot-plug */ > if (size > sum) { > @@ -1131,12 +1132,12 @@ static void pci_bios_map_devices(struct pci_bus *busses) > r64_mem.base = le64_to_cpu(romfile_loadint("etc/reserved-memory-end", 0)); > if (r64_mem.base < 0x100000000LL + RamSizeOver4G) > r64_mem.base = 0x100000000LL + RamSizeOver4G; > - if (CPUPhysBits) { > - u64 top = 1LL << CPUPhysBits; > + if (pci_phys_bits) { FYI, this is a change in behavior - previously this condition would have been taken even if CPULongMode or RamSizeOver4G is false. I'm not sure if this change is intentional. (My example patch below follows your lead here.) > + u64 top = 1LL << pci_phys_bits; > u64 size = (ALIGN(sum_mem, (1LL<<30)) + > ALIGN(sum_pref, (1LL<<30))); > if (pci_use_64bit) > - size = ALIGN(size, (1LL<<(CPUPhysBits-3))); > + size = ALIGN(size, (1LL<<(pci_phys_bits-3))); > if (r64_mem.base < top - size) { > r64_mem.base = top - size; > } > @@ -1181,8 +1182,16 @@ pci_setup(void) > > dprintf(3, "pci setup\n"); > > - if (CPUPhysBits >= 36 && CPULongMode && RamSizeOver4G) > + if (CPUPhysBits >= 36 && CPULongMode && RamSizeOver4G) { > pci_use_64bit = 1; > + pci_phys_bits = CPUPhysBits; > + if (pci_phys_bits > 46) { > + // Old linux kernels have trouble dealing with more than 46 > + // phys-bits, so avoid that for now. Seems to be a bug in the > + // virtio-pci driver. Reported: centos-7, ubuntu-18.04 > + pci_phys_bits = 46; > + } > + } > > dprintf(1, "=== PCI bus & bridge init ===\n"); > if (pci_probe_host() != 0) { > -- > 2.41.0 > Possible alternate variable naming: diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index c7084f5..cd64d6e 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -46,12 +46,16 @@ static const char *region_type_name[] = { [ PCI_REGION_TYPE_PREFMEM ] = "prefmem", }; +// Memory ranges exported to legacy ACPI type table generation u64 pcimem_start = BUILD_PCIMEM_START; u64 pcimem_end = BUILD_PCIMEM_END; u64 pcimem64_start = BUILD_PCIMEM64_START; u64 pcimem64_end = BUILD_PCIMEM64_END; -u64 pci_io_low_end = 0xa000; -u32 pci_use_64bit = 0; + +// Memory resource allocation tracking +static u64 pci_io_low_end = 0xa000; +static u32 pci_use_64bit = 0; +static u64 pci_mem64_top = 0; struct pci_region_entry { struct pci_device *dev; @@ -966,8 +970,9 @@ static int pci_bios_check_devices(struct pci_bus *busses) int resource_optional = 0; if (hotplug_support == HOTPLUG_PCIE) resource_optional = pcie_cap && (type == PCI_REGION_TYPE_IO); - if (hotplug_support && pci_use_64bit && is64 && (type == PCI_REGION_TYPE_PREFMEM)) - align = (u64)1 << (CPUPhysBits - 11); + if (hotplug_support && pci_use_64bit && is64 + && (type == PCI_REGION_TYPE_PREFMEM)) + align = pci_mem64_top >> 11; if (align > sum && hotplug_support && !resource_optional) sum = align; /* reserve min size for hot-plug */ if (size > sum) { @@ -1131,14 +1136,13 @@ static void pci_bios_map_devices(struct pci_bus *busses) r64_mem.base = le64_to_cpu(romfile_loadint("etc/reserved-memory-end", 0)); if (r64_mem.base < 0x100000000LL + RamSizeOver4G) r64_mem.base = 0x100000000LL + RamSizeOver4G; - if (CPUPhysBits) { - u64 top = 1LL << CPUPhysBits; + if (pci_mem64_top) { u64 size = (ALIGN(sum_mem, (1LL<<30)) + ALIGN(sum_pref, (1LL<<30))); if (pci_use_64bit) - size = ALIGN(size, (1LL<<(CPUPhysBits-3))); - if (r64_mem.base < top - size) { - r64_mem.base = top - size; + size = ALIGN(size, pci_mem64_top >> 3); + if (r64_mem.base < pci_mem64_top - size) { + r64_mem.base = pci_mem64_top - size; } if (e820_is_used(r64_mem.base, size)) r64_mem.base -= size; @@ -1181,8 +1185,16 @@ pci_setup(void) dprintf(3, "pci setup\n"); - if (CPUPhysBits >= 36 && CPULongMode && RamSizeOver4G) + if (CPUPhysBits >= 36 && CPULongMode && RamSizeOver4G) { pci_use_64bit = 1; + pci_mem64_top = 1LL << CPUPhysBits; + if (CPUPhysBits > 46) { + // Old linux kernels have trouble dealing with more than 46 + // phys-bits, so avoid that for now. Seems to be a bug in the + // virtio-pci driver. Reported: centos-7, ubuntu-18.04 + pci_mem64_top = 1LL << 46; + } + } dprintf(1, "=== PCI bus & bridge init ===\n"); if (pci_probe_host() != 0) { Cheers, -Kevin
On Fri, Nov 10, 2023 at 11:44:48AM -0500, Kevin O'Connor wrote: > On Fri, Nov 10, 2023 at 12:04:24PM +0100, Gerd Hoffmann wrote: > > - if (CPUPhysBits) { > > - u64 top = 1LL << CPUPhysBits; > > + if (pci_phys_bits) { > > FYI, this is a change in behavior - previously this condition would > have been taken even if CPULongMode or RamSizeOver4G is false. I'm > not sure if this change is intentional. (My example patch below > follows your lead here.) > On closer inspection, I think this change in behavior was not intended. How about variable names like the below instead. -Kevin diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index c7084f5..6b13cd5 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -46,12 +46,16 @@ static const char *region_type_name[] = { [ PCI_REGION_TYPE_PREFMEM ] = "prefmem", }; +// Memory ranges exported to legacy ACPI type table generation u64 pcimem_start = BUILD_PCIMEM_START; u64 pcimem_end = BUILD_PCIMEM_END; u64 pcimem64_start = BUILD_PCIMEM64_START; u64 pcimem64_end = BUILD_PCIMEM64_END; -u64 pci_io_low_end = 0xa000; -u32 pci_use_64bit = 0; + +// Resource allocation limits +static u64 pci_io_low_end = 0xa000; +static u64 pci_mem64_top = 0; +static u32 pci_pad_mem64 = 0; struct pci_region_entry { struct pci_device *dev; @@ -966,8 +970,9 @@ static int pci_bios_check_devices(struct pci_bus *busses) int resource_optional = 0; if (hotplug_support == HOTPLUG_PCIE) resource_optional = pcie_cap && (type == PCI_REGION_TYPE_IO); - if (hotplug_support && pci_use_64bit && is64 && (type == PCI_REGION_TYPE_PREFMEM)) - align = (u64)1 << (CPUPhysBits - 11); + if (hotplug_support && pci_pad_mem64 && is64 + && (type == PCI_REGION_TYPE_PREFMEM)) + align = pci_mem64_top >> 11; if (align > sum && hotplug_support && !resource_optional) sum = align; /* reserve min size for hot-plug */ if (size > sum) { @@ -1111,7 +1116,7 @@ static void pci_bios_map_devices(struct pci_bus *busses) panic("PCI: out of I/O address space\n"); dprintf(1, "PCI: 32: %016llx - %016llx\n", pcimem_start, pcimem_end); - if (pci_use_64bit || pci_bios_init_root_regions_mem(busses)) { + if (pci_pad_mem64 || pci_bios_init_root_regions_mem(busses)) { struct pci_region r64_mem, r64_pref; r64_mem.list.first = NULL; r64_pref.list.first = NULL; @@ -1131,14 +1136,13 @@ static void pci_bios_map_devices(struct pci_bus *busses) r64_mem.base = le64_to_cpu(romfile_loadint("etc/reserved-memory-end", 0)); if (r64_mem.base < 0x100000000LL + RamSizeOver4G) r64_mem.base = 0x100000000LL + RamSizeOver4G; - if (CPUPhysBits) { - u64 top = 1LL << CPUPhysBits; + if (pci_mem64_top) { u64 size = (ALIGN(sum_mem, (1LL<<30)) + ALIGN(sum_pref, (1LL<<30))); - if (pci_use_64bit) - size = ALIGN(size, (1LL<<(CPUPhysBits-3))); - if (r64_mem.base < top - size) { - r64_mem.base = top - size; + if (pci_pad_mem64) + size = ALIGN(size, pci_mem64_top >> 3); + if (r64_mem.base < pci_mem64_top - size) { + r64_mem.base = pci_mem64_top - size; } if (e820_is_used(r64_mem.base, size)) r64_mem.base -= size; @@ -1181,8 +1185,18 @@ pci_setup(void) dprintf(3, "pci setup\n"); + if (CPUPhysBits) { + pci_mem64_top = 1LL << CPUPhysBits; + if (CPUPhysBits > 46) { + // Old linux kernels have trouble dealing with more than 46 + // phys-bits, so avoid that for now. Seems to be a bug in the + // virtio-pci driver. Reported: centos-7, ubuntu-18.04 + pci_mem64_top = 1LL << 46; + } + } + if (CPUPhysBits >= 36 && CPULongMode && RamSizeOver4G) - pci_use_64bit = 1; + pci_pad_mem64 = 1; dprintf(1, "=== PCI bus & bridge init ===\n"); if (pci_probe_host() != 0) {
diff --git a/src/fw/paravirt.c b/src/fw/paravirt.c index e5d4eca0cb5a..a2c9c64d5e78 100644 --- a/src/fw/paravirt.c +++ b/src/fw/paravirt.c @@ -182,6 +182,14 @@ static void physbits(int qemu_quirk) __func__, signature, pae ? "yes" : "no", CPULongMode ? "yes" : "no", physbits, valid ? "yes" : "no"); + if (valid && physbits > 46) { + // Old linux kernels have trouble dealing with more than 46 + // phys-bits, so avoid that for now. Seems to be a bug in the + // virtio-pci driver. Reported: centos-7, ubuntu-18.04 + dprintf(1, "%s: using phys-bits=46 (old linux kernel compatibility)\n", __func__); + physbits = 46; + } + if (valid) CPUPhysBits = physbits; }
For better compatibility with old linux kernels, see source code comment. Related (same problem in ovmf): https://github.com/tianocore/edk2/commit/c1e853769046 Cc: Claudio Fontana <cfontana@suse.de> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- src/fw/paravirt.c | 8 ++++++++ 1 file changed, 8 insertions(+)