Message ID | 20250303091908.38846-1-roger.pau@citrix.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] x86/msr: expose MSR_FAM10H_MMIO_CONF_BASE on AMD | expand |
On 03/03/2025 9:19 am, Roger Pau Monne wrote: > The MMIO_CONF_BASE reports the base of the MCFG range on AMD systems. > Currently Linux is unconditionally attempting to read the MSR without a > safe MSR accessor, and since Xen doesn't allow access to it Linux reports > the following error: > > unchecked MSR access error: RDMSR from 0xc0010058 at rIP: 0xffffffff8101d19f (xen_do_read_msr+0x7f/0xa0) > Call Trace: > <TASK> > ? ex_handler_msr+0x11e/0x150 > ? fixup_exception+0x81/0x300 > ? exc_general_protection+0x138/0x410 > ? asm_exc_general_protection+0x22/0x30 > ? xen_do_read_msr+0x7f/0xa0 > xen_read_msr+0x1e/0x30 > amd_get_mmconfig_range+0x2b/0x80 > quirk_amd_mmconfig_area+0x28/0x100 > ? quirk_system_pci_resources+0x2b/0x150 > pnp_fixup_device+0x39/0x50 > __pnp_add_device+0xf/0x150 > pnp_add_device+0x3d/0x100 > ? __pfx_pnpacpi_allocated_resource+0x10/0x10 > ? __pfx_pnpacpi_allocated_resource+0x10/0x10 > ? acpi_walk_resources+0xbb/0xd0 > pnpacpi_add_device_handler+0x1f9/0x280 > acpi_ns_get_device_callback+0x104/0x1c0 > ? _raw_spin_unlock_irqrestore+0x18/0x20 > ? down_timeout+0x3a/0x60 > ? _raw_spin_lock_irqsave+0x14/0x40 > acpi_ns_walk_namespace+0x1d0/0x260 > ? _raw_spin_unlock_irqrestore+0x18/0x20 > ? __pfx_acpi_ns_get_device_callback+0x10/0x10 > acpi_get_devices+0x8a/0xb0 > ? __pfx_pnpacpi_add_device_handler+0x10/0x10 > ? __pfx_pnpacpi_init+0x10/0x10 > pnpacpi_init+0x50/0x80 > do_one_initcall+0x46/0x2e0 > kernel_init_freeable+0x1da/0x2f0 > ? __pfx_kernel_init+0x10/0x10 > kernel_init+0x16/0x1b0 > ret_from_fork+0x30/0x50 > ? __pfx_kernel_init+0x10/0x10 > ret_from_fork_asm+0x1b/0x30 > </TASK> > > Such access is conditional to the presence of a device with PnP ID > "PNP0c01", which triggers the execution of the quirk_amd_mmconfig_area() > function. Note that prior to commit 3fac3734c43a MSR accesses when running > as a PV guest would always use the safe variant, and thus silently handle > the #GP. > > Fix by allowing access to the MSR on AMD systems, returning 0 for > unprivileged domains (MMIO configuration space disabled), and the native > value for the hardware domain. > > The non hardware domain logic will need to be adjusted if in the future we > expose an MCFG region to such domains. > > Write attempts to the MSR will still result in #GP for all domain types. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > Changes since v2: > - Expand commit message to note which device triggers the MSR read. > --- > xen/arch/x86/msr.c | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > > diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c > index 1550fd9ec9f3..c1e616a3a757 100644 > --- a/xen/arch/x86/msr.c > +++ b/xen/arch/x86/msr.c > @@ -318,6 +318,21 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val) > *val = 0; > break; > > + case MSR_FAM10H_MMIO_CONF_BASE: > + if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ) > + goto gp_fault; > + > + /* > + * Report MMIO configuration space is disabled unconditionally for > + * domUs, as the emulated chipset doesn't support ECAM. For dom0 > + * return the hardware value. > + */ > + *val = 0; > + if ( is_hardware_domain(d) && rdmsr_safe(msr, *val) ) > + goto gp_fault; > + > + break; It doesn't matter right now, but reporting MMCFG disable is likely to interfere with Q35 support when we do present such a range. For PVH dom0, do we guarantee this range is identity mapped? Or at least, doesn't have plain RAM in? ~Andrew
On Mon, Mar 03, 2025 at 01:41:15PM +0000, Andrew Cooper wrote: > On 03/03/2025 9:19 am, Roger Pau Monne wrote: > > The MMIO_CONF_BASE reports the base of the MCFG range on AMD systems. > > Currently Linux is unconditionally attempting to read the MSR without a > > safe MSR accessor, and since Xen doesn't allow access to it Linux reports > > the following error: > > > > unchecked MSR access error: RDMSR from 0xc0010058 at rIP: 0xffffffff8101d19f (xen_do_read_msr+0x7f/0xa0) > > Call Trace: > > <TASK> > > ? ex_handler_msr+0x11e/0x150 > > ? fixup_exception+0x81/0x300 > > ? exc_general_protection+0x138/0x410 > > ? asm_exc_general_protection+0x22/0x30 > > ? xen_do_read_msr+0x7f/0xa0 > > xen_read_msr+0x1e/0x30 > > amd_get_mmconfig_range+0x2b/0x80 > > quirk_amd_mmconfig_area+0x28/0x100 > > ? quirk_system_pci_resources+0x2b/0x150 > > pnp_fixup_device+0x39/0x50 > > __pnp_add_device+0xf/0x150 > > pnp_add_device+0x3d/0x100 > > ? __pfx_pnpacpi_allocated_resource+0x10/0x10 > > ? __pfx_pnpacpi_allocated_resource+0x10/0x10 > > ? acpi_walk_resources+0xbb/0xd0 > > pnpacpi_add_device_handler+0x1f9/0x280 > > acpi_ns_get_device_callback+0x104/0x1c0 > > ? _raw_spin_unlock_irqrestore+0x18/0x20 > > ? down_timeout+0x3a/0x60 > > ? _raw_spin_lock_irqsave+0x14/0x40 > > acpi_ns_walk_namespace+0x1d0/0x260 > > ? _raw_spin_unlock_irqrestore+0x18/0x20 > > ? __pfx_acpi_ns_get_device_callback+0x10/0x10 > > acpi_get_devices+0x8a/0xb0 > > ? __pfx_pnpacpi_add_device_handler+0x10/0x10 > > ? __pfx_pnpacpi_init+0x10/0x10 > > pnpacpi_init+0x50/0x80 > > do_one_initcall+0x46/0x2e0 > > kernel_init_freeable+0x1da/0x2f0 > > ? __pfx_kernel_init+0x10/0x10 > > kernel_init+0x16/0x1b0 > > ret_from_fork+0x30/0x50 > > ? __pfx_kernel_init+0x10/0x10 > > ret_from_fork_asm+0x1b/0x30 > > </TASK> > > > > Such access is conditional to the presence of a device with PnP ID > > "PNP0c01", which triggers the execution of the quirk_amd_mmconfig_area() > > function. Note that prior to commit 3fac3734c43a MSR accesses when running > > as a PV guest would always use the safe variant, and thus silently handle > > the #GP. > > > > Fix by allowing access to the MSR on AMD systems, returning 0 for > > unprivileged domains (MMIO configuration space disabled), and the native > > value for the hardware domain. > > > > The non hardware domain logic will need to be adjusted if in the future we > > expose an MCFG region to such domains. > > > > Write attempts to the MSR will still result in #GP for all domain types. > > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > --- > > Changes since v2: > > - Expand commit message to note which device triggers the MSR read. > > --- > > xen/arch/x86/msr.c | 15 +++++++++++++++ > > 1 file changed, 15 insertions(+) > > > > diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c > > index 1550fd9ec9f3..c1e616a3a757 100644 > > --- a/xen/arch/x86/msr.c > > +++ b/xen/arch/x86/msr.c > > @@ -318,6 +318,21 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val) > > *val = 0; > > break; > > > > + case MSR_FAM10H_MMIO_CONF_BASE: > > + if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ) > > + goto gp_fault; > > + > > + /* > > + * Report MMIO configuration space is disabled unconditionally for > > + * domUs, as the emulated chipset doesn't support ECAM. For dom0 > > + * return the hardware value. > > + */ > > + *val = 0; > > + if ( is_hardware_domain(d) && rdmsr_safe(msr, *val) ) > > + goto gp_fault; > > + > > + break; > > It doesn't matter right now, but reporting MMCFG disable is likely to > interfere with Q35 support when we do present such a range. Yup, that's why I mention that this will likely need to be adjusted. However Linux only reads MMIO_CONF_BASE if a PnP device with ID "PNP0c01" is present, not sure whether that's (or will be the case) with Q35 support. > For PVH dom0, do we guarantee this range is identity mapped? Or at > least, doesn't have plain RAM in? Yes, for PVH dom0 ECAM range(s) are always identity mapped. Thanks, Roger.
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c index 1550fd9ec9f3..c1e616a3a757 100644 --- a/xen/arch/x86/msr.c +++ b/xen/arch/x86/msr.c @@ -318,6 +318,21 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val) *val = 0; break; + case MSR_FAM10H_MMIO_CONF_BASE: + if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ) + goto gp_fault; + + /* + * Report MMIO configuration space is disabled unconditionally for + * domUs, as the emulated chipset doesn't support ECAM. For dom0 + * return the hardware value. + */ + *val = 0; + if ( is_hardware_domain(d) && rdmsr_safe(msr, *val) ) + goto gp_fault; + + break; + case MSR_VIRT_SPEC_CTRL: if ( !cp->extd.virt_ssbd ) goto gp_fault;
The MMIO_CONF_BASE reports the base of the MCFG range on AMD systems. Currently Linux is unconditionally attempting to read the MSR without a safe MSR accessor, and since Xen doesn't allow access to it Linux reports the following error: unchecked MSR access error: RDMSR from 0xc0010058 at rIP: 0xffffffff8101d19f (xen_do_read_msr+0x7f/0xa0) Call Trace: <TASK> ? ex_handler_msr+0x11e/0x150 ? fixup_exception+0x81/0x300 ? exc_general_protection+0x138/0x410 ? asm_exc_general_protection+0x22/0x30 ? xen_do_read_msr+0x7f/0xa0 xen_read_msr+0x1e/0x30 amd_get_mmconfig_range+0x2b/0x80 quirk_amd_mmconfig_area+0x28/0x100 ? quirk_system_pci_resources+0x2b/0x150 pnp_fixup_device+0x39/0x50 __pnp_add_device+0xf/0x150 pnp_add_device+0x3d/0x100 ? __pfx_pnpacpi_allocated_resource+0x10/0x10 ? __pfx_pnpacpi_allocated_resource+0x10/0x10 ? acpi_walk_resources+0xbb/0xd0 pnpacpi_add_device_handler+0x1f9/0x280 acpi_ns_get_device_callback+0x104/0x1c0 ? _raw_spin_unlock_irqrestore+0x18/0x20 ? down_timeout+0x3a/0x60 ? _raw_spin_lock_irqsave+0x14/0x40 acpi_ns_walk_namespace+0x1d0/0x260 ? _raw_spin_unlock_irqrestore+0x18/0x20 ? __pfx_acpi_ns_get_device_callback+0x10/0x10 acpi_get_devices+0x8a/0xb0 ? __pfx_pnpacpi_add_device_handler+0x10/0x10 ? __pfx_pnpacpi_init+0x10/0x10 pnpacpi_init+0x50/0x80 do_one_initcall+0x46/0x2e0 kernel_init_freeable+0x1da/0x2f0 ? __pfx_kernel_init+0x10/0x10 kernel_init+0x16/0x1b0 ret_from_fork+0x30/0x50 ? __pfx_kernel_init+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> Such access is conditional to the presence of a device with PnP ID "PNP0c01", which triggers the execution of the quirk_amd_mmconfig_area() function. Note that prior to commit 3fac3734c43a MSR accesses when running as a PV guest would always use the safe variant, and thus silently handle the #GP. Fix by allowing access to the MSR on AMD systems, returning 0 for unprivileged domains (MMIO configuration space disabled), and the native value for the hardware domain. The non hardware domain logic will need to be adjusted if in the future we expose an MCFG region to such domains. Write attempts to the MSR will still result in #GP for all domain types. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- Changes since v2: - Expand commit message to note which device triggers the MSR read. --- xen/arch/x86/msr.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)