Message ID | 20211027082533.1406015-1-andr2000@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xen/arm: fix SBDF calculation for vPCI MMIO handlers | expand |
On Wed, Oct 27, 2021 at 11:25:33AM +0300, Oleksandr Andrushchenko wrote: > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> > > While in vPCI MMIO trap handlers for the guest PCI host bridge it is not > enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as > the base address may not be aligned in the way that the translation > always work. I think I've already raised this during review [0]. But this is only a problem if you change the current value of GUEST_VPCI_ECAM_BASE AFAICT, as the current value has bits [0,27] clear. I assume this is a problem for the hardware domain that needs to trap random base addresses as present on hardware, but that code hasn't been committed yet. If that's indeed the case, please expand the commit message to contain this information. Thanks, Roger. [0] https://lore.kernel.org/xen-devel/YWlnc3b0sj4akUWa@MacBook-Air-de-Roger.local/
Hi, Roger! On 27.10.21 11:59, Roger Pau Monné wrote: > On Wed, Oct 27, 2021 at 11:25:33AM +0300, Oleksandr Andrushchenko wrote: >> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >> >> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not >> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as >> the base address may not be aligned in the way that the translation >> always work. > I think I've already raised this during review [0]. But this is only a > problem if you change the current value of GUEST_VPCI_ECAM_BASE > AFAICT, as the current value has bits [0,27] clear. Exactly, so we were just lucky not to hit this before > > I assume this is a problem for the hardware domain that needs to trap > random base addresses as present on hardware, but that code hasn't > been committed yet. Yes, I am facing this on the real HW when Dom0's access is trapped and the base is not taken into account. So, I have a patch for the future upstream which subtracts the relevant base from the gpa, e.g. either GUEST_VPCI_ECAM_BASE or bridge->cfg->phys_addr > > If that's indeed the case, please expand the commit message to contain > this information. I can only mention about "the current value of GUEST_VPCI_ECAM_BASE AFAICT, as the current value has bits [0,27] clear" as of now because Dom0 traps are not yet there. > > Thanks, Roger. > > [0] https://lore.kernel.org/xen-devel/YWlnc3b0sj4akUWa@MacBook-Air-de-Roger.local/ Thank you, Oleksandr P.S. Sorry I failed to mark this patch as a fix for 4.16 and explain why it is a good candidate for 4.16 inclusion
On Wed, Oct 27, 2021 at 09:04:39AM +0000, Oleksandr Andrushchenko wrote: > Hi, Roger! > > On 27.10.21 11:59, Roger Pau Monné wrote: > > On Wed, Oct 27, 2021 at 11:25:33AM +0300, Oleksandr Andrushchenko wrote: > >> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> > >> > >> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not > >> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as > >> the base address may not be aligned in the way that the translation > >> always work. > > I think I've already raised this during review [0]. But this is only a > > problem if you change the current value of GUEST_VPCI_ECAM_BASE > > AFAICT, as the current value has bits [0,27] clear. > Exactly, so we were just lucky not to hit this before > > > > I assume this is a problem for the hardware domain that needs to trap > > random base addresses as present on hardware, but that code hasn't > > been committed yet. > Yes, I am facing this on the real HW when Dom0's access is trapped > and the base is not taken into account. So, I have a patch for the > future upstream which subtracts the relevant base from the gpa, > e.g. either GUEST_VPCI_ECAM_BASE or bridge->cfg->phys_addr > > > > If that's indeed the case, please expand the commit message to contain > > this information. > I can only mention about "the current value of GUEST_VPCI_ECAM_BASE > AFAICT, as the current value has bits [0,27] clear" as of now because > Dom0 traps are not yet there. Indeed, I would rather mention both cases, ie: "This change is not strictly required given the current value of GUEST_VPCI_ECAM_BASE which has bits 0 to 27 clear, but could cause issues if such value is changed, or when handlers for dom0 ECAM regions are added as those will be mapped over existing hardware regions that could use non-aligned base addresses." Regards, Roger.
On 27.10.21 12:23, Roger Pau Monné wrote: > On Wed, Oct 27, 2021 at 09:04:39AM +0000, Oleksandr Andrushchenko wrote: >> Hi, Roger! >> >> On 27.10.21 11:59, Roger Pau Monné wrote: >>> On Wed, Oct 27, 2021 at 11:25:33AM +0300, Oleksandr Andrushchenko wrote: >>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>>> >>>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not >>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as >>>> the base address may not be aligned in the way that the translation >>>> always work. >>> I think I've already raised this during review [0]. But this is only a >>> problem if you change the current value of GUEST_VPCI_ECAM_BASE >>> AFAICT, as the current value has bits [0,27] clear. >> Exactly, so we were just lucky not to hit this before >>> I assume this is a problem for the hardware domain that needs to trap >>> random base addresses as present on hardware, but that code hasn't >>> been committed yet. >> Yes, I am facing this on the real HW when Dom0's access is trapped >> and the base is not taken into account. So, I have a patch for the >> future upstream which subtracts the relevant base from the gpa, >> e.g. either GUEST_VPCI_ECAM_BASE or bridge->cfg->phys_addr >>> If that's indeed the case, please expand the commit message to contain >>> this information. >> I can only mention about "the current value of GUEST_VPCI_ECAM_BASE >> AFAICT, as the current value has bits [0,27] clear" as of now because >> Dom0 traps are not yet there. > Indeed, I would rather mention both cases, ie: > > "This change is not strictly required given the current value of > GUEST_VPCI_ECAM_BASE which has bits 0 to 27 clear, but could cause > issues if such value is changed, or when handlers for dom0 ECAM > regions are added as those will be mapped over existing hardware > regions that could use non-aligned base addresses." Sounds good, will add > > Regards, Roger. Thank you, Oleksandr
Oleksandr Andrushchenko writes ("Re: [PATCH] xen/arm: fix SBDF calculation for vPCI MMIO handlers"): > P.S. Sorry I failed to mark this patch as a fix for 4.16 and explain why it is > a good candidate for 4.16 inclusion Hello :-). Um, can you explain what the practical impact is of not taking this patch for 4.16 ? As I understand it vpci for ARM is non-functional in 4.16 and this is not expected to change ? So there would be no benefit to users, and taking the patch would add small but nonzero risk ? I think according to the freeze policy I set this can go in if the maintainers feel it is a "straightforward bugfix", but provided it goes in by the end of this coming Friday. After that it will need a release-ack and right now, unless I am mistaken (which may well be the case) it can just as well wait ? Thanks, Ian.
Hi Oleksandr, On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> > > While in vPCI MMIO trap handlers for the guest PCI host bridge it is not > enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as > the base address may not be aligned in the way that the translation > always work. If not adjusted with respect to the base address it may not be > able to properly convert SBDF and crashes: > > (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc I can't find a printk() that may output this message. Where does this comes from? Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. > (XEN) Data Abort Trap. Syndrome=0x6 > (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. When there is a data abort in Xen, you should get a stack trace from where this comes from. Can you paste it here? > > Fix this by adjusting the gpa with respect to the host bridge base address > in a way as it is done for x86. > > Fixes: d59168dc05a5 ("xen/arm: Enable the existing x86 virtual PCI support for ARM") > > Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> > --- > xen/arch/arm/vpci.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c > index 8f40a0dec6d2..23f45386f4b3 100644 > --- a/xen/arch/arm/vpci.c > +++ b/xen/arch/arm/vpci.c > @@ -24,7 +24,7 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info, > unsigned long data; > > /* We ignore segment part and always handle segment 0 */ > - sbdf.sbdf = VPCI_ECAM_BDF(info->gpa); > + sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); Looking at the rest of the rest, it seems that 1) the issue is latent as the bits 0-27 are clear 2) this will need to be modified to take into account dom0. So I would prefer if the base address is passed differently (maybe in priv?) from the start. This will avoid extra modification that you already plan to have in a follow-up series. > > if ( vpci_ecam_read(sbdf, ECAM_REG_OFFSET(info->gpa), > 1U << info->dabt.size, &data) ) > @@ -44,7 +44,7 @@ static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info, > pci_sbdf_t sbdf; > > /* We ignore segment part and always handle segment 0 */ > - sbdf.sbdf = VPCI_ECAM_BDF(info->gpa); > + sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); > > return vpci_ecam_write(sbdf, ECAM_REG_OFFSET(info->gpa), > 1U << info->dabt.size, r); > Cheers,
Hi, Julien! On 27.10.21 20:35, Julien Grall wrote: > Hi Oleksandr, > > On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: >> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >> >> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not >> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as >> the base address may not be aligned in the way that the translation >> always work. If not adjusted with respect to the base address it may not be >> able to properly convert SBDF and crashes: >> >> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc > > I can't find a printk() that may output this message. Where does this comes from? That was a debug print. I shouldn't have used that in the patch description, but probably after "---" to better explain what's happening > > Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. This is from dom0 I am working on now. > > IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. > >> (XEN) Data Abort Trap. Syndrome=0x6 >> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 > I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. > > In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. Well, there is no (?) easy way to validate SBDF. And this could be a problem if we have a misbehaving guest which may force Xen to access the memory beyond that of PCI host bridge > > When there is a data abort in Xen, you should get a stack trace from where this comes from. Can you paste it here? (XEN) Data Abort Trap. Syndrome=0x6 (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 (XEN) 0TH[0x0] = 0x00000000481d4f7f (XEN) 1ST[0x1] = 0x00000000481d2f7f (XEN) 2ND[0x33] = 0x0000000000000000 (XEN) CPU0: Unexpected Trap: Data Abort (XEN) ----[ Xen-4.16-unstable arm64 debug=y Not tainted ]---- (XEN) CPU: 0 (XEN) PC: 000000000026d3d4 pci_generic_config_read+0x88/0x9c (XEN) LR: 000000000026d36c (XEN) SP: 000080007ff97c00 (XEN) CPSR: 0000000060400249 MODE:64-bit EL2h (Hypervisor, handler) (XEN) X0: 00000000467a28bc X1: 00000000065d08bc X2: 00000000000008bc (XEN) X3: 000000000000000c X4: 000080007ffc6fd0 X5: 0000000000000000 (XEN) X6: 0000000000000014 X7: ffff800011a58000 X8: ffff0000225a0380 (XEN) X9: 0000000000000000 X10: 0101010101010101 X11: 0000000000000028 (XEN) X12: 0101010101010101 X13: 0000000000000020 X14: ffffffffffffffff (XEN) X15: 0000000000000001 X16: ffff800010da6708 X17: 0000000000000020 (XEN) X18: 0000000000000002 X19: 0000000000000004 X20: 000080007ff97c5c (XEN) X21: 00000000000008bc X22: 00000000000008bc X23: 0000000000000004 (XEN) X24: 0000000000000000 X25: 00000000000008bc X26: 00000000000065d0 (XEN) X27: 000080007ffb9010 X28: 0000000000000000 FP: 000080007ff97c00 (XEN) (XEN) VTCR_EL2: 00000000800a3558 (XEN) VTTBR_EL2: 00010000bffba000 (XEN) (XEN) SCTLR_EL2: 0000000030cd183d (XEN) HCR_EL2: 00000000807c663f (XEN) TTBR0_EL2: 00000000481d5000 (XEN) (XEN) ESR_EL2: 0000000096000006 (XEN) HPFAR_EL2: 0000000000e65d00 (XEN) FAR_EL2: 00000000467a28bc (XEN) [snip] (XEN) Xen call trace: (XEN) [<000000000026d3d4>] pci_generic_config_read+0x88/0x9c (PC) (XEN) [<000000000026d36c>] pci_generic_config_read+0x20/0x9c (LR) (XEN) [<000000000026d2c8>] pci-access.c#pci_config_read+0x60/0x84 (XEN) [<000000000026d4a8>] pci_conf_read32+0x10/0x18 (XEN) [<000000000024dcf8>] vpci.c#vpci_read_hw+0x48/0xb8 (XEN) [<000000000024e3e4>] vpci_read+0xac/0x24c (XEN) [<000000000024e934>] vpci_ecam_read+0x78/0xa8 (XEN) [<000000000026e368>] vpci.c#vpci_mmio_read+0x44/0x7c (XEN) [<0000000000275054>] try_handle_mmio+0x1ec/0x264 (XEN) [<000000000027ea50>] traps.c#do_trap_stage2_abort_guest+0x18c/0x2d8 (XEN) [<000000000027f088>] do_trap_guest_sync+0xf0/0x618 (XEN) [<0000000000269c58>] entry.o#guest_sync_slowpath+0xa4/0xd4 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) CPU0: Unexpected Trap: Data Abort (XEN) **************************************** > >> >> Fix this by adjusting the gpa with respect to the host bridge base address >> in a way as it is done for x86. >> >> Fixes: d59168dc05a5 ("xen/arm: Enable the existing x86 virtual PCI support for ARM") >> >> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >> --- >> xen/arch/arm/vpci.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c >> index 8f40a0dec6d2..23f45386f4b3 100644 >> --- a/xen/arch/arm/vpci.c >> +++ b/xen/arch/arm/vpci.c >> @@ -24,7 +24,7 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info, >> unsigned long data; >> /* We ignore segment part and always handle segment 0 */ >> - sbdf.sbdf = VPCI_ECAM_BDF(info->gpa); >> + sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); > > Looking at the rest of the rest, it seems that > 1) the issue is latent as the bits 0-27 are clear > 2) this will need to be modified to take into account dom0. > > So I would prefer if the base address is passed differently (maybe in priv?) from the start. This will avoid extra modification that you already plan to have in a follow-up series. I was thinking about the same, but the future code will use priv for other purpose: static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info, register_t *r, void *p) { struct pci_host_bridge *bridge = p; pci_sbdf_t sbdf; /* data is needed to prevent a pointer cast on 32bit */ unsigned long data; if ( bridge ) { sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - bridge->cfg->phys_addr); sbdf.seg = bridge->segment; } else sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); > >> if ( vpci_ecam_read(sbdf, ECAM_REG_OFFSET(info->gpa), >> 1U << info->dabt.size, &data) ) >> @@ -44,7 +44,7 @@ static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info, >> pci_sbdf_t sbdf; >> /* We ignore segment part and always handle segment 0 */ >> - sbdf.sbdf = VPCI_ECAM_BDF(info->gpa); >> + sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); >> return vpci_ecam_write(sbdf, ECAM_REG_OFFSET(info->gpa), >> 1U << info->dabt.size, r); >> > > Cheers, > Thank you, Oleksandr
On 28/10/2021 13:09, Oleksandr Andrushchenko wrote: > Hi, Julien! Hello, > On 27.10.21 20:35, Julien Grall wrote: >> Hi Oleksandr, >> >> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: >>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>> >>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not >>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as >>> the base address may not be aligned in the way that the translation >>> always work. If not adjusted with respect to the base address it may not be >>> able to properly convert SBDF and crashes: >>> >>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc >> >> I can't find a printk() that may output this message. Where does this comes from? > That was a debug print. I shouldn't have used that in the patch description, but > probably after "---" to better explain what's happening >> >> Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. > This is from dom0 I am working on now. >> >> IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. >> >>> (XEN) Data Abort Trap. Syndrome=0x6 >>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >> I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. >> >> In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. > Well, there is no (?) easy way to validate SBDF. AFAICT pci_ecam_map_bus() is already doing some validation for the bus number. So... And this could be a problem if we have a misbehaving > guest which may force Xen to access the memory beyond that of PCI host bridge >> >> When there is a data abort in Xen, you should get a stack trace from where this comes from. Can you paste it here? > (XEN) Data Abort Trap. Syndrome=0x6 > (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 > (XEN) 0TH[0x0] = 0x00000000481d4f7f > (XEN) 1ST[0x1] = 0x00000000481d2f7f > (XEN) 2ND[0x33] = 0x0000000000000000 > (XEN) CPU0: Unexpected Trap: Data Abort ... I am getting quite confused why this is crashing. Are we validation correctly the access? > (XEN) ----[ Xen-4.16-unstable arm64 debug=y Not tainted ]---- > (XEN) CPU: 0 > (XEN) PC: 000000000026d3d4 pci_generic_config_read+0x88/0x9c > (XEN) LR: 000000000026d36c > (XEN) SP: 000080007ff97c00 > (XEN) CPSR: 0000000060400249 MODE:64-bit EL2h (Hypervisor, handler) > (XEN) X0: 00000000467a28bc X1: 00000000065d08bc X2: 00000000000008bc > (XEN) X3: 000000000000000c X4: 000080007ffc6fd0 X5: 0000000000000000 > (XEN) X6: 0000000000000014 X7: ffff800011a58000 X8: ffff0000225a0380 > (XEN) X9: 0000000000000000 X10: 0101010101010101 X11: 0000000000000028 > (XEN) X12: 0101010101010101 X13: 0000000000000020 X14: ffffffffffffffff > (XEN) X15: 0000000000000001 X16: ffff800010da6708 X17: 0000000000000020 > (XEN) X18: 0000000000000002 X19: 0000000000000004 X20: 000080007ff97c5c > (XEN) X21: 00000000000008bc X22: 00000000000008bc X23: 0000000000000004 > (XEN) X24: 0000000000000000 X25: 00000000000008bc X26: 00000000000065d0 > (XEN) X27: 000080007ffb9010 X28: 0000000000000000 FP: 000080007ff97c00 > (XEN) > (XEN) VTCR_EL2: 00000000800a3558 > (XEN) VTTBR_EL2: 00010000bffba000 > (XEN) > (XEN) SCTLR_EL2: 0000000030cd183d > (XEN) HCR_EL2: 00000000807c663f > (XEN) TTBR0_EL2: 00000000481d5000 > (XEN) > (XEN) ESR_EL2: 0000000096000006 > (XEN) HPFAR_EL2: 0000000000e65d00 > (XEN) FAR_EL2: 00000000467a28bc > (XEN) > [snip] > (XEN) Xen call trace: > (XEN) [<000000000026d3d4>] pci_generic_config_read+0x88/0x9c (PC) > (XEN) [<000000000026d36c>] pci_generic_config_read+0x20/0x9c (LR) > (XEN) [<000000000026d2c8>] pci-access.c#pci_config_read+0x60/0x84 > (XEN) [<000000000026d4a8>] pci_conf_read32+0x10/0x18 > (XEN) [<000000000024dcf8>] vpci.c#vpci_read_hw+0x48/0xb8 > (XEN) [<000000000024e3e4>] vpci_read+0xac/0x24c > (XEN) [<000000000024e934>] vpci_ecam_read+0x78/0xa8 > (XEN) [<000000000026e368>] vpci.c#vpci_mmio_read+0x44/0x7c > (XEN) [<0000000000275054>] try_handle_mmio+0x1ec/0x264 > (XEN) [<000000000027ea50>] traps.c#do_trap_stage2_abort_guest+0x18c/0x2d8 > (XEN) [<000000000027f088>] do_trap_guest_sync+0xf0/0x618 > (XEN) [<0000000000269c58>] entry.o#guest_sync_slowpath+0xa4/0xd4 > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) CPU0: Unexpected Trap: Data Abort > (XEN) **************************************** > >> >>> >>> Fix this by adjusting the gpa with respect to the host bridge base address >>> in a way as it is done for x86. >>> >>> Fixes: d59168dc05a5 ("xen/arm: Enable the existing x86 virtual PCI support for ARM") >>> >>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>> --- >>> xen/arch/arm/vpci.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c >>> index 8f40a0dec6d2..23f45386f4b3 100644 >>> --- a/xen/arch/arm/vpci.c >>> +++ b/xen/arch/arm/vpci.c >>> @@ -24,7 +24,7 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info, >>> unsigned long data; >>> /* We ignore segment part and always handle segment 0 */ >>> - sbdf.sbdf = VPCI_ECAM_BDF(info->gpa); >>> + sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); >> >> Looking at the rest of the rest, it seems that >> 1) the issue is latent as the bits 0-27 are clear >> 2) this will need to be modified to take into account dom0. >> >> So I would prefer if the base address is passed differently (maybe in priv?) from the start. This will avoid extra modification that you already plan to have in a follow-up series. > I was thinking about the same, but the future code will use priv for other purpose: > > static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info, > register_t *r, void *p) > { > struct pci_host_bridge *bridge = p; > pci_sbdf_t sbdf; > /* data is needed to prevent a pointer cast on 32bit */ > unsigned long data; > > if ( bridge ) > { > sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - bridge->cfg->phys_addr); > sbdf.seg = bridge->segment; > } > else > sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); Is it the only place you are doing to use bridge? If so, then I think we can simply have a structure that would contain phys_addr and segment. This would be include in the bridge for dom0 and for guest this could be a static global variable for now. Cheers,
On Thu, Oct 28, 2021 at 12:09:23PM +0000, Oleksandr Andrushchenko wrote: > Hi, Julien! > > On 27.10.21 20:35, Julien Grall wrote: > > Hi Oleksandr, > > > > On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: > >> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> > >> > >> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not > >> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as > >> the base address may not be aligned in the way that the translation > >> always work. If not adjusted with respect to the base address it may not be > >> able to properly convert SBDF and crashes: > >> > >> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc > > > > I can't find a printk() that may output this message. Where does this comes from? > That was a debug print. I shouldn't have used that in the patch description, but > probably after "---" to better explain what's happening > > > > Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. > This is from dom0 I am working on now. > > > > IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. > > > >> (XEN) Data Abort Trap. Syndrome=0x6 > >> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 > > I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. > > > > In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. > Well, there is no (?) easy way to validate SBDF. And this could be a problem if we have a misbehaving > guest which may force Xen to access the memory beyond that of PCI host bridge How could that be? The ECAM region exposed to the guest you should be the same as the physical one for dom0? And for domUs you really need to fix vpci_{read,write} to not passthrough accesses not explicitly handled. > > When there is a data abort in Xen, you should get a stack trace from where this comes from. Can you paste it here? > (XEN) Data Abort Trap. Syndrome=0x6 > (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 > (XEN) 0TH[0x0] = 0x00000000481d4f7f > (XEN) 1ST[0x1] = 0x00000000481d2f7f > (XEN) 2ND[0x33] = 0x0000000000000000 > (XEN) CPU0: Unexpected Trap: Data Abort > (XEN) ----[ Xen-4.16-unstable arm64 debug=y Not tainted ]---- > (XEN) CPU: 0 > (XEN) PC: 000000000026d3d4 pci_generic_config_read+0x88/0x9c > (XEN) LR: 000000000026d36c > (XEN) SP: 000080007ff97c00 > (XEN) CPSR: 0000000060400249 MODE:64-bit EL2h (Hypervisor, handler) > (XEN) X0: 00000000467a28bc X1: 00000000065d08bc X2: 00000000000008bc > (XEN) X3: 000000000000000c X4: 000080007ffc6fd0 X5: 0000000000000000 > (XEN) X6: 0000000000000014 X7: ffff800011a58000 X8: ffff0000225a0380 > (XEN) X9: 0000000000000000 X10: 0101010101010101 X11: 0000000000000028 > (XEN) X12: 0101010101010101 X13: 0000000000000020 X14: ffffffffffffffff > (XEN) X15: 0000000000000001 X16: ffff800010da6708 X17: 0000000000000020 > (XEN) X18: 0000000000000002 X19: 0000000000000004 X20: 000080007ff97c5c > (XEN) X21: 00000000000008bc X22: 00000000000008bc X23: 0000000000000004 > (XEN) X24: 0000000000000000 X25: 00000000000008bc X26: 00000000000065d0 > (XEN) X27: 000080007ffb9010 X28: 0000000000000000 FP: 000080007ff97c00 > (XEN) > (XEN) VTCR_EL2: 00000000800a3558 > (XEN) VTTBR_EL2: 00010000bffba000 > (XEN) > (XEN) SCTLR_EL2: 0000000030cd183d > (XEN) HCR_EL2: 00000000807c663f > (XEN) TTBR0_EL2: 00000000481d5000 > (XEN) > (XEN) ESR_EL2: 0000000096000006 > (XEN) HPFAR_EL2: 0000000000e65d00 > (XEN) FAR_EL2: 00000000467a28bc > (XEN) > [snip] > (XEN) Xen call trace: > (XEN) [<000000000026d3d4>] pci_generic_config_read+0x88/0x9c (PC) > (XEN) [<000000000026d36c>] pci_generic_config_read+0x20/0x9c (LR) > (XEN) [<000000000026d2c8>] pci-access.c#pci_config_read+0x60/0x84 > (XEN) [<000000000026d4a8>] pci_conf_read32+0x10/0x18 > (XEN) [<000000000024dcf8>] vpci.c#vpci_read_hw+0x48/0xb8 > (XEN) [<000000000024e3e4>] vpci_read+0xac/0x24c > (XEN) [<000000000024e934>] vpci_ecam_read+0x78/0xa8 > (XEN) [<000000000026e368>] vpci.c#vpci_mmio_read+0x44/0x7c > (XEN) [<0000000000275054>] try_handle_mmio+0x1ec/0x264 > (XEN) [<000000000027ea50>] traps.c#do_trap_stage2_abort_guest+0x18c/0x2d8 > (XEN) [<000000000027f088>] do_trap_guest_sync+0xf0/0x618 > (XEN) [<0000000000269c58>] entry.o#guest_sync_slowpath+0xa4/0xd4 > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) CPU0: Unexpected Trap: Data Abort > (XEN) **************************************** Are you exposing an ECAM region to the guest bigger than the underlying one, and that's why you get crashes? (because you get out of the hardware range) I would assume physical accesses to the ECAM area reported by the hardware don't trigger traps? Roger.
On 28.10.21 16:22, Julien Grall wrote: > On 28/10/2021 13:09, Oleksandr Andrushchenko wrote: >> Hi, Julien! > > Hello, > >> On 27.10.21 20:35, Julien Grall wrote: >>> Hi Oleksandr, >>> >>> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: >>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>>> >>>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not >>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as >>>> the base address may not be aligned in the way that the translation >>>> always work. If not adjusted with respect to the base address it may not be >>>> able to properly convert SBDF and crashes: >>>> >>>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc >>> >>> I can't find a printk() that may output this message. Where does this comes from? >> That was a debug print. I shouldn't have used that in the patch description, but >> probably after "---" to better explain what's happening >>> >>> Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. >> This is from dom0 I am working on now. >>> >>> IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. >>> >>>> (XEN) Data Abort Trap. Syndrome=0x6 >>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >>> I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. >>> >>> In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. >> Well, there is no (?) easy way to validate SBDF. > > AFAICT pci_ecam_map_bus() is already doing some validation for the bus number. So... what it does is not enough as... if ( busn < cfg->busn_start || busn > cfg->busn_end ) return NULL; busn -= cfg->busn_start; base = cfg->win + (busn << ops->bus_shift); return base + (PCI_DEVFN2(sbdf.bdf) << devfn_shift) + where; this can still overrun > > And this could be a problem if we have a misbehaving >> guest which may force Xen to access the memory beyond that of PCI host bridge >>> >>> When there is a data abort in Xen, you should get a stack trace from where this comes from. Can you paste it here? >> (XEN) Data Abort Trap. Syndrome=0x6 >> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >> (XEN) 0TH[0x0] = 0x00000000481d4f7f >> (XEN) 1ST[0x1] = 0x00000000481d2f7f >> (XEN) 2ND[0x33] = 0x0000000000000000 >> (XEN) CPU0: Unexpected Trap: Data Abort > > ... I am getting quite confused why this is crashing. Are we validation correctly the access? See above. If provided with big enough SBDF we can end up getting out of the window. > > >> (XEN) ----[ Xen-4.16-unstable arm64 debug=y Not tainted ]---- >> (XEN) CPU: 0 >> (XEN) PC: 000000000026d3d4 pci_generic_config_read+0x88/0x9c >> (XEN) LR: 000000000026d36c >> (XEN) SP: 000080007ff97c00 >> (XEN) CPSR: 0000000060400249 MODE:64-bit EL2h (Hypervisor, handler) >> (XEN) X0: 00000000467a28bc X1: 00000000065d08bc X2: 00000000000008bc >> (XEN) X3: 000000000000000c X4: 000080007ffc6fd0 X5: 0000000000000000 >> (XEN) X6: 0000000000000014 X7: ffff800011a58000 X8: ffff0000225a0380 >> (XEN) X9: 0000000000000000 X10: 0101010101010101 X11: 0000000000000028 >> (XEN) X12: 0101010101010101 X13: 0000000000000020 X14: ffffffffffffffff >> (XEN) X15: 0000000000000001 X16: ffff800010da6708 X17: 0000000000000020 >> (XEN) X18: 0000000000000002 X19: 0000000000000004 X20: 000080007ff97c5c >> (XEN) X21: 00000000000008bc X22: 00000000000008bc X23: 0000000000000004 >> (XEN) X24: 0000000000000000 X25: 00000000000008bc X26: 00000000000065d0 >> (XEN) X27: 000080007ffb9010 X28: 0000000000000000 FP: 000080007ff97c00 >> (XEN) >> (XEN) VTCR_EL2: 00000000800a3558 >> (XEN) VTTBR_EL2: 00010000bffba000 >> (XEN) >> (XEN) SCTLR_EL2: 0000000030cd183d >> (XEN) HCR_EL2: 00000000807c663f >> (XEN) TTBR0_EL2: 00000000481d5000 >> (XEN) >> (XEN) ESR_EL2: 0000000096000006 >> (XEN) HPFAR_EL2: 0000000000e65d00 >> (XEN) FAR_EL2: 00000000467a28bc >> (XEN) >> [snip] >> (XEN) Xen call trace: >> (XEN) [<000000000026d3d4>] pci_generic_config_read+0x88/0x9c (PC) >> (XEN) [<000000000026d36c>] pci_generic_config_read+0x20/0x9c (LR) >> (XEN) [<000000000026d2c8>] pci-access.c#pci_config_read+0x60/0x84 >> (XEN) [<000000000026d4a8>] pci_conf_read32+0x10/0x18 >> (XEN) [<000000000024dcf8>] vpci.c#vpci_read_hw+0x48/0xb8 >> (XEN) [<000000000024e3e4>] vpci_read+0xac/0x24c >> (XEN) [<000000000024e934>] vpci_ecam_read+0x78/0xa8 >> (XEN) [<000000000026e368>] vpci.c#vpci_mmio_read+0x44/0x7c >> (XEN) [<0000000000275054>] try_handle_mmio+0x1ec/0x264 >> (XEN) [<000000000027ea50>] traps.c#do_trap_stage2_abort_guest+0x18c/0x2d8 >> (XEN) [<000000000027f088>] do_trap_guest_sync+0xf0/0x618 >> (XEN) [<0000000000269c58>] entry.o#guest_sync_slowpath+0xa4/0xd4 >> (XEN) >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 0: >> (XEN) CPU0: Unexpected Trap: Data Abort >> (XEN) **************************************** >> >>> >>>> >>>> Fix this by adjusting the gpa with respect to the host bridge base address >>>> in a way as it is done for x86. >>>> >>>> Fixes: d59168dc05a5 ("xen/arm: Enable the existing x86 virtual PCI support for ARM") >>>> >>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>>> --- >>>> xen/arch/arm/vpci.c | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c >>>> index 8f40a0dec6d2..23f45386f4b3 100644 >>>> --- a/xen/arch/arm/vpci.c >>>> +++ b/xen/arch/arm/vpci.c >>>> @@ -24,7 +24,7 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info, >>>> unsigned long data; >>>> /* We ignore segment part and always handle segment 0 */ >>>> - sbdf.sbdf = VPCI_ECAM_BDF(info->gpa); >>>> + sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); >>> >>> Looking at the rest of the rest, it seems that >>> 1) the issue is latent as the bits 0-27 are clear >>> 2) this will need to be modified to take into account dom0. >>> >>> So I would prefer if the base address is passed differently (maybe in priv?) from the start. This will avoid extra modification that you already plan to have in a follow-up series. >> I was thinking about the same, but the future code will use priv for other purpose: >> >> static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info, >> register_t *r, void *p) >> { >> struct pci_host_bridge *bridge = p; >> pci_sbdf_t sbdf; >> /* data is needed to prevent a pointer cast on 32bit */ >> unsigned long data; >> >> if ( bridge ) >> { >> sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - bridge->cfg->phys_addr); >> sbdf.seg = bridge->segment; >> } >> else >> sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); > > Is it the only place you are doing to use bridge? If so, then I think we can simply have a structure that would contain phys_addr and segment. > > This would be include in the bridge for dom0 and for guest this could be a static global variable for now. Hm. I don't think a global is any better than using info->gpa - GUEST_VPCI_ECAM_BASE. But I am fine with the structure: please let me know your preference, so I can have an acceptable fix > > Cheers, > Thank you, Oleksandr
On 28.10.21 16:36, Roger Pau Monné wrote: > On Thu, Oct 28, 2021 at 12:09:23PM +0000, Oleksandr Andrushchenko wrote: >> Hi, Julien! >> >> On 27.10.21 20:35, Julien Grall wrote: >>> Hi Oleksandr, >>> >>> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: >>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>>> >>>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not >>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as >>>> the base address may not be aligned in the way that the translation >>>> always work. If not adjusted with respect to the base address it may not be >>>> able to properly convert SBDF and crashes: >>>> >>>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc >>> I can't find a printk() that may output this message. Where does this comes from? >> That was a debug print. I shouldn't have used that in the patch description, but >> probably after "---" to better explain what's happening >>> Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. >> This is from dom0 I am working on now. >>> IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. >>> >>>> (XEN) Data Abort Trap. Syndrome=0x6 >>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >>> I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. >>> >>> In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. >> Well, there is no (?) easy way to validate SBDF. And this could be a problem if we have a misbehaving >> guest which may force Xen to access the memory beyond that of PCI host bridge > How could that be? The ECAM region exposed to the guest you should be > the same as the physical one for dom0? Ok, I have a Designware PCI hist which has 2 ECAM regions (I am starting to implement the driver for it, so I can be wrong here): - Root Complex ECAM area ("dbi"), it is something like 0x3000 bytes long - "Client" ECAM area ("config") So from Dom0 POV we have 2 ECAM regions and for the guest we always emulate a single big region: /* * 256 MB is reserved for VPCI configuration space based on calculation * 256 buses x 32 devices x 8 functions x 4 KB = 256 MB */ #define GUEST_VPCI_ECAM_BASE xen_mk_ullong(0x10000000) #define GUEST_VPCI_ECAM_SIZE xen_mk_ullong(0x10000000) So, we have the base address and size of the emulated ECAM space not connected to the real host bridge > > And for domUs you really need to fix vpci_{read,write} to not > passthrough accesses not explicitly handled. Do you mean that we need to validate SBDFs there? This can be tricky if we have a use-case when a PCI device being passed through if not put at 0000:00:0.0, but requested to be, for example, 0000:0d:0.0. So, we need to go over the list of virtual devices and see if SBDF the guest is trying to access is a valid SBDF. Is this what you mean? > >>> When there is a data abort in Xen, you should get a stack trace from where this comes from. Can you paste it here? >> (XEN) Data Abort Trap. Syndrome=0x6 >> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >> (XEN) 0TH[0x0] = 0x00000000481d4f7f >> (XEN) 1ST[0x1] = 0x00000000481d2f7f >> (XEN) 2ND[0x33] = 0x0000000000000000 >> (XEN) CPU0: Unexpected Trap: Data Abort >> (XEN) ----[ Xen-4.16-unstable arm64 debug=y Not tainted ]---- >> (XEN) CPU: 0 >> (XEN) PC: 000000000026d3d4 pci_generic_config_read+0x88/0x9c >> (XEN) LR: 000000000026d36c >> (XEN) SP: 000080007ff97c00 >> (XEN) CPSR: 0000000060400249 MODE:64-bit EL2h (Hypervisor, handler) >> (XEN) X0: 00000000467a28bc X1: 00000000065d08bc X2: 00000000000008bc >> (XEN) X3: 000000000000000c X4: 000080007ffc6fd0 X5: 0000000000000000 >> (XEN) X6: 0000000000000014 X7: ffff800011a58000 X8: ffff0000225a0380 >> (XEN) X9: 0000000000000000 X10: 0101010101010101 X11: 0000000000000028 >> (XEN) X12: 0101010101010101 X13: 0000000000000020 X14: ffffffffffffffff >> (XEN) X15: 0000000000000001 X16: ffff800010da6708 X17: 0000000000000020 >> (XEN) X18: 0000000000000002 X19: 0000000000000004 X20: 000080007ff97c5c >> (XEN) X21: 00000000000008bc X22: 00000000000008bc X23: 0000000000000004 >> (XEN) X24: 0000000000000000 X25: 00000000000008bc X26: 00000000000065d0 >> (XEN) X27: 000080007ffb9010 X28: 0000000000000000 FP: 000080007ff97c00 >> (XEN) >> (XEN) VTCR_EL2: 00000000800a3558 >> (XEN) VTTBR_EL2: 00010000bffba000 >> (XEN) >> (XEN) SCTLR_EL2: 0000000030cd183d >> (XEN) HCR_EL2: 00000000807c663f >> (XEN) TTBR0_EL2: 00000000481d5000 >> (XEN) >> (XEN) ESR_EL2: 0000000096000006 >> (XEN) HPFAR_EL2: 0000000000e65d00 >> (XEN) FAR_EL2: 00000000467a28bc >> (XEN) >> [snip] >> (XEN) Xen call trace: >> (XEN) [<000000000026d3d4>] pci_generic_config_read+0x88/0x9c (PC) >> (XEN) [<000000000026d36c>] pci_generic_config_read+0x20/0x9c (LR) >> (XEN) [<000000000026d2c8>] pci-access.c#pci_config_read+0x60/0x84 >> (XEN) [<000000000026d4a8>] pci_conf_read32+0x10/0x18 >> (XEN) [<000000000024dcf8>] vpci.c#vpci_read_hw+0x48/0xb8 >> (XEN) [<000000000024e3e4>] vpci_read+0xac/0x24c >> (XEN) [<000000000024e934>] vpci_ecam_read+0x78/0xa8 >> (XEN) [<000000000026e368>] vpci.c#vpci_mmio_read+0x44/0x7c >> (XEN) [<0000000000275054>] try_handle_mmio+0x1ec/0x264 >> (XEN) [<000000000027ea50>] traps.c#do_trap_stage2_abort_guest+0x18c/0x2d8 >> (XEN) [<000000000027f088>] do_trap_guest_sync+0xf0/0x618 >> (XEN) [<0000000000269c58>] entry.o#guest_sync_slowpath+0xa4/0xd4 >> (XEN) >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 0: >> (XEN) CPU0: Unexpected Trap: Data Abort >> (XEN) **************************************** > Are you exposing an ECAM region to the guest bigger than the > underlying one, and that's why you get crashes? (because you get out of > the hardware range) Please see above > I would assume physical accesses to the ECAM area reported by the > hardware don't trigger traps? No > > Roger. Thank you, Oleksandr
Hi, On 28/10/2021 15:16, Oleksandr Andrushchenko wrote: > On 28.10.21 16:22, Julien Grall wrote: >> On 28/10/2021 13:09, Oleksandr Andrushchenko wrote: >>> Hi, Julien! >> >> Hello, >> >>> On 27.10.21 20:35, Julien Grall wrote: >>>> Hi Oleksandr, >>>> >>>> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: >>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>>>> >>>>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not >>>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as >>>>> the base address may not be aligned in the way that the translation >>>>> always work. If not adjusted with respect to the base address it may not be >>>>> able to properly convert SBDF and crashes: >>>>> >>>>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc >>>> >>>> I can't find a printk() that may output this message. Where does this comes from? >>> That was a debug print. I shouldn't have used that in the patch description, but >>> probably after "---" to better explain what's happening >>>> >>>> Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. >>> This is from dom0 I am working on now. >>>> >>>> IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. >>>> >>>>> (XEN) Data Abort Trap. Syndrome=0x6 >>>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >>>> I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. >>>> >>>> In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. >>> Well, there is no (?) easy way to validate SBDF. >> >> AFAICT pci_ecam_map_bus() is already doing some validation for the bus number. So... > what it does is not enough as... > if ( busn < cfg->busn_start || busn > cfg->busn_end ) > return NULL; > > busn -= cfg->busn_start; > base = cfg->win + (busn << ops->bus_shift); > > return base + (PCI_DEVFN2(sbdf.bdf) << devfn_shift) + where; > this can still overrun Thanks, I guessed this was not enough... What I am trying to understand is *why* this is not enough *and* whether we need to add more validation. >> >> And this could be a problem if we have a misbehaving >>> guest which may force Xen to access the memory beyond that of PCI host bridge >>>> >>>> When there is a data abort in Xen, you should get a stack trace from where this comes from. Can you paste it here? >>> (XEN) Data Abort Trap. Syndrome=0x6 >>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >>> (XEN) 0TH[0x0] = 0x00000000481d4f7f >>> (XEN) 1ST[0x1] = 0x00000000481d2f7f >>> (XEN) 2ND[0x33] = 0x0000000000000000 >>> (XEN) CPU0: Unexpected Trap: Data Abort >> >> ... I am getting quite confused why this is crashing. Are we validation correctly the access? > See above. If provided with big enough SBDF we can end up getting out of the window. Shouldn't we validate that we are still in the window? >> >> >>> (XEN) ----[ Xen-4.16-unstable arm64 debug=y Not tainted ]---- >>> (XEN) CPU: 0 >>> (XEN) PC: 000000000026d3d4 pci_generic_config_read+0x88/0x9c >>> (XEN) LR: 000000000026d36c >>> (XEN) SP: 000080007ff97c00 >>> (XEN) CPSR: 0000000060400249 MODE:64-bit EL2h (Hypervisor, handler) >>> (XEN) X0: 00000000467a28bc X1: 00000000065d08bc X2: 00000000000008bc >>> (XEN) X3: 000000000000000c X4: 000080007ffc6fd0 X5: 0000000000000000 >>> (XEN) X6: 0000000000000014 X7: ffff800011a58000 X8: ffff0000225a0380 >>> (XEN) X9: 0000000000000000 X10: 0101010101010101 X11: 0000000000000028 >>> (XEN) X12: 0101010101010101 X13: 0000000000000020 X14: ffffffffffffffff >>> (XEN) X15: 0000000000000001 X16: ffff800010da6708 X17: 0000000000000020 >>> (XEN) X18: 0000000000000002 X19: 0000000000000004 X20: 000080007ff97c5c >>> (XEN) X21: 00000000000008bc X22: 00000000000008bc X23: 0000000000000004 >>> (XEN) X24: 0000000000000000 X25: 00000000000008bc X26: 00000000000065d0 >>> (XEN) X27: 000080007ffb9010 X28: 0000000000000000 FP: 000080007ff97c00 >>> (XEN) >>> (XEN) VTCR_EL2: 00000000800a3558 >>> (XEN) VTTBR_EL2: 00010000bffba000 >>> (XEN) >>> (XEN) SCTLR_EL2: 0000000030cd183d >>> (XEN) HCR_EL2: 00000000807c663f >>> (XEN) TTBR0_EL2: 00000000481d5000 >>> (XEN) >>> (XEN) ESR_EL2: 0000000096000006 >>> (XEN) HPFAR_EL2: 0000000000e65d00 >>> (XEN) FAR_EL2: 00000000467a28bc >>> (XEN) >>> [snip] >>> (XEN) Xen call trace: >>> (XEN) [<000000000026d3d4>] pci_generic_config_read+0x88/0x9c (PC) >>> (XEN) [<000000000026d36c>] pci_generic_config_read+0x20/0x9c (LR) >>> (XEN) [<000000000026d2c8>] pci-access.c#pci_config_read+0x60/0x84 >>> (XEN) [<000000000026d4a8>] pci_conf_read32+0x10/0x18 >>> (XEN) [<000000000024dcf8>] vpci.c#vpci_read_hw+0x48/0xb8 >>> (XEN) [<000000000024e3e4>] vpci_read+0xac/0x24c >>> (XEN) [<000000000024e934>] vpci_ecam_read+0x78/0xa8 >>> (XEN) [<000000000026e368>] vpci.c#vpci_mmio_read+0x44/0x7c >>> (XEN) [<0000000000275054>] try_handle_mmio+0x1ec/0x264 >>> (XEN) [<000000000027ea50>] traps.c#do_trap_stage2_abort_guest+0x18c/0x2d8 >>> (XEN) [<000000000027f088>] do_trap_guest_sync+0xf0/0x618 >>> (XEN) [<0000000000269c58>] entry.o#guest_sync_slowpath+0xa4/0xd4 >>> (XEN) >>> (XEN) >>> (XEN) **************************************** >>> (XEN) Panic on CPU 0: >>> (XEN) CPU0: Unexpected Trap: Data Abort >>> (XEN) **************************************** >>> >>>> >>>>> >>>>> Fix this by adjusting the gpa with respect to the host bridge base address >>>>> in a way as it is done for x86. >>>>> >>>>> Fixes: d59168dc05a5 ("xen/arm: Enable the existing x86 virtual PCI support for ARM") >>>>> >>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>>>> --- >>>>> xen/arch/arm/vpci.c | 4 ++-- >>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c >>>>> index 8f40a0dec6d2..23f45386f4b3 100644 >>>>> --- a/xen/arch/arm/vpci.c >>>>> +++ b/xen/arch/arm/vpci.c >>>>> @@ -24,7 +24,7 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info, >>>>> unsigned long data; >>>>> /* We ignore segment part and always handle segment 0 */ >>>>> - sbdf.sbdf = VPCI_ECAM_BDF(info->gpa); >>>>> + sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); >>>> >>>> Looking at the rest of the rest, it seems that >>>> 1) the issue is latent as the bits 0-27 are clear >>>> 2) this will need to be modified to take into account dom0. >>>> >>>> So I would prefer if the base address is passed differently (maybe in priv?) from the start. This will avoid extra modification that you already plan to have in a follow-up series. >>> I was thinking about the same, but the future code will use priv for other purpose: >>> >>> static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info, >>> register_t *r, void *p) >>> { >>> struct pci_host_bridge *bridge = p; >>> pci_sbdf_t sbdf; >>> /* data is needed to prevent a pointer cast on 32bit */ >>> unsigned long data; >>> >>> if ( bridge ) >>> { >>> sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - bridge->cfg->phys_addr); >>> sbdf.seg = bridge->segment; >>> } >>> else >>> sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); >> >> Is it the only place you are doing to use bridge? If so, then I think we can simply have a structure that would contain phys_addr and segment. >> >> This would be include in the bridge for dom0 and for guest this could be a static global variable for now. > Hm. I don't think a global is any better than using info->gpa - GUEST_VPCI_ECAM_BASE. > But I am fine with the structure: please let me know your preference, > so I can have an acceptable fix The difference is you don't duplicate the same check in two places Alternatively, I would be happy consider an helper that is used in both places. Cheers,
On 28.10.21 17:28, Julien Grall wrote: > Hi, > > On 28/10/2021 15:16, Oleksandr Andrushchenko wrote: >> On 28.10.21 16:22, Julien Grall wrote: >>> On 28/10/2021 13:09, Oleksandr Andrushchenko wrote: >>>> Hi, Julien! >>> >>> Hello, >>> >>>> On 27.10.21 20:35, Julien Grall wrote: >>>>> Hi Oleksandr, >>>>> >>>>> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: >>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>>>>> >>>>>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not >>>>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as >>>>>> the base address may not be aligned in the way that the translation >>>>>> always work. If not adjusted with respect to the base address it may not be >>>>>> able to properly convert SBDF and crashes: >>>>>> >>>>>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc >>>>> >>>>> I can't find a printk() that may output this message. Where does this comes from? >>>> That was a debug print. I shouldn't have used that in the patch description, but >>>> probably after "---" to better explain what's happening >>>>> >>>>> Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. >>>> This is from dom0 I am working on now. >>>>> >>>>> IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. >>>>> >>>>>> (XEN) Data Abort Trap. Syndrome=0x6 >>>>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >>>>> I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. >>>>> >>>>> In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. >>>> Well, there is no (?) easy way to validate SBDF. >>> >>> AFAICT pci_ecam_map_bus() is already doing some validation for the bus number. So... >> what it does is not enough as... >> if ( busn < cfg->busn_start || busn > cfg->busn_end ) >> return NULL; >> >> busn -= cfg->busn_start; >> base = cfg->win + (busn << ops->bus_shift); >> >> return base + (PCI_DEVFN2(sbdf.bdf) << devfn_shift) + where; >> this can still overrun > > Thanks, I guessed this was not enough... What I am trying to understand is *why* this is not enough *and* whether we need to add more validation. > >>> >>> And this could be a problem if we have a misbehaving >>>> guest which may force Xen to access the memory beyond that of PCI host bridge >>>>> >>>>> When there is a data abort in Xen, you should get a stack trace from where this comes from. Can you paste it here? >>>> (XEN) Data Abort Trap. Syndrome=0x6 >>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >>>> (XEN) 0TH[0x0] = 0x00000000481d4f7f >>>> (XEN) 1ST[0x1] = 0x00000000481d2f7f >>>> (XEN) 2ND[0x33] = 0x0000000000000000 >>>> (XEN) CPU0: Unexpected Trap: Data Abort >>> >>> ... I am getting quite confused why this is crashing. Are we validation correctly the access? >> See above. If provided with big enough SBDF we can end up getting out of the window. > > Shouldn't we validate that we are still in the window? Yes, this will work if check is implemented to respect bridge's config window. > >>> >>> >>>> (XEN) ----[ Xen-4.16-unstable arm64 debug=y Not tainted ]---- >>>> (XEN) CPU: 0 >>>> (XEN) PC: 000000000026d3d4 pci_generic_config_read+0x88/0x9c >>>> (XEN) LR: 000000000026d36c >>>> (XEN) SP: 000080007ff97c00 >>>> (XEN) CPSR: 0000000060400249 MODE:64-bit EL2h (Hypervisor, handler) >>>> (XEN) X0: 00000000467a28bc X1: 00000000065d08bc X2: 00000000000008bc >>>> (XEN) X3: 000000000000000c X4: 000080007ffc6fd0 X5: 0000000000000000 >>>> (XEN) X6: 0000000000000014 X7: ffff800011a58000 X8: ffff0000225a0380 >>>> (XEN) X9: 0000000000000000 X10: 0101010101010101 X11: 0000000000000028 >>>> (XEN) X12: 0101010101010101 X13: 0000000000000020 X14: ffffffffffffffff >>>> (XEN) X15: 0000000000000001 X16: ffff800010da6708 X17: 0000000000000020 >>>> (XEN) X18: 0000000000000002 X19: 0000000000000004 X20: 000080007ff97c5c >>>> (XEN) X21: 00000000000008bc X22: 00000000000008bc X23: 0000000000000004 >>>> (XEN) X24: 0000000000000000 X25: 00000000000008bc X26: 00000000000065d0 >>>> (XEN) X27: 000080007ffb9010 X28: 0000000000000000 FP: 000080007ff97c00 >>>> (XEN) >>>> (XEN) VTCR_EL2: 00000000800a3558 >>>> (XEN) VTTBR_EL2: 00010000bffba000 >>>> (XEN) >>>> (XEN) SCTLR_EL2: 0000000030cd183d >>>> (XEN) HCR_EL2: 00000000807c663f >>>> (XEN) TTBR0_EL2: 00000000481d5000 >>>> (XEN) >>>> (XEN) ESR_EL2: 0000000096000006 >>>> (XEN) HPFAR_EL2: 0000000000e65d00 >>>> (XEN) FAR_EL2: 00000000467a28bc >>>> (XEN) >>>> [snip] >>>> (XEN) Xen call trace: >>>> (XEN) [<000000000026d3d4>] pci_generic_config_read+0x88/0x9c (PC) >>>> (XEN) [<000000000026d36c>] pci_generic_config_read+0x20/0x9c (LR) >>>> (XEN) [<000000000026d2c8>] pci-access.c#pci_config_read+0x60/0x84 >>>> (XEN) [<000000000026d4a8>] pci_conf_read32+0x10/0x18 >>>> (XEN) [<000000000024dcf8>] vpci.c#vpci_read_hw+0x48/0xb8 >>>> (XEN) [<000000000024e3e4>] vpci_read+0xac/0x24c >>>> (XEN) [<000000000024e934>] vpci_ecam_read+0x78/0xa8 >>>> (XEN) [<000000000026e368>] vpci.c#vpci_mmio_read+0x44/0x7c >>>> (XEN) [<0000000000275054>] try_handle_mmio+0x1ec/0x264 >>>> (XEN) [<000000000027ea50>] traps.c#do_trap_stage2_abort_guest+0x18c/0x2d8 >>>> (XEN) [<000000000027f088>] do_trap_guest_sync+0xf0/0x618 >>>> (XEN) [<0000000000269c58>] entry.o#guest_sync_slowpath+0xa4/0xd4 >>>> (XEN) >>>> (XEN) >>>> (XEN) **************************************** >>>> (XEN) Panic on CPU 0: >>>> (XEN) CPU0: Unexpected Trap: Data Abort >>>> (XEN) **************************************** >>>> >>>>> >>>>>> >>>>>> Fix this by adjusting the gpa with respect to the host bridge base address >>>>>> in a way as it is done for x86. >>>>>> >>>>>> Fixes: d59168dc05a5 ("xen/arm: Enable the existing x86 virtual PCI support for ARM") >>>>>> >>>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>>>>> --- >>>>>> xen/arch/arm/vpci.c | 4 ++-- >>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c >>>>>> index 8f40a0dec6d2..23f45386f4b3 100644 >>>>>> --- a/xen/arch/arm/vpci.c >>>>>> +++ b/xen/arch/arm/vpci.c >>>>>> @@ -24,7 +24,7 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info, >>>>>> unsigned long data; >>>>>> /* We ignore segment part and always handle segment 0 */ >>>>>> - sbdf.sbdf = VPCI_ECAM_BDF(info->gpa); >>>>>> + sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); >>>>> >>>>> Looking at the rest of the rest, it seems that >>>>> 1) the issue is latent as the bits 0-27 are clear >>>>> 2) this will need to be modified to take into account dom0. >>>>> >>>>> So I would prefer if the base address is passed differently (maybe in priv?) from the start. This will avoid extra modification that you already plan to have in a follow-up series. >>>> I was thinking about the same, but the future code will use priv for other purpose: >>>> >>>> static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info, >>>> register_t *r, void *p) >>>> { >>>> struct pci_host_bridge *bridge = p; >>>> pci_sbdf_t sbdf; >>>> /* data is needed to prevent a pointer cast on 32bit */ >>>> unsigned long data; >>>> >>>> if ( bridge ) >>>> { >>>> sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - bridge->cfg->phys_addr); >>>> sbdf.seg = bridge->segment; >>>> } >>>> else >>>> sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); >>> >>> Is it the only place you are doing to use bridge? If so, then I think we can simply have a structure that would contain phys_addr and segment. >>> >>> This would be include in the bridge for dom0 and for guest this could be a static global variable for now. >> Hm. I don't think a global is any better than using info->gpa - GUEST_VPCI_ECAM_BASE. >> But I am fine with the structure: please let me know your preference, >> so I can have an acceptable fix > > The difference is you don't duplicate the same check in two places > Alternatively, I would be happy consider an helper that is used in both places. But then we duplicate data inside "struct pci_host_bridge *bridge" because we need to allocate (have it embedded) such a structure per bridge, e.g. we have bridge->segment bridge->cfg->phys_addr and then we add a structure which contains the same: bridge->new_struct.segment == bridge->segment bridge->new_struct.phys_addr == bridge->cfg->phys_addr This is so that bridge->new_struct can be passed as a private to register_mmio_handler And the above seems to be no so bright comparing to just passing bridge as private and using info->gpa - GUEST_VPCI_ECAM_BASE... So, I would stay with simpler if ( bridge ) { sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - bridge->cfg->phys_addr); sbdf.seg = bridge->segment; } else sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); > > Cheers, > Thank you, Oleksandr
Hi, On 28/10/2021 16:27, Oleksandr Andrushchenko wrote: > bridge as private and using info->gpa - GUEST_VPCI_ECAM_BASE... > So, I would stay with simpler > > if ( bridge ) > { > sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - bridge->cfg->phys_addr); > sbdf.seg = bridge->segment; > } > else > sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); I am fine with that so long this is part of an helper (maybe vpci_sbdf_from_addr()) rather than duplicated. Cheers,
Julien Grall writes ("Re: [PATCH] xen/arm: fix SBDF calculation for vPCI MMIO handlers"): > On 28/10/2021 16:27, Oleksandr Andrushchenko wrote: > > bridge as private and using info->gpa - GUEST_VPCI_ECAM_BASE... > > So, I would stay with simpler > > > > if ( bridge ) > > { > > sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - bridge->cfg->phys_addr); > > sbdf.seg = bridge->segment; > > } > > else > > sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); > > I am fine with that so long this is part of an helper (maybe > vpci_sbdf_from_addr()) rather than duplicated. There are a number of patches that I'm getting CC'd on related to ARM and vpci (according to the Subject). Are these targeted for 4.16 ? Most of them don't have 4.16 Subject tags. Am I getting these in my capacity as a REST maintainer ? I am finding it difficult to see the wood for the trees. It would be really helpful if these vpci fixes were collected together into a series. Thanks, Ian. (writing as Xen 4.16 Release Manager)
On Thu, Oct 28, 2021 at 02:23:34PM +0000, Oleksandr Andrushchenko wrote: > > > On 28.10.21 16:36, Roger Pau Monné wrote: > > On Thu, Oct 28, 2021 at 12:09:23PM +0000, Oleksandr Andrushchenko wrote: > >> Hi, Julien! > >> > >> On 27.10.21 20:35, Julien Grall wrote: > >>> Hi Oleksandr, > >>> > >>> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: > >>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> > >>>> > >>>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not > >>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as > >>>> the base address may not be aligned in the way that the translation > >>>> always work. If not adjusted with respect to the base address it may not be > >>>> able to properly convert SBDF and crashes: > >>>> > >>>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc > >>> I can't find a printk() that may output this message. Where does this comes from? > >> That was a debug print. I shouldn't have used that in the patch description, but > >> probably after "---" to better explain what's happening > >>> Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. > >> This is from dom0 I am working on now. > >>> IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. > >>> > >>>> (XEN) Data Abort Trap. Syndrome=0x6 > >>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 > >>> I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. > >>> > >>> In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. > >> Well, there is no (?) easy way to validate SBDF. And this could be a problem if we have a misbehaving > >> guest which may force Xen to access the memory beyond that of PCI host bridge > > How could that be? The ECAM region exposed to the guest you should be > > the same as the physical one for dom0? > Ok, I have a Designware PCI hist which has 2 ECAM regions (I am starting to > implement the driver for it, so I can be wrong here): > - Root Complex ECAM area ("dbi"), it is something like 0x3000 bytes long > - "Client" ECAM area ("config") > So from Dom0 POV we have 2 ECAM regions and for the guest > we always emulate a single big region: You need support for multiple ECAM regions. That's how we do it on x86 PVH dom0. See register_vpci_mmcfg_handler and related machinery. > /* > * 256 MB is reserved for VPCI configuration space based on calculation > * 256 buses x 32 devices x 8 functions x 4 KB = 256 MB > */ > #define GUEST_VPCI_ECAM_BASE xen_mk_ullong(0x10000000) > #define GUEST_VPCI_ECAM_SIZE xen_mk_ullong(0x10000000) > > So, we have the base address and size of the emulated ECAM space > not connected to the real host bridge > > > > And for domUs you really need to fix vpci_{read,write} to not > > passthrough accesses not explicitly handled. > Do you mean that we need to validate SBDFs there? > This can be tricky if we have a use-case when a PCI device being > passed through if not put at 0000:00:0.0, but requested to be, for > example, 0000:0d:0.0. So, we need to go over the list of virtual > devices and see if SBDF the guest is trying to access is a valid SBDF. > Is this what you mean? No, you need to prevent accesses to registers not explicitly handled by vpci. Ie: do not forward unhandled accesses to vpci_{read,wrie}_hw). Regards, Roger.
On 28.10.21 19:03, Roger Pau Monné wrote: > On Thu, Oct 28, 2021 at 02:23:34PM +0000, Oleksandr Andrushchenko wrote: >> >> On 28.10.21 16:36, Roger Pau Monné wrote: >>> On Thu, Oct 28, 2021 at 12:09:23PM +0000, Oleksandr Andrushchenko wrote: >>>> Hi, Julien! >>>> >>>> On 27.10.21 20:35, Julien Grall wrote: >>>>> Hi Oleksandr, >>>>> >>>>> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: >>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>>>>> >>>>>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not >>>>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as >>>>>> the base address may not be aligned in the way that the translation >>>>>> always work. If not adjusted with respect to the base address it may not be >>>>>> able to properly convert SBDF and crashes: >>>>>> >>>>>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc >>>>> I can't find a printk() that may output this message. Where does this comes from? >>>> That was a debug print. I shouldn't have used that in the patch description, but >>>> probably after "---" to better explain what's happening >>>>> Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. >>>> This is from dom0 I am working on now. >>>>> IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. >>>>> >>>>>> (XEN) Data Abort Trap. Syndrome=0x6 >>>>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >>>>> I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. >>>>> >>>>> In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. >>>> Well, there is no (?) easy way to validate SBDF. And this could be a problem if we have a misbehaving >>>> guest which may force Xen to access the memory beyond that of PCI host bridge >>> How could that be? The ECAM region exposed to the guest you should be >>> the same as the physical one for dom0? >> Ok, I have a Designware PCI hist which has 2 ECAM regions (I am starting to >> implement the driver for it, so I can be wrong here): >> - Root Complex ECAM area ("dbi"), it is something like 0x3000 bytes long >> - "Client" ECAM area ("config") >> So from Dom0 POV we have 2 ECAM regions and for the guest >> we always emulate a single big region: > You need support for multiple ECAM regions. That's how we do it on x86 > PVH dom0. See register_vpci_mmcfg_handler and related machinery. Is it common for a PCI host bridge to have multiple ECAM regions? Currently on Arm we were about to support "pci-host-ecam-generic" [1], e.g. generic ECAM host bridge which normally (?) has a single ECAM region [2]. But the host bridge I want to support has multiple, so strictly speaking it is not the one that we implement. Arm folks, do we want this generalization at this moment to align with x86 with this respect? We can live with the current approach and when I have my driver implemented I can send patches to make that generalization. > >> /* >> * 256 MB is reserved for VPCI configuration space based on calculation >> * 256 buses x 32 devices x 8 functions x 4 KB = 256 MB >> */ >> #define GUEST_VPCI_ECAM_BASE xen_mk_ullong(0x10000000) >> #define GUEST_VPCI_ECAM_SIZE xen_mk_ullong(0x10000000) >> >> So, we have the base address and size of the emulated ECAM space >> not connected to the real host bridge >>> And for domUs you really need to fix vpci_{read,write} to not >>> passthrough accesses not explicitly handled. >> Do you mean that we need to validate SBDFs there? >> This can be tricky if we have a use-case when a PCI device being >> passed through if not put at 0000:00:0.0, but requested to be, for >> example, 0000:0d:0.0. So, we need to go over the list of virtual >> devices and see if SBDF the guest is trying to access is a valid SBDF. >> Is this what you mean? > No, you need to prevent accesses to registers not explicitly handled > by vpci. Ie: do not forward unhandled accesses to > vpci_{read,wrie}_hw). I see, so those which have no handlers are not passed to the hardware. I need to see how to do that > > Regards, Roger. Thank you, Oleksandr [1] https://elixir.bootlin.com/linux/v5.15-rc7/source/drivers/pci/controller/pci-host-generic.c [2] https://elixir.bootlin.com/linux/v5.15-rc7/source/drivers/pci/controller/pci-host-common.c#L23
On 28.10.21 18:35, Julien Grall wrote: > Hi, > > On 28/10/2021 16:27, Oleksandr Andrushchenko wrote: >> bridge as private and using info->gpa - GUEST_VPCI_ECAM_BASE... >> So, I would stay with simpler >> >> if ( bridge ) >> { >> sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - bridge->cfg->phys_addr); >> sbdf.seg = bridge->segment; >> } >> else >> sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); > > I am fine with that so long this is part of an helper (maybe vpci_sbdf_from_addr()) rather than duplicated. > Ok, I will implement a helper then: pci_sbdf_t vpci_sbdf_from_gpa(const struct pci_host_bridge *bridge, paddr_t gpa); > Cheers, > Thank you, Oleksandr
On Thu, Oct 28, 2021 at 05:55:25PM +0000, Oleksandr Andrushchenko wrote: > > > On 28.10.21 19:03, Roger Pau Monné wrote: > > On Thu, Oct 28, 2021 at 02:23:34PM +0000, Oleksandr Andrushchenko wrote: > >> > >> On 28.10.21 16:36, Roger Pau Monné wrote: > >>> On Thu, Oct 28, 2021 at 12:09:23PM +0000, Oleksandr Andrushchenko wrote: > >>>> Hi, Julien! > >>>> > >>>> On 27.10.21 20:35, Julien Grall wrote: > >>>>> Hi Oleksandr, > >>>>> > >>>>> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: > >>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> > >>>>>> > >>>>>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not > >>>>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as > >>>>>> the base address may not be aligned in the way that the translation > >>>>>> always work. If not adjusted with respect to the base address it may not be > >>>>>> able to properly convert SBDF and crashes: > >>>>>> > >>>>>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc > >>>>> I can't find a printk() that may output this message. Where does this comes from? > >>>> That was a debug print. I shouldn't have used that in the patch description, but > >>>> probably after "---" to better explain what's happening > >>>>> Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. > >>>> This is from dom0 I am working on now. > >>>>> IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. > >>>>> > >>>>>> (XEN) Data Abort Trap. Syndrome=0x6 > >>>>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 > >>>>> I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. > >>>>> > >>>>> In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. > >>>> Well, there is no (?) easy way to validate SBDF. And this could be a problem if we have a misbehaving > >>>> guest which may force Xen to access the memory beyond that of PCI host bridge > >>> How could that be? The ECAM region exposed to the guest you should be > >>> the same as the physical one for dom0? > >> Ok, I have a Designware PCI hist which has 2 ECAM regions (I am starting to > >> implement the driver for it, so I can be wrong here): > >> - Root Complex ECAM area ("dbi"), it is something like 0x3000 bytes long > >> - "Client" ECAM area ("config") > >> So from Dom0 POV we have 2 ECAM regions and for the guest > >> we always emulate a single big region: > > You need support for multiple ECAM regions. That's how we do it on x86 > > PVH dom0. See register_vpci_mmcfg_handler and related machinery. > Is it common for a PCI host bridge to have multiple ECAM regions? > Currently on Arm we were about to support "pci-host-ecam-generic" [1], > e.g. generic ECAM host bridge which normally (?) has a single ECAM > region [2]. But the host bridge I want to support has multiple, so > strictly speaking it is not the one that we implement. It's possible on x86 to have multiple ECAM regions, whether that means multiple host bridges, or host bridges having multiple ECAM regions is unknown to me. It's all reported in the MCFG ACPI table (see PCI Firmware document for the detailed description of MCFG) using the "Configuration Space Base Address Allocation Structure", and there can be multiple of those structures. > Arm folks, do we want this generalization at this moment to align with x86 > with this respect? > > We can live with the current approach and when I have my driver implemented > I can send patches to make that generalization. > > > >> /* > >> * 256 MB is reserved for VPCI configuration space based on calculation > >> * 256 buses x 32 devices x 8 functions x 4 KB = 256 MB > >> */ > >> #define GUEST_VPCI_ECAM_BASE xen_mk_ullong(0x10000000) > >> #define GUEST_VPCI_ECAM_SIZE xen_mk_ullong(0x10000000) > >> > >> So, we have the base address and size of the emulated ECAM space > >> not connected to the real host bridge > >>> And for domUs you really need to fix vpci_{read,write} to not > >>> passthrough accesses not explicitly handled. > >> Do you mean that we need to validate SBDFs there? > >> This can be tricky if we have a use-case when a PCI device being > >> passed through if not put at 0000:00:0.0, but requested to be, for > >> example, 0000:0d:0.0. So, we need to go over the list of virtual > >> devices and see if SBDF the guest is trying to access is a valid SBDF. > >> Is this what you mean? > > No, you need to prevent accesses to registers not explicitly handled > > by vpci. Ie: do not forward unhandled accesses to > > vpci_{read,wrie}_hw). > I see, so those which have no handlers are not passed to the hardware. > I need to see how to do that Indeed. Without fixing that passthrough to domUs is completely unsafe, as you allow domUs full access to registers not explicitly handled by current vPCI code. Regards, Roger.
Hi Ian, On 28/10/2021 16:54, Ian Jackson wrote: > Julien Grall writes ("Re: [PATCH] xen/arm: fix SBDF calculation for vPCI MMIO handlers"): >> On 28/10/2021 16:27, Oleksandr Andrushchenko wrote: >>> bridge as private and using info->gpa - GUEST_VPCI_ECAM_BASE... >>> So, I would stay with simpler >>> >>> if ( bridge ) >>> { >>> sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - bridge->cfg->phys_addr); >>> sbdf.seg = bridge->segment; >>> } >>> else >>> sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); >> >> I am fine with that so long this is part of an helper (maybe >> vpci_sbdf_from_addr()) rather than duplicated. > > There are a number of patches that I'm getting CC'd on related to ARM > and vpci (according to the Subject). Are these targeted for 4.16 ? > Most of them don't have 4.16 Subject tags. Oleksandr wants this patch to be included for 4.16 but forgot to tag it properly. Cheers,
On 29.10.21 10:33, Roger Pau Monné wrote: > On Thu, Oct 28, 2021 at 05:55:25PM +0000, Oleksandr Andrushchenko wrote: >> >> On 28.10.21 19:03, Roger Pau Monné wrote: >>> On Thu, Oct 28, 2021 at 02:23:34PM +0000, Oleksandr Andrushchenko wrote: >>>> On 28.10.21 16:36, Roger Pau Monné wrote: >>>>> On Thu, Oct 28, 2021 at 12:09:23PM +0000, Oleksandr Andrushchenko wrote: >>>>>> Hi, Julien! >>>>>> >>>>>> On 27.10.21 20:35, Julien Grall wrote: >>>>>>> Hi Oleksandr, >>>>>>> >>>>>>> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: >>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>>>>>>> >>>>>>>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not >>>>>>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as >>>>>>>> the base address may not be aligned in the way that the translation >>>>>>>> always work. If not adjusted with respect to the base address it may not be >>>>>>>> able to properly convert SBDF and crashes: >>>>>>>> >>>>>>>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc >>>>>>> I can't find a printk() that may output this message. Where does this comes from? >>>>>> That was a debug print. I shouldn't have used that in the patch description, but >>>>>> probably after "---" to better explain what's happening >>>>>>> Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. >>>>>> This is from dom0 I am working on now. >>>>>>> IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. >>>>>>> >>>>>>>> (XEN) Data Abort Trap. Syndrome=0x6 >>>>>>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >>>>>>> I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. >>>>>>> >>>>>>> In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. >>>>>> Well, there is no (?) easy way to validate SBDF. And this could be a problem if we have a misbehaving >>>>>> guest which may force Xen to access the memory beyond that of PCI host bridge >>>>> How could that be? The ECAM region exposed to the guest you should be >>>>> the same as the physical one for dom0? >>>> Ok, I have a Designware PCI hist which has 2 ECAM regions (I am starting to >>>> implement the driver for it, so I can be wrong here): >>>> - Root Complex ECAM area ("dbi"), it is something like 0x3000 bytes long >>>> - "Client" ECAM area ("config") >>>> So from Dom0 POV we have 2 ECAM regions and for the guest >>>> we always emulate a single big region: >>> You need support for multiple ECAM regions. That's how we do it on x86 >>> PVH dom0. See register_vpci_mmcfg_handler and related machinery. >> Is it common for a PCI host bridge to have multiple ECAM regions? >> Currently on Arm we were about to support "pci-host-ecam-generic" [1], >> e.g. generic ECAM host bridge which normally (?) has a single ECAM >> region [2]. But the host bridge I want to support has multiple, so >> strictly speaking it is not the one that we implement. > It's possible on x86 to have multiple ECAM regions, whether that means > multiple host bridges, or host bridges having multiple ECAM regions is > unknown to me. It's all reported in the MCFG ACPI table (see PCI > Firmware document for the detailed description of MCFG) using the > "Configuration Space Base Address Allocation Structure", and there can > be multiple of those structures. As we are currently supporting generic ECAM host bridge which has a single ECAM region I think the existing code we have and about to upstream is ok as is for now. I own a bridge which has 2 ECAM regions, so I will work towards adding its support soon. > >> Arm folks, do we want this generalization at this moment to align with x86 >> with this respect? >> >> We can live with the current approach and when I have my driver implemented >> I can send patches to make that generalization. >>>> /* >>>> * 256 MB is reserved for VPCI configuration space based on calculation >>>> * 256 buses x 32 devices x 8 functions x 4 KB = 256 MB >>>> */ >>>> #define GUEST_VPCI_ECAM_BASE xen_mk_ullong(0x10000000) >>>> #define GUEST_VPCI_ECAM_SIZE xen_mk_ullong(0x10000000) >>>> >>>> So, we have the base address and size of the emulated ECAM space >>>> not connected to the real host bridge >>>>> And for domUs you really need to fix vpci_{read,write} to not >>>>> passthrough accesses not explicitly handled. >>>> Do you mean that we need to validate SBDFs there? >>>> This can be tricky if we have a use-case when a PCI device being >>>> passed through if not put at 0000:00:0.0, but requested to be, for >>>> example, 0000:0d:0.0. So, we need to go over the list of virtual >>>> devices and see if SBDF the guest is trying to access is a valid SBDF. >>>> Is this what you mean? >>> No, you need to prevent accesses to registers not explicitly handled >>> by vpci. Ie: do not forward unhandled accesses to >>> vpci_{read,wrie}_hw). >> I see, so those which have no handlers are not passed to the hardware. >> I need to see how to do that > Indeed. Without fixing that passthrough to domUs is completely unsafe, > as you allow domUs full access to registers not explicitly handled by > current vPCI code. Well, my understanding is: we can let the guest access whatever registers it wants with the following exceptions: - "special" registers we already trap in vPCI, e.g. command, BARs - we must not let the guest go out of the configuration space of a specific PCI device, e.g. prevent it from accessing configuration spaces of other devices. The rest accesses seem to be ok to me as we do not really want: - have handlers and emulate all possible registers - we do not want the guest to fail if it accesses a valid register which we do not emulate. > > Regards, Roger. > Thanks, Oleksandr
Julien Grall writes ("Re: [PATCH] xen/arm: fix SBDF calculation for vPCI MMIO handlers"): > On 28/10/2021 16:54, Ian Jackson wrote: > > There are a number of patches that I'm getting CC'd on related to ARM > > and vpci (according to the Subject). Are these targeted for 4.16 ? > > Most of them don't have 4.16 Subject tags. > > Oleksandr wants this patch to be included for 4.16 but forgot to tag it > properly. Oh yes. However, 1. I also wrote this: > > I am finding it difficult to see the wood for the trees. > > It would be really helpful if these vpci fixes were collected > > together into a series. Can someone please confirm whether this is the only vpci-related patch that ought to be on my radar for 4.16 ? 2. I have not had a reply to my question on Wednesday in <24953.34635.645112.279110@mariner.uk.xensource.com>: Um, can you explain what the practical impact is of not taking this patch for 4.16 ? As I understand it vpci for ARM is non-functional in 4.16 and this is not expected to change ? So there would be no benefit to users, and taking the patch would add small but nonzero risk ? I need this information to decide whether a release-ack is appropriate. Note that we are in code freeze so all patches, including bugfixes, need my ack. Thanks, Ian.
On Mon, 1 Nov 2021, Ian Jackson wrote: > Julien Grall writes ("Re: [PATCH] xen/arm: fix SBDF calculation for vPCI MMIO handlers"): > > On 28/10/2021 16:54, Ian Jackson wrote: > > > There are a number of patches that I'm getting CC'd on related to ARM > > > and vpci (according to the Subject). Are these targeted for 4.16 ? > > > Most of them don't have 4.16 Subject tags. > > > > Oleksandr wants this patch to be included for 4.16 but forgot to tag it > > properly. > > Oh yes. However, > > 1. I also wrote this: > > > > I am finding it difficult to see the wood for the trees. > > > It would be really helpful if these vpci fixes were collected > > > together into a series. > > Can someone please confirm whether this is the only vpci-related patch > that ought to be on my radar for 4.16 ? > > 2. I have not had a reply to my question on Wednesday in > <24953.34635.645112.279110@mariner.uk.xensource.com>: > > Um, can you explain what the practical impact is of not taking this > patch for 4.16 ? As I understand it vpci for ARM is non-functional in > 4.16 and this is not expected to change ? So there would be no > benefit to users, and taking the patch would add small but nonzero > risk ? > > I need this information to decide whether a release-ack is > appropriate. > > Note that we are in code freeze so all patches, including bugfixes, > need my ack. Hi Ian, This patch [1] is a straightforward 2 lines fix for vpci on ARM. There is no risk for the release as the source file affected only builds when CONFIG_HAS_VPCI is enabled, and it is currently disabled on ARM. At the same time, as we know vpci is not complete in 4.16 anyway, so the counter argument is that we don't need to fix it. Given how trivial the fix is, and that it cannot break the build or runtime, I would take it. Cheers, Stefano [1] https://marc.info/?l=xen-devel&m=163532307715435
On 01.11.21 23:06, Stefano Stabellini wrote: > On Mon, 1 Nov 2021, Ian Jackson wrote: >> Julien Grall writes ("Re: [PATCH] xen/arm: fix SBDF calculation for vPCI MMIO handlers"): >>> On 28/10/2021 16:54, Ian Jackson wrote: >>>> There are a number of patches that I'm getting CC'd on related to ARM >>>> and vpci (according to the Subject). Are these targeted for 4.16 ? >>>> Most of them don't have 4.16 Subject tags. >>> Oleksandr wants this patch to be included for 4.16 but forgot to tag it >>> properly. >> Oh yes. However, >> >> 1. I also wrote this: >> >>>> I am finding it difficult to see the wood for the trees. >>>> It would be really helpful if these vpci fixes were collected >>>> together into a series. >> Can someone please confirm whether this is the only vpci-related patch >> that ought to be on my radar for 4.16 ? >> >> 2. I have not had a reply to my question on Wednesday in >> <24953.34635.645112.279110@mariner.uk.xensource.com>: >> >> Um, can you explain what the practical impact is of not taking this >> patch for 4.16 ? As I understand it vpci for ARM is non-functional in >> 4.16 and this is not expected to change ? So there would be no >> benefit to users, and taking the patch would add small but nonzero >> risk ? >> >> I need this information to decide whether a release-ack is >> appropriate. >> >> Note that we are in code freeze so all patches, including bugfixes, >> need my ack. > Hi Ian, > > This patch [1] is a straightforward 2 lines fix for vpci on ARM. There > is no risk for the release as the source file affected only builds when > CONFIG_HAS_VPCI is enabled, and it is currently disabled on ARM. > > At the same time, as we know vpci is not complete in 4.16 anyway, so the > counter argument is that we don't need to fix it. > > Given how trivial the fix is, and that it cannot break the build or > runtime, I would take it. Thank you, I can re-send the patch with the updated commit message (Julien), but I still have no R-b's for the patch, so not sure if it is worth it > > Cheers, > > Stefano > > > [1] https://marc.info/?l=xen-devel&m=163532307715435 >
Hi Oleksandr, On 2021/11/1 14:14, Oleksandr Andrushchenko wrote: > > > On 29.10.21 10:33, Roger Pau Monné wrote: >> On Thu, Oct 28, 2021 at 05:55:25PM +0000, Oleksandr Andrushchenko wrote: >>> >>> On 28.10.21 19:03, Roger Pau Monné wrote: >>>> On Thu, Oct 28, 2021 at 02:23:34PM +0000, Oleksandr Andrushchenko wrote: >>>>> On 28.10.21 16:36, Roger Pau Monné wrote: >>>>>> On Thu, Oct 28, 2021 at 12:09:23PM +0000, Oleksandr Andrushchenko wrote: >>>>>>> Hi, Julien! >>>>>>> >>>>>>> On 27.10.21 20:35, Julien Grall wrote: >>>>>>>> Hi Oleksandr, >>>>>>>> >>>>>>>> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: >>>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>>>>>>>> >>>>>>>>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not >>>>>>>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as >>>>>>>>> the base address may not be aligned in the way that the translation >>>>>>>>> always work. If not adjusted with respect to the base address it may not be >>>>>>>>> able to properly convert SBDF and crashes: >>>>>>>>> >>>>>>>>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc >>>>>>>> I can't find a printk() that may output this message. Where does this comes from? >>>>>>> That was a debug print. I shouldn't have used that in the patch description, but >>>>>>> probably after "---" to better explain what's happening >>>>>>>> Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. >>>>>>> This is from dom0 I am working on now. >>>>>>>> IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. >>>>>>>> >>>>>>>>> (XEN) Data Abort Trap. Syndrome=0x6 >>>>>>>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >>>>>>>> I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. >>>>>>>> >>>>>>>> In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. >>>>>>> Well, there is no (?) easy way to validate SBDF. And this could be a problem if we have a misbehaving >>>>>>> guest which may force Xen to access the memory beyond that of PCI host bridge >>>>>> How could that be? The ECAM region exposed to the guest you should be >>>>>> the same as the physical one for dom0? >>>>> Ok, I have a Designware PCI hist which has 2 ECAM regions (I am starting to >>>>> implement the driver for it, so I can be wrong here): >>>>> - Root Complex ECAM area ("dbi"), it is something like 0x3000 bytes long >>>>> - "Client" ECAM area ("config") >>>>> So from Dom0 POV we have 2 ECAM regions and for the guest >>>>> we always emulate a single big region: >>>> You need support for multiple ECAM regions. That's how we do it on x86 >>>> PVH dom0. See register_vpci_mmcfg_handler and related machinery. >>> Is it common for a PCI host bridge to have multiple ECAM regions? >>> Currently on Arm we were about to support "pci-host-ecam-generic" [1], >>> e.g. generic ECAM host bridge which normally (?) has a single ECAM >>> region [2]. But the host bridge I want to support has multiple, so >>> strictly speaking it is not the one that we implement. >> It's possible on x86 to have multiple ECAM regions, whether that means >> multiple host bridges, or host bridges having multiple ECAM regions is >> unknown to me. It's all reported in the MCFG ACPI table (see PCI >> Firmware document for the detailed description of MCFG) using the >> "Configuration Space Base Address Allocation Structure", and there can >> be multiple of those structures. > As we are currently supporting generic ECAM host bridge which > has a single ECAM region I think the existing code we have and > about to upstream is ok as is for now. > I own a bridge which has 2 ECAM regions, so I will work towards > adding its support soon. >> >>> Arm folks, do we want this generalization at this moment to align with x86 >>> with this respect? >>> >>> We can live with the current approach and when I have my driver implemented >>> I can send patches to make that generalization. >>>>> /* >>>>> * 256 MB is reserved for VPCI configuration space based on calculation >>>>> * 256 buses x 32 devices x 8 functions x 4 KB = 256 MB >>>>> */ >>>>> #define GUEST_VPCI_ECAM_BASE xen_mk_ullong(0x10000000) >>>>> #define GUEST_VPCI_ECAM_SIZE xen_mk_ullong(0x10000000) >>>>> >>>>> So, we have the base address and size of the emulated ECAM space >>>>> not connected to the real host bridge >>>>>> And for domUs you really need to fix vpci_{read,write} to not >>>>>> passthrough accesses not explicitly handled. >>>>> Do you mean that we need to validate SBDFs there? >>>>> This can be tricky if we have a use-case when a PCI device being >>>>> passed through if not put at 0000:00:0.0, but requested to be, for >>>>> example, 0000:0d:0.0. So, we need to go over the list of virtual >>>>> devices and see if SBDF the guest is trying to access is a valid SBDF. >>>>> Is this what you mean? >>>> No, you need to prevent accesses to registers not explicitly handled >>>> by vpci. Ie: do not forward unhandled accesses to >>>> vpci_{read,wrie}_hw). >>> I see, so those which have no handlers are not passed to the hardware. >>> I need to see how to do that >> Indeed. Without fixing that passthrough to domUs is completely unsafe, >> as you allow domUs full access to registers not explicitly handled by >> current vPCI code. > Well, my understanding is: we can let the guest access whatever > registers it wants with the following exceptions: > - "special" registers we already trap in vPCI, e.g. command, BARs > - we must not let the guest go out of the configuration space of a > specific PCI device, e.g. prevent it from accessing configuration > spaces of other devices. > The rest accesses seem to be ok to me as we do not really want: > - have handlers and emulate all possible registers > - we do not want the guest to fail if it accesses a valid register which > we do not emulate. I am tring to review your patch, please point out if there is anything wrong. IIUC, vPCI only emulates some registers, and forward unhandled accesses to physical device configuration space (if the accesses passed the validate.)? Does that make the context inconsistent in physical device's configuration space? For example, one register in physical device config space is related to another register. But we just emulate only one in vPCI? >> >> Regards, Roger. >> > Thanks, > Oleksandr >
Hi, On 02.11.21 09:37, Wei Chen wrote: > Hi Oleksandr, > > On 2021/11/1 14:14, Oleksandr Andrushchenko wrote: >> >> >> On 29.10.21 10:33, Roger Pau Monné wrote: >>> On Thu, Oct 28, 2021 at 05:55:25PM +0000, Oleksandr Andrushchenko wrote: >>>> >>>> On 28.10.21 19:03, Roger Pau Monné wrote: >>>>> On Thu, Oct 28, 2021 at 02:23:34PM +0000, Oleksandr Andrushchenko wrote: >>>>>> On 28.10.21 16:36, Roger Pau Monné wrote: >>>>>>> On Thu, Oct 28, 2021 at 12:09:23PM +0000, Oleksandr Andrushchenko wrote: >>>>>>>> Hi, Julien! >>>>>>>> >>>>>>>> On 27.10.21 20:35, Julien Grall wrote: >>>>>>>>> Hi Oleksandr, >>>>>>>>> >>>>>>>>> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: >>>>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >>>>>>>>>> >>>>>>>>>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not >>>>>>>>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as >>>>>>>>>> the base address may not be aligned in the way that the translation >>>>>>>>>> always work. If not adjusted with respect to the base address it may not be >>>>>>>>>> able to properly convert SBDF and crashes: >>>>>>>>>> >>>>>>>>>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc >>>>>>>>> I can't find a printk() that may output this message. Where does this comes from? >>>>>>>> That was a debug print. I shouldn't have used that in the patch description, but >>>>>>>> probably after "---" to better explain what's happening >>>>>>>>> Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. >>>>>>>> This is from dom0 I am working on now. >>>>>>>>> IMHO, the stack trace should come from usptream Xen or need some information to explain how this was reproduced. >>>>>>>>> >>>>>>>>>> (XEN) Data Abort Trap. Syndrome=0x6 >>>>>>>>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >>>>>>>>> I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in theory) not get the correct BDF. But... I don't understand how this would result to a data abort in the hypervisor. >>>>>>>>> >>>>>>>>> In fact, I think the vPCI code should be resilient enough to not crash if we pass the wrong BDF. >>>>>>>> Well, there is no (?) easy way to validate SBDF. And this could be a problem if we have a misbehaving >>>>>>>> guest which may force Xen to access the memory beyond that of PCI host bridge >>>>>>> How could that be? The ECAM region exposed to the guest you should be >>>>>>> the same as the physical one for dom0? >>>>>> Ok, I have a Designware PCI hist which has 2 ECAM regions (I am starting to >>>>>> implement the driver for it, so I can be wrong here): >>>>>> - Root Complex ECAM area ("dbi"), it is something like 0x3000 bytes long >>>>>> - "Client" ECAM area ("config") >>>>>> So from Dom0 POV we have 2 ECAM regions and for the guest >>>>>> we always emulate a single big region: >>>>> You need support for multiple ECAM regions. That's how we do it on x86 >>>>> PVH dom0. See register_vpci_mmcfg_handler and related machinery. >>>> Is it common for a PCI host bridge to have multiple ECAM regions? >>>> Currently on Arm we were about to support "pci-host-ecam-generic" [1], >>>> e.g. generic ECAM host bridge which normally (?) has a single ECAM >>>> region [2]. But the host bridge I want to support has multiple, so >>>> strictly speaking it is not the one that we implement. >>> It's possible on x86 to have multiple ECAM regions, whether that means >>> multiple host bridges, or host bridges having multiple ECAM regions is >>> unknown to me. It's all reported in the MCFG ACPI table (see PCI >>> Firmware document for the detailed description of MCFG) using the >>> "Configuration Space Base Address Allocation Structure", and there can >>> be multiple of those structures. >> As we are currently supporting generic ECAM host bridge which >> has a single ECAM region I think the existing code we have and >> about to upstream is ok as is for now. >> I own a bridge which has 2 ECAM regions, so I will work towards >> adding its support soon. >>> >>>> Arm folks, do we want this generalization at this moment to align with x86 >>>> with this respect? >>>> >>>> We can live with the current approach and when I have my driver implemented >>>> I can send patches to make that generalization. >>>>>> /* >>>>>> * 256 MB is reserved for VPCI configuration space based on calculation >>>>>> * 256 buses x 32 devices x 8 functions x 4 KB = 256 MB >>>>>> */ >>>>>> #define GUEST_VPCI_ECAM_BASE xen_mk_ullong(0x10000000) >>>>>> #define GUEST_VPCI_ECAM_SIZE xen_mk_ullong(0x10000000) >>>>>> >>>>>> So, we have the base address and size of the emulated ECAM space >>>>>> not connected to the real host bridge >>>>>>> And for domUs you really need to fix vpci_{read,write} to not >>>>>>> passthrough accesses not explicitly handled. >>>>>> Do you mean that we need to validate SBDFs there? >>>>>> This can be tricky if we have a use-case when a PCI device being >>>>>> passed through if not put at 0000:00:0.0, but requested to be, for >>>>>> example, 0000:0d:0.0. So, we need to go over the list of virtual >>>>>> devices and see if SBDF the guest is trying to access is a valid SBDF. >>>>>> Is this what you mean? >>>>> No, you need to prevent accesses to registers not explicitly handled >>>>> by vpci. Ie: do not forward unhandled accesses to >>>>> vpci_{read,wrie}_hw). >>>> I see, so those which have no handlers are not passed to the hardware. >>>> I need to see how to do that >>> Indeed. Without fixing that passthrough to domUs is completely unsafe, >>> as you allow domUs full access to registers not explicitly handled by >>> current vPCI code. >> Well, my understanding is: we can let the guest access whatever >> registers it wants with the following exceptions: >> - "special" registers we already trap in vPCI, e.g. command, BARs >> - we must not let the guest go out of the configuration space of a >> specific PCI device, e.g. prevent it from accessing configuration >> spaces of other devices. >> The rest accesses seem to be ok to me as we do not really want: >> - have handlers and emulate all possible registers >> - we do not want the guest to fail if it accesses a valid register which >> we do not emulate. > > I am tring to review your patch, please point out if there is anything > wrong. IIUC, vPCI only emulates some registers, and forward unhandled > accesses to physical device configuration space (if the accesses passed the validate.)? Right > Does that make the context inconsistent in physical device's configuration space? It is always consistent for the hardware domain and some parts of it are emulated for guests > For example, one register in physical device > config space is related to another register. But we just emulate > only one in vPCI? So, we trap for all domains and emulate for guests, e.g. hardware domain's view on the registers is consistent. For guests we emulate: - PCI_COMMAND - not to allow INTx as we do not support that on Arm - BARs - we emulate guest's view on these according to the memory spaces of the emulated host bridge, so the real BARs still have physical values, but guests see emulated ones Hope this helps > > >>> >>> Regards, Roger. >>> >> Thanks, >> Oleksandr >>
Hi Oleksandr, > -----Original Message----- > From: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@epam.com> > Sent: 2021年11月2日 15:47 > To: Wei Chen <Wei.Chen@arm.com>; Roger Pau Monné <roger.pau@citrix.com> > Cc: Julien Grall <julien@xen.org>; Bertrand Marquis > <Bertrand.Marquis@arm.com>; sstabellini@kernel.org; Rahul Singh > <Rahul.Singh@arm.com>; xen-devel@lists.xenproject.org > Subject: Re: [PATCH] xen/arm: fix SBDF calculation for vPCI MMIO handlers > > Hi, > > On 02.11.21 09:37, Wei Chen wrote: > > Hi Oleksandr, > > > > On 2021/11/1 14:14, Oleksandr Andrushchenko wrote: > >> > >> > >> On 29.10.21 10:33, Roger Pau Monné wrote: > >>> On Thu, Oct 28, 2021 at 05:55:25PM +0000, Oleksandr Andrushchenko > wrote: > >>>> > >>>> On 28.10.21 19:03, Roger Pau Monné wrote: > >>>>> On Thu, Oct 28, 2021 at 02:23:34PM +0000, Oleksandr Andrushchenko > wrote: > >>>>>> On 28.10.21 16:36, Roger Pau Monné wrote: > >>>>>>> On Thu, Oct 28, 2021 at 12:09:23PM +0000, Oleksandr Andrushchenko > wrote: > >>>>>>>> Hi, Julien! > >>>>>>>> > >>>>>>>> On 27.10.21 20:35, Julien Grall wrote: > >>>>>>>>> Hi Oleksandr, > >>>>>>>>> > >>>>>>>>> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: > >>>>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> > >>>>>>>>>> > >>>>>>>>>> While in vPCI MMIO trap handlers for the guest PCI host bridge > it is not > >>>>>>>>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info- > >gpa) as > >>>>>>>>>> the base address may not be aligned in the way that the > translation > >>>>>>>>>> always work. If not adjusted with respect to the base address > it may not be > >>>>>>>>>> able to properly convert SBDF and crashes: > >>>>>>>>>> > >>>>>>>>>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc > >>>>>>>>> I can't find a printk() that may output this message. Where does > this comes from? > >>>>>>>> That was a debug print. I shouldn't have used that in the patch > description, but > >>>>>>>> probably after "---" to better explain what's happening > >>>>>>>>> Anyway, IIUC the guest physical address is 0xe65d08bc which, if > I am not mistaken, doesn't belong to the range advertised for > GUEST_VPCI_ECAM. > >>>>>>>> This is from dom0 I am working on now. > >>>>>>>>> IMHO, the stack trace should come from usptream Xen or need some > information to explain how this was reproduced. > >>>>>>>>> > >>>>>>>>>> (XEN) Data Abort Trap. Syndrome=0x6 > >>>>>>>>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR > 0x00000000481d5000 > >>>>>>>>> I can understnad that if we don't substract GUEST_VPCI_ECAM, we > would (in theory) not get the correct BDF. But... I don't understand how > this would result to a data abort in the hypervisor. > >>>>>>>>> > >>>>>>>>> In fact, I think the vPCI code should be resilient enough to not > crash if we pass the wrong BDF. > >>>>>>>> Well, there is no (?) easy way to validate SBDF. And this could > be a problem if we have a misbehaving > >>>>>>>> guest which may force Xen to access the memory beyond that of PCI > host bridge > >>>>>>> How could that be? The ECAM region exposed to the guest you should > be > >>>>>>> the same as the physical one for dom0? > >>>>>> Ok, I have a Designware PCI hist which has 2 ECAM regions (I am > starting to > >>>>>> implement the driver for it, so I can be wrong here): > >>>>>> - Root Complex ECAM area ("dbi"), it is something like 0x3000 bytes > long > >>>>>> - "Client" ECAM area ("config") > >>>>>> So from Dom0 POV we have 2 ECAM regions and for the guest > >>>>>> we always emulate a single big region: > >>>>> You need support for multiple ECAM regions. That's how we do it on > x86 > >>>>> PVH dom0. See register_vpci_mmcfg_handler and related machinery. > >>>> Is it common for a PCI host bridge to have multiple ECAM regions? > >>>> Currently on Arm we were about to support "pci-host-ecam-generic" [1], > >>>> e.g. generic ECAM host bridge which normally (?) has a single ECAM > >>>> region [2]. But the host bridge I want to support has multiple, so > >>>> strictly speaking it is not the one that we implement. > >>> It's possible on x86 to have multiple ECAM regions, whether that means > >>> multiple host bridges, or host bridges having multiple ECAM regions is > >>> unknown to me. It's all reported in the MCFG ACPI table (see PCI > >>> Firmware document for the detailed description of MCFG) using the > >>> "Configuration Space Base Address Allocation Structure", and there can > >>> be multiple of those structures. > >> As we are currently supporting generic ECAM host bridge which > >> has a single ECAM region I think the existing code we have and > >> about to upstream is ok as is for now. > >> I own a bridge which has 2 ECAM regions, so I will work towards > >> adding its support soon. > >>> > >>>> Arm folks, do we want this generalization at this moment to align > with x86 > >>>> with this respect? > >>>> > >>>> We can live with the current approach and when I have my driver > implemented > >>>> I can send patches to make that generalization. > >>>>>> /* > >>>>>> * 256 MB is reserved for VPCI configuration space based on > calculation > >>>>>> * 256 buses x 32 devices x 8 functions x 4 KB = 256 MB > >>>>>> */ > >>>>>> #define GUEST_VPCI_ECAM_BASE xen_mk_ullong(0x10000000) > >>>>>> #define GUEST_VPCI_ECAM_SIZE xen_mk_ullong(0x10000000) > >>>>>> > >>>>>> So, we have the base address and size of the emulated ECAM space > >>>>>> not connected to the real host bridge > >>>>>>> And for domUs you really need to fix vpci_{read,write} to not > >>>>>>> passthrough accesses not explicitly handled. > >>>>>> Do you mean that we need to validate SBDFs there? > >>>>>> This can be tricky if we have a use-case when a PCI device being > >>>>>> passed through if not put at 0000:00:0.0, but requested to be, for > >>>>>> example, 0000:0d:0.0. So, we need to go over the list of virtual > >>>>>> devices and see if SBDF the guest is trying to access is a valid > SBDF. > >>>>>> Is this what you mean? > >>>>> No, you need to prevent accesses to registers not explicitly handled > >>>>> by vpci. Ie: do not forward unhandled accesses to > >>>>> vpci_{read,wrie}_hw). > >>>> I see, so those which have no handlers are not passed to the hardware. > >>>> I need to see how to do that > >>> Indeed. Without fixing that passthrough to domUs is completely unsafe, > >>> as you allow domUs full access to registers not explicitly handled by > >>> current vPCI code. > >> Well, my understanding is: we can let the guest access whatever > >> registers it wants with the following exceptions: > >> - "special" registers we already trap in vPCI, e.g. command, BARs > >> - we must not let the guest go out of the configuration space of a > >> specific PCI device, e.g. prevent it from accessing configuration > >> spaces of other devices. > >> The rest accesses seem to be ok to me as we do not really want: > >> - have handlers and emulate all possible registers > >> - we do not want the guest to fail if it accesses a valid register > which > >> we do not emulate. > > > > I am tring to review your patch, please point out if there is anything > > wrong. IIUC, vPCI only emulates some registers, and forward unhandled > > accesses to physical device configuration space (if the accesses passed > the validate.)? > Right > > Does that make the context inconsistent in physical device's > configuration space? > It is always consistent for the hardware domain and some parts of it are > emulated > for guests > > For example, one register in physical device > > config space is related to another register. But we just emulate > > only one in vPCI? > So, we trap for all domains and emulate for guests, e.g. hardware domain's > view on the > registers is consistent. For guests we emulate: > - PCI_COMMAND - not to allow INTx as we do not support that on Arm > - BARs - we emulate guest's view on these according to the memory spaces > of the emulated host bridge, so the real BARs still have physical values, > but > guests see emulated ones > > Hope this helps Thanks, it's very helpful! > > > > > >>> > >>> Regards, Roger. > >>> > >> Thanks, > >> Oleksandr > >>
On Mon, Nov 01, 2021 at 06:14:40AM +0000, Oleksandr Andrushchenko wrote: > > > On 29.10.21 10:33, Roger Pau Monné wrote: > > On Thu, Oct 28, 2021 at 05:55:25PM +0000, Oleksandr Andrushchenko wrote: > >> > >> On 28.10.21 19:03, Roger Pau Monné wrote: > >>> On Thu, Oct 28, 2021 at 02:23:34PM +0000, Oleksandr Andrushchenko wrote: > >>>> On 28.10.21 16:36, Roger Pau Monné wrote: > >>>>> And for domUs you really need to fix vpci_{read,write} to not > >>>>> passthrough accesses not explicitly handled. > >>>> Do you mean that we need to validate SBDFs there? > >>>> This can be tricky if we have a use-case when a PCI device being > >>>> passed through if not put at 0000:00:0.0, but requested to be, for > >>>> example, 0000:0d:0.0. So, we need to go over the list of virtual > >>>> devices and see if SBDF the guest is trying to access is a valid SBDF. > >>>> Is this what you mean? > >>> No, you need to prevent accesses to registers not explicitly handled > >>> by vpci. Ie: do not forward unhandled accesses to > >>> vpci_{read,wrie}_hw). > >> I see, so those which have no handlers are not passed to the hardware. > >> I need to see how to do that > > Indeed. Without fixing that passthrough to domUs is completely unsafe, > > as you allow domUs full access to registers not explicitly handled by > > current vPCI code. > Well, my understanding is: we can let the guest access whatever > registers it wants with the following exceptions: > - "special" registers we already trap in vPCI, e.g. command, BARs > - we must not let the guest go out of the configuration space of a > specific PCI device, e.g. prevent it from accessing configuration > spaces of other devices. > The rest accesses seem to be ok to me as we do not really want: > - have handlers and emulate all possible registers > - we do not want the guest to fail if it accesses a valid register which > we do not emulate. IMO that's not good from a security PoV. Xen needs to be sure that every registers a guest accesses is not going to cause the system to malfunction, so Xen needs to keep a list of the registers it's safe for a guest to access. For example we should only expose the PCI capabilities that we know are safe for a guest to use, ie: MSI and MSI-X initially. The rest of the capabilities should be blocked from guest access, unless we audit them and declare safe for a guest to access. As a reference you might want to look at the approach currently used by QEMU in order to do PCI passthrough. A very limited set of PCI capabilities known to be safe for untrusted access are exposed to the guest, and registers need to be explicitly handled or else access is rejected. We need a fairly similar model in vPCI or else none of this will be safe for unprivileged access. Regards, Roger.
On 02.11.21 10:48, Roger Pau Monné wrote: > On Mon, Nov 01, 2021 at 06:14:40AM +0000, Oleksandr Andrushchenko wrote: >> >> On 29.10.21 10:33, Roger Pau Monné wrote: >>> On Thu, Oct 28, 2021 at 05:55:25PM +0000, Oleksandr Andrushchenko wrote: >>>> On 28.10.21 19:03, Roger Pau Monné wrote: >>>>> On Thu, Oct 28, 2021 at 02:23:34PM +0000, Oleksandr Andrushchenko wrote: >>>>>> On 28.10.21 16:36, Roger Pau Monné wrote: >>>>>>> And for domUs you really need to fix vpci_{read,write} to not >>>>>>> passthrough accesses not explicitly handled. >>>>>> Do you mean that we need to validate SBDFs there? >>>>>> This can be tricky if we have a use-case when a PCI device being >>>>>> passed through if not put at 0000:00:0.0, but requested to be, for >>>>>> example, 0000:0d:0.0. So, we need to go over the list of virtual >>>>>> devices and see if SBDF the guest is trying to access is a valid SBDF. >>>>>> Is this what you mean? >>>>> No, you need to prevent accesses to registers not explicitly handled >>>>> by vpci. Ie: do not forward unhandled accesses to >>>>> vpci_{read,wrie}_hw). >>>> I see, so those which have no handlers are not passed to the hardware. >>>> I need to see how to do that >>> Indeed. Without fixing that passthrough to domUs is completely unsafe, >>> as you allow domUs full access to registers not explicitly handled by >>> current vPCI code. >> Well, my understanding is: we can let the guest access whatever >> registers it wants with the following exceptions: >> - "special" registers we already trap in vPCI, e.g. command, BARs >> - we must not let the guest go out of the configuration space of a >> specific PCI device, e.g. prevent it from accessing configuration >> spaces of other devices. >> The rest accesses seem to be ok to me as we do not really want: >> - have handlers and emulate all possible registers >> - we do not want the guest to fail if it accesses a valid register which >> we do not emulate. > IMO that's not good from a security PoV. Xen needs to be sure that > every registers a guest accesses is not going to cause the system to > malfunction, so Xen needs to keep a list of the registers it's safe > for a guest to access. > > For example we should only expose the PCI capabilities that we know > are safe for a guest to use, ie: MSI and MSI-X initially. The rest of > the capabilities should be blocked from guest access, unless we audit > them and declare safe for a guest to access. > > As a reference you might want to look at the approach currently used > by QEMU in order to do PCI passthrough. A very limited set of PCI > capabilities known to be safe for untrusted access are exposed to the > guest, and registers need to be explicitly handled or else access is > rejected. We need a fairly similar model in vPCI or else none of this > will be safe for unprivileged access. I do agree with this. But at the moment we only emulate some of them, so in the future we will need revisiting the emulation and put many more registers under Xen's control > > Regards, Roger.
On Tue, Nov 02, 2021 at 09:07:56AM +0000, Oleksandr Andrushchenko wrote: > > > On 02.11.21 10:48, Roger Pau Monné wrote: > > On Mon, Nov 01, 2021 at 06:14:40AM +0000, Oleksandr Andrushchenko wrote: > >> > >> On 29.10.21 10:33, Roger Pau Monné wrote: > >>> On Thu, Oct 28, 2021 at 05:55:25PM +0000, Oleksandr Andrushchenko wrote: > >>>> On 28.10.21 19:03, Roger Pau Monné wrote: > >>>>> On Thu, Oct 28, 2021 at 02:23:34PM +0000, Oleksandr Andrushchenko wrote: > >>>>>> On 28.10.21 16:36, Roger Pau Monné wrote: > >>>>>>> And for domUs you really need to fix vpci_{read,write} to not > >>>>>>> passthrough accesses not explicitly handled. > >>>>>> Do you mean that we need to validate SBDFs there? > >>>>>> This can be tricky if we have a use-case when a PCI device being > >>>>>> passed through if not put at 0000:00:0.0, but requested to be, for > >>>>>> example, 0000:0d:0.0. So, we need to go over the list of virtual > >>>>>> devices and see if SBDF the guest is trying to access is a valid SBDF. > >>>>>> Is this what you mean? > >>>>> No, you need to prevent accesses to registers not explicitly handled > >>>>> by vpci. Ie: do not forward unhandled accesses to > >>>>> vpci_{read,wrie}_hw). > >>>> I see, so those which have no handlers are not passed to the hardware. > >>>> I need to see how to do that > >>> Indeed. Without fixing that passthrough to domUs is completely unsafe, > >>> as you allow domUs full access to registers not explicitly handled by > >>> current vPCI code. > >> Well, my understanding is: we can let the guest access whatever > >> registers it wants with the following exceptions: > >> - "special" registers we already trap in vPCI, e.g. command, BARs > >> - we must not let the guest go out of the configuration space of a > >> specific PCI device, e.g. prevent it from accessing configuration > >> spaces of other devices. > >> The rest accesses seem to be ok to me as we do not really want: > >> - have handlers and emulate all possible registers > >> - we do not want the guest to fail if it accesses a valid register which > >> we do not emulate. > > IMO that's not good from a security PoV. Xen needs to be sure that > > every registers a guest accesses is not going to cause the system to > > malfunction, so Xen needs to keep a list of the registers it's safe > > for a guest to access. > > > > For example we should only expose the PCI capabilities that we know > > are safe for a guest to use, ie: MSI and MSI-X initially. The rest of > > the capabilities should be blocked from guest access, unless we audit > > them and declare safe for a guest to access. > > > > As a reference you might want to look at the approach currently used > > by QEMU in order to do PCI passthrough. A very limited set of PCI > > capabilities known to be safe for untrusted access are exposed to the > > guest, and registers need to be explicitly handled or else access is > > rejected. We need a fairly similar model in vPCI or else none of this > > will be safe for unprivileged access. > I do agree with this. But at the moment we only emulate some of them, > so in the future we will need revisiting the emulation and put many > more registers under Xen's control Indeed. That's my main point - there's still a lot of work to do internally in vPCI in order to be safe for unprivileged guests to use. Thanks, Roger.
Hi Oleksandr, On 02/11/2021 07:16, Oleksandr Andrushchenko wrote: > > > On 01.11.21 23:06, Stefano Stabellini wrote: >> On Mon, 1 Nov 2021, Ian Jackson wrote: >>> Julien Grall writes ("Re: [PATCH] xen/arm: fix SBDF calculation for vPCI MMIO handlers"): >>>> On 28/10/2021 16:54, Ian Jackson wrote: >>>>> There are a number of patches that I'm getting CC'd on related to ARM >>>>> and vpci (according to the Subject). Are these targeted for 4.16 ? >>>>> Most of them don't have 4.16 Subject tags. >>>> Oleksandr wants this patch to be included for 4.16 but forgot to tag it >>>> properly. >>> Oh yes. However, >>> >>> 1. I also wrote this: >>> >>>>> I am finding it difficult to see the wood for the trees. >>>>> It would be really helpful if these vpci fixes were collected >>>>> together into a series. >>> Can someone please confirm whether this is the only vpci-related patch >>> that ought to be on my radar for 4.16 ? >>> >>> 2. I have not had a reply to my question on Wednesday in >>> <24953.34635.645112.279110@mariner.uk.xensource.com>: >>> >>> Um, can you explain what the practical impact is of not taking this >>> patch for 4.16 ? As I understand it vpci for ARM is non-functional in >>> 4.16 and this is not expected to change ? So there would be no >>> benefit to users, and taking the patch would add small but nonzero >>> risk ? >>> >>> I need this information to decide whether a release-ack is >>> appropriate. >>> >>> Note that we are in code freeze so all patches, including bugfixes, >>> need my ack. >> Hi Ian, >> >> This patch [1] is a straightforward 2 lines fix for vpci on ARM. There >> is no risk for the release as the source file affected only builds when >> CONFIG_HAS_VPCI is enabled, and it is currently disabled on ARM. >> >> At the same time, as we know vpci is not complete in 4.16 anyway, so the >> counter argument is that we don't need to fix it. >> >> Given how trivial the fix is, and that it cannot break the build or >> runtime, I would take it. > Thank you, > I can re-send the patch with the updated commit message (Julien), > but I still have no R-b's for the patch, so not sure if it is worth it I can't speak for the others. In my case, I didn't give my reviewed-by because the commit message needs to be updated. If you resend, I will have another look. Cheers,
On 02.11.21 11:32, Julien Grall wrote: > Hi Oleksandr, > > On 02/11/2021 07:16, Oleksandr Andrushchenko wrote: >> >> >> On 01.11.21 23:06, Stefano Stabellini wrote: >>> On Mon, 1 Nov 2021, Ian Jackson wrote: >>>> Julien Grall writes ("Re: [PATCH] xen/arm: fix SBDF calculation for vPCI MMIO handlers"): >>>>> On 28/10/2021 16:54, Ian Jackson wrote: >>>>>> There are a number of patches that I'm getting CC'd on related to ARM >>>>>> and vpci (according to the Subject). Are these targeted for 4.16 ? >>>>>> Most of them don't have 4.16 Subject tags. >>>>> Oleksandr wants this patch to be included for 4.16 but forgot to tag it >>>>> properly. >>>> Oh yes. However, >>>> >>>> 1. I also wrote this: >>>> >>>>>> I am finding it difficult to see the wood for the trees. >>>>>> It would be really helpful if these vpci fixes were collected >>>>>> together into a series. >>>> Can someone please confirm whether this is the only vpci-related patch >>>> that ought to be on my radar for 4.16 ? >>>> >>>> 2. I have not had a reply to my question on Wednesday in >>>> <24953.34635.645112.279110@mariner.uk.xensource.com>: >>>> >>>> Um, can you explain what the practical impact is of not taking this >>>> patch for 4.16 ? As I understand it vpci for ARM is non-functional in >>>> 4.16 and this is not expected to change ? So there would be no >>>> benefit to users, and taking the patch would add small but nonzero >>>> risk ? >>>> >>>> I need this information to decide whether a release-ack is >>>> appropriate. >>>> >>>> Note that we are in code freeze so all patches, including bugfixes, >>>> need my ack. >>> Hi Ian, >>> >>> This patch [1] is a straightforward 2 lines fix for vpci on ARM. There >>> is no risk for the release as the source file affected only builds when >>> CONFIG_HAS_VPCI is enabled, and it is currently disabled on ARM. >>> >>> At the same time, as we know vpci is not complete in 4.16 anyway, so the >>> counter argument is that we don't need to fix it. >>> >>> Given how trivial the fix is, and that it cannot break the build or >>> runtime, I would take it. >> Thank you, >> I can re-send the patch with the updated commit message (Julien), >> but I still have no R-b's for the patch, so not sure if it is worth it > > I can't speak for the others. In my case, I didn't give my reviewed-by because the commit message needs to be updated. If you resend, I will have another look. Sure > > Cheers, > Thanks, Oleksandr
Stefano Stabellini writes ("Re: [PATCH] xen/arm: fix SBDF calculation for vPCI MMIO handlers [and 2 more messages]"): > This patch [1] is a straightforward 2 lines fix for vpci on ARM. There > is no risk for the release as the source file affected only builds when > CONFIG_HAS_VPCI is enabled, and it is currently disabled on ARM. > > At the same time, as we know vpci is not complete in 4.16 anyway, so the > counter argument is that we don't need to fix it. > > Given how trivial the fix is, and that it cannot break the build or > runtime, I would take it. Thanks, that is helpful. Release-Acked-by: Ian Jackson <iwj@xenproject.org>
diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c index 8f40a0dec6d2..23f45386f4b3 100644 --- a/xen/arch/arm/vpci.c +++ b/xen/arch/arm/vpci.c @@ -24,7 +24,7 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info, unsigned long data; /* We ignore segment part and always handle segment 0 */ - sbdf.sbdf = VPCI_ECAM_BDF(info->gpa); + sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); if ( vpci_ecam_read(sbdf, ECAM_REG_OFFSET(info->gpa), 1U << info->dabt.size, &data) ) @@ -44,7 +44,7 @@ static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info, pci_sbdf_t sbdf; /* We ignore segment part and always handle segment 0 */ - sbdf.sbdf = VPCI_ECAM_BDF(info->gpa); + sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE); return vpci_ecam_write(sbdf, ECAM_REG_OFFSET(info->gpa), 1U << info->dabt.size, r);