Message ID | 20230913143550.14565-3-stewart.hildebrand@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vPCI capabilities filtering | expand |
On Wed, Sep 13, 2023 at 10:35:47AM -0400, Stewart Hildebrand wrote: > Currently, Xen vPCI only supports virtualizing the MSI and MSI-X capabilities. > Hide all other PCI capabilities (including extended capabilities) from domUs for > now, even though there may be certain devices/drivers that depend on being able > to discover certain capabilities. > > We parse the physical PCI capabilities linked list and add vPCI register > handlers for the next elements, inserting our own next value, thus presenting a > modified linked list to the domU. > > Introduce helper functions vpci_hw_read8 and vpci_read_val. The vpci_read_val > helper function returns a fixed value, which may be used for RAZ registers, or > registers whose value doesn't change. > > Introduce pci_find_next_cap_ttl() helper while adapting the logic from > pci_find_next_cap() to suit our needs, and implement the existing > pci_find_next_cap() in terms of the new helper. > > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> > Reviewed-by: Jan Beulich <jbeulich@suse.com> > --- > v6->v7: > * no change > > v5->v6: > * add register handlers before status register handler in init_bars() > * s/header->mask_cap_list/mask_cap_list/ > > v4->v5: > * use more appropriate types, continued > * get rid of unnecessary hook function > * add Jan's R-b > > v3->v4: > * move mask_cap_list setting to this patch > * leave pci_find_next_cap signature alone > * use more appropriate types > > v2->v3: > * get rid of > 0 in loop condition > * implement pci_find_next_cap in terms of new pci_find_next_cap_ttl function so > that hypothetical future callers wouldn't be required to pass &ttl. > * change NULL to (void *)0 for RAZ value passed to vpci_read_val > * change type of ttl to unsigned int > * remember to mask off the low 2 bits of next in the initial loop iteration > * change return type of pci_find_next_cap and pci_find_next_cap_ttl > * avoid wrapping the PCI_STATUS_CAP_LIST condition by using ! instead of == 0 > > v1->v2: > * change type of ttl to int > * use switch statement instead of if/else > * adapt existing pci_find_next_cap helper instead of rolling our own > * pass ttl as in/out > * "pass through" the lower 2 bits of the next pointer > * squash helper functions into this patch to avoid transient dead code situation > * extended capabilities RAZ/WI > --- > xen/drivers/pci/pci.c | 26 +++++++++----- > xen/drivers/vpci/header.c | 76 +++++++++++++++++++++++++++++++++++++++ > xen/drivers/vpci/vpci.c | 12 +++++++ > xen/include/xen/pci.h | 3 ++ > xen/include/xen/vpci.h | 5 +++ > 5 files changed, 113 insertions(+), 9 deletions(-) > > diff --git a/xen/drivers/pci/pci.c b/xen/drivers/pci/pci.c > index 3569ccb24e9e..8799d60c2156 100644 > --- a/xen/drivers/pci/pci.c > +++ b/xen/drivers/pci/pci.c > @@ -39,31 +39,39 @@ unsigned int pci_find_cap_offset(pci_sbdf_t sbdf, unsigned int cap) > return 0; > } > > -unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos, > - unsigned int cap) > +unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos, > + bool (*is_match)(unsigned int), > + unsigned int cap, unsigned int *ttl) Maybe this has been discussed in previous patch versions, but why pass a match function instead of expanding the cap parameter to be an array of capabilities to search for? I find it kind of weird to be able to pass both a specific capability to match against and also a match function. What the expected behavior if the caller provides both a match function and a cap value? > { > - u8 id; > - int ttl = 48; > + unsigned int id; > > - while ( ttl-- ) > + while ( (*ttl)-- ) > { > pos = pci_conf_read8(sbdf, pos); > if ( pos < 0x40 ) > break; > > - pos &= ~3; > - id = pci_conf_read8(sbdf, pos + PCI_CAP_LIST_ID); > + id = pci_conf_read8(sbdf, (pos & ~3) + PCI_CAP_LIST_ID); > > if ( id == 0xff ) > break; > - if ( id == cap ) > + if ( (is_match && is_match(id)) || (!is_match && id == cap) ) > return pos; > > - pos += PCI_CAP_LIST_NEXT; > + pos = (pos & ~3) + PCI_CAP_LIST_NEXT; > } > + > return 0; > } > > +unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos, > + unsigned int cap) > +{ > + unsigned int ttl = 48; > + > + return pci_find_next_cap_ttl(sbdf, pos, NULL, cap, &ttl) & ~3; > +} > + > /** > * pci_find_ext_capability - Find an extended capability > * @sbdf: PCI device to query > diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c > index af267b75ac31..1e7dfe668ccf 100644 > --- a/xen/drivers/vpci/header.c > +++ b/xen/drivers/vpci/header.c > @@ -513,6 +513,18 @@ static void cf_check rom_write( > rom->addr = val & PCI_ROM_ADDRESS_MASK; > } > > +static bool cf_check vpci_cap_supported(unsigned int id) > +{ > + switch ( id ) > + { > + case PCI_CAP_ID_MSI: > + case PCI_CAP_ID_MSIX: > + return true; > + default: > + return false; > + } > +} > + > static int cf_check init_bars(struct pci_dev *pdev) > { > uint16_t cmd; > @@ -545,6 +557,70 @@ static int cf_check init_bars(struct pci_dev *pdev) We might have to rename this to init_header() now :). > if ( rc ) > return rc; > > + if ( !is_hardware_domain(pdev->domain) ) > + { > + if ( !(pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST) ) > + { > + /* RAZ/WI */ That RAZ/WI acronym seems very Arm specific (TBH I had to search for it). FWIW, it's my understanding that if the status register doesn't report the capability list support, the register is unimplemented, and hence would be fine to return ~0 from reads of PCI_CAPABILITY_LIST? IOW: I'm not sure we need to add this handler for PCI_CAPABILITY_LIST if it's not supported. Thanks, Roger.
On 11/17/23 06:44, Roger Pau Monné wrote: > On Wed, Sep 13, 2023 at 10:35:47AM -0400, Stewart Hildebrand wrote: >> Currently, Xen vPCI only supports virtualizing the MSI and MSI-X capabilities. >> Hide all other PCI capabilities (including extended capabilities) from domUs for >> now, even though there may be certain devices/drivers that depend on being able >> to discover certain capabilities. >> >> We parse the physical PCI capabilities linked list and add vPCI register >> handlers for the next elements, inserting our own next value, thus presenting a >> modified linked list to the domU. >> >> Introduce helper functions vpci_hw_read8 and vpci_read_val. The vpci_read_val >> helper function returns a fixed value, which may be used for RAZ registers, or >> registers whose value doesn't change. >> >> Introduce pci_find_next_cap_ttl() helper while adapting the logic from >> pci_find_next_cap() to suit our needs, and implement the existing >> pci_find_next_cap() in terms of the new helper. >> >> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> >> Reviewed-by: Jan Beulich <jbeulich@suse.com> >> --- >> v6->v7: >> * no change >> >> v5->v6: >> * add register handlers before status register handler in init_bars() >> * s/header->mask_cap_list/mask_cap_list/ >> >> v4->v5: >> * use more appropriate types, continued >> * get rid of unnecessary hook function >> * add Jan's R-b >> >> v3->v4: >> * move mask_cap_list setting to this patch >> * leave pci_find_next_cap signature alone >> * use more appropriate types >> >> v2->v3: >> * get rid of > 0 in loop condition >> * implement pci_find_next_cap in terms of new pci_find_next_cap_ttl function so >> that hypothetical future callers wouldn't be required to pass &ttl. >> * change NULL to (void *)0 for RAZ value passed to vpci_read_val >> * change type of ttl to unsigned int >> * remember to mask off the low 2 bits of next in the initial loop iteration >> * change return type of pci_find_next_cap and pci_find_next_cap_ttl >> * avoid wrapping the PCI_STATUS_CAP_LIST condition by using ! instead of == 0 >> >> v1->v2: >> * change type of ttl to int >> * use switch statement instead of if/else >> * adapt existing pci_find_next_cap helper instead of rolling our own >> * pass ttl as in/out >> * "pass through" the lower 2 bits of the next pointer >> * squash helper functions into this patch to avoid transient dead code situation >> * extended capabilities RAZ/WI >> --- >> xen/drivers/pci/pci.c | 26 +++++++++----- >> xen/drivers/vpci/header.c | 76 +++++++++++++++++++++++++++++++++++++++ >> xen/drivers/vpci/vpci.c | 12 +++++++ >> xen/include/xen/pci.h | 3 ++ >> xen/include/xen/vpci.h | 5 +++ >> 5 files changed, 113 insertions(+), 9 deletions(-) >> >> diff --git a/xen/drivers/pci/pci.c b/xen/drivers/pci/pci.c >> index 3569ccb24e9e..8799d60c2156 100644 >> --- a/xen/drivers/pci/pci.c >> +++ b/xen/drivers/pci/pci.c >> @@ -39,31 +39,39 @@ unsigned int pci_find_cap_offset(pci_sbdf_t sbdf, unsigned int cap) >> return 0; >> } >> >> -unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos, >> - unsigned int cap) >> +unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos, >> + bool (*is_match)(unsigned int), >> + unsigned int cap, unsigned int *ttl) > > Maybe this has been discussed in previous patch versions, but why > pass a match function instead of expanding the cap parameter to > be an array of capabilities to search for? Hm. I don't think we did discuss it previously. It's simple enough to change to an array, so I'll do this for v8. > > I find it kind of weird to be able to pass both a specific capability > to match against and also a match function. Having two ways to specify it was a way to retain compatibility with the existing function pci_find_next_cap() without having to introduce another match function. But I'll change it to an array now. > > What the expected behavior if the caller provides both a match > function and a cap value? > >> { >> - u8 id; >> - int ttl = 48; >> + unsigned int id; >> >> - while ( ttl-- ) >> + while ( (*ttl)-- ) >> { >> pos = pci_conf_read8(sbdf, pos); >> if ( pos < 0x40 ) >> break; >> >> - pos &= ~3; >> - id = pci_conf_read8(sbdf, pos + PCI_CAP_LIST_ID); >> + id = pci_conf_read8(sbdf, (pos & ~3) + PCI_CAP_LIST_ID); >> >> if ( id == 0xff ) >> break; >> - if ( id == cap ) >> + if ( (is_match && is_match(id)) || (!is_match && id == cap) ) >> return pos; >> >> - pos += PCI_CAP_LIST_NEXT; >> + pos = (pos & ~3) + PCI_CAP_LIST_NEXT; >> } >> + >> return 0; >> } >> >> +unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos, >> + unsigned int cap) >> +{ >> + unsigned int ttl = 48; >> + >> + return pci_find_next_cap_ttl(sbdf, pos, NULL, cap, &ttl) & ~3; >> +} >> + >> /** >> * pci_find_ext_capability - Find an extended capability >> * @sbdf: PCI device to query >> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c >> index af267b75ac31..1e7dfe668ccf 100644 >> --- a/xen/drivers/vpci/header.c >> +++ b/xen/drivers/vpci/header.c >> @@ -513,6 +513,18 @@ static void cf_check rom_write( >> rom->addr = val & PCI_ROM_ADDRESS_MASK; >> } >> >> +static bool cf_check vpci_cap_supported(unsigned int id) >> +{ >> + switch ( id ) >> + { >> + case PCI_CAP_ID_MSI: >> + case PCI_CAP_ID_MSIX: >> + return true; >> + default: >> + return false; >> + } >> +} >> + >> static int cf_check init_bars(struct pci_dev *pdev) >> { >> uint16_t cmd; >> @@ -545,6 +557,70 @@ static int cf_check init_bars(struct pci_dev *pdev) > > We might have to rename this to init_header() now :). > >> if ( rc ) >> return rc; >> >> + if ( !is_hardware_domain(pdev->domain) ) >> + { >> + if ( !(pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST) ) >> + { >> + /* RAZ/WI */ > > That RAZ/WI acronym seems very Arm specific (TBH I had to search for > it). > > FWIW, it's my understanding that if the status register doesn't report > the capability list support, the register is unimplemented, and hence > would be fine to return ~0 from reads of PCI_CAPABILITY_LIST? > > IOW: I'm not sure we need to add this handler for PCI_CAPABILITY_LIST > if it's not supported. Agreed, if the hardware itself doesn't have the PCI_STATUS_CAP_LIST bit set, there little to no point in emulating the PCI_CAPABILITY_LIST register. I'll fix this up for v8. > > Thanks, Roger.
diff --git a/xen/drivers/pci/pci.c b/xen/drivers/pci/pci.c index 3569ccb24e9e..8799d60c2156 100644 --- a/xen/drivers/pci/pci.c +++ b/xen/drivers/pci/pci.c @@ -39,31 +39,39 @@ unsigned int pci_find_cap_offset(pci_sbdf_t sbdf, unsigned int cap) return 0; } -unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos, - unsigned int cap) +unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos, + bool (*is_match)(unsigned int), + unsigned int cap, unsigned int *ttl) { - u8 id; - int ttl = 48; + unsigned int id; - while ( ttl-- ) + while ( (*ttl)-- ) { pos = pci_conf_read8(sbdf, pos); if ( pos < 0x40 ) break; - pos &= ~3; - id = pci_conf_read8(sbdf, pos + PCI_CAP_LIST_ID); + id = pci_conf_read8(sbdf, (pos & ~3) + PCI_CAP_LIST_ID); if ( id == 0xff ) break; - if ( id == cap ) + if ( (is_match && is_match(id)) || (!is_match && id == cap) ) return pos; - pos += PCI_CAP_LIST_NEXT; + pos = (pos & ~3) + PCI_CAP_LIST_NEXT; } + return 0; } +unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos, + unsigned int cap) +{ + unsigned int ttl = 48; + + return pci_find_next_cap_ttl(sbdf, pos, NULL, cap, &ttl) & ~3; +} + /** * pci_find_ext_capability - Find an extended capability * @sbdf: PCI device to query diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c index af267b75ac31..1e7dfe668ccf 100644 --- a/xen/drivers/vpci/header.c +++ b/xen/drivers/vpci/header.c @@ -513,6 +513,18 @@ static void cf_check rom_write( rom->addr = val & PCI_ROM_ADDRESS_MASK; } +static bool cf_check vpci_cap_supported(unsigned int id) +{ + switch ( id ) + { + case PCI_CAP_ID_MSI: + case PCI_CAP_ID_MSIX: + return true; + default: + return false; + } +} + static int cf_check init_bars(struct pci_dev *pdev) { uint16_t cmd; @@ -545,6 +557,70 @@ static int cf_check init_bars(struct pci_dev *pdev) if ( rc ) return rc; + if ( !is_hardware_domain(pdev->domain) ) + { + if ( !(pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST) ) + { + /* RAZ/WI */ + rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL, + PCI_CAPABILITY_LIST, 1, (void *)0); + if ( rc ) + return rc; + } + else + { + /* Only expose capabilities to the guest that vPCI can handle. */ + unsigned int next, ttl = 48; + + next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST, + vpci_cap_supported, 0, &ttl); + + rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL, + PCI_CAPABILITY_LIST, 1, + (void *)(uintptr_t)next); + if ( rc ) + return rc; + + next &= ~3; + + if ( !next ) + /* + * If we don't have any supported capabilities to expose to the + * guest, mask the PCI_STATUS_CAP_LIST bit in the status + * register. + */ + mask_cap_list = true; + + while ( next && ttl ) + { + unsigned int pos = next; + + next = pci_find_next_cap_ttl(pdev->sbdf, + pos + PCI_CAP_LIST_NEXT, + vpci_cap_supported, 0, &ttl); + + rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL, + pos + PCI_CAP_LIST_ID, 1, NULL); + if ( rc ) + return rc; + + rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL, + pos + PCI_CAP_LIST_NEXT, 1, + (void *)(uintptr_t)next); + if ( rc ) + return rc; + + next &= ~3; + } + } + + /* Extended capabilities RAZ/WI */ + rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL, 0x100, 4, + (void *)0); + if ( rc ) + return rc; + } + /* * Utilize rsvdz_mask to hide PCI_STATUS_CAP_LIST from the guest for now. If * support for rsvdp (reserved & preserved) is added in the future, the diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c index b4cde7db1b3f..e3b866aaf7a4 100644 --- a/xen/drivers/vpci/vpci.c +++ b/xen/drivers/vpci/vpci.c @@ -136,6 +136,18 @@ static void cf_check vpci_ignored_write( { } +uint32_t cf_check vpci_read_val( + const struct pci_dev *pdev, unsigned int reg, void *data) +{ + return (uintptr_t)data; +} + +uint32_t cf_check vpci_hw_read8( + const struct pci_dev *pdev, unsigned int reg, void *data) +{ + return pci_conf_read8(pdev->sbdf, reg); +} + uint32_t cf_check vpci_hw_read16( const struct pci_dev *pdev, unsigned int reg, void *data) { diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h index 251b8761a8e9..db927eeaeef2 100644 --- a/xen/include/xen/pci.h +++ b/xen/include/xen/pci.h @@ -194,6 +194,9 @@ int pci_mmcfg_read(unsigned int seg, unsigned int bus, int pci_mmcfg_write(unsigned int seg, unsigned int bus, unsigned int devfn, int reg, int len, u32 value); unsigned int pci_find_cap_offset(pci_sbdf_t sbdf, unsigned int cap); +unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos, + bool (*is_match)(unsigned int), + unsigned int cap, unsigned int *ttl); unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos, unsigned int cap); unsigned int pci_find_ext_capability(pci_sbdf_t sbdf, unsigned int cap); diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h index 7a5cca29e54c..b79efc49bad6 100644 --- a/xen/include/xen/vpci.h +++ b/xen/include/xen/vpci.h @@ -51,7 +51,12 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size); void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size, uint32_t data); +uint32_t cf_check vpci_read_val( + const struct pci_dev *pdev, unsigned int reg, void *data); + /* Passthrough handlers. */ +uint32_t cf_check vpci_hw_read8( + const struct pci_dev *pdev, unsigned int reg, void *data); uint32_t cf_check vpci_hw_read16( const struct pci_dev *pdev, unsigned int reg, void *data); uint32_t cf_check vpci_hw_read32(