diff mbox series

[v6] vpci: Add resizable bar support

Message ID 20250123035003.3797022-1-Jiqian.Chen@amd.com (mailing list archive)
State New
Headers show
Series [v6] vpci: Add resizable bar support | expand

Commit Message

Jiqian Chen Jan. 23, 2025, 3:50 a.m. UTC
Some devices, like discrete GPU of amd, support resizable bar
capability, but vpci of Xen doesn't support this feature, so
they fail to resize bars and then cause probing failure.

According to PCIe spec, each bar that supports resizing has
two registers, PCI_REBAR_CAP and PCI_REBAR_CTRL. So, add
handlers for them to support resizing the size of BARs.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
Hi all,
v5->v6 changes:
* Changed "1UL" to "1ULL" in PCI_REBAR_CTRL_SIZE idefinition for 32 bit architecture.
* In rebar_ctrl_write used "bar - pdev->vpci->header.bars" to get index instead of reading
  from register.
* Added the index of BAR to error messages.
* Changed to "continue" instead of "return an error" when vpci_add_register failed.

Best regards,
Jiqian Chen.

v4->v5 changes:
* Called pci_size_mem_bar in rebar_ctrl_write to get addr and size of BAR instead of setting
  their values directly after writing new size to hardware.
* Changed from "return" to "continue" when index/type of BAR are not correct during initializing
  BAR.
* Corrected the value of PCI_REBAR_CTRL_BAR_SIZE from "0x00001F00" to "0x00003F00".
* Renamed PCI_REBAR_SIZE_BIAS to PCI_REBAR_CTRL_SIZE_BIAS.
* Re-defined "PCI_REBAR_CAP_SHIFT 4" to "PCI_REBAR_CAP_SIZES_MASK 0xFFFFFFF0U".

v3->v4 changes:
* Removed PCI_REBAR_CAP_SIZES since it was not needed, and added
  PCI_REBAR_CAP_SHIFT and PCI_REBAR_CTRL_SIZES.
* Added parameter resizable_sizes to struct vpci_bar to cache the support resizable sizes and
  added the logic in init_rebar().
* Changed PCI_REBAR_CAP to PCI_REBAR_CAP(n) (4+8*(n)), changed PCI_REBAR_CTRL to
  PCI_REBAR_CTRL(n) (8+8*(n)).
* Added domain info of pci_dev to printings of init_rebar().

v2->v3 changes:
* Used "bar->enabled" to replace "pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY",
  and added comments why it needs this check.
* Added "!is_hardware_domain(pdev->domain)" check in init_rebar() to return EOPNOTSUPP for domUs.
* Moved BAR type and index check into init_rebar(), then only need to check once.
* Added 'U' suffix for macro PCI_REBAR_CAP_SIZES.
* Added macro PCI_REBAR_SIZE_BIAS to represent 20.
TODO: need to hide ReBar capability from hardware domain when init_rebar() fails.

v1->v2 changes:
* In rebar_ctrl_write, to check if memory decoding is enabled, and added
  some checks for the type of Bar.
* Added vpci_hw_write32 to handle PCI_REBAR_CAP's write, since there is
  no write limitation of dom0.
* And has many other minor modifications as well.
---
 xen/drivers/vpci/Makefile  |   2 +-
 xen/drivers/vpci/rebar.c   | 137 +++++++++++++++++++++++++++++++++++++
 xen/drivers/vpci/vpci.c    |   6 ++
 xen/include/xen/pci_regs.h |  15 ++++
 xen/include/xen/vpci.h     |   3 +
 5 files changed, 162 insertions(+), 1 deletion(-)
 create mode 100644 xen/drivers/vpci/rebar.c

Comments

Jan Beulich Jan. 27, 2025, 2:20 p.m. UTC | #1
On 23.01.2025 04:50, Jiqian Chen wrote:
> v5->v6 changes:
> * Changed "1UL" to "1ULL" in PCI_REBAR_CTRL_SIZE idefinition for 32 bit architecture.
> * In rebar_ctrl_write used "bar - pdev->vpci->header.bars" to get index instead of reading
>   from register.
> * Added the index of BAR to error messages.
> * Changed to "continue" instead of "return an error" when vpci_add_register failed.

I'm not convinced this was a good change to make. While ...

> +static int cf_check init_rebar(struct pci_dev *pdev)
> +{
> +    uint32_t ctrl;
> +    unsigned int nbars;
> +    unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
> +                                                        PCI_EXT_CAP_ID_REBAR);
> +
> +    if ( !rebar_offset )
> +        return 0;
> +
> +    if ( !is_hardware_domain(pdev->domain) )
> +    {
> +        printk(XENLOG_ERR "%pp: resizable BARs unsupported for unpriv %pd\n",
> +               &pdev->sbdf, pdev->domain);
> +        return -EOPNOTSUPP;
> +    }
> +
> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
> +    for ( unsigned int i = 0; i < nbars; i++ )
> +    {
> +        int rc;
> +        struct vpci_bar *bar;
> +        unsigned int index;
> +
> +        ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(i));
> +        index = ctrl & PCI_REBAR_CTRL_BAR_IDX;
> +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
> +        {
> +            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
> +                   pdev->domain, &pdev->sbdf, index);
> +            continue;
> +        }
> +
> +        bar = &pdev->vpci->header.bars[index];
> +        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
> +        {
> +            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
> +                   pdev->domain, &pdev->sbdf, index);
> +            continue;
> +        }

... for these two cases we can permit Dom0 direct access because the BAR
isn't going to work anyway (as far as we can tell), ...

> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
> +                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
> +        if ( rc )
> +        {
> +            /*
> +             * TODO: for failed pathes, need to hide ReBar capability
> +             * from hardware domain instead of returning an error.
> +             */
> +            printk(XENLOG_ERR "%pd %pp: BAR%u fail to add reg of REBAR_CAP rc=%d\n",
> +                   pdev->domain, &pdev->sbdf, index, rc);
> +            continue;
> +        }
> +
> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
> +                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
> +        if ( rc )
> +        {
> +            printk(XENLOG_ERR "%pd %pp: BAR%u fail to add reg of REBAR_CTRL rc=%d\n",
> +                   pdev->domain, &pdev->sbdf, index, rc);
> +            continue;
> +        }

... in these two cases we had an issue internally, and would hence wrongly
allow Dom0 direct access (and in case it's the 2nd one that failed, in fact
only partially direct access, with who knows what resulting inconsistencies).

Only with this particular change undone:
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Otherwise you and Roger (who needs to at least ack the change anyway) will
need to sort that out, with me merely watching.

Jan
Roger Pau Monné Jan. 27, 2025, 2:41 p.m. UTC | #2
On Mon, Jan 27, 2025 at 03:20:40PM +0100, Jan Beulich wrote:
> On 23.01.2025 04:50, Jiqian Chen wrote:
> > v5->v6 changes:
> > * Changed "1UL" to "1ULL" in PCI_REBAR_CTRL_SIZE idefinition for 32 bit architecture.
> > * In rebar_ctrl_write used "bar - pdev->vpci->header.bars" to get index instead of reading
> >   from register.
> > * Added the index of BAR to error messages.
> > * Changed to "continue" instead of "return an error" when vpci_add_register failed.
> 
> I'm not convinced this was a good change to make. While ...
> 
> > +static int cf_check init_rebar(struct pci_dev *pdev)
> > +{
> > +    uint32_t ctrl;
> > +    unsigned int nbars;
> > +    unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
> > +                                                        PCI_EXT_CAP_ID_REBAR);
> > +
> > +    if ( !rebar_offset )
> > +        return 0;
> > +
> > +    if ( !is_hardware_domain(pdev->domain) )
> > +    {
> > +        printk(XENLOG_ERR "%pp: resizable BARs unsupported for unpriv %pd\n",
> > +               &pdev->sbdf, pdev->domain);
> > +        return -EOPNOTSUPP;
> > +    }
> > +
> > +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
> > +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
> > +    for ( unsigned int i = 0; i < nbars; i++ )
> > +    {
> > +        int rc;
> > +        struct vpci_bar *bar;
> > +        unsigned int index;
> > +
> > +        ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(i));
> > +        index = ctrl & PCI_REBAR_CTRL_BAR_IDX;
> > +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
> > +        {
> > +            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
> > +                   pdev->domain, &pdev->sbdf, index);
> > +            continue;
> > +        }
> > +
> > +        bar = &pdev->vpci->header.bars[index];
> > +        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
> > +        {
> > +            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
> > +                   pdev->domain, &pdev->sbdf, index);
> > +            continue;
> > +        }
> 
> ... for these two cases we can permit Dom0 direct access because the BAR
> isn't going to work anyway (as far as we can tell), ...
> 
> > +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
> > +                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
> > +        if ( rc )
> > +        {
> > +            /*
> > +             * TODO: for failed pathes, need to hide ReBar capability
> > +             * from hardware domain instead of returning an error.
> > +             */
> > +            printk(XENLOG_ERR "%pd %pp: BAR%u fail to add reg of REBAR_CAP rc=%d\n",
> > +                   pdev->domain, &pdev->sbdf, index, rc);
> > +            continue;
> > +        }
> > +
> > +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
> > +                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
> > +        if ( rc )
> > +        {
> > +            printk(XENLOG_ERR "%pd %pp: BAR%u fail to add reg of REBAR_CTRL rc=%d\n",
> > +                   pdev->domain, &pdev->sbdf, index, rc);
> > +            continue;
> > +        }
> 
> ... in these two cases we had an issue internally, and would hence wrongly
> allow Dom0 direct access (and in case it's the 2nd one that failed, in fact
> only partially direct access, with who knows what resulting inconsistencies).
> 
> Only with this particular change undone:
R> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> Otherwise you and Roger (who needs to at least ack the change anyway) will
> need to sort that out, with me merely watching.

Ideally errors here should be dealt with by masking the capability.
However Xen doesn't yet have that support.  The usage of continue is
to merely attempt to keep any possible setup hooks working (header,
MSI, MSI-X). Returning failure from init_rebar() will cause all
vPCI hooks to be removed, and thus the hardware domain to have
unmediated access to the device, which is likely worse than just
continuing here.

This already happens in other capability init paths, that are much less
careful about returning errors, so Jan might be right that if nothing
else for consistency we return an error.  With the hope that
initialization error of capabilities in vPCI will eventually lead to
such capabilities being hidden instead of removing all vPCI handlers
from the device.

Thanks, Roger.
Jan Beulich Jan. 27, 2025, 2:52 p.m. UTC | #3
On 27.01.2025 15:41, Roger Pau Monné wrote:
> On Mon, Jan 27, 2025 at 03:20:40PM +0100, Jan Beulich wrote:
>> On 23.01.2025 04:50, Jiqian Chen wrote:
>>> v5->v6 changes:
>>> * Changed "1UL" to "1ULL" in PCI_REBAR_CTRL_SIZE idefinition for 32 bit architecture.
>>> * In rebar_ctrl_write used "bar - pdev->vpci->header.bars" to get index instead of reading
>>>   from register.
>>> * Added the index of BAR to error messages.
>>> * Changed to "continue" instead of "return an error" when vpci_add_register failed.
>>
>> I'm not convinced this was a good change to make. While ...
>>
>>> +static int cf_check init_rebar(struct pci_dev *pdev)
>>> +{
>>> +    uint32_t ctrl;
>>> +    unsigned int nbars;
>>> +    unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
>>> +                                                        PCI_EXT_CAP_ID_REBAR);
>>> +
>>> +    if ( !rebar_offset )
>>> +        return 0;
>>> +
>>> +    if ( !is_hardware_domain(pdev->domain) )
>>> +    {
>>> +        printk(XENLOG_ERR "%pp: resizable BARs unsupported for unpriv %pd\n",
>>> +               &pdev->sbdf, pdev->domain);
>>> +        return -EOPNOTSUPP;
>>> +    }
>>> +
>>> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
>>> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
>>> +    for ( unsigned int i = 0; i < nbars; i++ )
>>> +    {
>>> +        int rc;
>>> +        struct vpci_bar *bar;
>>> +        unsigned int index;
>>> +
>>> +        ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(i));
>>> +        index = ctrl & PCI_REBAR_CTRL_BAR_IDX;
>>> +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
>>> +        {
>>> +            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
>>> +                   pdev->domain, &pdev->sbdf, index);
>>> +            continue;
>>> +        }
>>> +
>>> +        bar = &pdev->vpci->header.bars[index];
>>> +        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
>>> +        {
>>> +            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
>>> +                   pdev->domain, &pdev->sbdf, index);
>>> +            continue;
>>> +        }
>>
>> ... for these two cases we can permit Dom0 direct access because the BAR
>> isn't going to work anyway (as far as we can tell), ...
>>
>>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
>>> +                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
>>> +        if ( rc )
>>> +        {
>>> +            /*
>>> +             * TODO: for failed pathes, need to hide ReBar capability
>>> +             * from hardware domain instead of returning an error.
>>> +             */
>>> +            printk(XENLOG_ERR "%pd %pp: BAR%u fail to add reg of REBAR_CAP rc=%d\n",
>>> +                   pdev->domain, &pdev->sbdf, index, rc);
>>> +            continue;
>>> +        }
>>> +
>>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
>>> +                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
>>> +        if ( rc )
>>> +        {
>>> +            printk(XENLOG_ERR "%pd %pp: BAR%u fail to add reg of REBAR_CTRL rc=%d\n",
>>> +                   pdev->domain, &pdev->sbdf, index, rc);
>>> +            continue;
>>> +        }
>>
>> ... in these two cases we had an issue internally, and would hence wrongly
>> allow Dom0 direct access (and in case it's the 2nd one that failed, in fact
>> only partially direct access, with who knows what resulting inconsistencies).
>>
>> Only with this particular change undone:
> R> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>
>> Otherwise you and Roger (who needs to at least ack the change anyway) will
>> need to sort that out, with me merely watching.
> 
> Ideally errors here should be dealt with by masking the capability.
> However Xen doesn't yet have that support.  The usage of continue is
> to merely attempt to keep any possible setup hooks working (header,
> MSI, MSI-X). Returning failure from init_rebar() will cause all
> vPCI hooks to be removed, and thus the hardware domain to have
> unmediated access to the device, which is likely worse than just
> continuing here.

Hmm, true. Maybe with the exception of the case where the first reg
registration works, but the 2nd fails. Since CTRL is writable but
CAP is r/o (and data there is simply being handed through) I wonder
whether we need to intercept CAP at all, and if we do, whether we
wouldn't better try to register CTRL first.

Jan

> This already happens in other capability init paths, that are much less
> careful about returning errors, so Jan might be right that if nothing
> else for consistency we return an error.  With the hope that
> initialization error of capabilities in vPCI will eventually lead to
> such capabilities being hidden instead of removing all vPCI handlers
> from the device.
> 
> Thanks, Roger.
Roger Pau Monné Jan. 27, 2025, 3:08 p.m. UTC | #4
On Mon, Jan 27, 2025 at 03:52:31PM +0100, Jan Beulich wrote:
> On 27.01.2025 15:41, Roger Pau Monné wrote:
> > On Mon, Jan 27, 2025 at 03:20:40PM +0100, Jan Beulich wrote:
> >> On 23.01.2025 04:50, Jiqian Chen wrote:
> >>> v5->v6 changes:
> >>> * Changed "1UL" to "1ULL" in PCI_REBAR_CTRL_SIZE idefinition for 32 bit architecture.
> >>> * In rebar_ctrl_write used "bar - pdev->vpci->header.bars" to get index instead of reading
> >>>   from register.
> >>> * Added the index of BAR to error messages.
> >>> * Changed to "continue" instead of "return an error" when vpci_add_register failed.
> >>
> >> I'm not convinced this was a good change to make. While ...
> >>
> >>> +static int cf_check init_rebar(struct pci_dev *pdev)
> >>> +{
> >>> +    uint32_t ctrl;
> >>> +    unsigned int nbars;
> >>> +    unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
> >>> +                                                        PCI_EXT_CAP_ID_REBAR);
> >>> +
> >>> +    if ( !rebar_offset )
> >>> +        return 0;
> >>> +
> >>> +    if ( !is_hardware_domain(pdev->domain) )
> >>> +    {
> >>> +        printk(XENLOG_ERR "%pp: resizable BARs unsupported for unpriv %pd\n",
> >>> +               &pdev->sbdf, pdev->domain);
> >>> +        return -EOPNOTSUPP;
> >>> +    }
> >>> +
> >>> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
> >>> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
> >>> +    for ( unsigned int i = 0; i < nbars; i++ )
> >>> +    {
> >>> +        int rc;
> >>> +        struct vpci_bar *bar;
> >>> +        unsigned int index;
> >>> +
> >>> +        ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(i));
> >>> +        index = ctrl & PCI_REBAR_CTRL_BAR_IDX;
> >>> +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
> >>> +        {
> >>> +            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
> >>> +                   pdev->domain, &pdev->sbdf, index);
> >>> +            continue;
> >>> +        }
> >>> +
> >>> +        bar = &pdev->vpci->header.bars[index];
> >>> +        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
> >>> +        {
> >>> +            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
> >>> +                   pdev->domain, &pdev->sbdf, index);
> >>> +            continue;
> >>> +        }
> >>
> >> ... for these two cases we can permit Dom0 direct access because the BAR
> >> isn't going to work anyway (as far as we can tell), ...
> >>
> >>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32vpci_hw_read32, vpci_hw_write32,
> >>> +                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
> >>> +        if ( rc )
> >>> +        {
> >>> +            /*
> >>> +             * TODO: for failed pathes, need to hide ReBar capability
> >>> +             * from hardware domain instead of returning an error.
> >>> +             */
> >>> +            printk(XENLOG_ERR "%pd %pp: BAR%u fail to add reg of REBAR_CAP rc=%d\n",
> >>> +                   pdev->domain, &pdev->sbdf, index, rc);
> >>> +            continue;
> >>> +        }
> >>> +
> >>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
> >>> +                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
> >>> +        if ( rc )
> >>> +        {
> >>> +            printk(XENLOG_ERR "%pd %pp: BAR%u fail to add reg of REBAR_CTRL rc=%d\n",
> >>> +                   pdev->domain, &pdev->sbdf, index, rc);
> >>> +            continue;
> >>> +        }
> >>
> >> ... in these two cases we had an issue internally, and would hence wrongly
> >> allow Dom0 direct access (and in case it's the 2nd one that failed, in fact
> >> only partially direct access, with who knows what resulting inconsistencies).
> >>
> >> Only with this particular change undone:
> > R> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> >>
> >> Otherwise you and Roger (who needs to at least ack the change anyway) will
> >> need to sort that out, with me merely watching.
> > 
> > Ideally errors here should be dealt with by masking the capability.
> > However Xen doesn't yet have that support.  The usage of continue is
> > to merely attempt to keep any possible setup hooks working (header,
> > MSI, MSI-X). Returning failure from init_rebar() will cause all
> > vPCI hooks to be removed, and thus the hardware domain to have
> > unmediated access to the device, which is likely worse than just
> > continuing here.
> 
> Hmm, true. Maybe with the exception of the case where the first reg
> registration works, but the 2nd fails. Since CTRL is writable but
> CAP is r/o (and data there is simply being handed through) I wonder
> whether we need to intercept CAP at all, and if we do, whether we
> wouldn't better try to register CTRL first.

Indeed, Jiqian is that a leftover from a previous version when writes
to CAP where ignored for being a read-only register?

The current adding of a handler with vpci_hw_{read,write}32() result
in the exact same behavior for a hardware domain, which is the only
domain where ReBAR will be exposed.

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/drivers/vpci/Makefile b/xen/drivers/vpci/Makefile
index 1a1413b93e76..a7c8a30a8956 100644
--- a/xen/drivers/vpci/Makefile
+++ b/xen/drivers/vpci/Makefile
@@ -1,2 +1,2 @@ 
-obj-y += vpci.o header.o
+obj-y += vpci.o header.o rebar.o
 obj-$(CONFIG_HAS_PCI_MSI) += msi.o msix.o
diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
new file mode 100644
index 000000000000..8c611cf0cd91
--- /dev/null
+++ b/xen/drivers/vpci/rebar.c
@@ -0,0 +1,137 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2025 Advanced Micro Devices, Inc. All Rights Reserved.
+ *
+ * Author: Jiqian Chen <Jiqian.Chen@amd.com>
+ */
+
+#include <xen/sched.h>
+#include <xen/vpci.h>
+
+static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
+                                      unsigned int reg,
+                                      uint32_t val,
+                                      void *data)
+{
+    struct vpci_bar *bar = data;
+    const unsigned int index = bar - pdev->vpci->header.bars;
+    uint64_t size = PCI_REBAR_CTRL_SIZE(val);
+
+    if ( bar->enabled )
+    {
+        /*
+         * Refuse to resize a BAR while memory decoding is enabled, as
+         * otherwise the size of the mapped region in the p2m would become
+         * stale with the newly set BAR size, and the position of the BAR
+         * would be reset to undefined.  Note the PCIe specification also
+         * forbids resizing a BAR with memory decoding enabled.
+         */
+        if ( size != bar->size )
+            gprintk(XENLOG_ERR,
+                    "%pp: refuse to resize BAR#%u with memory decoding enabled\n",
+                    &pdev->sbdf, index);
+        return;
+    }
+
+    if ( !((size >> PCI_REBAR_CTRL_SIZE_BIAS) & bar->resizable_sizes) )
+        gprintk(XENLOG_WARNING,
+                "%pp: new BAR#%u size %#lx is not supported by hardware\n",
+                &pdev->sbdf, index, size);
+
+    pci_conf_write32(pdev->sbdf, reg, val);
+
+    pci_size_mem_bar(pdev->sbdf,
+                     PCI_BASE_ADDRESS_0 + index * 4,
+                     &bar->addr,
+                     &bar->size,
+                     (index == PCI_HEADER_NORMAL_NR_BARS - 1) ?
+                      PCI_BAR_LAST : 0);
+    bar->guest_addr = bar->addr;
+}
+
+static int cf_check init_rebar(struct pci_dev *pdev)
+{
+    uint32_t ctrl;
+    unsigned int nbars;
+    unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
+                                                        PCI_EXT_CAP_ID_REBAR);
+
+    if ( !rebar_offset )
+        return 0;
+
+    if ( !is_hardware_domain(pdev->domain) )
+    {
+        printk(XENLOG_ERR "%pp: resizable BARs unsupported for unpriv %pd\n",
+               &pdev->sbdf, pdev->domain);
+        return -EOPNOTSUPP;
+    }
+
+    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
+    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
+    for ( unsigned int i = 0; i < nbars; i++ )
+    {
+        int rc;
+        struct vpci_bar *bar;
+        unsigned int index;
+
+        ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(i));
+        index = ctrl & PCI_REBAR_CTRL_BAR_IDX;
+        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
+        {
+            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
+                   pdev->domain, &pdev->sbdf, index);
+            continue;
+        }
+
+        bar = &pdev->vpci->header.bars[index];
+        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
+        {
+            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
+                   pdev->domain, &pdev->sbdf, index);
+            continue;
+        }
+
+        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
+                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
+        if ( rc )
+        {
+            /*
+             * TODO: for failed pathes, need to hide ReBar capability
+             * from hardware domain instead of returning an error.
+             */
+            printk(XENLOG_ERR "%pd %pp: BAR%u fail to add reg of REBAR_CAP rc=%d\n",
+                   pdev->domain, &pdev->sbdf, index, rc);
+            continue;
+        }
+
+        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
+                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
+        if ( rc )
+        {
+            printk(XENLOG_ERR "%pd %pp: BAR%u fail to add reg of REBAR_CTRL rc=%d\n",
+                   pdev->domain, &pdev->sbdf, index, rc);
+            continue;
+        }
+
+        bar->resizable_sizes =
+            MASK_EXTR(pci_conf_read32(pdev->sbdf,
+                                      rebar_offset + PCI_REBAR_CAP(i)),
+                      PCI_REBAR_CAP_SIZES_MASK);
+        bar->resizable_sizes |=
+            (((uint64_t)MASK_EXTR(ctrl, PCI_REBAR_CTRL_SIZES_MASK) << 32) /
+             ISOLATE_LSB(PCI_REBAR_CAP_SIZES_MASK));
+    }
+
+    return 0;
+}
+REGISTER_VPCI_INIT(init_rebar, VPCI_PRIORITY_LOW);
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 1e6aa5d799b9..3349b98389b8 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -232,6 +232,12 @@  void cf_check vpci_hw_write16(
     pci_conf_write16(pdev->sbdf, reg, val);
 }
 
+void cf_check vpci_hw_write32(
+    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
+{
+    pci_conf_write32(pdev->sbdf, reg, val);
+}
+
 int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler,
                            vpci_write_t *write_handler, unsigned int offset,
                            unsigned int size, void *data, uint32_t ro_mask,
diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h
index 250ba106dbd3..2f1d0d63e962 100644
--- a/xen/include/xen/pci_regs.h
+++ b/xen/include/xen/pci_regs.h
@@ -459,6 +459,7 @@ 
 #define PCI_EXT_CAP_ID_ARI	14
 #define PCI_EXT_CAP_ID_ATS	15
 #define PCI_EXT_CAP_ID_SRIOV	16
+#define PCI_EXT_CAP_ID_REBAR	21	/* Resizable BAR */
 
 /* Advanced Error Reporting */
 #define PCI_ERR_UNCOR_STATUS	4	/* Uncorrectable Error Status */
@@ -541,6 +542,20 @@ 
 #define  PCI_VNDR_HEADER_REV(x)	(((x) >> 16) & 0xf)
 #define  PCI_VNDR_HEADER_LEN(x)	(((x) >> 20) & 0xfff)
 
+/* Resizable BARs */
+#define PCI_REBAR_CAP(n)	(4 + 8 * (n))	/* capability register */
+#define  PCI_REBAR_CAP_SIZES_MASK	0xFFFFFFF0U	/* supported BAR sizes in CAP */
+#define PCI_REBAR_CTRL(n)	(8 + 8 * (n))	/* control register */
+#define  PCI_REBAR_CTRL_BAR_IDX		0x00000007	/* BAR index */
+#define  PCI_REBAR_CTRL_NBAR_MASK	0x000000E0	/* # of resizable BARs */
+#define  PCI_REBAR_CTRL_BAR_SIZE	0x00003F00	/* BAR size */
+#define  PCI_REBAR_CTRL_SIZES_MASK	0xFFFF0000U	/* supported BAR sizes in CTRL */
+
+#define PCI_REBAR_CTRL_SIZE_BIAS	20
+#define PCI_REBAR_CTRL_SIZE(v) \
+            (1ULL << (MASK_EXTR(v, PCI_REBAR_CTRL_BAR_SIZE) \
+                      + PCI_REBAR_CTRL_SIZE_BIAS))
+
 /*
  * Hypertransport sub capability types
  *
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 41e7c3bc2791..9d47b8c1a50e 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -78,6 +78,8 @@  uint32_t cf_check vpci_hw_read32(
     const struct pci_dev *pdev, unsigned int reg, void *data);
 void cf_check vpci_hw_write16(
     const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data);
+void cf_check vpci_hw_write32(
+    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data);
 
 /*
  * Check for pending vPCI operations on this vcpu. Returns true if the vcpu
@@ -100,6 +102,7 @@  struct vpci {
             /* Guest address. */
             uint64_t guest_addr;
             uint64_t size;
+            uint64_t resizable_sizes;
             struct rangeset *mem;
             enum {
                 VPCI_BAR_EMPTY,