diff mbox series

[v9,15/16] xen/arm: vpci: check guest range

Message ID 20230829231912.4091958-16-volodymyr_babchuk@epam.com (mailing list archive)
State Superseded
Headers show
Series PCI devices passthrough on Arm, part 3 | expand

Commit Message

Volodymyr Babchuk Aug. 29, 2023, 11:19 p.m. UTC
From: Stewart Hildebrand <stewart.hildebrand@amd.com>

Skip mapping the BAR if it is not in a valid range.

Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
 xen/drivers/vpci/header.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Roger Pau Monné Sept. 22, 2023, 8:44 a.m. UTC | #1
On Tue, Aug 29, 2023 at 11:19:47PM +0000, Volodymyr Babchuk wrote:
> From: Stewart Hildebrand <stewart.hildebrand@amd.com>
> 
> Skip mapping the BAR if it is not in a valid range.
> 
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> ---
>  xen/drivers/vpci/header.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index 1d243eeaf9..dbabdcbed2 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -345,6 +345,15 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
>               bar->enabled == !!(cmd & PCI_COMMAND_MEMORY) )
>              continue;
>  
> +#ifdef CONFIG_ARM
> +        if ( !is_hardware_domain(pdev->domain) )
> +        {
> +            if ( (start_guest < PFN_DOWN(GUEST_VPCI_MEM_ADDR)) ||
> +                 (end_guest >= PFN_DOWN(GUEST_VPCI_MEM_ADDR + GUEST_VPCI_MEM_SIZE)) )
> +                continue;
> +        }
> +#endif

Hm, I think this should be in a hook similar to pci_check_bar() that
can be implemented per-arch.

IIRC at least on x86 we allow the guest to place the BARs whenever it
wants, would such placement cause issues to the hypervisor on Arm?

Thanks, Roger.
Stewart Hildebrand Sept. 25, 2023, 9:49 p.m. UTC | #2
On 9/22/23 04:44, Roger Pau Monné wrote:
> On Tue, Aug 29, 2023 at 11:19:47PM +0000, Volodymyr Babchuk wrote:
>> From: Stewart Hildebrand <stewart.hildebrand@amd.com>
>>
>> Skip mapping the BAR if it is not in a valid range.
>>
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>> ---
>>  xen/drivers/vpci/header.c | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>> index 1d243eeaf9..dbabdcbed2 100644
>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -345,6 +345,15 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
>>               bar->enabled == !!(cmd & PCI_COMMAND_MEMORY) )
>>              continue;
>>
>> +#ifdef CONFIG_ARM
>> +        if ( !is_hardware_domain(pdev->domain) )
>> +        {
>> +            if ( (start_guest < PFN_DOWN(GUEST_VPCI_MEM_ADDR)) ||
>> +                 (end_guest >= PFN_DOWN(GUEST_VPCI_MEM_ADDR + GUEST_VPCI_MEM_SIZE)) )
>> +                continue;
>> +        }
>> +#endif
> 
> Hm, I think this should be in a hook similar to pci_check_bar() that
> can be implemented per-arch.
> 
> IIRC at least on x86 we allow the guest to place the BARs whenever it
> wants, would such placement cause issues to the hypervisor on Arm?

Hm. I wrote this patch in a hurry to make v9 of this series work on ARM. In my haste I also forgot about the prefetchable range starting at GUEST_VPCI_PREFETCH_MEM_ADDR, but that won't matter as we can probably throw this patch out.

Now that I've had some more time to investigate, I believe the check in this patch is more or less redundant to the existing check in map_range() added in baa6ea700386 ("vpci: add permission checks to map_range()").

The issue is that during initialization bar->guest_addr is zeroed, and this initial value of bar->guest_addr will fail the permissions check in map_range() and crash the domain. When the guest writes a new valid BAR, the old invalid address remains in the rangeset to be mapped. If we simply remove the old invalid BAR from the rangeset, that seems to fix the issue. So something like this:

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index d4629a14f26b..732be26f0d2d 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -638,6 +638,16 @@ static void cf_check guest_bar_write(const struct pci_dev *pdev,
         return;
     }

+    if ( (val != 0xfffffff0U) &&
+         (bar->guest_addr != (0xfffffff0ULL & ~(bar->size - 1))) &&
+         (bar->guest_addr != (0xfffffffffffffff0ULL & ~(bar->size - 1))) )
+    {
+        if ( rangeset_remove_range(bar->mem, PFN_DOWN(bar->guest_addr),
+                                   PFN_DOWN(bar->guest_addr + bar->size) - 1) )
+            gprintk(XENLOG_WARNING, "%pp failed to remove old BAR range\n",
+                    &pdev->sbdf);
+    }
+
     bar->guest_addr = guest_addr;
 }
Roger Pau Monné Sept. 26, 2023, 8:07 a.m. UTC | #3
On Mon, Sep 25, 2023 at 05:49:00PM -0400, Stewart Hildebrand wrote:
> On 9/22/23 04:44, Roger Pau Monné wrote:
> > On Tue, Aug 29, 2023 at 11:19:47PM +0000, Volodymyr Babchuk wrote:
> >> From: Stewart Hildebrand <stewart.hildebrand@amd.com>
> >>
> >> Skip mapping the BAR if it is not in a valid range.
> >>
> >> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> >> ---
> >>  xen/drivers/vpci/header.c | 9 +++++++++
> >>  1 file changed, 9 insertions(+)
> >>
> >> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> >> index 1d243eeaf9..dbabdcbed2 100644
> >> --- a/xen/drivers/vpci/header.c
> >> +++ b/xen/drivers/vpci/header.c
> >> @@ -345,6 +345,15 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
> >>               bar->enabled == !!(cmd & PCI_COMMAND_MEMORY) )
> >>              continue;
> >>
> >> +#ifdef CONFIG_ARM
> >> +        if ( !is_hardware_domain(pdev->domain) )
> >> +        {
> >> +            if ( (start_guest < PFN_DOWN(GUEST_VPCI_MEM_ADDR)) ||
> >> +                 (end_guest >= PFN_DOWN(GUEST_VPCI_MEM_ADDR + GUEST_VPCI_MEM_SIZE)) )
> >> +                continue;
> >> +        }
> >> +#endif
> > 
> > Hm, I think this should be in a hook similar to pci_check_bar() that
> > can be implemented per-arch.
> > 
> > IIRC at least on x86 we allow the guest to place the BARs whenever it
> > wants, would such placement cause issues to the hypervisor on Arm?
> 
> Hm. I wrote this patch in a hurry to make v9 of this series work on ARM. In my haste I also forgot about the prefetchable range starting at GUEST_VPCI_PREFETCH_MEM_ADDR, but that won't matter as we can probably throw this patch out.
> 
> Now that I've had some more time to investigate, I believe the check in this patch is more or less redundant to the existing check in map_range() added in baa6ea700386 ("vpci: add permission checks to map_range()").
> 
> The issue is that during initialization bar->guest_addr is zeroed, and this initial value of bar->guest_addr will fail the permissions check in map_range() and crash the domain. When the guest writes a new valid BAR, the old invalid address remains in the rangeset to be mapped. If we simply remove the old invalid BAR from the rangeset, that seems to fix the issue. So something like this:

It does seem to me we are missing a proper cleanup of the rangeset
contents in some paths then.  In the above paragraph you mention "the
old invalid address remains in the rangeset to be mapped", how does it
get in there in the first place, and why is the rangeset not emptied
if the mapping failed?

Thanks, Roger.
Stewart Hildebrand Sept. 26, 2023, 3:27 p.m. UTC | #4
On 9/26/23 04:07, Roger Pau Monné wrote:
> On Mon, Sep 25, 2023 at 05:49:00PM -0400, Stewart Hildebrand wrote:
>> On 9/22/23 04:44, Roger Pau Monné wrote:
>>> On Tue, Aug 29, 2023 at 11:19:47PM +0000, Volodymyr Babchuk wrote:
>>>> From: Stewart Hildebrand <stewart.hildebrand@amd.com>
>>>>
>>>> Skip mapping the BAR if it is not in a valid range.
>>>>
>>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>>>> ---
>>>>  xen/drivers/vpci/header.c | 9 +++++++++
>>>>  1 file changed, 9 insertions(+)
>>>>
>>>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>>>> index 1d243eeaf9..dbabdcbed2 100644
>>>> --- a/xen/drivers/vpci/header.c
>>>> +++ b/xen/drivers/vpci/header.c
>>>> @@ -345,6 +345,15 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
>>>>               bar->enabled == !!(cmd & PCI_COMMAND_MEMORY) )
>>>>              continue;
>>>>
>>>> +#ifdef CONFIG_ARM
>>>> +        if ( !is_hardware_domain(pdev->domain) )
>>>> +        {
>>>> +            if ( (start_guest < PFN_DOWN(GUEST_VPCI_MEM_ADDR)) ||
>>>> +                 (end_guest >= PFN_DOWN(GUEST_VPCI_MEM_ADDR + GUEST_VPCI_MEM_SIZE)) )
>>>> +                continue;
>>>> +        }
>>>> +#endif
>>>
>>> Hm, I think this should be in a hook similar to pci_check_bar() that
>>> can be implemented per-arch.
>>>
>>> IIRC at least on x86 we allow the guest to place the BARs whenever it
>>> wants, would such placement cause issues to the hypervisor on Arm?
>>
>> Hm. I wrote this patch in a hurry to make v9 of this series work on ARM. In my haste I also forgot about the prefetchable range starting at GUEST_VPCI_PREFETCH_MEM_ADDR, but that won't matter as we can probably throw this patch out.
>>
>> Now that I've had some more time to investigate, I believe the check in this patch is more or less redundant to the existing check in map_range() added in baa6ea700386 ("vpci: add permission checks to map_range()").
>>
>> The issue is that during initialization bar->guest_addr is zeroed, and this initial value of bar->guest_addr will fail the permissions check in map_range() and crash the domain. When the guest writes a new valid BAR, the old invalid address remains in the rangeset to be mapped. If we simply remove the old invalid BAR from the rangeset, that seems to fix the issue. So something like this:
> 
> It does seem to me we are missing a proper cleanup of the rangeset
> contents in some paths then.  In the above paragraph you mention "the
> old invalid address remains in the rangeset to be mapped", how does it
> get in there in the first place, and why is the rangeset not emptied
> if the mapping failed?

Back in ("vpci/header: handle p2m range sets per BAR") I added a v->domain == pdev->domain check near the top of vpci_process_pending() as you appropriately suggested.

+    if ( v->domain != pdev->domain )
+    {
+        read_unlock(&v->domain->pci_lock);
+        return false;
+    }

I have also reverted this patch ("xen/arm: vpci: check guest range").

The sequence of events leading to the old value remaining in the rangeset are:

# xl pci-assignable-add 01:00.0
drivers/vpci/vpci.c:vpci_deassign_device()
    deassign 0000:01:00.0 from d0
# grep pci domu.cfg
pci = [ "01:00.0" ]
# xl create domu.cfg
drivers/vpci/vpci.c:vpci_deassign_device()
    deassign 0000:01:00.0 from d[IO]
drivers/vpci/vpci.c:vpci_assign_device()
    assign 0000:01:00.0 to d1
    bar->guest_addr is initialized to zero because of the line: pdev->vpci = xzalloc(struct vpci);
drivers/vpci/header.c:init_bars()
drivers/vpci/header.c:modify_bars()
    BAR0: start 0xe0000, end 0xe000f, start_guest 0x0, end_guest 0xf
    The range { 0-f } is added to the BAR0 rangeset for d1
drivers/vpci/header.c:defer_map()
    raise_softirq(SCHEDULE_SOFTIRQ);
drivers/vpci/header.c:vpci_process_pending()
    vpci_process_pending() returns because v->domain != pdev->domain (i.e. d0 != d1)
    BAR0 rangeset still contains { 0-f }
xl create finishes

Then during domU boot, guest initializes BAR0:

drivers/vpci/header.c:guest_bar_write()
    bar->guest_addr = 0x23000000
drivers/vpci/header.c:modify_bars()
    BAR0: start 0xe0000, end 0xe000f, start_guest 0x23000, end_guest 0x2300f
    The d1 BAR0 rangeset now contains both { 0-f } and { 23000-2300f }
drivers/vpci/header.c:defer_map()
    raise_softirq(SCHEDULE_SOFTIRQ);
drivers/vpci/header.c:vpci_process_pending()
    rangeset_consume_ranges(bar->mem, map_range, &data);
drivers/vpci/header.c:map_range()
    The range { 0-f } fails the permissions check and we crash the domU (back in vpci_process_pending)
Roger Pau Monné Sept. 26, 2023, 3:48 p.m. UTC | #5
On Tue, Sep 26, 2023 at 11:27:48AM -0400, Stewart Hildebrand wrote:
> On 9/26/23 04:07, Roger Pau Monné wrote:
> > On Mon, Sep 25, 2023 at 05:49:00PM -0400, Stewart Hildebrand wrote:
> >> On 9/22/23 04:44, Roger Pau Monné wrote:
> >>> On Tue, Aug 29, 2023 at 11:19:47PM +0000, Volodymyr Babchuk wrote:
> >>>> From: Stewart Hildebrand <stewart.hildebrand@amd.com>
> >>>>
> >>>> Skip mapping the BAR if it is not in a valid range.
> >>>>
> >>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> >>>> ---
> >>>>  xen/drivers/vpci/header.c | 9 +++++++++
> >>>>  1 file changed, 9 insertions(+)
> >>>>
> >>>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> >>>> index 1d243eeaf9..dbabdcbed2 100644
> >>>> --- a/xen/drivers/vpci/header.c
> >>>> +++ b/xen/drivers/vpci/header.c
> >>>> @@ -345,6 +345,15 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
> >>>>               bar->enabled == !!(cmd & PCI_COMMAND_MEMORY) )
> >>>>              continue;
> >>>>
> >>>> +#ifdef CONFIG_ARM
> >>>> +        if ( !is_hardware_domain(pdev->domain) )
> >>>> +        {
> >>>> +            if ( (start_guest < PFN_DOWN(GUEST_VPCI_MEM_ADDR)) ||
> >>>> +                 (end_guest >= PFN_DOWN(GUEST_VPCI_MEM_ADDR + GUEST_VPCI_MEM_SIZE)) )
> >>>> +                continue;
> >>>> +        }
> >>>> +#endif
> >>>
> >>> Hm, I think this should be in a hook similar to pci_check_bar() that
> >>> can be implemented per-arch.
> >>>
> >>> IIRC at least on x86 we allow the guest to place the BARs whenever it
> >>> wants, would such placement cause issues to the hypervisor on Arm?
> >>
> >> Hm. I wrote this patch in a hurry to make v9 of this series work on ARM. In my haste I also forgot about the prefetchable range starting at GUEST_VPCI_PREFETCH_MEM_ADDR, but that won't matter as we can probably throw this patch out.
> >>
> >> Now that I've had some more time to investigate, I believe the check in this patch is more or less redundant to the existing check in map_range() added in baa6ea700386 ("vpci: add permission checks to map_range()").
> >>
> >> The issue is that during initialization bar->guest_addr is zeroed, and this initial value of bar->guest_addr will fail the permissions check in map_range() and crash the domain. When the guest writes a new valid BAR, the old invalid address remains in the rangeset to be mapped. If we simply remove the old invalid BAR from the rangeset, that seems to fix the issue. So something like this:
> > 
> > It does seem to me we are missing a proper cleanup of the rangeset
> > contents in some paths then.  In the above paragraph you mention "the
> > old invalid address remains in the rangeset to be mapped", how does it
> > get in there in the first place, and why is the rangeset not emptied
> > if the mapping failed?
> 
> Back in ("vpci/header: handle p2m range sets per BAR") I added a v->domain == pdev->domain check near the top of vpci_process_pending() as you appropriately suggested.
> 
> +    if ( v->domain != pdev->domain )
> +    {
> +        read_unlock(&v->domain->pci_lock);
> +        return false;
> +    }
> 
> I have also reverted this patch ("xen/arm: vpci: check guest range").
> 
> The sequence of events leading to the old value remaining in the rangeset are:
> 
> # xl pci-assignable-add 01:00.0
> drivers/vpci/vpci.c:vpci_deassign_device()
>     deassign 0000:01:00.0 from d0
> # grep pci domu.cfg
> pci = [ "01:00.0" ]
> # xl create domu.cfg
> drivers/vpci/vpci.c:vpci_deassign_device()
>     deassign 0000:01:00.0 from d[IO]
> drivers/vpci/vpci.c:vpci_assign_device()
>     assign 0000:01:00.0 to d1
>     bar->guest_addr is initialized to zero because of the line: pdev->vpci = xzalloc(struct vpci);
> drivers/vpci/header.c:init_bars()
> drivers/vpci/header.c:modify_bars()

I think I've commented this on another patch, but why is the device
added with memory decoding enabled?  I would expect the FLR performed
before assigning would leave the device with memory decoding disabled?

Otherwise we might have to force init_bars() to assume memory decoding
to be disabled, IOW: memory decoding would be set as disabled in the
guest cmd view, and leave the physical device cmd as-is.  We might
also consider switching memory decoding off unconditionally for domUs
on the physical device.

>     BAR0: start 0xe0000, end 0xe000f, start_guest 0x0, end_guest 0xf
>     The range { 0-f } is added to the BAR0 rangeset for d1
> drivers/vpci/header.c:defer_map()
>     raise_softirq(SCHEDULE_SOFTIRQ);
> drivers/vpci/header.c:vpci_process_pending()
>     vpci_process_pending() returns because v->domain != pdev->domain (i.e. d0 != d1)

I don't think we can easily handle BAR mappings during device
assignment with vPCI, because that would require adding some kind of
continuation support which we don't have ATM.  Might be better to just
switch memory decoding unconditionally off at init_bars() for domUs as
that's the easier solution right now that would allow us to move
forward.

Thanks, Roger.
Stewart Hildebrand Sept. 27, 2023, 6:03 p.m. UTC | #6
On 9/26/23 11:48, Roger Pau Monné wrote:
> On Tue, Sep 26, 2023 at 11:27:48AM -0400, Stewart Hildebrand wrote:
>> On 9/26/23 04:07, Roger Pau Monné wrote:
>>> On Mon, Sep 25, 2023 at 05:49:00PM -0400, Stewart Hildebrand wrote:
>>>> On 9/22/23 04:44, Roger Pau Monné wrote:
>>>>> On Tue, Aug 29, 2023 at 11:19:47PM +0000, Volodymyr Babchuk wrote:
>>>>>> From: Stewart Hildebrand <stewart.hildebrand@amd.com>
>>>>>>
>>>>>> Skip mapping the BAR if it is not in a valid range.
>>>>>>
>>>>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>>>>>> ---
>>>>>>  xen/drivers/vpci/header.c | 9 +++++++++
>>>>>>  1 file changed, 9 insertions(+)
>>>>>>
>>>>>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>>>>>> index 1d243eeaf9..dbabdcbed2 100644
>>>>>> --- a/xen/drivers/vpci/header.c
>>>>>> +++ b/xen/drivers/vpci/header.c
>>>>>> @@ -345,6 +345,15 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
>>>>>>               bar->enabled == !!(cmd & PCI_COMMAND_MEMORY) )
>>>>>>              continue;
>>>>>>
>>>>>> +#ifdef CONFIG_ARM
>>>>>> +        if ( !is_hardware_domain(pdev->domain) )
>>>>>> +        {
>>>>>> +            if ( (start_guest < PFN_DOWN(GUEST_VPCI_MEM_ADDR)) ||
>>>>>> +                 (end_guest >= PFN_DOWN(GUEST_VPCI_MEM_ADDR + GUEST_VPCI_MEM_SIZE)) )
>>>>>> +                continue;
>>>>>> +        }
>>>>>> +#endif
>>>>>
>>>>> Hm, I think this should be in a hook similar to pci_check_bar() that
>>>>> can be implemented per-arch.
>>>>>
>>>>> IIRC at least on x86 we allow the guest to place the BARs whenever it
>>>>> wants, would such placement cause issues to the hypervisor on Arm?
>>>>
>>>> Hm. I wrote this patch in a hurry to make v9 of this series work on ARM. In my haste I also forgot about the prefetchable range starting at GUEST_VPCI_PREFETCH_MEM_ADDR, but that won't matter as we can probably throw this patch out.
>>>>
>>>> Now that I've had some more time to investigate, I believe the check in this patch is more or less redundant to the existing check in map_range() added in baa6ea700386 ("vpci: add permission checks to map_range()").
>>>>
>>>> The issue is that during initialization bar->guest_addr is zeroed, and this initial value of bar->guest_addr will fail the permissions check in map_range() and crash the domain. When the guest writes a new valid BAR, the old invalid address remains in the rangeset to be mapped. If we simply remove the old invalid BAR from the rangeset, that seems to fix the issue. So something like this:
>>>
>>> It does seem to me we are missing a proper cleanup of the rangeset
>>> contents in some paths then.  In the above paragraph you mention "the
>>> old invalid address remains in the rangeset to be mapped", how does it
>>> get in there in the first place, and why is the rangeset not emptied
>>> if the mapping failed?
>>
>> Back in ("vpci/header: handle p2m range sets per BAR") I added a v->domain == pdev->domain check near the top of vpci_process_pending() as you appropriately suggested.
>>
>> +    if ( v->domain != pdev->domain )
>> +    {
>> +        read_unlock(&v->domain->pci_lock);
>> +        return false;
>> +    }
>>
>> I have also reverted this patch ("xen/arm: vpci: check guest range").
>>
>> The sequence of events leading to the old value remaining in the rangeset are:
>>
>> # xl pci-assignable-add 01:00.0
>> drivers/vpci/vpci.c:vpci_deassign_device()
>>     deassign 0000:01:00.0 from d0
>> # grep pci domu.cfg
>> pci = [ "01:00.0" ]
>> # xl create domu.cfg
>> drivers/vpci/vpci.c:vpci_deassign_device()
>>     deassign 0000:01:00.0 from d[IO]
>> drivers/vpci/vpci.c:vpci_assign_device()
>>     assign 0000:01:00.0 to d1
>>     bar->guest_addr is initialized to zero because of the line: pdev->vpci = xzalloc(struct vpci);
>> drivers/vpci/header.c:init_bars()
>> drivers/vpci/header.c:modify_bars()
> 
> I think I've commented this on another patch, but why is the device
> added with memory decoding enabled?  I would expect the FLR performed
> before assigning would leave the device with memory decoding disabled?

It seems the device is indeed being assigned to the domU with memory decoding enabled, but I'm not entirely sure why. The device I'm testing with doesn't support FLR, but it does support pm bus reset:
# cat /sys/bus/pci/devices/0000\:01\:00.0/reset_method
pm bus

As I understand it, libxl__device_pci_reset() should still be able to issue a reset in this case.

> Otherwise we might have to force init_bars() to assume memory decoding
> to be disabled, IOW: memory decoding would be set as disabled in the
> guest cmd view, and leave the physical device cmd as-is.  We might
> also consider switching memory decoding off unconditionally for domUs
> on the physical device.

I did a quick test and it works as expected with my apparently quirky test case:

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index de29e5322d34..0ad0ad947759 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -790,7 +790,12 @@ static int cf_check init_bars(struct pci_dev *pdev)

     /* Disable memory decoding before sizing. */
     if ( cmd & PCI_COMMAND_MEMORY )
+    {
+        if ( !is_hwdom )
+            cmd &= ~PCI_COMMAND_MEMORY;
+
         pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd & ~PCI_COMMAND_MEMORY);
+    }

     header->guest_cmd = cmd & ~PCI_COMMAND_MEMORY;

> 
>>     BAR0: start 0xe0000, end 0xe000f, start_guest 0x0, end_guest 0xf
>>     The range { 0-f } is added to the BAR0 rangeset for d1
>> drivers/vpci/header.c:defer_map()
>>     raise_softirq(SCHEDULE_SOFTIRQ);
>> drivers/vpci/header.c:vpci_process_pending()
>>     vpci_process_pending() returns because v->domain != pdev->domain (i.e. d0 != d1)
> 
> I don't think we can easily handle BAR mappings during device
> assignment with vPCI, because that would require adding some kind of
> continuation support which we don't have ATM.  Might be better to just
> switch memory decoding unconditionally off at init_bars() for domUs as
> that's the easier solution right now that would allow us to move
> forward.

+1
Roger Pau Monné Sept. 28, 2023, 8:28 a.m. UTC | #7
On Wed, Sep 27, 2023 at 02:03:30PM -0400, Stewart Hildebrand wrote:
> On 9/26/23 11:48, Roger Pau Monné wrote:
> > On Tue, Sep 26, 2023 at 11:27:48AM -0400, Stewart Hildebrand wrote:
> >> On 9/26/23 04:07, Roger Pau Monné wrote:
> >>> On Mon, Sep 25, 2023 at 05:49:00PM -0400, Stewart Hildebrand wrote:
> >>>> On 9/22/23 04:44, Roger Pau Monné wrote:
> >>>>> On Tue, Aug 29, 2023 at 11:19:47PM +0000, Volodymyr Babchuk wrote:
> >>>>>> From: Stewart Hildebrand <stewart.hildebrand@amd.com>
> >>>>>>
> >>>>>> Skip mapping the BAR if it is not in a valid range.
> >>>>>>
> >>>>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> >>>>>> ---
> >>>>>>  xen/drivers/vpci/header.c | 9 +++++++++
> >>>>>>  1 file changed, 9 insertions(+)
> >>>>>>
> >>>>>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> >>>>>> index 1d243eeaf9..dbabdcbed2 100644
> >>>>>> --- a/xen/drivers/vpci/header.c
> >>>>>> +++ b/xen/drivers/vpci/header.c
> >>>>>> @@ -345,6 +345,15 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
> >>>>>>               bar->enabled == !!(cmd & PCI_COMMAND_MEMORY) )
> >>>>>>              continue;
> >>>>>>
> >>>>>> +#ifdef CONFIG_ARM
> >>>>>> +        if ( !is_hardware_domain(pdev->domain) )
> >>>>>> +        {
> >>>>>> +            if ( (start_guest < PFN_DOWN(GUEST_VPCI_MEM_ADDR)) ||
> >>>>>> +                 (end_guest >= PFN_DOWN(GUEST_VPCI_MEM_ADDR + GUEST_VPCI_MEM_SIZE)) )
> >>>>>> +                continue;
> >>>>>> +        }
> >>>>>> +#endif
> >>>>>
> >>>>> Hm, I think this should be in a hook similar to pci_check_bar() that
> >>>>> can be implemented per-arch.
> >>>>>
> >>>>> IIRC at least on x86 we allow the guest to place the BARs whenever it
> >>>>> wants, would such placement cause issues to the hypervisor on Arm?
> >>>>
> >>>> Hm. I wrote this patch in a hurry to make v9 of this series work on ARM. In my haste I also forgot about the prefetchable range starting at GUEST_VPCI_PREFETCH_MEM_ADDR, but that won't matter as we can probably throw this patch out.
> >>>>
> >>>> Now that I've had some more time to investigate, I believe the check in this patch is more or less redundant to the existing check in map_range() added in baa6ea700386 ("vpci: add permission checks to map_range()").
> >>>>
> >>>> The issue is that during initialization bar->guest_addr is zeroed, and this initial value of bar->guest_addr will fail the permissions check in map_range() and crash the domain. When the guest writes a new valid BAR, the old invalid address remains in the rangeset to be mapped. If we simply remove the old invalid BAR from the rangeset, that seems to fix the issue. So something like this:
> >>>
> >>> It does seem to me we are missing a proper cleanup of the rangeset
> >>> contents in some paths then.  In the above paragraph you mention "the
> >>> old invalid address remains in the rangeset to be mapped", how does it
> >>> get in there in the first place, and why is the rangeset not emptied
> >>> if the mapping failed?
> >>
> >> Back in ("vpci/header: handle p2m range sets per BAR") I added a v->domain == pdev->domain check near the top of vpci_process_pending() as you appropriately suggested.
> >>
> >> +    if ( v->domain != pdev->domain )
> >> +    {
> >> +        read_unlock(&v->domain->pci_lock);
> >> +        return false;
> >> +    }
> >>
> >> I have also reverted this patch ("xen/arm: vpci: check guest range").
> >>
> >> The sequence of events leading to the old value remaining in the rangeset are:
> >>
> >> # xl pci-assignable-add 01:00.0
> >> drivers/vpci/vpci.c:vpci_deassign_device()
> >>     deassign 0000:01:00.0 from d0
> >> # grep pci domu.cfg
> >> pci = [ "01:00.0" ]
> >> # xl create domu.cfg
> >> drivers/vpci/vpci.c:vpci_deassign_device()
> >>     deassign 0000:01:00.0 from d[IO]
> >> drivers/vpci/vpci.c:vpci_assign_device()
> >>     assign 0000:01:00.0 to d1
> >>     bar->guest_addr is initialized to zero because of the line: pdev->vpci = xzalloc(struct vpci);
> >> drivers/vpci/header.c:init_bars()
> >> drivers/vpci/header.c:modify_bars()
> > 
> > I think I've commented this on another patch, but why is the device
> > added with memory decoding enabled?  I would expect the FLR performed
> > before assigning would leave the device with memory decoding disabled?
> 
> It seems the device is indeed being assigned to the domU with memory decoding enabled, but I'm not entirely sure why. The device I'm testing with doesn't support FLR, but it does support pm bus reset:
> # cat /sys/bus/pci/devices/0000\:01\:00.0/reset_method
> pm bus
> 
> As I understand it, libxl__device_pci_reset() should still be able to issue a reset in this case.

Maybe pciback is somehow restoring part of the previous state?  I
have no insight in what state we expect the device to be handled by
pciback, but this needs investigation in order to know what to expect.

Can you paste the full contents of the command register for this
device?

Thanks, Roger.
Stewart Hildebrand Sept. 28, 2023, 6:28 p.m. UTC | #8
On 9/28/23 04:28, Roger Pau Monné wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> On Wed, Sep 27, 2023 at 02:03:30PM -0400, Stewart Hildebrand wrote:
>> On 9/26/23 11:48, Roger Pau Monné wrote:
>>> On Tue, Sep 26, 2023 at 11:27:48AM -0400, Stewart Hildebrand wrote:
>>>> On 9/26/23 04:07, Roger Pau Monné wrote:
>>>>> On Mon, Sep 25, 2023 at 05:49:00PM -0400, Stewart Hildebrand wrote:
>>>>>> On 9/22/23 04:44, Roger Pau Monné wrote:
>>>>>>> On Tue, Aug 29, 2023 at 11:19:47PM +0000, Volodymyr Babchuk wrote:
>>>>>>>> From: Stewart Hildebrand <stewart.hildebrand@amd.com>
>>>>>>>>
>>>>>>>> Skip mapping the BAR if it is not in a valid range.
>>>>>>>>
>>>>>>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>>>>>>>> ---
>>>>>>>>  xen/drivers/vpci/header.c | 9 +++++++++
>>>>>>>>  1 file changed, 9 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>>>>>>>> index 1d243eeaf9..dbabdcbed2 100644
>>>>>>>> --- a/xen/drivers/vpci/header.c
>>>>>>>> +++ b/xen/drivers/vpci/header.c
>>>>>>>> @@ -345,6 +345,15 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
>>>>>>>>               bar->enabled == !!(cmd & PCI_COMMAND_MEMORY) )
>>>>>>>>              continue;
>>>>>>>>
>>>>>>>> +#ifdef CONFIG_ARM
>>>>>>>> +        if ( !is_hardware_domain(pdev->domain) )
>>>>>>>> +        {
>>>>>>>> +            if ( (start_guest < PFN_DOWN(GUEST_VPCI_MEM_ADDR)) ||
>>>>>>>> +                 (end_guest >= PFN_DOWN(GUEST_VPCI_MEM_ADDR + GUEST_VPCI_MEM_SIZE)) )
>>>>>>>> +                continue;
>>>>>>>> +        }
>>>>>>>> +#endif
>>>>>>>
>>>>>>> Hm, I think this should be in a hook similar to pci_check_bar() that
>>>>>>> can be implemented per-arch.
>>>>>>>
>>>>>>> IIRC at least on x86 we allow the guest to place the BARs whenever it
>>>>>>> wants, would such placement cause issues to the hypervisor on Arm?
>>>>>>
>>>>>> Hm. I wrote this patch in a hurry to make v9 of this series work on ARM. In my haste I also forgot about the prefetchable range starting at GUEST_VPCI_PREFETCH_MEM_ADDR, but that won't matter as we can probably throw this patch out.
>>>>>>
>>>>>> Now that I've had some more time to investigate, I believe the check in this patch is more or less redundant to the existing check in map_range() added in baa6ea700386 ("vpci: add permission checks to map_range()").
>>>>>>
>>>>>> The issue is that during initialization bar->guest_addr is zeroed, and this initial value of bar->guest_addr will fail the permissions check in map_range() and crash the domain. When the guest writes a new valid BAR, the old invalid address remains in the rangeset to be mapped. If we simply remove the old invalid BAR from the rangeset, that seems to fix the issue. So something like this:
>>>>>
>>>>> It does seem to me we are missing a proper cleanup of the rangeset
>>>>> contents in some paths then.  In the above paragraph you mention "the
>>>>> old invalid address remains in the rangeset to be mapped", how does it
>>>>> get in there in the first place, and why is the rangeset not emptied
>>>>> if the mapping failed?
>>>>
>>>> Back in ("vpci/header: handle p2m range sets per BAR") I added a v->domain == pdev->domain check near the top of vpci_process_pending() as you appropriately suggested.
>>>>
>>>> +    if ( v->domain != pdev->domain )
>>>> +    {
>>>> +        read_unlock(&v->domain->pci_lock);
>>>> +        return false;
>>>> +    }
>>>>
>>>> I have also reverted this patch ("xen/arm: vpci: check guest range").
>>>>
>>>> The sequence of events leading to the old value remaining in the rangeset are:
>>>>
>>>> # xl pci-assignable-add 01:00.0
>>>> drivers/vpci/vpci.c:vpci_deassign_device()
>>>>     deassign 0000:01:00.0 from d0
>>>> # grep pci domu.cfg
>>>> pci = [ "01:00.0" ]
>>>> # xl create domu.cfg
>>>> drivers/vpci/vpci.c:vpci_deassign_device()
>>>>     deassign 0000:01:00.0 from d[IO]
>>>> drivers/vpci/vpci.c:vpci_assign_device()
>>>>     assign 0000:01:00.0 to d1
>>>>     bar->guest_addr is initialized to zero because of the line: pdev->vpci = xzalloc(struct vpci);
>>>> drivers/vpci/header.c:init_bars()
>>>> drivers/vpci/header.c:modify_bars()
>>>
>>> I think I've commented this on another patch, but why is the device
>>> added with memory decoding enabled?  I would expect the FLR performed
>>> before assigning would leave the device with memory decoding disabled?
>>
>> It seems the device is indeed being assigned to the domU with memory decoding enabled, but I'm not entirely sure why. The device I'm testing with doesn't support FLR, but it does support pm bus reset:
>> # cat /sys/bus/pci/devices/0000\:01\:00.0/reset_method
>> pm bus
>>
>> As I understand it, libxl__device_pci_reset() should still be able to issue a reset in this case.
> 
> Maybe pciback is somehow restoring part of the previous state?  I
> have no insight in what state we expect the device to be handled by
> pciback, but this needs investigation in order to know what to expect.

Yep, during "xl pci-assignable-add ..." pciback resets the device and restores the state, including whether memory decoding is enabled.

drivers/xen/xen-pciback/pci_stub.c:pcistub_init_device():

	/* We need the device active to save the state. */
	dev_dbg(&dev->dev, "save state of device\n");
	pci_save_state(dev);
	dev_data->pci_saved_state = pci_store_saved_state(dev);
	if (!dev_data->pci_saved_state)
		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
	else {
		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
		__pci_reset_function_locked(dev);
		pci_restore_state(dev);
	}
	/* Now disable the device (this also ensures some private device
	 * data is setup before we export)
	 */
	dev_dbg(&dev->dev, "reset device\n");
	xen_pcibk_reset_device(dev);

That last function, xen_pcibk_reset_device(), clears the bus master enable bit in the command register for devices with PCI_HEADER_TYPE_NORMAL (not a reset contrary to the function name).

xl create should reset the device again, but, similarly, this also seems to restore the state.

> Can you paste the full contents of the command register for this
> device?
Start of day (PCIe controller and bridge initialized, no device BARs or anything have been programmed yet): 0x0000
After dom0 boot, device is in use: 0x0006
After pci-assignable-add: 0x0002
After echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/reset: 0x0002
After xl create, domU booted: 0x0006

Should mapping bars should be conditional on PCI_COMMAND_MASTER, not PCI_COMMAND_MEMORY? E.g.:

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 9cf701b3c464..9ce1793d64b8 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -1162,7 +1162,7 @@ static int cf_check init_bars(struct pci_dev *pdev)
             goto fail;
     }

-    return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, cmd, false) : 0;
+    return (cmd & PCI_COMMAND_MASTER) ? modify_bars(pdev, cmd, false) : 0;

  fail:
     pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);
Roger Pau Monné Oct. 2, 2023, 11:49 a.m. UTC | #9
On Thu, Sep 28, 2023 at 02:28:11PM -0400, Stewart Hildebrand wrote:
> 
> 
> On 9/28/23 04:28, Roger Pau Monné wrote:
> > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> > 
> > 
> > On Wed, Sep 27, 2023 at 02:03:30PM -0400, Stewart Hildebrand wrote:
> >> On 9/26/23 11:48, Roger Pau Monné wrote:
> >>> On Tue, Sep 26, 2023 at 11:27:48AM -0400, Stewart Hildebrand wrote:
> >>>> On 9/26/23 04:07, Roger Pau Monné wrote:
> >>>>> On Mon, Sep 25, 2023 at 05:49:00PM -0400, Stewart Hildebrand wrote:
> >>>>>> On 9/22/23 04:44, Roger Pau Monné wrote:
> >>>>>>> On Tue, Aug 29, 2023 at 11:19:47PM +0000, Volodymyr Babchuk wrote:
> >>>>>>>> From: Stewart Hildebrand <stewart.hildebrand@amd.com>
> >>>>>>>>
> >>>>>>>> Skip mapping the BAR if it is not in a valid range.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> >>>>>>>> ---
> >>>>>>>>  xen/drivers/vpci/header.c | 9 +++++++++
> >>>>>>>>  1 file changed, 9 insertions(+)
> >>>>>>>>
> >>>>>>>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> >>>>>>>> index 1d243eeaf9..dbabdcbed2 100644
> >>>>>>>> --- a/xen/drivers/vpci/header.c
> >>>>>>>> +++ b/xen/drivers/vpci/header.c
> >>>>>>>> @@ -345,6 +345,15 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
> >>>>>>>>               bar->enabled == !!(cmd & PCI_COMMAND_MEMORY) )
> >>>>>>>>              continue;
> >>>>>>>>
> >>>>>>>> +#ifdef CONFIG_ARM
> >>>>>>>> +        if ( !is_hardware_domain(pdev->domain) )
> >>>>>>>> +        {
> >>>>>>>> +            if ( (start_guest < PFN_DOWN(GUEST_VPCI_MEM_ADDR)) ||
> >>>>>>>> +                 (end_guest >= PFN_DOWN(GUEST_VPCI_MEM_ADDR + GUEST_VPCI_MEM_SIZE)) )
> >>>>>>>> +                continue;
> >>>>>>>> +        }
> >>>>>>>> +#endif
> >>>>>>>
> >>>>>>> Hm, I think this should be in a hook similar to pci_check_bar() that
> >>>>>>> can be implemented per-arch.
> >>>>>>>
> >>>>>>> IIRC at least on x86 we allow the guest to place the BARs whenever it
> >>>>>>> wants, would such placement cause issues to the hypervisor on Arm?
> >>>>>>
> >>>>>> Hm. I wrote this patch in a hurry to make v9 of this series work on ARM. In my haste I also forgot about the prefetchable range starting at GUEST_VPCI_PREFETCH_MEM_ADDR, but that won't matter as we can probably throw this patch out.
> >>>>>>
> >>>>>> Now that I've had some more time to investigate, I believe the check in this patch is more or less redundant to the existing check in map_range() added in baa6ea700386 ("vpci: add permission checks to map_range()").
> >>>>>>
> >>>>>> The issue is that during initialization bar->guest_addr is zeroed, and this initial value of bar->guest_addr will fail the permissions check in map_range() and crash the domain. When the guest writes a new valid BAR, the old invalid address remains in the rangeset to be mapped. If we simply remove the old invalid BAR from the rangeset, that seems to fix the issue. So something like this:
> >>>>>
> >>>>> It does seem to me we are missing a proper cleanup of the rangeset
> >>>>> contents in some paths then.  In the above paragraph you mention "the
> >>>>> old invalid address remains in the rangeset to be mapped", how does it
> >>>>> get in there in the first place, and why is the rangeset not emptied
> >>>>> if the mapping failed?
> >>>>
> >>>> Back in ("vpci/header: handle p2m range sets per BAR") I added a v->domain == pdev->domain check near the top of vpci_process_pending() as you appropriately suggested.
> >>>>
> >>>> +    if ( v->domain != pdev->domain )
> >>>> +    {
> >>>> +        read_unlock(&v->domain->pci_lock);
> >>>> +        return false;
> >>>> +    }
> >>>>
> >>>> I have also reverted this patch ("xen/arm: vpci: check guest range").
> >>>>
> >>>> The sequence of events leading to the old value remaining in the rangeset are:
> >>>>
> >>>> # xl pci-assignable-add 01:00.0
> >>>> drivers/vpci/vpci.c:vpci_deassign_device()
> >>>>     deassign 0000:01:00.0 from d0
> >>>> # grep pci domu.cfg
> >>>> pci = [ "01:00.0" ]
> >>>> # xl create domu.cfg
> >>>> drivers/vpci/vpci.c:vpci_deassign_device()
> >>>>     deassign 0000:01:00.0 from d[IO]
> >>>> drivers/vpci/vpci.c:vpci_assign_device()
> >>>>     assign 0000:01:00.0 to d1
> >>>>     bar->guest_addr is initialized to zero because of the line: pdev->vpci = xzalloc(struct vpci);
> >>>> drivers/vpci/header.c:init_bars()
> >>>> drivers/vpci/header.c:modify_bars()
> >>>
> >>> I think I've commented this on another patch, but why is the device
> >>> added with memory decoding enabled?  I would expect the FLR performed
> >>> before assigning would leave the device with memory decoding disabled?
> >>
> >> It seems the device is indeed being assigned to the domU with memory decoding enabled, but I'm not entirely sure why. The device I'm testing with doesn't support FLR, but it does support pm bus reset:
> >> # cat /sys/bus/pci/devices/0000\:01\:00.0/reset_method
> >> pm bus
> >>
> >> As I understand it, libxl__device_pci_reset() should still be able to issue a reset in this case.
> > 
> > Maybe pciback is somehow restoring part of the previous state?  I
> > have no insight in what state we expect the device to be handled by
> > pciback, but this needs investigation in order to know what to expect.
> 
> Yep, during "xl pci-assignable-add ..." pciback resets the device and restores the state, including whether memory decoding is enabled.
> 
> drivers/xen/xen-pciback/pci_stub.c:pcistub_init_device():
> 
> 	/* We need the device active to save the state. */
> 	dev_dbg(&dev->dev, "save state of device\n");
> 	pci_save_state(dev);
> 	dev_data->pci_saved_state = pci_store_saved_state(dev);
> 	if (!dev_data->pci_saved_state)
> 		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
> 	else {
> 		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
> 		__pci_reset_function_locked(dev);
> 		pci_restore_state(dev);
> 	}
> 	/* Now disable the device (this also ensures some private device
> 	 * data is setup before we export)
> 	 */
> 	dev_dbg(&dev->dev, "reset device\n");
> 	xen_pcibk_reset_device(dev);
> 
> That last function, xen_pcibk_reset_device(), clears the bus master enable bit in the command register for devices with PCI_HEADER_TYPE_NORMAL (not a reset contrary to the function name).
> 
> xl create should reset the device again, but, similarly, this also seems to restore the state.
> 
> > Can you paste the full contents of the command register for this
> > device?
> Start of day (PCIe controller and bridge initialized, no device BARs or anything have been programmed yet): 0x0000
> After dom0 boot, device is in use: 0x0006
> After pci-assignable-add: 0x0002
> After echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/reset: 0x0002
> After xl create, domU booted: 0x0006
> 
> Should mapping bars should be conditional on PCI_COMMAND_MASTER, not PCI_COMMAND_MEMORY? E.g.:

NO, I don't think so, as then Xen state would get out of sync with the
hardware state.  I think just disabling memory and IO decoding at
init_bars() for devices assigned to domUs should be fine for the time
being.

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 1d243eeaf9..dbabdcbed2 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -345,6 +345,15 @@  static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
              bar->enabled == !!(cmd & PCI_COMMAND_MEMORY) )
             continue;
 
+#ifdef CONFIG_ARM
+        if ( !is_hardware_domain(pdev->domain) )
+        {
+            if ( (start_guest < PFN_DOWN(GUEST_VPCI_MEM_ADDR)) ||
+                 (end_guest >= PFN_DOWN(GUEST_VPCI_MEM_ADDR + GUEST_VPCI_MEM_SIZE)) )
+                continue;
+        }
+#endif
+
         if ( !pci_check_bar(pdev, _mfn(start), _mfn(end)) )
         {
             printk(XENLOG_G_WARNING