diff mbox series

[XEN,v3] hvmloader: Enable MMIO and I/O decode, after all resource allocation

Message ID 758e3427f147ed82774edcbbe80b0b29c812e6e4.1586862721.git.havanur@amazon.com (mailing list archive)
State Superseded
Headers show
Series [XEN,v3] hvmloader: Enable MMIO and I/O decode, after all resource allocation | expand

Commit Message

Shamsundara Havanur, Harsha April 14, 2020, 11:12 a.m. UTC
It was observed that PCI MMIO and/or IO BARs were programmed with
memory and I/O decodes (bits 0 and 1 of PCI COMMAND register) enabled,
during PCI setup phase. This resulted in incorrect memory mapping as
soon as the lower half of the 64 bit bar is programmed.
This displaced any RAM mappings under 4G. After the
upper half is programmed PCI memory mapping is restored to its
intended high mem location, but the RAM displaced is not restored.
The OS then continues to boot and function until it tries to access
the displaced RAM at which point it suffers a page fault and crashes.

This patch address the issue by deferring enablement of memory and
I/O decode in command register until all the resources, like interrupts
I/O and/or MMIO BARs for all the PCI device functions are programmed,
in the descending order of memory requested.

Signed-off-by: Harsha Shamsundara Havanur <havanur@amazon.com>
Reviewed-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/firmware/hvmloader/pci.c | 37 +++++++++++++++++++++++++++----------
 1 file changed, 27 insertions(+), 10 deletions(-)

Comments

Jan Beulich April 14, 2020, 2:12 p.m. UTC | #1
On 14.04.2020 13:12, Harsha Shamsundara Havanur wrote:
> It was observed that PCI MMIO and/or IO BARs were programmed with
> memory and I/O decodes (bits 0 and 1 of PCI COMMAND register) enabled,
> during PCI setup phase. This resulted in incorrect memory mapping as
> soon as the lower half of the 64 bit bar is programmed.
> This displaced any RAM mappings under 4G. After the
> upper half is programmed PCI memory mapping is restored to its
> intended high mem location, but the RAM displaced is not restored.
> The OS then continues to boot and function until it tries to access
> the displaced RAM at which point it suffers a page fault and crashes.
> 
> This patch address the issue by deferring enablement of memory and
> I/O decode in command register until all the resources, like interrupts
> I/O and/or MMIO BARs for all the PCI device functions are programmed,
> in the descending order of memory requested.
> 
> Signed-off-by: Harsha Shamsundara Havanur <havanur@amazon.com>
> Reviewed-by: Paul Durrant <pdurrant@amazon.com>
> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

You've fixed bugs, implying you need to drop tags covering the code
which was fixed. As said on v2, imo you should have dropped the tags
then already.

> --- a/tools/firmware/hvmloader/pci.c
> +++ b/tools/firmware/hvmloader/pci.c
> @@ -84,6 +84,7 @@ void pci_setup(void)
>      uint32_t vga_devfn = 256;
>      uint16_t class, vendor_id, device_id;
>      unsigned int bar, pin, link, isa_irq;
> +    uint8_t pci_devfn_decode_type[256] = {};

I'm still waiting for a reply on my v1 comment on the stack
consumption of this, suggesting two 256-bit bitmaps instead.

> @@ -120,6 +121,9 @@ void pci_setup(void)
>       */
>      bool allow_memory_relocate = 1;
>  
> +    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_MEMORY != PCI_COMMAND_MEMORY);
> +    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_IO != PCI_COMMAND_IO);

These lines are too long.

> @@ -288,10 +292,21 @@ void pci_setup(void)
>              printf("pci dev %02x:%x INT%c->IRQ%u\n",
>                     devfn>>3, devfn&7, 'A'+pin-1, isa_irq);
>          }
> -
>          /* Enable bus mastering. */

Please don't drop a blank line like this.

>          cmd = pci_readw(devfn, PCI_COMMAND);
>          cmd |= PCI_COMMAND_MASTER;
> +
> +        /*
> +         * Disable memory and I/O decode,
> +         * It is recommended that BAR programming be done whilst
> +         * decode bits are cleared to avoid incorrect mappings being created,
> +         * when 64-bit memory BAR is programmed first by writing the lower half
> +         * and then the upper half, which first maps to an address under 4G
> +         * replacing any RAM mapped in that address, which is not restored
> +         * back after the upper half is written and PCI memory is correctly
> +         * mapped to its intended high mem address.
> +         */
> +        cmd &= ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO);
>          pci_writew(devfn, PCI_COMMAND, cmd);
>      }

I'd like to note that the disabling of IO and MEMORY you do here comes too
late: It should happen before any of the BARs gets written with ~0. In
particular for 64-bit BARs these writes can again cause undue effects on
the intermediately resulting effective addresses.

> @@ -526,10 +537,16 @@ void pci_setup(void)
>           * has IO enabled, even if there is no I/O BAR on that
>           * particular device.
>           */
> -        cmd = pci_readw(vga_devfn, PCI_COMMAND);
> -        cmd |= PCI_COMMAND_IO;
> -        pci_writew(vga_devfn, PCI_COMMAND, cmd);
> +        pci_devfn_decode_type[vga_devfn] |= PCI_COMMAND_IO;
>      }
> +
> +    /* Enable memory and I/O decode. */
> +    for ( devfn = 0; devfn < 256; devfn++ )
> +        if ( pci_devfn_decode_type[devfn] ) {

Style: The brace belongs on its own line.

To save one set of writes to the command registers I would, btw,
be okay with you moving the MASTER enabling here.

Jan
Shamsundara Havanur, Harsha April 14, 2020, 3:11 p.m. UTC | #2
On Tue, 2020-04-14 at 16:12 +0200, Jan Beulich wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
> 
> 
> 
> On 14.04.2020 13:12, Harsha Shamsundara Havanur wrote:
> > It was observed that PCI MMIO and/or IO BARs were programmed with
> > memory and I/O decodes (bits 0 and 1 of PCI COMMAND register)
> > enabled,
> > during PCI setup phase. This resulted in incorrect memory mapping
> > as
> > soon as the lower half of the 64 bit bar is programmed.
> > This displaced any RAM mappings under 4G. After the
> > upper half is programmed PCI memory mapping is restored to its
> > intended high mem location, but the RAM displaced is not restored.
> > The OS then continues to boot and function until it tries to access
> > the displaced RAM at which point it suffers a page fault and
> > crashes.
> > 
> > This patch address the issue by deferring enablement of memory and
> > I/O decode in command register until all the resources, like
> > interrupts
> > I/O and/or MMIO BARs for all the PCI device functions are
> > programmed,
> > in the descending order of memory requested.
> > 
> > Signed-off-by: Harsha Shamsundara Havanur <havanur@amazon.com>
> > Reviewed-by: Paul Durrant <pdurrant@amazon.com>
> > Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> You've fixed bugs, implying you need to drop tags covering the code
> which was fixed. As said on v2, imo you should have dropped the tags
> then already.
> 
> > --- a/tools/firmware/hvmloader/pci.c
> > +++ b/tools/firmware/hvmloader/pci.c
> > @@ -84,6 +84,7 @@ void pci_setup(void)
> >      uint32_t vga_devfn = 256;
> >      uint16_t class, vendor_id, device_id;
> >      unsigned int bar, pin, link, isa_irq;
> > +    uint8_t pci_devfn_decode_type[256] = {};
> 
> I'm still waiting for a reply on my v1 comment on the stack
> consumption of this, suggesting two 256-bit bitmaps instead.
> 

I chose uint8 array over bitmaps as this reduces complexity of code
to get and set individual bits. Sorry for missing this out in the v1
conversation.

> > @@ -120,6 +121,9 @@ void pci_setup(void)
> >       */
> >      bool allow_memory_relocate = 1;
> > 
> > +    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_MEMOR
> > Y != PCI_COMMAND_MEMORY);
> > +    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_IO !=
> > PCI_COMMAND_IO);
> 
> These lines are too long.
> 
> > @@ -288,10 +292,21 @@ void pci_setup(void)
> >              printf("pci dev %02x:%x INT%c->IRQ%u\n",
> >                     devfn>>3, devfn&7, 'A'+pin-1, isa_irq);
> >          }
> > -
> >          /* Enable bus mastering. */
> 
> Please don't drop a blank line like this.
> 
> >          cmd = pci_readw(devfn, PCI_COMMAND);
> >          cmd |= PCI_COMMAND_MASTER;
> > +
> > +        /*
> > +         * Disable memory and I/O decode,
> > +         * It is recommended that BAR programming be done whilst
> > +         * decode bits are cleared to avoid incorrect mappings
> > being created,
> > +         * when 64-bit memory BAR is programmed first by writing
> > the lower half
> > +         * and then the upper half, which first maps to an address
> > under 4G
> > +         * replacing any RAM mapped in that address, which is not
> > restored
> > +         * back after the upper half is written and PCI memory is
> > correctly
> > +         * mapped to its intended high mem address.
> > +         */
> > +        cmd &= ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO);
> >          pci_writew(devfn, PCI_COMMAND, cmd);
> >      }
> 
> I'd like to note that the disabling of IO and MEMORY you do here
> comes too
> late: It should happen before any of the BARs gets written with ~0.
> In
> particular for 64-bit BARs these writes can again cause undue effects
> on
> the intermediately resulting effective addresses.
> 
I agree, this needs to be done before writing BARS with ~0

> > @@ -526,10 +537,16 @@ void pci_setup(void)
> >           * has IO enabled, even if there is no I/O BAR on that
> >           * particular device.
> >           */
> > -        cmd = pci_readw(vga_devfn, PCI_COMMAND);
> > -        cmd |= PCI_COMMAND_IO;
> > -        pci_writew(vga_devfn, PCI_COMMAND, cmd);
> > +        pci_devfn_decode_type[vga_devfn] |= PCI_COMMAND_IO;
> >      }
> > +
> > +    /* Enable memory and I/O decode. */
> > +    for ( devfn = 0; devfn < 256; devfn++ )
> > +        if ( pci_devfn_decode_type[devfn] ) {
> 
> Style: The brace belongs on its own line.
> 
> To save one set of writes to the command registers I would, btw,
> be okay with you moving the MASTER enabling here.
> 
Good point. I will make these changes in v4.

> Jan
Andrew Cooper April 14, 2020, 4:26 p.m. UTC | #3
On 14/04/2020 15:12, Jan Beulich wrote:
> On 14.04.2020 13:12, Harsha Shamsundara Havanur wrote:
>> It was observed that PCI MMIO and/or IO BARs were programmed with
>> memory and I/O decodes (bits 0 and 1 of PCI COMMAND register) enabled,
>> during PCI setup phase. This resulted in incorrect memory mapping as
>> soon as the lower half of the 64 bit bar is programmed.
>> This displaced any RAM mappings under 4G. After the
>> upper half is programmed PCI memory mapping is restored to its
>> intended high mem location, but the RAM displaced is not restored.
>> The OS then continues to boot and function until it tries to access
>> the displaced RAM at which point it suffers a page fault and crashes.
>>
>> This patch address the issue by deferring enablement of memory and
>> I/O decode in command register until all the resources, like interrupts
>> I/O and/or MMIO BARs for all the PCI device functions are programmed,
>> in the descending order of memory requested.
>>
>> Signed-off-by: Harsha Shamsundara Havanur <havanur@amazon.com>
>> Reviewed-by: Paul Durrant <pdurrant@amazon.com>
>> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

I have not acked this patch.  My comment on v1 was correctly your
mis-spelling of "Ack-by" for Paul's tag.  I apologise if that wasn't
terribly clear.

> You've fixed bugs, implying you need to drop tags covering the code
> which was fixed. As said on v2, imo you should have dropped the tags
> then already.
>
>> --- a/tools/firmware/hvmloader/pci.c
>> +++ b/tools/firmware/hvmloader/pci.c
>> @@ -84,6 +84,7 @@ void pci_setup(void)
>>      uint32_t vga_devfn = 256;
>>      uint16_t class, vendor_id, device_id;
>>      unsigned int bar, pin, link, isa_irq;
>> +    uint8_t pci_devfn_decode_type[256] = {};
> I'm still waiting for a reply on my v1 comment on the stack
> consumption of this, suggesting two 256-bit bitmaps instead.

256 bytes of stack space isn't something to worry about.  Definitely not
for the complexity of using bitmaps instead.

~Andrew
Jan Beulich April 15, 2020, 6:57 a.m. UTC | #4
On 14.04.2020 18:26, Andrew Cooper wrote:
> On 14/04/2020 15:12, Jan Beulich wrote:
>> On 14.04.2020 13:12, Harsha Shamsundara Havanur wrote:
>>> --- a/tools/firmware/hvmloader/pci.c
>>> +++ b/tools/firmware/hvmloader/pci.c
>>> @@ -84,6 +84,7 @@ void pci_setup(void)
>>>      uint32_t vga_devfn = 256;
>>>      uint16_t class, vendor_id, device_id;
>>>      unsigned int bar, pin, link, isa_irq;
>>> +    uint8_t pci_devfn_decode_type[256] = {};
>> I'm still waiting for a reply on my v1 comment on the stack
>> consumption of this, suggesting two 256-bit bitmaps instead.
> 
> 256 bytes of stack space isn't something to worry about.  Definitely not
> for the complexity of using bitmaps instead.

Complexity? hvmloader already has an odd partial set of bitmap
manipulation functions. Completing this doesn't seem overly
difficult. And while I agree that 256 bytes of stack space
_alone_ aren't much reason to worry, it is - as almost
always - the accumulation of local variables (over an entire
call tree, which isn't overly deep here) which counts. (Note
how the use of bitmaps would have avoided the truncation bug
that had been introduced into v2(?).)

Jan
diff mbox series

Patch

diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c
index 0b708bf578..b8a0df3286 100644
--- a/tools/firmware/hvmloader/pci.c
+++ b/tools/firmware/hvmloader/pci.c
@@ -84,6 +84,7 @@  void pci_setup(void)
     uint32_t vga_devfn = 256;
     uint16_t class, vendor_id, device_id;
     unsigned int bar, pin, link, isa_irq;
+    uint8_t pci_devfn_decode_type[256] = {};
 
     /* Resources assignable to PCI devices via BARs. */
     struct resource {
@@ -120,6 +121,9 @@  void pci_setup(void)
      */
     bool allow_memory_relocate = 1;
 
+    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_MEMORY != PCI_COMMAND_MEMORY);
+    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_IO != PCI_COMMAND_IO);
+
     s = xenstore_read(HVM_XS_ALLOW_MEMORY_RELOCATE, NULL);
     if ( s )
         allow_memory_relocate = strtoll(s, NULL, 0);
@@ -288,10 +292,21 @@  void pci_setup(void)
             printf("pci dev %02x:%x INT%c->IRQ%u\n",
                    devfn>>3, devfn&7, 'A'+pin-1, isa_irq);
         }
-
         /* Enable bus mastering. */
         cmd = pci_readw(devfn, PCI_COMMAND);
         cmd |= PCI_COMMAND_MASTER;
+
+        /*
+         * Disable memory and I/O decode,
+         * It is recommended that BAR programming be done whilst
+         * decode bits are cleared to avoid incorrect mappings being created,
+         * when 64-bit memory BAR is programmed first by writing the lower half
+         * and then the upper half, which first maps to an address under 4G
+         * replacing any RAM mapped in that address, which is not restored
+         * back after the upper half is written and PCI memory is correctly
+         * mapped to its intended high mem address.
+         */
+        cmd &= ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO);
         pci_writew(devfn, PCI_COMMAND, cmd);
     }
 
@@ -497,16 +512,12 @@  void pci_setup(void)
                PRIllx_arg(bar_sz),
                bar_data_upper, bar_data);
 			
-
-        /* Now enable the memory or I/O mapping. */
-        cmd = pci_readw(devfn, PCI_COMMAND);
         if ( (bar_reg == PCI_ROM_ADDRESS) ||
              ((bar_data & PCI_BASE_ADDRESS_SPACE) ==
               PCI_BASE_ADDRESS_SPACE_MEMORY) )
-            cmd |= PCI_COMMAND_MEMORY;
+            pci_devfn_decode_type[devfn] |= PCI_COMMAND_MEMORY;
         else
-            cmd |= PCI_COMMAND_IO;
-        pci_writew(devfn, PCI_COMMAND, cmd);
+            pci_devfn_decode_type[devfn] |= PCI_COMMAND_IO;
     }
 
     if ( pci_hi_mem_start )
@@ -526,10 +537,16 @@  void pci_setup(void)
          * has IO enabled, even if there is no I/O BAR on that
          * particular device.
          */
-        cmd = pci_readw(vga_devfn, PCI_COMMAND);
-        cmd |= PCI_COMMAND_IO;
-        pci_writew(vga_devfn, PCI_COMMAND, cmd);
+        pci_devfn_decode_type[vga_devfn] |= PCI_COMMAND_IO;
     }
+
+    /* Enable memory and I/O decode. */
+    for ( devfn = 0; devfn < 256; devfn++ )
+        if ( pci_devfn_decode_type[devfn] ) {
+            cmd = pci_readw(devfn, PCI_COMMAND);
+            cmd |= pci_devfn_decode_type[devfn];
+            pci_writew(devfn, PCI_COMMAND, cmd);
+        }
 }
 
 /*