diff mbox series

[XEN,v4] hvmloader: Enable MMIO and I/O decode, after all resource allocation

Message ID 4b6c017245698c18b063d156be645b8b633c0a99.1586884502.git.havanur@amazon.com (mailing list archive)
State Superseded
Headers show
Series [XEN,v4] hvmloader: Enable MMIO and I/O decode, after all resource allocation | expand

Commit Message

Shamsundara Havanur, Harsha April 14, 2020, 5:15 p.m. UTC
It was observed that PCI MMIO and/or IO BARs were programmed with
memory and I/O decodes (bits 0 and 1 of PCI COMMAND register) enabled,
during PCI setup phase. This resulted in incorrect memory mapping as
soon as the lower half of the 64 bit bar is programmed.
This displaced any RAM mappings under 4G. After the
upper half is programmed PCI memory mapping is restored to its
intended high mem location, but the RAM displaced is not restored.
The OS then continues to boot and function until it tries to access
the displaced RAM at which point it suffers a page fault and crashes.

This patch address the issue by deferring enablement of memory and
I/O decode in command register until all the resources, like interrupts
I/O and/or MMIO BARs for all the PCI device functions are programmed,
in the descending order of memory requested.

Signed-off-by: Harsha Shamsundara Havanur <havanur@amazon.com>

---
Changes in v4:
  Addressed review comments from Jan Beulich <jbeulich@suse.com>
  - Disable decodes before writing ~0 to BARs
  - Enable BUS MASTER at the end along with decode bits

Changes in v3:
  - Retained enabling of BUS master for all device functions as
    was in original commit

Changes in v2:
  - BUS MASTER state was captured and restored to the same state
    while enabling decode bits.
---
---
 tools/firmware/hvmloader/pci.c | 45 ++++++++++++++++++++++++++++++------------
 1 file changed, 32 insertions(+), 13 deletions(-)

Comments

Jan Beulich April 15, 2020, 7:13 a.m. UTC | #1
On 14.04.2020 19:15, Harsha Shamsundara Havanur wrote:
> @@ -120,6 +121,11 @@ void pci_setup(void)
>       */
>      bool allow_memory_relocate = 1;
>  
> +    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_MEMORY
> +            != PCI_COMMAND_MEMORY);
> +    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_IO
> +            != PCI_COMMAND_IO);

This still isn't in line with our default style, you will want eg:

    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_MEMORY !=
                 PCI_COMMAND_MEMORY);
    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_IO !=
                 PCI_COMMAND_IO);

> @@ -208,6 +214,20 @@ void pci_setup(void)
>              break;
>          }
>  
> +        /*
> +         * Disable memory and I/O decode,
> +         * It is recommended that BAR programming be done whilst
> +         * decode bits are cleared to avoid incorrect mappings being created,
> +         * when 64-bit memory BAR is programmed first by writing the lower half
> +         * and then the upper half, which first maps to an address under 4G
> +         * replacing any RAM mapped in that address, which is not restored
> +         * back after the upper half is written and PCI memory is correctly
> +         * mapped to its intended high mem address.
> +         */

Please can you bring this comment into shape? The comma on the first
line doesn't fit with the capital letter the second line starts with.
Plus, if you really mean what is now on the second line to start on a
new one, then please also insert a line consisting of just * at the
properly indented position between the two. Additionally there's at
least one line exceeding the 80-chars-per-line limit.

> @@ -289,10 +309,6 @@ void pci_setup(void)
>                     devfn>>3, devfn&7, 'A'+pin-1, isa_irq);
>          }
>  
> -        /* Enable bus mastering. */
> -        cmd = pci_readw(devfn, PCI_COMMAND);
> -        cmd |= PCI_COMMAND_MASTER;
> -        pci_writew(devfn, PCI_COMMAND, cmd);

The movement of this wants mentioning in the description.

> @@ -526,10 +538,17 @@ void pci_setup(void)
>           * has IO enabled, even if there is no I/O BAR on that
>           * particular device.
>           */
> -        cmd = pci_readw(vga_devfn, PCI_COMMAND);
> -        cmd |= PCI_COMMAND_IO;
> -        pci_writew(vga_devfn, PCI_COMMAND, cmd);
> +        pci_devfn_decode_type[vga_devfn] |= PCI_COMMAND_IO;
>      }
> +
> +    /* Enable bus master, memory and I/O decode. */
> +    for ( devfn = 0; devfn < 256; devfn++ )
> +        if ( pci_devfn_decode_type[devfn] )
> +        {
> +            cmd = pci_readw(devfn, PCI_COMMAND);
> +            cmd |= (PCI_COMMAND_MASTER | pci_devfn_decode_type[devfn]);
> +            pci_writew(devfn, PCI_COMMAND, cmd);
> +        }

This still regresses the setting of MASTER afaict: You only set
that bit now if either IO or MEMORY would also get set. But be
sure to honor the original code not doing the write when vendor/
device IDs are all ones.

Jan
Shamsundara Havanur, Harsha April 15, 2020, 9:32 a.m. UTC | #2
On Wed, 2020-04-15 at 09:13 +0200, Jan Beulich wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
> 
> 
> 
> On 14.04.2020 19:15, Harsha Shamsundara Havanur wrote:
> > @@ -120,6 +121,11 @@ void pci_setup(void)
> >       */
> >      bool allow_memory_relocate = 1;
> > 
> > +    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_MEMOR
> > Y
> > +            != PCI_COMMAND_MEMORY);
> > +    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_IO
> > +            != PCI_COMMAND_IO);
> 
> This still isn't in line with our default style, you will want eg:
> 
>     BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_MEMORY
> !=
>                  PCI_COMMAND_MEMORY);
>     BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_IO !=
>                  PCI_COMMAND_IO);
> 
> > @@ -208,6 +214,20 @@ void pci_setup(void)
> >              break;
> >          }
> > 
> > +        /*
> > +         * Disable memory and I/O decode,
> > +         * It is recommended that BAR programming be done whilst
> > +         * decode bits are cleared to avoid incorrect mappings
> > being created,
> > +         * when 64-bit memory BAR is programmed first by writing
> > the lower half
> > +         * and then the upper half, which first maps to an address
> > under 4G
> > +         * replacing any RAM mapped in that address, which is not
> > restored
> > +         * back after the upper half is written and PCI memory is
> > correctly
> > +         * mapped to its intended high mem address.
> > +         */
> 
> Please can you bring this comment into shape? The comma on the first
> line doesn't fit with the capital letter the second line starts with.
> Plus, if you really mean what is now on the second line to start on a
> new one, then please also insert a line consisting of just * at the
> properly indented position between the two. Additionally there's at
> least one line exceeding the 80-chars-per-line limit.
> 
> > @@ -289,10 +309,6 @@ void pci_setup(void)
> >                     devfn>>3, devfn&7, 'A'+pin-1, isa_irq);
> >          }
> > 
> > -        /* Enable bus mastering. */
> > -        cmd = pci_readw(devfn, PCI_COMMAND);
> > -        cmd |= PCI_COMMAND_MASTER;
> > -        pci_writew(devfn, PCI_COMMAND, cmd);
> 
> The movement of this wants mentioning in the description.
> 
> > @@ -526,10 +538,17 @@ void pci_setup(void)
> >           * has IO enabled, even if there is no I/O BAR on that
> >           * particular device.
> >           */
> > -        cmd = pci_readw(vga_devfn, PCI_COMMAND);
> > -        cmd |= PCI_COMMAND_IO;
> > -        pci_writew(vga_devfn, PCI_COMMAND, cmd);
> > +        pci_devfn_decode_type[vga_devfn] |= PCI_COMMAND_IO;
> >      }
> > +
> > +    /* Enable bus master, memory and I/O decode. */
> > +    for ( devfn = 0; devfn < 256; devfn++ )
> > +        if ( pci_devfn_decode_type[devfn] )
> > +        {
> > +            cmd = pci_readw(devfn, PCI_COMMAND);
> > +            cmd |= (PCI_COMMAND_MASTER |
> > pci_devfn_decode_type[devfn]);
> > +            pci_writew(devfn, PCI_COMMAND, cmd);
> > +        }
> 
> This still regresses the setting of MASTER afaict: You only set
> that bit now if either IO or MEMORY would also get set. But be
> sure to honor the original code not doing the write when vendor/
> device IDs are all ones.
> 
If condition ensures that for devices with vendor/device IDs all ones
are skipped as it would evaluate to false. But this would also skip
enabling Bus master for devices whose vendor/device IDs are not all
ones but IO or memory BARs are not programmed for them. Is there a
possibility of this happening?
> Jan
Jan Beulich April 15, 2020, 10:38 a.m. UTC | #3
On 15.04.2020 11:32, Shamsundara Havanur, Harsha wrote:
> On Wed, 2020-04-15 at 09:13 +0200, Jan Beulich wrote:
>> On 14.04.2020 19:15, Harsha Shamsundara Havanur wrote:
>>> @@ -526,10 +538,17 @@ void pci_setup(void)
>>>           * has IO enabled, even if there is no I/O BAR on that
>>>           * particular device.
>>>           */
>>> -        cmd = pci_readw(vga_devfn, PCI_COMMAND);
>>> -        cmd |= PCI_COMMAND_IO;
>>> -        pci_writew(vga_devfn, PCI_COMMAND, cmd);
>>> +        pci_devfn_decode_type[vga_devfn] |= PCI_COMMAND_IO;
>>>      }
>>> +
>>> +    /* Enable bus master, memory and I/O decode. */
>>> +    for ( devfn = 0; devfn < 256; devfn++ )
>>> +        if ( pci_devfn_decode_type[devfn] )
>>> +        {
>>> +            cmd = pci_readw(devfn, PCI_COMMAND);
>>> +            cmd |= (PCI_COMMAND_MASTER |
>>> pci_devfn_decode_type[devfn]);
>>> +            pci_writew(devfn, PCI_COMMAND, cmd);
>>> +        }
>>
>> This still regresses the setting of MASTER afaict: You only set
>> that bit now if either IO or MEMORY would also get set. But be
>> sure to honor the original code not doing the write when vendor/
>> device IDs are all ones.
>>
> If condition ensures that for devices with vendor/device IDs all ones
> are skipped as it would evaluate to false. But this would also skip
> enabling Bus master for devices whose vendor/device IDs are not all
> ones but IO or memory BARs are not programmed for them. Is there a
> possibility of this happening?

I think so, yes - programming of DMA requests can in principle also
be done via custom config space fields.

Jan
diff mbox series

Patch

diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c
index 0b708bf578..69c30f0548 100644
--- a/tools/firmware/hvmloader/pci.c
+++ b/tools/firmware/hvmloader/pci.c
@@ -84,6 +84,7 @@  void pci_setup(void)
     uint32_t vga_devfn = 256;
     uint16_t class, vendor_id, device_id;
     unsigned int bar, pin, link, isa_irq;
+    uint8_t pci_devfn_decode_type[256] = {};
 
     /* Resources assignable to PCI devices via BARs. */
     struct resource {
@@ -120,6 +121,11 @@  void pci_setup(void)
      */
     bool allow_memory_relocate = 1;
 
+    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_MEMORY
+            != PCI_COMMAND_MEMORY);
+    BUILD_BUG_ON((typeof(*pci_devfn_decode_type))PCI_COMMAND_IO
+            != PCI_COMMAND_IO);
+
     s = xenstore_read(HVM_XS_ALLOW_MEMORY_RELOCATE, NULL);
     if ( s )
         allow_memory_relocate = strtoll(s, NULL, 0);
@@ -208,6 +214,20 @@  void pci_setup(void)
             break;
         }
 
+        /*
+         * Disable memory and I/O decode,
+         * It is recommended that BAR programming be done whilst
+         * decode bits are cleared to avoid incorrect mappings being created,
+         * when 64-bit memory BAR is programmed first by writing the lower half
+         * and then the upper half, which first maps to an address under 4G
+         * replacing any RAM mapped in that address, which is not restored
+         * back after the upper half is written and PCI memory is correctly
+         * mapped to its intended high mem address.
+         */
+        cmd = pci_readw(devfn, PCI_COMMAND);
+        cmd &= ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO);
+        pci_writew(devfn, PCI_COMMAND, cmd);
+
         /* Map the I/O memory and port resources. */
         for ( bar = 0; bar < 7; bar++ )
         {
@@ -289,10 +309,6 @@  void pci_setup(void)
                    devfn>>3, devfn&7, 'A'+pin-1, isa_irq);
         }
 
-        /* Enable bus mastering. */
-        cmd = pci_readw(devfn, PCI_COMMAND);
-        cmd |= PCI_COMMAND_MASTER;
-        pci_writew(devfn, PCI_COMMAND, cmd);
     }
 
     if ( mmio_hole_size )
@@ -497,16 +513,12 @@  void pci_setup(void)
                PRIllx_arg(bar_sz),
                bar_data_upper, bar_data);
 			
-
-        /* Now enable the memory or I/O mapping. */
-        cmd = pci_readw(devfn, PCI_COMMAND);
         if ( (bar_reg == PCI_ROM_ADDRESS) ||
              ((bar_data & PCI_BASE_ADDRESS_SPACE) ==
               PCI_BASE_ADDRESS_SPACE_MEMORY) )
-            cmd |= PCI_COMMAND_MEMORY;
+            pci_devfn_decode_type[devfn] |= PCI_COMMAND_MEMORY;
         else
-            cmd |= PCI_COMMAND_IO;
-        pci_writew(devfn, PCI_COMMAND, cmd);
+            pci_devfn_decode_type[devfn] |= PCI_COMMAND_IO;
     }
 
     if ( pci_hi_mem_start )
@@ -526,10 +538,17 @@  void pci_setup(void)
          * has IO enabled, even if there is no I/O BAR on that
          * particular device.
          */
-        cmd = pci_readw(vga_devfn, PCI_COMMAND);
-        cmd |= PCI_COMMAND_IO;
-        pci_writew(vga_devfn, PCI_COMMAND, cmd);
+        pci_devfn_decode_type[vga_devfn] |= PCI_COMMAND_IO;
     }
+
+    /* Enable bus master, memory and I/O decode. */
+    for ( devfn = 0; devfn < 256; devfn++ )
+        if ( pci_devfn_decode_type[devfn] )
+        {
+            cmd = pci_readw(devfn, PCI_COMMAND);
+            cmd |= (PCI_COMMAND_MASTER | pci_devfn_decode_type[devfn]);
+            pci_writew(devfn, PCI_COMMAND, cmd);
+        }
 }
 
 /*