diff mbox series

PCI/MSI: skip masking MSI on Xen PV

Message ID 20210826134337.134767-1-marmarek@invisiblethingslab.com (mailing list archive)
State Accepted
Commit 1a519dc7a73c977547d8b5108d98c6e769c89f4b
Headers show
Series PCI/MSI: skip masking MSI on Xen PV | expand

Commit Message

Marek Marczykowski-Górecki Aug. 26, 2021, 1:43 p.m. UTC
When running as Xen PV guest, masking MSI is a responsibility of the
hypervisor. Guest has no write access to relevant BAR at all - when it
tries to, it results in a crash like this:

    BUG: unable to handle page fault for address: ffffc9004069100c
    #PF: supervisor write access in kernel mode
    #PF: error_code(0x0003) - permissions violation
    PGD 18f1c067 P4D 18f1c067 PUD 4dbd067 PMD 4fba067 PTE 80100000febd4075
    Oops: 0003 [#1] SMP NOPTI
    CPU: 0 PID: 234 Comm: kworker/0:2 Tainted: G        W         5.14.0-rc7-1.fc32.qubes.x86_64 #15
    Workqueue: events work_for_cpu_fn
    RIP: e030:__pci_enable_msix_range.part.0+0x26b/0x5f0
    Code: 2f 96 ff 48 89 44 24 28 48 89 c7 48 85 c0 0f 84 f6 01 00 00 45 0f b7 f6 48 8d 40 0c ba 01 00 00 00 49 c1 e6 04 4a 8d 4c 37 1c <89> 10 48 83 c0 10 48 39 c1 75 f5 41 0f b6 44 24 6a 84 c0 0f 84 48
    RSP: e02b:ffffc9004018bd50 EFLAGS: 00010212
    RAX: ffffc9004069100c RBX: ffff88800ed412f8 RCX: ffffc9004069105c
    RDX: 0000000000000001 RSI: 00000000000febd4 RDI: ffffc90040691000
    RBP: 0000000000000003 R08: 0000000000000000 R09: 00000000febd404f
    R10: 0000000000007ff0 R11: ffff88800ee8ae40 R12: ffff88800ed41000
    R13: 0000000000000000 R14: 0000000000000040 R15: 00000000feba0000
    FS:  0000000000000000(0000) GS:ffff888018400000(0000) knlGS:0000000000000000
    CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffff8000007f5ea0 CR3: 0000000012f6a000 CR4: 0000000000000660
    Call Trace:
     e1000e_set_interrupt_capability+0xbf/0xd0 [e1000e]
     e1000_probe+0x41f/0xdb0 [e1000e]
     local_pci_probe+0x42/0x80
    (...)

There is pci_msi_ignore_mask variable for bypassing MSI masking on Xen
PV, but msix_mask_all() missed checking it. Add the check there too.

Fixes: 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries")
Cc: stable@vger.kernel.org
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
Cc: xen-devel@lists.xenproject.org
---
 drivers/pci/msi.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Bjorn Helgaas Aug. 26, 2021, 2:55 p.m. UTC | #1
If/when you repost this, please run "git log --oneline
drivers/pci/msi.c" and follow the convention of capitalizing the
subject line.

Also, I think this patch refers specifically to MSI-X, not MSI, so
please update the subject line and the "masking MSI" below to reflect
that.

On Thu, Aug 26, 2021 at 03:43:37PM +0200, Marek Marczykowski-Górecki wrote:
> When running as Xen PV guest, masking MSI is a responsibility of the
> hypervisor. Guest has no write access to relevant BAR at all - when it
> tries to, it results in a crash like this:
> 
>     BUG: unable to handle page fault for address: ffffc9004069100c
>     #PF: supervisor write access in kernel mode
>     #PF: error_code(0x0003) - permissions violation
>     PGD 18f1c067 P4D 18f1c067 PUD 4dbd067 PMD 4fba067 PTE 80100000febd4075
>     Oops: 0003 [#1] SMP NOPTI
>     CPU: 0 PID: 234 Comm: kworker/0:2 Tainted: G        W         5.14.0-rc7-1.fc32.qubes.x86_64 #15
>     Workqueue: events work_for_cpu_fn
>     RIP: e030:__pci_enable_msix_range.part.0+0x26b/0x5f0
>     Code: 2f 96 ff 48 89 44 24 28 48 89 c7 48 85 c0 0f 84 f6 01 00 00 45 0f b7 f6 48 8d 40 0c ba 01 00 00 00 49 c1 e6 04 4a 8d 4c 37 1c <89> 10 48 83 c0 10 48 39 c1 75 f5 41 0f b6 44 24 6a 84 c0 0f 84 48
>     RSP: e02b:ffffc9004018bd50 EFLAGS: 00010212
>     RAX: ffffc9004069100c RBX: ffff88800ed412f8 RCX: ffffc9004069105c
>     RDX: 0000000000000001 RSI: 00000000000febd4 RDI: ffffc90040691000
>     RBP: 0000000000000003 R08: 0000000000000000 R09: 00000000febd404f
>     R10: 0000000000007ff0 R11: ffff88800ee8ae40 R12: ffff88800ed41000
>     R13: 0000000000000000 R14: 0000000000000040 R15: 00000000feba0000
>     FS:  0000000000000000(0000) GS:ffff888018400000(0000) knlGS:0000000000000000
>     CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>     CR2: ffff8000007f5ea0 CR3: 0000000012f6a000 CR4: 0000000000000660
>     Call Trace:
>      e1000e_set_interrupt_capability+0xbf/0xd0 [e1000e]
>      e1000_probe+0x41f/0xdb0 [e1000e]
>      local_pci_probe+0x42/0x80
>     (...)
> 
> There is pci_msi_ignore_mask variable for bypassing MSI masking on Xen
> PV, but msix_mask_all() missed checking it. Add the check there too.
> 
> Fixes: 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries")
> Cc: stable@vger.kernel.org

7d5ec3d36123 appeared in v5.14-rc6, so if this fix is merged before
v5.14, the stable tag will be unnecessary.  But we are running out of
time there.

> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> ---
> Cc: xen-devel@lists.xenproject.org
> ---
>  drivers/pci/msi.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index e5e75331b415..3a9f4f8ad8f9 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -776,6 +776,9 @@ static void msix_mask_all(void __iomem *base, int tsize)
>  	u32 ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT;
>  	int i;
>  
> +	if (pci_msi_ignore_mask)
> +		return;
> +
>  	for (i = 0; i < tsize; i++, base += PCI_MSIX_ENTRY_SIZE)
>  		writel(ctrl, base + PCI_MSIX_ENTRY_VECTOR_CTRL);
>  }
> -- 
> 2.31.1
>
Marek Marczykowski-Górecki Aug. 26, 2021, 4:36 p.m. UTC | #2
On Thu, Aug 26, 2021 at 09:55:32AM -0500, Bjorn Helgaas wrote:
> If/when you repost this, please run "git log --oneline
> drivers/pci/msi.c" and follow the convention of capitalizing the
> subject line.
> 
> Also, I think this patch refers specifically to MSI-X, not MSI, so
> please update the subject line and the "masking MSI" below to reflect
> that.

Sure, thanks for pointing this out. Is the patch otherwise ok? Should I
post v2 with just updated commit message?

> On Thu, Aug 26, 2021 at 03:43:37PM +0200, Marek Marczykowski-Górecki wrote:
> > When running as Xen PV guest, masking MSI is a responsibility of the
> > hypervisor. Guest has no write access to relevant BAR at all - when it
> > tries to, it results in a crash like this:
> > 
> >     BUG: unable to handle page fault for address: ffffc9004069100c
> >     #PF: supervisor write access in kernel mode
> >     #PF: error_code(0x0003) - permissions violation
> >     PGD 18f1c067 P4D 18f1c067 PUD 4dbd067 PMD 4fba067 PTE 80100000febd4075
> >     Oops: 0003 [#1] SMP NOPTI
> >     CPU: 0 PID: 234 Comm: kworker/0:2 Tainted: G        W         5.14.0-rc7-1.fc32.qubes.x86_64 #15
> >     Workqueue: events work_for_cpu_fn
> >     RIP: e030:__pci_enable_msix_range.part.0+0x26b/0x5f0
> >     Code: 2f 96 ff 48 89 44 24 28 48 89 c7 48 85 c0 0f 84 f6 01 00 00 45 0f b7 f6 48 8d 40 0c ba 01 00 00 00 49 c1 e6 04 4a 8d 4c 37 1c <89> 10 48 83 c0 10 48 39 c1 75 f5 41 0f b6 44 24 6a 84 c0 0f 84 48
> >     RSP: e02b:ffffc9004018bd50 EFLAGS: 00010212
> >     RAX: ffffc9004069100c RBX: ffff88800ed412f8 RCX: ffffc9004069105c
> >     RDX: 0000000000000001 RSI: 00000000000febd4 RDI: ffffc90040691000
> >     RBP: 0000000000000003 R08: 0000000000000000 R09: 00000000febd404f
> >     R10: 0000000000007ff0 R11: ffff88800ee8ae40 R12: ffff88800ed41000
> >     R13: 0000000000000000 R14: 0000000000000040 R15: 00000000feba0000
> >     FS:  0000000000000000(0000) GS:ffff888018400000(0000) knlGS:0000000000000000
> >     CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> >     CR2: ffff8000007f5ea0 CR3: 0000000012f6a000 CR4: 0000000000000660
> >     Call Trace:
> >      e1000e_set_interrupt_capability+0xbf/0xd0 [e1000e]
> >      e1000_probe+0x41f/0xdb0 [e1000e]
> >      local_pci_probe+0x42/0x80
> >     (...)
> > 
> > There is pci_msi_ignore_mask variable for bypassing MSI masking on Xen
> > PV, but msix_mask_all() missed checking it. Add the check there too.
> > 
> > Fixes: 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries")
> > Cc: stable@vger.kernel.org
> 
> 7d5ec3d36123 appeared in v5.14-rc6, so if this fix is merged before
> v5.14, the stable tag will be unnecessary.  But we are running out of
> time there.

7d5ec3d36123 was already backported to stable branches (at least 5.10
and 5.4), and in fact this is how I discovered the issue...

> > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > ---
> > Cc: xen-devel@lists.xenproject.org
> > ---
> >  drivers/pci/msi.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> > index e5e75331b415..3a9f4f8ad8f9 100644
> > --- a/drivers/pci/msi.c
> > +++ b/drivers/pci/msi.c
> > @@ -776,6 +776,9 @@ static void msix_mask_all(void __iomem *base, int tsize)
> >  	u32 ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT;
> >  	int i;
> >  
> > +	if (pci_msi_ignore_mask)
> > +		return;
> > +
> >  	for (i = 0; i < tsize; i++, base += PCI_MSIX_ENTRY_SIZE)
> >  		writel(ctrl, base + PCI_MSIX_ENTRY_VECTOR_CTRL);
> >  }
> > -- 
> > 2.31.1
> >
Bjorn Helgaas Aug. 26, 2021, 4:56 p.m. UTC | #3
On Thu, Aug 26, 2021 at 06:36:49PM +0200, Marek Marczykowski-Górecki wrote:
> On Thu, Aug 26, 2021 at 09:55:32AM -0500, Bjorn Helgaas wrote:
> > If/when you repost this, please run "git log --oneline
> > drivers/pci/msi.c" and follow the convention of capitalizing the
> > subject line.
> > 
> > Also, I think this patch refers specifically to MSI-X, not MSI, so
> > please update the subject line and the "masking MSI" below to reflect
> > that.
> 
> Sure, thanks for pointing this out. Is the patch otherwise ok? Should I
> post v2 with just updated commit message?

Wouldn't hurt to post a v2.  I don't have any objections to the patch,
but ultimately up to Thomas.

> > On Thu, Aug 26, 2021 at 03:43:37PM +0200, Marek Marczykowski-Górecki wrote:
> > > When running as Xen PV guest, masking MSI is a responsibility of the
> > > hypervisor. Guest has no write access to relevant BAR at all - when it
> > > tries to, it results in a crash like this:
> > > 
> > >     BUG: unable to handle page fault for address: ffffc9004069100c
> > >     #PF: supervisor write access in kernel mode
> > >     #PF: error_code(0x0003) - permissions violation
> > >     PGD 18f1c067 P4D 18f1c067 PUD 4dbd067 PMD 4fba067 PTE 80100000febd4075
> > >     Oops: 0003 [#1] SMP NOPTI
> > >     CPU: 0 PID: 234 Comm: kworker/0:2 Tainted: G        W         5.14.0-rc7-1.fc32.qubes.x86_64 #15
> > >     Workqueue: events work_for_cpu_fn
> > >     RIP: e030:__pci_enable_msix_range.part.0+0x26b/0x5f0
> > >     Code: 2f 96 ff 48 89 44 24 28 48 89 c7 48 85 c0 0f 84 f6 01 00 00 45 0f b7 f6 48 8d 40 0c ba 01 00 00 00 49 c1 e6 04 4a 8d 4c 37 1c <89> 10 48 83 c0 10 48 39 c1 75 f5 41 0f b6 44 24 6a 84 c0 0f 84 48
> > >     RSP: e02b:ffffc9004018bd50 EFLAGS: 00010212
> > >     RAX: ffffc9004069100c RBX: ffff88800ed412f8 RCX: ffffc9004069105c
> > >     RDX: 0000000000000001 RSI: 00000000000febd4 RDI: ffffc90040691000
> > >     RBP: 0000000000000003 R08: 0000000000000000 R09: 00000000febd404f
> > >     R10: 0000000000007ff0 R11: ffff88800ee8ae40 R12: ffff88800ed41000
> > >     R13: 0000000000000000 R14: 0000000000000040 R15: 00000000feba0000
> > >     FS:  0000000000000000(0000) GS:ffff888018400000(0000) knlGS:0000000000000000
> > >     CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >     CR2: ffff8000007f5ea0 CR3: 0000000012f6a000 CR4: 0000000000000660
> > >     Call Trace:
> > >      e1000e_set_interrupt_capability+0xbf/0xd0 [e1000e]
> > >      e1000_probe+0x41f/0xdb0 [e1000e]
> > >      local_pci_probe+0x42/0x80
> > >     (...)
> > > 
> > > There is pci_msi_ignore_mask variable for bypassing MSI masking on Xen
> > > PV, but msix_mask_all() missed checking it. Add the check there too.
> > > 
> > > Fixes: 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries")
> > > Cc: stable@vger.kernel.org
> > 
> > 7d5ec3d36123 appeared in v5.14-rc6, so if this fix is merged before
> > v5.14, the stable tag will be unnecessary.  But we are running out of
> > time there.
> 
> 7d5ec3d36123 was already backported to stable branches (at least 5.10
> and 5.4), and in fact this is how I discovered the issue...

Oh, right, of course.  Sorry :)

> > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > > ---
> > > Cc: xen-devel@lists.xenproject.org
> > > ---
> > >  drivers/pci/msi.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> > > index e5e75331b415..3a9f4f8ad8f9 100644
> > > --- a/drivers/pci/msi.c
> > > +++ b/drivers/pci/msi.c
> > > @@ -776,6 +776,9 @@ static void msix_mask_all(void __iomem *base, int tsize)
> > >  	u32 ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT;
> > >  	int i;
> > >  
> > > +	if (pci_msi_ignore_mask)
> > > +		return;
> > > +
> > >  	for (i = 0; i < tsize; i++, base += PCI_MSIX_ENTRY_SIZE)
> > >  		writel(ctrl, base + PCI_MSIX_ENTRY_VECTOR_CTRL);
> > >  }
> > > -- 
> > > 2.31.1
> > > 
> 
> -- 
> Best Regards,
> Marek Marczykowski-Górecki
> Invisible Things Lab
diff mbox series

Patch

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index e5e75331b415..3a9f4f8ad8f9 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -776,6 +776,9 @@  static void msix_mask_all(void __iomem *base, int tsize)
 	u32 ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT;
 	int i;
 
+	if (pci_msi_ignore_mask)
+		return;
+
 	for (i = 0; i < tsize; i++, base += PCI_MSIX_ENTRY_SIZE)
 		writel(ctrl, base + PCI_MSIX_ENTRY_VECTOR_CTRL);
 }