diff mbox series

[v2] PCI: dwc: endpoint: Fix dw_pcie_ep_raise_msix_irq() alignment support

Message ID 20231128132231.2221614-1-nks@flawful.org (mailing list archive)
State Accepted
Delegated to: Krzysztof Wilczyński
Headers show
Series [v2] PCI: dwc: endpoint: Fix dw_pcie_ep_raise_msix_irq() alignment support | expand

Commit Message

Niklas Cassel Nov. 28, 2023, 1:22 p.m. UTC
From: Niklas Cassel <niklas.cassel@wdc.com>

Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get
correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to
support iATUs which require a specific alignment.

However, this support cannot have been properly tested.

The whole point is for the iATU to map an address that is aligned,
using dw_pcie_ep_map_addr(), and then let the writel() write to
ep->msi_mem + aligned_offset.

Thus, modify the address that is mapped such that it is aligned.
With this change, dw_pcie_ep_raise_msix_irq() matches the logic in
dw_pcie_ep_raise_msi_irq().

Cc: Kishon Vijay Abraham I <kishon@kernel.org>
Fixes: 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
---
Changes since v1:
-Clarified commit message.
-Add a working email for Kishon to CC.

 drivers/pci/controller/dwc/pcie-designware-ep.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Manivannan Sadhasivam Nov. 28, 2023, 4:21 p.m. UTC | #1
On Tue, Nov 28, 2023 at 02:22:30PM +0100, Niklas Cassel wrote:
> From: Niklas Cassel <niklas.cassel@wdc.com>
> 
> Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get
> correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to
> support iATUs which require a specific alignment.
> 
> However, this support cannot have been properly tested.
> 
> The whole point is for the iATU to map an address that is aligned,
> using dw_pcie_ep_map_addr(), and then let the writel() write to
> ep->msi_mem + aligned_offset.
> 
> Thus, modify the address that is mapped such that it is aligned.
> With this change, dw_pcie_ep_raise_msix_irq() matches the logic in
> dw_pcie_ep_raise_msi_irq().
> 
> Cc: Kishon Vijay Abraham I <kishon@kernel.org>
> Fixes: 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address")
> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>

Cc: stable@vger.kernel.org # 5.7

Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>

- Mani

> ---
> Changes since v1:
> -Clarified commit message.
> -Add a working email for Kishon to CC.
> 
>  drivers/pci/controller/dwc/pcie-designware-ep.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index f6207989fc6a..bc94d7f39535 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -615,6 +615,7 @@ int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
>  	}
>  
>  	aligned_offset = msg_addr & (epc->mem->window.page_size - 1);
> +	msg_addr &= ~aligned_offset;
>  	ret = dw_pcie_ep_map_addr(epc, func_no, 0, ep->msi_mem_phys, msg_addr,
>  				  epc->mem->window.page_size);
>  	if (ret)
> -- 
> 2.43.0
>
Niklas Cassel Dec. 14, 2023, 9:59 a.m. UTC | #2
On Tue, Nov 28, 2023 at 02:22:30PM +0100, Niklas Cassel wrote:
> From: Niklas Cassel <niklas.cassel@wdc.com>
> 
> Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get
> correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to
> support iATUs which require a specific alignment.
> 
> However, this support cannot have been properly tested.
> 
> The whole point is for the iATU to map an address that is aligned,
> using dw_pcie_ep_map_addr(), and then let the writel() write to
> ep->msi_mem + aligned_offset.
> 
> Thus, modify the address that is mapped such that it is aligned.
> With this change, dw_pcie_ep_raise_msix_irq() matches the logic in
> dw_pcie_ep_raise_msi_irq().
> 
> Cc: Kishon Vijay Abraham I <kishon@kernel.org>
> Fixes: 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address")
> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> ---
> Changes since v1:
> -Clarified commit message.
> -Add a working email for Kishon to CC.
> 
>  drivers/pci/controller/dwc/pcie-designware-ep.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index f6207989fc6a..bc94d7f39535 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -615,6 +615,7 @@ int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
>  	}
>  
>  	aligned_offset = msg_addr & (epc->mem->window.page_size - 1);
> +	msg_addr &= ~aligned_offset;
>  	ret = dw_pcie_ep_map_addr(epc, func_no, 0, ep->msi_mem_phys, msg_addr,
>  				  epc->mem->window.page_size);
>  	if (ret)

Gentle ping...


Kind regards,
Niklas
Krzysztof Wilczyński Dec. 18, 2023, 1:12 a.m. UTC | #3
Hello,

> Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get
> correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to
> support iATUs which require a specific alignment.
> 
> However, this support cannot have been properly tested.
> 
> The whole point is for the iATU to map an address that is aligned,
> using dw_pcie_ep_map_addr(), and then let the writel() write to
> ep->msi_mem + aligned_offset.
> 
> Thus, modify the address that is mapped such that it is aligned.
> With this change, dw_pcie_ep_raise_msix_irq() matches the logic in
> dw_pcie_ep_raise_msi_irq().


Applied to controller/dwc, thank you!

[1/1] PCI: dwc: endpoint: Fix dw_pcie_ep_raise_msix_irq() alignment support
      https://git.kernel.org/pci/pci/c/2217fffcd63f

	Krzysztof
Krzysztof Wilczyński Dec. 18, 2023, 1:17 a.m. UTC | #4
Hello,

> > Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get
> > correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to
> > support iATUs which require a specific alignment.
> > 
> > However, this support cannot have been properly tested.
> > 
> > The whole point is for the iATU to map an address that is aligned,
> > using dw_pcie_ep_map_addr(), and then let the writel() write to
> > ep->msi_mem + aligned_offset.
> > 
> > Thus, modify the address that is mapped such that it is aligned.
> > With this change, dw_pcie_ep_raise_msix_irq() matches the logic in
> > dw_pcie_ep_raise_msi_irq().
[...]
> 
> Gentle ping...

Applied, so it should make it to 6.8.  Apologies for the delay.

	Krzysztof
Bjorn Helgaas Dec. 26, 2023, 10:17 p.m. UTC | #5
On Tue, Nov 28, 2023 at 02:22:30PM +0100, Niklas Cassel wrote:
> From: Niklas Cassel <niklas.cassel@wdc.com>
> 
> Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get
> correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to
> support iATUs which require a specific alignment.
> 
> However, this support cannot have been properly tested.
> 
> The whole point is for the iATU to map an address that is aligned,
> using dw_pcie_ep_map_addr(), and then let the writel() write to
> ep->msi_mem + aligned_offset.
> 
> Thus, modify the address that is mapped such that it is aligned.
> With this change, dw_pcie_ep_raise_msix_irq() matches the logic in
> dw_pcie_ep_raise_msi_irq().

Was there a problem report for this?  Since 6f5e193bfb55 appeared in
v5.7 (May 31 2020), and this should affect imx6, keystone am654,
dw-pcie (platform), and keembay, it seems a little weird that this bug
persisted so long.  Maybe nobody really uses endpoint support yet?

But I assume you observed a failure and tested this fix somewhere.

And the failure is that we send the wrong MSI-X vector or something
and get an unexpected IRQ and a driver hang or something?  In other
words, how does the bug manifest to users?

> Cc: Kishon Vijay Abraham I <kishon@kernel.org>
> Fixes: 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address")
> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> ---
> Changes since v1:
> -Clarified commit message.
> -Add a working email for Kishon to CC.
> 
>  drivers/pci/controller/dwc/pcie-designware-ep.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index f6207989fc6a..bc94d7f39535 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -615,6 +615,7 @@ int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
>  	}
>  
>  	aligned_offset = msg_addr & (epc->mem->window.page_size - 1);
> +	msg_addr &= ~aligned_offset;
>  	ret = dw_pcie_ep_map_addr(epc, func_no, 0, ep->msi_mem_phys, msg_addr,
>  				  epc->mem->window.page_size);

Total tangent and I don't know enough to suggest any changes, but it's
a little strange as a reader that we want to write to ep->msi_mem, and
the ATU setup with dw_pcie_ep_map_addr() doesn't involve ep->msi_mem
at all.

I see that ep->msi_mem is allocated and ioremapped far away in
dw_pcie_ep_init().  It's just a little weird that there's no
connection *here* with ep->msi_mem.

I assume dw_pcie_ep_map_addr(), writel(), dw_pcie_ep_unmap_addr() have
to happen atomically so nobody else uses that piece of the ATU while
we're doing this?  There's no mutex here, so I guess we must know this
is atomic already because of something else?

Bjorn
Niklas Cassel Dec. 27, 2023, 11:57 a.m. UTC | #6
Hello Bjorn,

On Tue, Dec 26, 2023 at 04:17:14PM -0600, Bjorn Helgaas wrote:
> On Tue, Nov 28, 2023 at 02:22:30PM +0100, Niklas Cassel wrote:
> > From: Niklas Cassel <niklas.cassel@wdc.com>
> > 
> > Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get
> > correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to
> > support iATUs which require a specific alignment.
> > 
> > However, this support cannot have been properly tested.
> > 
> > The whole point is for the iATU to map an address that is aligned,
> > using dw_pcie_ep_map_addr(), and then let the writel() write to
> > ep->msi_mem + aligned_offset.
> > 
> > Thus, modify the address that is mapped such that it is aligned.
> > With this change, dw_pcie_ep_raise_msix_irq() matches the logic in
> > dw_pcie_ep_raise_msi_irq().

For the record, this patch is already queued up:
https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=controller/dwc


> 
> Was there a problem report for this?  Since 6f5e193bfb55 appeared in
> v5.7 (May 31 2020), and this should affect imx6, keystone am654,
> dw-pcie (platform), and keembay, it seems a little weird that this bug
> persisted so long.  Maybe nobody really uses endpoint support yet?
> 
> But I assume you observed a failure and tested this fix somewhere.

Yes, I verified it on rockchip rk3588.

I'm working on upstreaming rk3588 EP support:
https://github.com/floatious/linux/commits/rockchip-pcie-ep

Right now rk3588 only has support for RC in mainline.


The fix is only needed for platforms which:
1) supports MSI-X
2) has an iATU alignment requirement,
so where epc->mem->window.page_size != 0.

pci_epc_mem_init() calls pci_epc_multi_mem_init() which
initializes epc->mem->window.page_size with ep->page_size.

$ git grep page_size drivers/pci/controller/dwc/

So it will not affect pcie-designware-plat.c, nor pcie-keembay.c,
since they don't set any ep->page_size.

It will not affect pcie-tegra194.c, since it doesn't use
dw_pcie_ep_raise_msix_irq().

Looking at pci-imx6.c, imx6_pcie_ep_raise_irq() calls
dw_pcie_ep_raise_msix_irq(), but:

static const struct pci_epc_features imx8m_pcie_epc_features = {
        .msix_capable = false,
}

so while pci-imx6.c will call dw_pcie_ep_raise_msix_irq(),
I assume that it will return early, in this if-statement:
https://github.com/torvalds/linux/blob/v6.7-rc7/drivers/pci/controller/dwc/pcie-designware-ep.c#L596-L598

That leaves just pci-keystone.c (am654 compatible only).

I don't know why no one has reported this bug before,
I can only assume insufficient testing.

I guess you might be lucky and happen to get an address that is
aligned to the iATU alignment requirement, but that is unlikely
to happen when rebooting and running pcitest.sh multiple times.


> 
> And the failure is that we send the wrong MSI-X vector or something
> and get an unexpected IRQ and a driver hang or something?  In other
> words, how does the bug manifest to users?

pcitest.sh fails the MSI-X tests.
With this fix the MSI-X tests in pcitest.sh passes.


> 
> > Cc: Kishon Vijay Abraham I <kishon@kernel.org>
> > Fixes: 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address")
> > Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> > ---
> > Changes since v1:
> > -Clarified commit message.
> > -Add a working email for Kishon to CC.
> > 
> >  drivers/pci/controller/dwc/pcie-designware-ep.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > index f6207989fc6a..bc94d7f39535 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > @@ -615,6 +615,7 @@ int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
> >  	}
> >  
> >  	aligned_offset = msg_addr & (epc->mem->window.page_size - 1);
> > +	msg_addr &= ~aligned_offset;
> >  	ret = dw_pcie_ep_map_addr(epc, func_no, 0, ep->msi_mem_phys, msg_addr,
> >  				  epc->mem->window.page_size);
> 
> Total tangent and I don't know enough to suggest any changes, but it's
> a little strange as a reader that we want to write to ep->msi_mem, and
> the ATU setup with dw_pcie_ep_map_addr() doesn't involve ep->msi_mem
> at all.
> 
> I see that ep->msi_mem is allocated and ioremapped far away in
> dw_pcie_ep_init().  It's just a little weird that there's no
> connection *here* with ep->msi_mem.

There is a connection. dw_pcie_ep_raise_msix_irq() uses ep->msi_mem_phys,
which is the physical address of ep->msi_mem:

ret = dw_pcie_ep_map_addr(epc, func_no, 0, ep->msi_mem_phys, msg_addr,
                                  epc->mem->window.page_size);  


> 
> I assume dw_pcie_ep_map_addr(), writel(), dw_pcie_ep_unmap_addr() have
> to happen atomically so nobody else uses that piece of the ATU while
> we're doing this?  There's no mutex here, so I guess we must know this
> is atomic already because of something else?

Most devices have multiple iATUs (so multiple iATU indexes).

pcie-designware-ep.c:dw_pcie_ep_outbound_atu()
uses find_first_zero_bit() to make sure that a specific iATU (index)
is not reused for something else:
https://github.com/torvalds/linux/blob/v6.7-rc7/drivers/pci/controller/dwc/pcie-designware-ep.c#L208

A specific iATU (index) is then freed by dw_pcie_ep_unmap_addr(),
which does a clear_bit() for that iATU (index).

It is a bit scary that there is no mutex or anything, since
find_first_zero_bit() is _not_ atomic, so if we have concurrent calls
to dw_pcie_ep_map_addr(), things might break, but that is a separate
issue.

I assume that this patch series will improve the concurrency issue,
if it gets accepted:
https://lore.kernel.org/linux-pci/20231212022749.625238-1-yury.norov@gmail.com/


Kind regards,
Niklas
Bjorn Helgaas Dec. 27, 2023, 1:03 p.m. UTC | #7
On Wed, Dec 27, 2023 at 12:57:31PM +0100, Niklas Cassel wrote:
> On Tue, Dec 26, 2023 at 04:17:14PM -0600, Bjorn Helgaas wrote:
> > On Tue, Nov 28, 2023 at 02:22:30PM +0100, Niklas Cassel wrote:
> > > From: Niklas Cassel <niklas.cassel@wdc.com>
> > > 
> > > Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get
> > > correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to
> > > support iATUs which require a specific alignment.
> > > 
> > > However, this support cannot have been properly tested.
> > > 
> > > The whole point is for the iATU to map an address that is aligned,
> > > using dw_pcie_ep_map_addr(), and then let the writel() write to
> > > ep->msi_mem + aligned_offset.
> > > 
> > > Thus, modify the address that is mapped such that it is aligned.
> > > With this change, dw_pcie_ep_raise_msix_irq() matches the logic in
> > > dw_pcie_ep_raise_msi_irq().
> 
> For the record, this patch is already queued up:
> https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=controller/dwc

Yes, of course.  I was writing the merge commit log for merging that
branch into the PCI "next" branch.

> ...
> > > @@ -615,6 +615,7 @@ int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
> > >  	}
> > >  
> > >  	aligned_offset = msg_addr & (epc->mem->window.page_size - 1);
> > > +	msg_addr &= ~aligned_offset;
> > >  	ret = dw_pcie_ep_map_addr(epc, func_no, 0, ep->msi_mem_phys, msg_addr,
> > >  				  epc->mem->window.page_size);
> > 
> > Total tangent and I don't know enough to suggest any changes, but
> > it's a little strange as a reader that we want to write to
> > ep->msi_mem, and the ATU setup with dw_pcie_ep_map_addr() doesn't
> > involve ep->msi_mem at all.
> > 
> > I see that ep->msi_mem is allocated and ioremapped far away in
> > dw_pcie_ep_init().  It's just a little weird that there's no
> > connection *here* with ep->msi_mem.
> 
> There is a connection. dw_pcie_ep_raise_msix_irq() uses
> ep->msi_mem_phys, which is the physical address of ep->msi_mem:
> 
> ret = dw_pcie_ep_map_addr(epc, func_no, 0, ep->msi_mem_phys, msg_addr,
>                                   epc->mem->window.page_size);  

Right, that's the connection I mentioned as "far away in
dw_pcie_ep_init()".  It's not the usual pattern of "map X, write X".
Here we have "map X, write Y", and it's up to the reader to do the
research to figure out whether and how X and Y are related.

> > I assume dw_pcie_ep_map_addr(), writel(), dw_pcie_ep_unmap_addr()
> > have to happen atomically so nobody else uses that piece of the
> > ATU while we're doing this?  There's no mutex here, so I guess we
> > must know this is atomic already because of something else?
> 
> Most devices have multiple iATUs (so multiple iATU indexes).
> 
> pcie-designware-ep.c:dw_pcie_ep_outbound_atu() uses
> find_first_zero_bit() to make sure that a specific iATU (index) is
> not reused for something else:
> https://github.com/torvalds/linux/blob/v6.7-rc7/drivers/pci/controller/dwc/pcie-designware-ep.c#L208
> 
> A specific iATU (index) is then freed by dw_pcie_ep_unmap_addr(),
> which does a clear_bit() for that iATU (index).
> 
> It is a bit scary that there is no mutex or anything, since
> find_first_zero_bit() is _not_ atomic, so if we have concurrent
> calls to dw_pcie_ep_map_addr(), things might break, but that is a
> separate issue.
> 
> I assume that this patch series will improve the concurrency issue,
> if it gets accepted:
> https://lore.kernel.org/linux-pci/20231212022749.625238-1-yury.norov@gmail.com/

This totally seems non-obvious and scary, regardless of Yury's patch.
If we're relying on the mem->bitmap to permanently assign an iATU
index for ep->msi_mem, it's not obvious why we need to use
dw_pcie_ep_map_addr() to enable that address each time we need it.

But all this is completely unrelated to your patch, which is fine and
now in -next (well, it *will* be the next time there is a linux-next
release, which looks like Jan 2 or so).

Bjorn
Kishon Vijay Abraham I Jan. 2, 2024, 10:25 a.m. UTC | #8
Hi Niklas, Bjorn,

On 12/27/2023 5:27 PM, Niklas Cassel wrote:
> Hello Bjorn,
> 
> On Tue, Dec 26, 2023 at 04:17:14PM -0600, Bjorn Helgaas wrote:
>> On Tue, Nov 28, 2023 at 02:22:30PM +0100, Niklas Cassel wrote:
>>> From: Niklas Cassel <niklas.cassel@wdc.com>
>>>
>>> Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get
>>> correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to
>>> support iATUs which require a specific alignment.
>>>
>>> However, this support cannot have been properly tested.
>>>
>>> The whole point is for the iATU to map an address that is aligned,
>>> using dw_pcie_ep_map_addr(), and then let the writel() write to
>>> ep->msi_mem + aligned_offset.
>>>
>>> Thus, modify the address that is mapped such that it is aligned.
>>> With this change, dw_pcie_ep_raise_msix_irq() matches the logic in
>>> dw_pcie_ep_raise_msi_irq().
> 
> For the record, this patch is already queued up:
> https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=controller/dwc
> 
> 
>>
>> Was there a problem report for this?  Since 6f5e193bfb55 appeared in
>> v5.7 (May 31 2020), and this should affect imx6, keystone am654,
>> dw-pcie (platform), and keembay, it seems a little weird that this bug
>> persisted so long.  Maybe nobody really uses endpoint support yet?
>>
>> But I assume you observed a failure and tested this fix somewhere.
> 
> Yes, I verified it on rockchip rk3588.
> 
> I'm working on upstreaming rk3588 EP support:
> https://github.com/floatious/linux/commits/rockchip-pcie-ep
> 
> Right now rk3588 only has support for RC in mainline.
> 
> 
> The fix is only needed for platforms which:
> 1) supports MSI-X
> 2) has an iATU alignment requirement,
> so where epc->mem->window.page_size != 0.
> 
> pci_epc_mem_init() calls pci_epc_multi_mem_init() which
> initializes epc->mem->window.page_size with ep->page_size.
> 
> $ git grep page_size drivers/pci/controller/dwc/
> 
> So it will not affect pcie-designware-plat.c, nor pcie-keembay.c,
> since they don't set any ep->page_size.
> 
> It will not affect pcie-tegra194.c, since it doesn't use
> dw_pcie_ep_raise_msix_irq().
> 
> Looking at pci-imx6.c, imx6_pcie_ep_raise_irq() calls
> dw_pcie_ep_raise_msix_irq(), but:
> 
> static const struct pci_epc_features imx8m_pcie_epc_features = {
>          .msix_capable = false,
> }
> 
> so while pci-imx6.c will call dw_pcie_ep_raise_msix_irq(),
> I assume that it will return early, in this if-statement:
> https://github.com/torvalds/linux/blob/v6.7-rc7/drivers/pci/controller/dwc/pcie-designware-ep.c#L596-L598
> 
> That leaves just pci-keystone.c (am654 compatible only).
> 
> I don't know why no one has reported this bug before,
> I can only assume insufficient testing.

The HW enforces the alignment so there was no issues observed before.

> 
> I guess you might be lucky and happen to get an address that is
> aligned to the iATU alignment requirement, but that is unlikely
> to happen when rebooting and running pcitest.sh multiple times.

In AM654, the HW keeps the lower bits of the target address as '0' in 
the ATU, so the address in the ATU is always aligned.

"Table 12-2815. PCIE_EP_IATU_LWR_TARGET_ADDR_OFF_OUTBOUND_0 Register 
Field Descriptions" in https://www.ti.com/lit/ug/spruid7e/spruid7e.pdf 
describes the below

<quote>
- LWR_TARGET_RW[31:n] forms MSB's of the Lower
Target part of the new address of the translated region
- LWR_TARGET_RW[n-1:0] are not used [The start address must be
aligned to a CX_ATU_MIN_REGION_SIZE kB boundary, so the
lower bits of the start address of the new address of the translated
region [bits n-1:0] are always '0'] - n is
log2[CX_ATU_MIN_REGION_SIZE]
</quote>

Thanks,
Kishon
Manivannan Sadhasivam Jan. 2, 2024, 3:54 p.m. UTC | #9
On Tue, Nov 28, 2023 at 02:22:30PM +0100, Niklas Cassel wrote:
> From: Niklas Cassel <niklas.cassel@wdc.com>
> 
> Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get
> correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to
> support iATUs which require a specific alignment.
> 
> However, this support cannot have been properly tested.
> 
> The whole point is for the iATU to map an address that is aligned,
> using dw_pcie_ep_map_addr(), and then let the writel() write to
> ep->msi_mem + aligned_offset.
> 
> Thus, modify the address that is mapped such that it is aligned.
> With this change, dw_pcie_ep_raise_msix_irq() matches the logic in
> dw_pcie_ep_raise_msi_irq().
> 
> Cc: Kishon Vijay Abraham I <kishon@kernel.org>
> Fixes: 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address")
> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>

I apparently missed this patch... LGTM!

Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>

Btw, since you are sending this patch from a different address, relevant s-o-b
should also be present.

- Mani

> ---
> Changes since v1:
> -Clarified commit message.
> -Add a working email for Kishon to CC.
> 
>  drivers/pci/controller/dwc/pcie-designware-ep.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index f6207989fc6a..bc94d7f39535 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -615,6 +615,7 @@ int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
>  	}
>  
>  	aligned_offset = msg_addr & (epc->mem->window.page_size - 1);
> +	msg_addr &= ~aligned_offset;
>  	ret = dw_pcie_ep_map_addr(epc, func_no, 0, ep->msi_mem_phys, msg_addr,
>  				  epc->mem->window.page_size);
>  	if (ret)
> -- 
> 2.43.0
>
Manivannan Sadhasivam Jan. 2, 2024, 4:48 p.m. UTC | #10
On Wed, Dec 27, 2023 at 07:03:41AM -0600, Bjorn Helgaas wrote:
> On Wed, Dec 27, 2023 at 12:57:31PM +0100, Niklas Cassel wrote:
> > On Tue, Dec 26, 2023 at 04:17:14PM -0600, Bjorn Helgaas wrote:
> > > On Tue, Nov 28, 2023 at 02:22:30PM +0100, Niklas Cassel wrote:
> > > > From: Niklas Cassel <niklas.cassel@wdc.com>
> > > > 
> > > > Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get
> > > > correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to
> > > > support iATUs which require a specific alignment.
> > > > 
> > > > However, this support cannot have been properly tested.
> > > > 
> > > > The whole point is for the iATU to map an address that is aligned,
> > > > using dw_pcie_ep_map_addr(), and then let the writel() write to
> > > > ep->msi_mem + aligned_offset.
> > > > 
> > > > Thus, modify the address that is mapped such that it is aligned.
> > > > With this change, dw_pcie_ep_raise_msix_irq() matches the logic in
> > > > dw_pcie_ep_raise_msi_irq().
> > 
> > For the record, this patch is already queued up:
> > https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=controller/dwc
> 
> Yes, of course.  I was writing the merge commit log for merging that
> branch into the PCI "next" branch.
> 
> > ...
> > > > @@ -615,6 +615,7 @@ int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
> > > >  	}
> > > >  
> > > >  	aligned_offset = msg_addr & (epc->mem->window.page_size - 1);
> > > > +	msg_addr &= ~aligned_offset;
> > > >  	ret = dw_pcie_ep_map_addr(epc, func_no, 0, ep->msi_mem_phys, msg_addr,
> > > >  				  epc->mem->window.page_size);
> > > 
> > > Total tangent and I don't know enough to suggest any changes, but
> > > it's a little strange as a reader that we want to write to
> > > ep->msi_mem, and the ATU setup with dw_pcie_ep_map_addr() doesn't
> > > involve ep->msi_mem at all.
> > > 
> > > I see that ep->msi_mem is allocated and ioremapped far away in
> > > dw_pcie_ep_init().  It's just a little weird that there's no
> > > connection *here* with ep->msi_mem.
> > 
> > There is a connection. dw_pcie_ep_raise_msix_irq() uses
> > ep->msi_mem_phys, which is the physical address of ep->msi_mem:
> > 
> > ret = dw_pcie_ep_map_addr(epc, func_no, 0, ep->msi_mem_phys, msg_addr,
> >                                   epc->mem->window.page_size);  
> 
> Right, that's the connection I mentioned as "far away in
> dw_pcie_ep_init()".  It's not the usual pattern of "map X, write X".
> Here we have "map X, write Y", and it's up to the reader to do the
> research to figure out whether and how X and Y are related.
> 
> > > I assume dw_pcie_ep_map_addr(), writel(), dw_pcie_ep_unmap_addr()
> > > have to happen atomically so nobody else uses that piece of the
> > > ATU while we're doing this?  There's no mutex here, so I guess we
> > > must know this is atomic already because of something else?
> > 
> > Most devices have multiple iATUs (so multiple iATU indexes).
> > 
> > pcie-designware-ep.c:dw_pcie_ep_outbound_atu() uses
> > find_first_zero_bit() to make sure that a specific iATU (index) is
> > not reused for something else:
> > https://github.com/torvalds/linux/blob/v6.7-rc7/drivers/pci/controller/dwc/pcie-designware-ep.c#L208
> > 
> > A specific iATU (index) is then freed by dw_pcie_ep_unmap_addr(),
> > which does a clear_bit() for that iATU (index).
> > 
> > It is a bit scary that there is no mutex or anything, since
> > find_first_zero_bit() is _not_ atomic, so if we have concurrent
> > calls to dw_pcie_ep_map_addr(), things might break, but that is a
> > separate issue.
> > 
> > I assume that this patch series will improve the concurrency issue,
> > if it gets accepted:
> > https://lore.kernel.org/linux-pci/20231212022749.625238-1-yury.norov@gmail.com/
> 
> This totally seems non-obvious and scary, regardless of Yury's patch.
> If we're relying on the mem->bitmap to permanently assign an iATU
> index for ep->msi_mem, it's not obvious why we need to use
> dw_pcie_ep_map_addr() to enable that address each time we need it.
> 

Agree. I'm planning to switch to genpoll/genalloc instead of using a custom
memory allocation scheme in pci-epc-mem.c. Will try to address this issue also.\
Thanks for spotting!

- Mani

> But all this is completely unrelated to your patch, which is fine and
> now in -next (well, it *will* be the next time there is a linux-next
> release, which looks like Jan 2 or so).
> 
> Bjorn
Kishon Vijay Abraham I Jan. 3, 2024, 6:03 a.m. UTC | #11
Hi Niklas,

On 12/27/2023 5:27 PM, Niklas Cassel wrote:
> Hello Bjorn,
> 
> On Tue, Dec 26, 2023 at 04:17:14PM -0600, Bjorn Helgaas wrote:
>> On Tue, Nov 28, 2023 at 02:22:30PM +0100, Niklas Cassel wrote:
>>> From: Niklas Cassel <niklas.cassel@wdc.com>
>>>
>>> Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get
>>> correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to
>>> support iATUs which require a specific alignment.
>>>
>>> However, this support cannot have been properly tested.
>>>
>>> The whole point is for the iATU to map an address that is aligned,
>>> using dw_pcie_ep_map_addr(), and then let the writel() write to
>>> ep->msi_mem + aligned_offset.
>>>
>>> Thus, modify the address that is mapped such that it is aligned.
>>> With this change, dw_pcie_ep_raise_msix_irq() matches the logic in
>>> dw_pcie_ep_raise_msi_irq().
> 
> For the record, this patch is already queued up:
> https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=controller/dwc
> 
> 
>>
>> Was there a problem report for this?  Since 6f5e193bfb55 appeared in
>> v5.7 (May 31 2020), and this should affect imx6, keystone am654,
>> dw-pcie (platform), and keembay, it seems a little weird that this bug
>> persisted so long.  Maybe nobody really uses endpoint support yet?
>>
>> But I assume you observed a failure and tested this fix somewhere.
> 
> Yes, I verified it on rockchip rk3588.
> 
> I'm working on upstreaming rk3588 EP support:
> https://github.com/floatious/linux/commits/rockchip-pcie-ep
> 
> Right now rk3588 only has support for RC in mainline.
> 
> 
> The fix is only needed for platforms which:
> 1) supports MSI-X
> 2) has an iATU alignment requirement,
> so where epc->mem->window.page_size != 0.
> 
> pci_epc_mem_init() calls pci_epc_multi_mem_init() which
> initializes epc->mem->window.page_size with ep->page_size.
> 
> $ git grep page_size drivers/pci/controller/dwc/
> 
> So it will not affect pcie-designware-plat.c, nor pcie-keembay.c,
> since they don't set any ep->page_size.
> 
> It will not affect pcie-tegra194.c, since it doesn't use
> dw_pcie_ep_raise_msix_irq().
> 
> Looking at pci-imx6.c, imx6_pcie_ep_raise_irq() calls
> dw_pcie_ep_raise_msix_irq(), but:
> 
> static const struct pci_epc_features imx8m_pcie_epc_features = {
>          .msix_capable = false,
> }
> 
> so while pci-imx6.c will call dw_pcie_ep_raise_msix_irq(),
> I assume that it will return early, in this if-statement:
> https://github.com/torvalds/linux/blob/v6.7-rc7/drivers/pci/controller/dwc/pcie-designware-ep.c#L596-L598
> 
> That leaves just pci-keystone.c (am654 compatible only).
> 
> I don't know why no one has reported this bug before,
> I can only assume insufficient testing.
> 
> I guess you might be lucky and happen to get an address that is
> aligned to the iATU alignment requirement, but that is unlikely
> to happen when rebooting and running pcitest.sh multiple times.
> 
> 
>>
>> And the failure is that we send the wrong MSI-X vector or something
>> and get an unexpected IRQ and a driver hang or something?  In other
>> words, how does the bug manifest to users?
> 
> pcitest.sh fails the MSI-X tests.
> With this fix the MSI-X tests in pcitest.sh passes.
> 
> 
>>
>>> Cc: Kishon Vijay Abraham I <kishon@kernel.org>
>>> Fixes: 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address")
>>> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
>>> ---
>>> Changes since v1:
>>> -Clarified commit message.
>>> -Add a working email for Kishon to CC.
>>>
>>>   drivers/pci/controller/dwc/pcie-designware-ep.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
>>> index f6207989fc6a..bc94d7f39535 100644
>>> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
>>> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
>>> @@ -615,6 +615,7 @@ int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
>>>   	}
>>>   
>>>   	aligned_offset = msg_addr & (epc->mem->window.page_size - 1);
>>> +	msg_addr &= ~aligned_offset;
>>>   	ret = dw_pcie_ep_map_addr(epc, func_no, 0, ep->msi_mem_phys, msg_addr,
>>>   				  epc->mem->window.page_size);
>>
>> Total tangent and I don't know enough to suggest any changes, but it's
>> a little strange as a reader that we want to write to ep->msi_mem, and
>> the ATU setup with dw_pcie_ep_map_addr() doesn't involve ep->msi_mem
>> at all.
>>
>> I see that ep->msi_mem is allocated and ioremapped far away in
>> dw_pcie_ep_init().  It's just a little weird that there's no
>> connection *here* with ep->msi_mem.
> 
> There is a connection. dw_pcie_ep_raise_msix_irq() uses ep->msi_mem_phys,
> which is the physical address of ep->msi_mem:
> 
> ret = dw_pcie_ep_map_addr(epc, func_no, 0, ep->msi_mem_phys, msg_addr,
>                                    epc->mem->window.page_size);
> 
> 
>>
>> I assume dw_pcie_ep_map_addr(), writel(), dw_pcie_ep_unmap_addr() have
>> to happen atomically so nobody else uses that piece of the ATU while
>> we're doing this?  There's no mutex here, so I guess we must know this
>> is atomic already because of something else?
> 
> Most devices have multiple iATUs (so multiple iATU indexes).
> 
> pcie-designware-ep.c:dw_pcie_ep_outbound_atu()
> uses find_first_zero_bit() to make sure that a specific iATU (index)
> is not reused for something else:
> https://github.com/torvalds/linux/blob/v6.7-rc7/drivers/pci/controller/dwc/pcie-designware-ep.c#L208
> 
> A specific iATU (index) is then freed by dw_pcie_ep_unmap_addr(),
> which does a clear_bit() for that iATU (index).
> 
> It is a bit scary that there is no mutex or anything, since
> find_first_zero_bit() is _not_ atomic, so if we have concurrent calls
> to dw_pcie_ep_map_addr(), things might break, but that is a separate
> issue.

There cannot be concurrent calls to dw_pcie_ep_map_addr() in the current 
code path as pci_epc_raise_irq(), pci_epc_map_addr() and 
pci_epc_unmap_addr() which invokes dw_pcie_ep_map_addr() takes EPC lock 
in pci-epc-core.

Thanks,
Kishon
Niklas Cassel Jan. 9, 2024, 12:52 p.m. UTC | #12
On Wed, Jan 03, 2024 at 11:33:35AM +0530, Kishon Vijay Abraham I wrote:
> Hi Niklas,

Hello Kishon!


> > > I assume dw_pcie_ep_map_addr(), writel(), dw_pcie_ep_unmap_addr() have
> > > to happen atomically so nobody else uses that piece of the ATU while
> > > we're doing this?  There's no mutex here, so I guess we must know this
> > > is atomic already because of something else?
> > 
> > Most devices have multiple iATUs (so multiple iATU indexes).
> > 
> > pcie-designware-ep.c:dw_pcie_ep_outbound_atu()
> > uses find_first_zero_bit() to make sure that a specific iATU (index)
> > is not reused for something else:
> > https://github.com/torvalds/linux/blob/v6.7-rc7/drivers/pci/controller/dwc/pcie-designware-ep.c#L208
> > 
> > A specific iATU (index) is then freed by dw_pcie_ep_unmap_addr(),
> > which does a clear_bit() for that iATU (index).
> > 
> > It is a bit scary that there is no mutex or anything, since
> > find_first_zero_bit() is _not_ atomic, so if we have concurrent calls
> > to dw_pcie_ep_map_addr(), things might break, but that is a separate
> > issue.
> 
> There cannot be concurrent calls to dw_pcie_ep_map_addr() in the current
> code path as pci_epc_raise_irq(), pci_epc_map_addr() and
> pci_epc_unmap_addr() which invokes dw_pcie_ep_map_addr() takes EPC lock in
> pci-epc-core.

I must have overlooked the mutex in pci-epc-core.
Thank you for clearing that up.


Kind regards,
Niklas
diff mbox series

Patch

diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
index f6207989fc6a..bc94d7f39535 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -615,6 +615,7 @@  int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
 	}
 
 	aligned_offset = msg_addr & (epc->mem->window.page_size - 1);
+	msg_addr &= ~aligned_offset;
 	ret = dw_pcie_ep_map_addr(epc, func_no, 0, ep->msi_mem_phys, msg_addr,
 				  epc->mem->window.page_size);
 	if (ret)