diff mbox series

[v5,2/2] PCI: dwc: Add support for 64-bit MSI target address

Message ID 20220825185026.3816331-3-willmcvicker@google.com (mailing list archive)
State Superseded
Headers show
Series PCI: dwc: Add support for 64-bit MSI target addresses | expand

Commit Message

William McVicker Aug. 25, 2022, 6:50 p.m. UTC
Since not all devices require a 32-bit MSI address, add support to the
PCIe host driver to allow setting the DMA mask to 64-bits if the 32-bit
allocation fails. This allows kernels to disable ZONE_DMA32 and bounce
buffering (swiotlb) without risking not being able to get a 32-bit address
during DMA allocation.

Basically, in the slim chance that there are no 32-bit allocations
available, the current PCIe host driver will fail to allocate the msi_msg
page due to a DMA address overflow (seen in [1]). With this patch, the
PCIe host can retry the allocation with a 64-bit DMA mask if the current
PCIe device advertises 64-bit support via its MSI capabilities.

[1] https://lore.kernel.org/all/Yo0soniFborDl7+C@google.com/

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Will McVicker <willmcvicker@google.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
---
 .../pci/controller/dwc/pcie-designware-host.c | 38 ++++++++++++++-----
 drivers/pci/controller/dwc/pcie-designware.c  |  8 ++++
 drivers/pci/controller/dwc/pcie-designware.h  |  1 +
 3 files changed, 38 insertions(+), 9 deletions(-)

Comments

Robin Murphy Aug. 25, 2022, 8:59 p.m. UTC | #1
On 2022-08-25 19:50, Will McVicker wrote:
> Since not all devices require a 32-bit MSI address, add support to the
> PCIe host driver to allow setting the DMA mask to 64-bits if the 32-bit
> allocation fails. This allows kernels to disable ZONE_DMA32 and bounce
> buffering (swiotlb) without risking not being able to get a 32-bit address
> during DMA allocation.
> 
> Basically, in the slim chance that there are no 32-bit allocations
> available, the current PCIe host driver will fail to allocate the msi_msg
> page due to a DMA address overflow (seen in [1]). With this patch, the
> PCIe host can retry the allocation with a 64-bit DMA mask if the current
> PCIe device advertises 64-bit support via its MSI capabilities.
> 
> [1] https://lore.kernel.org/all/Yo0soniFborDl7+C@google.com/
> 
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Will McVicker <willmcvicker@google.com>
> Reviewed-by: Rob Herring <robh@kernel.org>
> Acked-by: Jingoo Han <jingoohan1@gmail.com>
> ---
>   .../pci/controller/dwc/pcie-designware-host.c | 38 ++++++++++++++-----
>   drivers/pci/controller/dwc/pcie-designware.c  |  8 ++++
>   drivers/pci/controller/dwc/pcie-designware.h  |  1 +
>   3 files changed, 38 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index 39f3b37d4033..8928a9a29d58 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -330,6 +330,9 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
>   	u64 *msi_vaddr;
>   	int ret;
>   	u32 ctrl, num_ctrls;
> +	bool msi_64bit = false;
> +	bool retry_64bit = false;
> +	u16 msi_capabilities;
>   
>   	for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++)
>   		pp->irq_mask[ctrl] = ~0;
> @@ -367,16 +370,33 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
>   						    dw_chained_msi_isr, pp);
>   	}
>   
> -	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
> -	if (ret)
> -		dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n");
> +	msi_capabilities = dw_pcie_msi_capabilities(pci);
> +	if (msi_capabilities & PCI_MSI_FLAGS_ENABLE)
> +		msi_64bit = msi_capabilities & PCI_MSI_FLAGS_64BIT;
>   
> -	msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> -					GFP_KERNEL);
> -	if (!msi_vaddr) {
> -		dev_err(dev, "Failed to alloc and map MSI data\n");
> -		dw_pcie_free_msi(pp);
> -		return -ENOMEM;
> +	while (true) {
> +		dev_dbg(dev, "Setting MSI DMA mask to %s-bit.\n",
> +			retry_64bit ? "64" : "32");

If only we has some sort of "variable" that could could store a 
numerical value, think of the possibilities... :)

> +		ret = dma_set_mask_and_coherent(dev, retry_64bit ?
> +						DMA_BIT_MASK(64) :
> +						DMA_BIT_MASK(32));
> +		if (ret)
> +			dev_warn(dev, "Failed to set DMA mask to %s-bit.\n",
> +				 retry_64bit ? "64" : "32");

Setting a 64-bit mask should never fail, since it represents having no 
possible limitation whatsoever (I'm not sure if there are any platforms 
left where setting a 32-bit mask can actually fail in practice either, 
but I have no strong opinion on the fate of the existing warning).

> +
> +		msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> +						GFP_KERNEL);
> +		if (!msi_vaddr) {
> +			dev_err(dev, "Failed to alloc and map MSI data\n");

Possibly a mattrer of personal taste, but I'd say try to avoid dev_err() 
for things that aren't actually fatal; if you're still able to continue 
on, at best it's a warning, not an error. Especially if your use-case 
*expects* the 32-bit allocation fail. There's nothing more offputting 
than booting a typical vendor kernel and watching it scream tons of 
errors that look EXTREMELY IMPORTANT yet are also apparently 
inconsequential.

> +			if (msi_64bit && !retry_64bit) {
> +				retry_64bit = true;
> +				continue;
> +			}
> +
> +			dw_pcie_free_msi(pp);
> +			return -ENOMEM;
> +		}
> +		break;

TBH the whole loop design is a bit baroque for me, I'd have gone for a 
more straightforward tweak to the existing flow, something like:

	msi_vaddr = NULL;
	ret = dma_set_mask(32);
	if (!ret)
		msi_vaddr = dma_alloc();
	if (!msi_vaddr && msi_64bit) {
		dev_warn();
		dma_set_mask(64);
		msi_vaddr = dma_alloc();
	}
	if (!msi_vaddr) {
		dev_err();
		return;
	}
		
However I'm happy that you've captured the important functional point, 
so I'll leave the style matters up to Lorenzo.

Thanks,
Robin.

>   	}
>   
>   	return 0;
> diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> index c6725c519a47..650a7f22f9d0 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.c
> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> @@ -82,6 +82,14 @@ u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
>   }
>   EXPORT_SYMBOL_GPL(dw_pcie_find_capability);
>   
> +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci)
> +{
> +	u8 offset;
> +
> +	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_MSI);
> +	return dw_pcie_readw_dbi(pci, offset + PCI_MSI_FLAGS);
> +}
> +
>   static u16 dw_pcie_find_next_ext_capability(struct dw_pcie *pci, u16 start,
>   					    u8 cap)
>   {
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> index a871ae7eb59e..45fcdfc8c035 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -332,6 +332,7 @@ void dw_pcie_version_detect(struct dw_pcie *pci);
>   
>   u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
>   u16 dw_pcie_find_ext_capability(struct dw_pcie *pci, u8 cap);
> +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci);
>   
>   int dw_pcie_read(void __iomem *addr, int size, u32 *val);
>   int dw_pcie_write(void __iomem *addr, int size, u32 val);
William McVicker Aug. 25, 2022, 9:22 p.m. UTC | #2
On 08/25/2022, Robin Murphy wrote:
> On 2022-08-25 19:50, Will McVicker wrote:
> > Since not all devices require a 32-bit MSI address, add support to the
> > PCIe host driver to allow setting the DMA mask to 64-bits if the 32-bit
> > allocation fails. This allows kernels to disable ZONE_DMA32 and bounce
> > buffering (swiotlb) without risking not being able to get a 32-bit address
> > during DMA allocation.
> > 
> > Basically, in the slim chance that there are no 32-bit allocations
> > available, the current PCIe host driver will fail to allocate the msi_msg
> > page due to a DMA address overflow (seen in [1]). With this patch, the
> > PCIe host can retry the allocation with a 64-bit DMA mask if the current
> > PCIe device advertises 64-bit support via its MSI capabilities.
> > 
> > [1] https://lore.kernel.org/all/Yo0soniFborDl7+C@google.com/
> > 
> > Reported-by: kernel test robot <lkp@intel.com>
> > Signed-off-by: Will McVicker <willmcvicker@google.com>
> > Reviewed-by: Rob Herring <robh@kernel.org>
> > Acked-by: Jingoo Han <jingoohan1@gmail.com>
> > ---
> >   .../pci/controller/dwc/pcie-designware-host.c | 38 ++++++++++++++-----
> >   drivers/pci/controller/dwc/pcie-designware.c  |  8 ++++
> >   drivers/pci/controller/dwc/pcie-designware.h  |  1 +
> >   3 files changed, 38 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > index 39f3b37d4033..8928a9a29d58 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > @@ -330,6 +330,9 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> >   	u64 *msi_vaddr;
> >   	int ret;
> >   	u32 ctrl, num_ctrls;
> > +	bool msi_64bit = false;
> > +	bool retry_64bit = false;
> > +	u16 msi_capabilities;
> >   	for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++)
> >   		pp->irq_mask[ctrl] = ~0;
> > @@ -367,16 +370,33 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> >   						    dw_chained_msi_isr, pp);
> >   	}
> > -	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
> > -	if (ret)
> > -		dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n");
> > +	msi_capabilities = dw_pcie_msi_capabilities(pci);
> > +	if (msi_capabilities & PCI_MSI_FLAGS_ENABLE)
> > +		msi_64bit = msi_capabilities & PCI_MSI_FLAGS_64BIT;
> > -	msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > -					GFP_KERNEL);
> > -	if (!msi_vaddr) {
> > -		dev_err(dev, "Failed to alloc and map MSI data\n");
> > -		dw_pcie_free_msi(pp);
> > -		return -ENOMEM;
> > +	while (true) {
> > +		dev_dbg(dev, "Setting MSI DMA mask to %s-bit.\n",
> > +			retry_64bit ? "64" : "32");
> 
> If only we has some sort of "variable" that could could store a numerical
> value, think of the possibilities... :)

Sure, now that we're trying both 32- and 64-bit, I can do that. Thanks for the
suggestion :)

> 
> > +		ret = dma_set_mask_and_coherent(dev, retry_64bit ?
> > +						DMA_BIT_MASK(64) :
> > +						DMA_BIT_MASK(32));
> > +		if (ret)
> > +			dev_warn(dev, "Failed to set DMA mask to %s-bit.\n",
> > +				 retry_64bit ? "64" : "32");
> 
> Setting a 64-bit mask should never fail, since it represents having no
> possible limitation whatsoever (I'm not sure if there are any platforms left
> where setting a 32-bit mask can actually fail in practice either, but I have
> no strong opinion on the fate of the existing warning).

Yeah, I'm not sure how this could fail. So I just left the warning and edited
the message. It's probably cleaner to just leave the warning unconditionally
based on ret.

> 
> > +
> > +		msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > +						GFP_KERNEL);
> > +		if (!msi_vaddr) {
> > +			dev_err(dev, "Failed to alloc and map MSI data\n");
> 
> Possibly a mattrer of personal taste, but I'd say try to avoid dev_err() for
> things that aren't actually fatal; if you're still able to continue on, at
> best it's a warning, not an error. Especially if your use-case *expects* the
> 32-bit allocation fail. There's nothing more offputting than booting a
> typical vendor kernel and watching it scream tons of errors that look
> EXTREMELY IMPORTANT yet are also apparently inconsequential.

Failing a 32-bit allocation should be a rare case, but still possible. If it
fails for both 32-bit and 64-bit, then it's very likely the PCIe device calling
dw_pcie_host_init() will fail to probe. So I'll move this down to only report
that error.

> 
> > +			if (msi_64bit && !retry_64bit) {
> > +				retry_64bit = true;
> > +				continue;
> > +			}
> > +
> > +			dw_pcie_free_msi(pp);
> > +			return -ENOMEM;
> > +		}
> > +		break;
> 
> TBH the whole loop design is a bit baroque for me, I'd have gone for a more
> straightforward tweak to the existing flow, something like:
> 
> 	msi_vaddr = NULL;
> 	ret = dma_set_mask(32);
> 	if (!ret)
> 		msi_vaddr = dma_alloc();
> 	if (!msi_vaddr && msi_64bit) {
> 		dev_warn();
> 		dma_set_mask(64);
> 		msi_vaddr = dma_alloc();
> 	}
> 	if (!msi_vaddr) {
> 		dev_err();
> 		return;
> 	}
> 		
> However I'm happy that you've captured the important functional point, so
> I'll leave the style matters up to Lorenzo.

I was trying to avoid duplicating the allocation code, but if that's preferred,
then I'm fine with it.

> 
> Thanks,
> Robin.
> 
> >   	}
> >   	return 0;
> > diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> > index c6725c519a47..650a7f22f9d0 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware.c
> > @@ -82,6 +82,14 @@ u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
> >   }
> >   EXPORT_SYMBOL_GPL(dw_pcie_find_capability);
> > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci)
> > +{
> > +	u8 offset;
> > +
> > +	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_MSI);
> > +	return dw_pcie_readw_dbi(pci, offset + PCI_MSI_FLAGS);
> > +}
> > +
> >   static u16 dw_pcie_find_next_ext_capability(struct dw_pcie *pci, u16 start,
> >   					    u8 cap)
> >   {
> > diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> > index a871ae7eb59e..45fcdfc8c035 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware.h
> > +++ b/drivers/pci/controller/dwc/pcie-designware.h
> > @@ -332,6 +332,7 @@ void dw_pcie_version_detect(struct dw_pcie *pci);
> >   u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
> >   u16 dw_pcie_find_ext_capability(struct dw_pcie *pci, u8 cap);
> > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci);
> >   int dw_pcie_read(void __iomem *addr, int size, u32 *val);
> >   int dw_pcie_write(void __iomem *addr, int size, u32 val);

Thanks,
Will
Christoph Hellwig Sept. 9, 2022, 1:29 p.m. UTC | #3
On Thu, Aug 25, 2022 at 06:50:25PM +0000, Will McVicker wrote:
> Since not all devices require a 32-bit MSI address, add support to the
> PCIe host driver to allow setting the DMA mask to 64-bits if the 32-bit
> allocation fails. This allows kernels to disable ZONE_DMA32 and bounce
> buffering (swiotlb) without risking not being able to get a 32-bit address
> during DMA allocation.

Umm.  You can't just disable ZONE_DMA32.  Linux absolutely requires a
32-bit dma mask to work, it is in fact the implicit default.
Robin Murphy Sept. 9, 2022, 1:47 p.m. UTC | #4
On 2022-09-09 14:29, Christoph Hellwig wrote:
> On Thu, Aug 25, 2022 at 06:50:25PM +0000, Will McVicker wrote:
>> Since not all devices require a 32-bit MSI address, add support to the
>> PCIe host driver to allow setting the DMA mask to 64-bits if the 32-bit
>> allocation fails. This allows kernels to disable ZONE_DMA32 and bounce
>> buffering (swiotlb) without risking not being able to get a 32-bit address
>> during DMA allocation.
> 
> Umm.  You can't just disable ZONE_DMA32.  Linux absolutely requires a
> 32-bit dma mask to work, it is in fact the implicit default.

Eh, it's behind CONFIG_EXPERT, which makes it enough of a "I think I 
know what I'm doing and accept responsibility for picking up the pieces 
if it breaks" thing.

Robin.
Christoph Hellwig Sept. 9, 2022, 2:47 p.m. UTC | #5
On Fri, Sep 09, 2022 at 02:47:19PM +0100, Robin Murphy wrote:
> On 2022-09-09 14:29, Christoph Hellwig wrote:
> > On Thu, Aug 25, 2022 at 06:50:25PM +0000, Will McVicker wrote:
> > > Since not all devices require a 32-bit MSI address, add support to the
> > > PCIe host driver to allow setting the DMA mask to 64-bits if the 32-bit
> > > allocation fails. This allows kernels to disable ZONE_DMA32 and bounce
> > > buffering (swiotlb) without risking not being able to get a 32-bit address
> > > during DMA allocation.
> > 
> > Umm.  You can't just disable ZONE_DMA32.  Linux absolutely requires a
> > 32-bit dma mask to work, it is in fact the implicit default.
> 
> Eh, it's behind CONFIG_EXPERT, which makes it enough of a "I think I know
> what I'm doing and accept responsibility for picking up the pieces if it
> breaks" thing.

Seem like indeed on arm64 there is a way to disable it.  The x86 model
is to just select it unconditionally, which I think is the right way
if we don't want to get into completely random failures.
Robin Murphy Sept. 9, 2022, 3 p.m. UTC | #6
On 2022-09-09 15:47, Christoph Hellwig wrote:
> On Fri, Sep 09, 2022 at 02:47:19PM +0100, Robin Murphy wrote:
>> On 2022-09-09 14:29, Christoph Hellwig wrote:
>>> On Thu, Aug 25, 2022 at 06:50:25PM +0000, Will McVicker wrote:
>>>> Since not all devices require a 32-bit MSI address, add support to the
>>>> PCIe host driver to allow setting the DMA mask to 64-bits if the 32-bit
>>>> allocation fails. This allows kernels to disable ZONE_DMA32 and bounce
>>>> buffering (swiotlb) without risking not being able to get a 32-bit address
>>>> during DMA allocation.
>>>
>>> Umm.  You can't just disable ZONE_DMA32.  Linux absolutely requires a
>>> 32-bit dma mask to work, it is in fact the implicit default.
>>
>> Eh, it's behind CONFIG_EXPERT, which makes it enough of a "I think I know
>> what I'm doing and accept responsibility for picking up the pieces if it
>> breaks" thing.
> 
> Seem like indeed on arm64 there is a way to disable it.  The x86 model
> is to just select it unconditionally, which I think is the right way
> if we don't want to get into completely random failures.

IIRC there were reasons for wanting as much ZONE_NORMAL memory as 
possible; for the embedded folks who are already typically running with 
"swiotlb=noforce" to save memory because they know their hardware, we 
may as well let them have the footgun. Distros and other general-purpose 
configs should rightly not be going anywhere near this.

Cheers,
Robin.
Serge Semin Sept. 28, 2022, 12:05 p.m. UTC | #7
On Thu, Aug 25, 2022 at 06:50:25PM +0000, Will McVicker wrote:
> Since not all devices require a 32-bit MSI address, add support to the
> PCIe host driver to allow setting the DMA mask to 64-bits if the 32-bit
> allocation fails. This allows kernels to disable ZONE_DMA32 and bounce
> buffering (swiotlb) without risking not being able to get a 32-bit address
> during DMA allocation.

What is a problem in having the ZONE_DMA32 enabled anyway?

> 
> Basically, in the slim chance that there are no 32-bit allocations
> available, the current PCIe host driver will fail to allocate the msi_msg
> page due to a DMA address overflow (seen in [1]). With this patch, the
> PCIe host can retry the allocation with a 64-bit DMA mask if the current
> PCIe device advertises 64-bit support via its MSI capabilities.
> 
> [1] https://lore.kernel.org/all/Yo0soniFborDl7+C@google.com/

Note the reported error isn't caused by the allocation procedure, but
by the mapping procedure.

> 
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Will McVicker <willmcvicker@google.com>
> Reviewed-by: Rob Herring <robh@kernel.org>
> Acked-by: Jingoo Han <jingoohan1@gmail.com>
> ---
>  .../pci/controller/dwc/pcie-designware-host.c | 38 ++++++++++++++-----
>  drivers/pci/controller/dwc/pcie-designware.c  |  8 ++++
>  drivers/pci/controller/dwc/pcie-designware.h  |  1 +
>  3 files changed, 38 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index 39f3b37d4033..8928a9a29d58 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -330,6 +330,9 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
>  	u64 *msi_vaddr;
>  	int ret;
>  	u32 ctrl, num_ctrls;
> +	bool msi_64bit = false;
> +	bool retry_64bit = false;
> +	u16 msi_capabilities;
>  
>  	for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++)
>  		pp->irq_mask[ctrl] = ~0;
> @@ -367,16 +370,33 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
>  						    dw_chained_msi_isr, pp);
>  	}
>  
> -	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
> -	if (ret)
> -		dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n");

> +	msi_capabilities = dw_pcie_msi_capabilities(pci);
> +	if (msi_capabilities & PCI_MSI_FLAGS_ENABLE)
> +		msi_64bit = msi_capabilities & PCI_MSI_FLAGS_64BIT;

Note this capability flag has nothing to do with the DW PCIe iMSI-RX
engine, which is used here to detect and report MSI TLPs. By design
iMSI-RX always support 64-bit addresses. If you imply having that flag
set by the DW PCIe platform drivers on the platform-specific probe
stage as an indication of MSI address range, then ok.

>  
> -	msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> -					GFP_KERNEL);
> -	if (!msi_vaddr) {
> -		dev_err(dev, "Failed to alloc and map MSI data\n");
> -		dw_pcie_free_msi(pp);
> -		return -ENOMEM;
> +	while (true) {
> +		dev_dbg(dev, "Setting MSI DMA mask to %s-bit.\n",
> +			retry_64bit ? "64" : "32");

> +		ret = dma_set_mask_and_coherent(dev, retry_64bit ?
> +						DMA_BIT_MASK(64) :
> +						DMA_BIT_MASK(32));

I'd suggest to just drop this. No DMA actually performed on getting the
MSI TLPs. So modifying the device DMA-mask due to something which
doesn't cause DMA and based on the flag which doesn't indicates the
device DMA-capability is at least inappropriate.

> +		if (ret)
> +			dev_warn(dev, "Failed to set DMA mask to %s-bit.\n",
> +				 retry_64bit ? "64" : "32");
> +

> +		msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> +						GFP_KERNEL);

As I noted earlier the DMA-coherent memory can be too expensive. So
it's a waste of one allocating with no intent of usage. Instead of this
just get back the alloc_page() method here and pass the flag GFP_DMA32
to that function if MSI-capability reported the PCI_MSI_FLAGS_64BIT
unset.

-Sergey

> +		if (!msi_vaddr) {
> +			dev_err(dev, "Failed to alloc and map MSI data\n");
> +			if (msi_64bit && !retry_64bit) {
> +				retry_64bit = true;
> +				continue;
> +			}
> +
> +			dw_pcie_free_msi(pp);
> +			return -ENOMEM;
> +		}
> +		break;
>  	}
>  
>  	return 0;
> diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> index c6725c519a47..650a7f22f9d0 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.c
> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> @@ -82,6 +82,14 @@ u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
>  }
>  EXPORT_SYMBOL_GPL(dw_pcie_find_capability);
>  
> +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci)
> +{
> +	u8 offset;
> +
> +	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_MSI);
> +	return dw_pcie_readw_dbi(pci, offset + PCI_MSI_FLAGS);
> +}
> +
>  static u16 dw_pcie_find_next_ext_capability(struct dw_pcie *pci, u16 start,
>  					    u8 cap)
>  {
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> index a871ae7eb59e..45fcdfc8c035 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -332,6 +332,7 @@ void dw_pcie_version_detect(struct dw_pcie *pci);
>  
>  u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
>  u16 dw_pcie_find_ext_capability(struct dw_pcie *pci, u8 cap);
> +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci);
>  
>  int dw_pcie_read(void __iomem *addr, int size, u32 *val);
>  int dw_pcie_write(void __iomem *addr, int size, u32 val);
> -- 
> 2.37.2.672.g94769d06f0-goog
> 
>
William McVicker Sept. 28, 2022, 5:52 p.m. UTC | #8
On 09/28/2022, Serge Semin wrote:
> On Thu, Aug 25, 2022 at 06:50:25PM +0000, Will McVicker wrote:
> > Since not all devices require a 32-bit MSI address, add support to the
> > PCIe host driver to allow setting the DMA mask to 64-bits if the 32-bit
> > allocation fails. This allows kernels to disable ZONE_DMA32 and bounce
> > buffering (swiotlb) without risking not being able to get a 32-bit address
> > during DMA allocation.
> 
> What is a problem in having the ZONE_DMA32 enabled anyway?

On Android most devices don't have a 32-bit limitation. Several Android OEMs
have reported significant enough performance improvements after disabling
ZONE_DMA32. These include reducing memory usage, improving the time spent by
kswapd, improving direct reclaim, and improving app launch time.

So this patch series was introduced to remove the dependency on ZONE_DMA32 for
the DW PCIe drivers.

> 
> > 
> > Basically, in the slim chance that there are no 32-bit allocations
> > available, the current PCIe host driver will fail to allocate the msi_msg
> > page due to a DMA address overflow (seen in [1]). With this patch, the
> > PCIe host can retry the allocation with a 64-bit DMA mask if the current
> > PCIe device advertises 64-bit support via its MSI capabilities.
> > 
> > [1] https://lore.kernel.org/all/Yo0soniFborDl7+C@google.com/
> 
> Note the reported error isn't caused by the allocation procedure, but
> by the mapping procedure.
> 
> > 
> > Reported-by: kernel test robot <lkp@intel.com>
> > Signed-off-by: Will McVicker <willmcvicker@google.com>
> > Reviewed-by: Rob Herring <robh@kernel.org>
> > Acked-by: Jingoo Han <jingoohan1@gmail.com>
> > ---
> >  .../pci/controller/dwc/pcie-designware-host.c | 38 ++++++++++++++-----
> >  drivers/pci/controller/dwc/pcie-designware.c  |  8 ++++
> >  drivers/pci/controller/dwc/pcie-designware.h  |  1 +
> >  3 files changed, 38 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > index 39f3b37d4033..8928a9a29d58 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > @@ -330,6 +330,9 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> >  	u64 *msi_vaddr;
> >  	int ret;
> >  	u32 ctrl, num_ctrls;
> > +	bool msi_64bit = false;
> > +	bool retry_64bit = false;
> > +	u16 msi_capabilities;
> >  
> >  	for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++)
> >  		pp->irq_mask[ctrl] = ~0;
> > @@ -367,16 +370,33 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> >  						    dw_chained_msi_isr, pp);
> >  	}
> >  
> > -	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
> > -	if (ret)
> > -		dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n");
> 
> > +	msi_capabilities = dw_pcie_msi_capabilities(pci);
> > +	if (msi_capabilities & PCI_MSI_FLAGS_ENABLE)
> > +		msi_64bit = msi_capabilities & PCI_MSI_FLAGS_64BIT;
> 
> Note this capability flag has nothing to do with the DW PCIe iMSI-RX
> engine, which is used here to detect and report MSI TLPs. By design
> iMSI-RX always support 64-bit addresses. If you imply having that flag
> set by the DW PCIe platform drivers on the platform-specific probe
> stage as an indication of MSI address range, then ok.

Right. The DW PCIe device driver can set this flag during probe before calling
dw_pcie_host init() to ensure that we will always successfully allocate and map
the MSI target address (as required to return successfully from
dw_pcie_host_init()).

> 
> >  
> > -	msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > -					GFP_KERNEL);
> > -	if (!msi_vaddr) {
> > -		dev_err(dev, "Failed to alloc and map MSI data\n");
> > -		dw_pcie_free_msi(pp);
> > -		return -ENOMEM;
> > +	while (true) {
> > +		dev_dbg(dev, "Setting MSI DMA mask to %s-bit.\n",
> > +			retry_64bit ? "64" : "32");
> 
> > +		ret = dma_set_mask_and_coherent(dev, retry_64bit ?
> > +						DMA_BIT_MASK(64) :
> > +						DMA_BIT_MASK(32));
> 
> I'd suggest to just drop this. No DMA actually performed on getting the
> MSI TLPs. So modifying the device DMA-mask due to something which
> doesn't cause DMA and based on the flag which doesn't indicates the
> device DMA-capability is at least inappropriate.
> 
> > +		if (ret)
> > +			dev_warn(dev, "Failed to set DMA mask to %s-bit.\n",
> > +				 retry_64bit ? "64" : "32");
> > +
> 
> > +		msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > +						GFP_KERNEL);
> 
> As I noted earlier the DMA-coherent memory can be too expensive. So
> it's a waste of one allocating with no intent of usage. Instead of this
> just get back the alloc_page() method here and pass the flag GFP_DMA32
> to that function if MSI-capability reported the PCI_MSI_FLAGS_64BIT
> unset.

As mentioned above, we don't want to force this driver to require the kernel to
enable ZONE_DMA32. Since no I/O happens to this buffer, could we use
dma_alloc_attrs() with the DMA_ATTR_SKIP_CPU_SYNC and
DMA_ATTR_NO_KERNEL_MAPPING attribute? Would that address the "too expensive"
issues you're referring to?

With regards to the DMA mask, I'm okay with moving that out of the host
controller and into the DW PCIe device driver. That would address all of my
issues and we could just drop the logic for checking the PCI_MSI_FLAGS_64BIT.
However, I'm not the one you to convince to do that.

Regards,
Will

> 
> -Sergey
> 
> > +		if (!msi_vaddr) {
> > +			dev_err(dev, "Failed to alloc and map MSI data\n");
> > +			if (msi_64bit && !retry_64bit) {
> > +				retry_64bit = true;
> > +				continue;
> > +			}
> > +
> > +			dw_pcie_free_msi(pp);
> > +			return -ENOMEM;
> > +		}
> > +		break;
> >  	}
> >  
> >  	return 0;
> > diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> > index c6725c519a47..650a7f22f9d0 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware.c
> > @@ -82,6 +82,14 @@ u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
> >  }
> >  EXPORT_SYMBOL_GPL(dw_pcie_find_capability);
> >  
> > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci)
> > +{
> > +	u8 offset;
> > +
> > +	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_MSI);
> > +	return dw_pcie_readw_dbi(pci, offset + PCI_MSI_FLAGS);
> > +}
> > +
> >  static u16 dw_pcie_find_next_ext_capability(struct dw_pcie *pci, u16 start,
> >  					    u8 cap)
> >  {
> > diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> > index a871ae7eb59e..45fcdfc8c035 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware.h
> > +++ b/drivers/pci/controller/dwc/pcie-designware.h
> > @@ -332,6 +332,7 @@ void dw_pcie_version_detect(struct dw_pcie *pci);
> >  
> >  u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
> >  u16 dw_pcie_find_ext_capability(struct dw_pcie *pci, u8 cap);
> > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci);
> >  
> >  int dw_pcie_read(void __iomem *addr, int size, u32 *val);
> >  int dw_pcie_write(void __iomem *addr, int size, u32 val);
> > -- 
> > 2.37.2.672.g94769d06f0-goog
> > 
> >
Lorenzo Pieralisi Sept. 29, 2022, 8:13 a.m. UTC | #9
On Wed, Sep 28, 2022 at 05:52:26PM +0000, William McVicker wrote:
> On 09/28/2022, Serge Semin wrote:
> > On Thu, Aug 25, 2022 at 06:50:25PM +0000, Will McVicker wrote:
> > > Since not all devices require a 32-bit MSI address, add support to the
> > > PCIe host driver to allow setting the DMA mask to 64-bits if the 32-bit
> > > allocation fails. This allows kernels to disable ZONE_DMA32 and bounce
> > > buffering (swiotlb) without risking not being able to get a 32-bit address
> > > during DMA allocation.
> > 
> > What is a problem in having the ZONE_DMA32 enabled anyway?
> 
> On Android most devices don't have a 32-bit limitation. Several Android OEMs
> have reported significant enough performance improvements after disabling
> ZONE_DMA32. These include reducing memory usage, improving the time spent by
> kswapd, improving direct reclaim, and improving app launch time.
> 
> So this patch series was introduced to remove the dependency on ZONE_DMA32 for
> the DW PCIe drivers.
> 
> > 
> > > 
> > > Basically, in the slim chance that there are no 32-bit allocations
> > > available, the current PCIe host driver will fail to allocate the msi_msg
> > > page due to a DMA address overflow (seen in [1]). With this patch, the
> > > PCIe host can retry the allocation with a 64-bit DMA mask if the current
> > > PCIe device advertises 64-bit support via its MSI capabilities.
> > > 
> > > [1] https://lore.kernel.org/all/Yo0soniFborDl7+C@google.com/
> > 
> > Note the reported error isn't caused by the allocation procedure, but
> > by the mapping procedure.
> > 
> > > 
> > > Reported-by: kernel test robot <lkp@intel.com>
> > > Signed-off-by: Will McVicker <willmcvicker@google.com>
> > > Reviewed-by: Rob Herring <robh@kernel.org>
> > > Acked-by: Jingoo Han <jingoohan1@gmail.com>
> > > ---
> > >  .../pci/controller/dwc/pcie-designware-host.c | 38 ++++++++++++++-----
> > >  drivers/pci/controller/dwc/pcie-designware.c  |  8 ++++
> > >  drivers/pci/controller/dwc/pcie-designware.h  |  1 +
> > >  3 files changed, 38 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > index 39f3b37d4033..8928a9a29d58 100644
> > > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > @@ -330,6 +330,9 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> > >  	u64 *msi_vaddr;
> > >  	int ret;
> > >  	u32 ctrl, num_ctrls;
> > > +	bool msi_64bit = false;
> > > +	bool retry_64bit = false;
> > > +	u16 msi_capabilities;
> > >  
> > >  	for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++)
> > >  		pp->irq_mask[ctrl] = ~0;
> > > @@ -367,16 +370,33 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> > >  						    dw_chained_msi_isr, pp);
> > >  	}
> > >  
> > > -	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
> > > -	if (ret)
> > > -		dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n");
> > 
> > > +	msi_capabilities = dw_pcie_msi_capabilities(pci);
> > > +	if (msi_capabilities & PCI_MSI_FLAGS_ENABLE)
> > > +		msi_64bit = msi_capabilities & PCI_MSI_FLAGS_64BIT;
> > 
> > Note this capability flag has nothing to do with the DW PCIe iMSI-RX
> > engine, which is used here to detect and report MSI TLPs. By design
> > iMSI-RX always support 64-bit addresses. If you imply having that flag
> > set by the DW PCIe platform drivers on the platform-specific probe
> > stage as an indication of MSI address range, then ok.
> 
> Right. The DW PCIe device driver can set this flag during probe before calling
> dw_pcie_host init() to ensure that we will always successfully allocate and map
> the MSI target address (as required to return successfully from
> dw_pcie_host_init()).
> 
> > 
> > >  
> > > -	msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > > -					GFP_KERNEL);
> > > -	if (!msi_vaddr) {
> > > -		dev_err(dev, "Failed to alloc and map MSI data\n");
> > > -		dw_pcie_free_msi(pp);
> > > -		return -ENOMEM;
> > > +	while (true) {
> > > +		dev_dbg(dev, "Setting MSI DMA mask to %s-bit.\n",
> > > +			retry_64bit ? "64" : "32");
> > 
> > > +		ret = dma_set_mask_and_coherent(dev, retry_64bit ?
> > > +						DMA_BIT_MASK(64) :
> > > +						DMA_BIT_MASK(32));
> > 
> > I'd suggest to just drop this. No DMA actually performed on getting the
> > MSI TLPs. So modifying the device DMA-mask due to something which
> > doesn't cause DMA and based on the flag which doesn't indicates the
> > device DMA-capability is at least inappropriate.
> > 
> > > +		if (ret)
> > > +			dev_warn(dev, "Failed to set DMA mask to %s-bit.\n",
> > > +				 retry_64bit ? "64" : "32");
> > > +
> > 
> > > +		msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > > +						GFP_KERNEL);
> > 
> > As I noted earlier the DMA-coherent memory can be too expensive. So
> > it's a waste of one allocating with no intent of usage. Instead of this
> > just get back the alloc_page() method here and pass the flag GFP_DMA32
> > to that function if MSI-capability reported the PCI_MSI_FLAGS_64BIT
> > unset.
> 
> As mentioned above, we don't want to force this driver to require the kernel to
> enable ZONE_DMA32. Since no I/O happens to this buffer, could we use
> dma_alloc_attrs() with the DMA_ATTR_SKIP_CPU_SYNC and
> DMA_ATTR_NO_KERNEL_MAPPING attribute? Would that address the "too expensive"
> issues you're referring to?
> 
> With regards to the DMA mask, I'm okay with moving that out of the host
> controller and into the DW PCIe device driver. That would address all of my
> issues and we could just drop the logic for checking the PCI_MSI_FLAGS_64BIT.
> However, I'm not the one you to convince to do that.

We are late -rc7 and it does not look like we are converging on this
discussion - I will wait till tomorrow but then I will have to drop

https://lore.kernel.org/linux-pci/20220825235404.4132818-1-willmcvicker@google.com

from the PCI queue for v6.1 so that we can restart from a clean slate.

Lorenzo

> Regards,
> Will
> 
> > 
> > -Sergey
> > 
> > > +		if (!msi_vaddr) {
> > > +			dev_err(dev, "Failed to alloc and map MSI data\n");
> > > +			if (msi_64bit && !retry_64bit) {
> > > +				retry_64bit = true;
> > > +				continue;
> > > +			}
> > > +
> > > +			dw_pcie_free_msi(pp);
> > > +			return -ENOMEM;
> > > +		}
> > > +		break;
> > >  	}
> > >  
> > >  	return 0;
> > > diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> > > index c6725c519a47..650a7f22f9d0 100644
> > > --- a/drivers/pci/controller/dwc/pcie-designware.c
> > > +++ b/drivers/pci/controller/dwc/pcie-designware.c
> > > @@ -82,6 +82,14 @@ u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
> > >  }
> > >  EXPORT_SYMBOL_GPL(dw_pcie_find_capability);
> > >  
> > > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci)
> > > +{
> > > +	u8 offset;
> > > +
> > > +	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_MSI);
> > > +	return dw_pcie_readw_dbi(pci, offset + PCI_MSI_FLAGS);
> > > +}
> > > +
> > >  static u16 dw_pcie_find_next_ext_capability(struct dw_pcie *pci, u16 start,
> > >  					    u8 cap)
> > >  {
> > > diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> > > index a871ae7eb59e..45fcdfc8c035 100644
> > > --- a/drivers/pci/controller/dwc/pcie-designware.h
> > > +++ b/drivers/pci/controller/dwc/pcie-designware.h
> > > @@ -332,6 +332,7 @@ void dw_pcie_version_detect(struct dw_pcie *pci);
> > >  
> > >  u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
> > >  u16 dw_pcie_find_ext_capability(struct dw_pcie *pci, u8 cap);
> > > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci);
> > >  
> > >  int dw_pcie_read(void __iomem *addr, int size, u32 *val);
> > >  int dw_pcie_write(void __iomem *addr, int size, u32 val);
> > > -- 
> > > 2.37.2.672.g94769d06f0-goog
> > > 
> > > 
>
William McVicker Sept. 29, 2022, 6:50 p.m. UTC | #10
On 09/29/2022, Lorenzo Pieralisi wrote:
> On Wed, Sep 28, 2022 at 05:52:26PM +0000, William McVicker wrote:
> > On 09/28/2022, Serge Semin wrote:
> > > On Thu, Aug 25, 2022 at 06:50:25PM +0000, Will McVicker wrote:
> > > > Since not all devices require a 32-bit MSI address, add support to the
> > > > PCIe host driver to allow setting the DMA mask to 64-bits if the 32-bit
> > > > allocation fails. This allows kernels to disable ZONE_DMA32 and bounce
> > > > buffering (swiotlb) without risking not being able to get a 32-bit address
> > > > during DMA allocation.
> > > 
> > > What is a problem in having the ZONE_DMA32 enabled anyway?
> > 
> > On Android most devices don't have a 32-bit limitation. Several Android OEMs
> > have reported significant enough performance improvements after disabling
> > ZONE_DMA32. These include reducing memory usage, improving the time spent by
> > kswapd, improving direct reclaim, and improving app launch time.
> > 
> > So this patch series was introduced to remove the dependency on ZONE_DMA32 for
> > the DW PCIe drivers.
> > 
> > > 
> > > > 
> > > > Basically, in the slim chance that there are no 32-bit allocations
> > > > available, the current PCIe host driver will fail to allocate the msi_msg
> > > > page due to a DMA address overflow (seen in [1]). With this patch, the
> > > > PCIe host can retry the allocation with a 64-bit DMA mask if the current
> > > > PCIe device advertises 64-bit support via its MSI capabilities.
> > > > 
> > > > [1] https://lore.kernel.org/all/Yo0soniFborDl7+C@google.com/
> > > 
> > > Note the reported error isn't caused by the allocation procedure, but
> > > by the mapping procedure.
> > > 
> > > > 
> > > > Reported-by: kernel test robot <lkp@intel.com>
> > > > Signed-off-by: Will McVicker <willmcvicker@google.com>
> > > > Reviewed-by: Rob Herring <robh@kernel.org>
> > > > Acked-by: Jingoo Han <jingoohan1@gmail.com>
> > > > ---
> > > >  .../pci/controller/dwc/pcie-designware-host.c | 38 ++++++++++++++-----
> > > >  drivers/pci/controller/dwc/pcie-designware.c  |  8 ++++
> > > >  drivers/pci/controller/dwc/pcie-designware.h  |  1 +
> > > >  3 files changed, 38 insertions(+), 9 deletions(-)
> > > > 
> > > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > > index 39f3b37d4033..8928a9a29d58 100644
> > > > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > > > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > > @@ -330,6 +330,9 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> > > >  	u64 *msi_vaddr;
> > > >  	int ret;
> > > >  	u32 ctrl, num_ctrls;
> > > > +	bool msi_64bit = false;
> > > > +	bool retry_64bit = false;
> > > > +	u16 msi_capabilities;
> > > >  
> > > >  	for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++)
> > > >  		pp->irq_mask[ctrl] = ~0;
> > > > @@ -367,16 +370,33 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> > > >  						    dw_chained_msi_isr, pp);
> > > >  	}
> > > >  
> > > > -	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
> > > > -	if (ret)
> > > > -		dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n");
> > > 
> > > > +	msi_capabilities = dw_pcie_msi_capabilities(pci);
> > > > +	if (msi_capabilities & PCI_MSI_FLAGS_ENABLE)
> > > > +		msi_64bit = msi_capabilities & PCI_MSI_FLAGS_64BIT;
> > > 
> > > Note this capability flag has nothing to do with the DW PCIe iMSI-RX
> > > engine, which is used here to detect and report MSI TLPs. By design
> > > iMSI-RX always support 64-bit addresses. If you imply having that flag
> > > set by the DW PCIe platform drivers on the platform-specific probe
> > > stage as an indication of MSI address range, then ok.
> > 
> > Right. The DW PCIe device driver can set this flag during probe before calling
> > dw_pcie_host init() to ensure that we will always successfully allocate and map
> > the MSI target address (as required to return successfully from
> > dw_pcie_host_init()).
> > 
> > > 
> > > >  
> > > > -	msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > > > -					GFP_KERNEL);
> > > > -	if (!msi_vaddr) {
> > > > -		dev_err(dev, "Failed to alloc and map MSI data\n");
> > > > -		dw_pcie_free_msi(pp);
> > > > -		return -ENOMEM;
> > > > +	while (true) {
> > > > +		dev_dbg(dev, "Setting MSI DMA mask to %s-bit.\n",
> > > > +			retry_64bit ? "64" : "32");
> > > 
> > > > +		ret = dma_set_mask_and_coherent(dev, retry_64bit ?
> > > > +						DMA_BIT_MASK(64) :
> > > > +						DMA_BIT_MASK(32));
> > > 
> > > I'd suggest to just drop this. No DMA actually performed on getting the
> > > MSI TLPs. So modifying the device DMA-mask due to something which
> > > doesn't cause DMA and based on the flag which doesn't indicates the
> > > device DMA-capability is at least inappropriate.
> > > 
> > > > +		if (ret)
> > > > +			dev_warn(dev, "Failed to set DMA mask to %s-bit.\n",
> > > > +				 retry_64bit ? "64" : "32");
> > > > +
> > > 
> > > > +		msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > > > +						GFP_KERNEL);
> > > 
> > > As I noted earlier the DMA-coherent memory can be too expensive. So
> > > it's a waste of one allocating with no intent of usage. Instead of this
> > > just get back the alloc_page() method here and pass the flag GFP_DMA32
> > > to that function if MSI-capability reported the PCI_MSI_FLAGS_64BIT
> > > unset.
> > 
> > As mentioned above, we don't want to force this driver to require the kernel to
> > enable ZONE_DMA32. Since no I/O happens to this buffer, could we use
> > dma_alloc_attrs() with the DMA_ATTR_SKIP_CPU_SYNC and
> > DMA_ATTR_NO_KERNEL_MAPPING attribute? Would that address the "too expensive"
> > issues you're referring to?
> > 
> > With regards to the DMA mask, I'm okay with moving that out of the host
> > controller and into the DW PCIe device driver. That would address all of my
> > issues and we could just drop the logic for checking the PCI_MSI_FLAGS_64BIT.
> > However, I'm not the one you to convince to do that.
> 
> We are late -rc7 and it does not look like we are converging on this
> discussion - I will wait till tomorrow but then I will have to drop
> 
> https://lore.kernel.org/linux-pci/20220825235404.4132818-1-willmcvicker@google.com
> 
> from the PCI queue for v6.1 so that we can restart from a clean slate.
> 
> Lorenzo
> 

Hi Lorenzo,

Based on Robin's response [1], I don't think we should change the
implementation based on MIPS32 until we have (1) someone showing MIPS32 is
using this driver and (2) that there's an actual perf regression when using
dmam_alloc_coherent(). My patch series addresses a real issue by removing the
dependency on ZONE_DMA32. Even if we did drop my patches, it won't solve
Serge's DMA mask issues since the DW PCIe host driver will continue to
unconditionally set the mask to 32-bits.

[1] https://lore.kernel.org/linux-pci/4dc31a63-00b1-f379-c5ac-7dc9425937f4@arm.com/

Thanks,
Will

> > Regards,
> > Will
> > 
> > > 
> > > -Sergey
> > > 
> > > > +		if (!msi_vaddr) {
> > > > +			dev_err(dev, "Failed to alloc and map MSI data\n");
> > > > +			if (msi_64bit && !retry_64bit) {
> > > > +				retry_64bit = true;
> > > > +				continue;
> > > > +			}
> > > > +
> > > > +			dw_pcie_free_msi(pp);
> > > > +			return -ENOMEM;
> > > > +		}
> > > > +		break;
> > > >  	}
> > > >  
> > > >  	return 0;
> > > > diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> > > > index c6725c519a47..650a7f22f9d0 100644
> > > > --- a/drivers/pci/controller/dwc/pcie-designware.c
> > > > +++ b/drivers/pci/controller/dwc/pcie-designware.c
> > > > @@ -82,6 +82,14 @@ u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(dw_pcie_find_capability);
> > > >  
> > > > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci)
> > > > +{
> > > > +	u8 offset;
> > > > +
> > > > +	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_MSI);
> > > > +	return dw_pcie_readw_dbi(pci, offset + PCI_MSI_FLAGS);
> > > > +}
> > > > +
> > > >  static u16 dw_pcie_find_next_ext_capability(struct dw_pcie *pci, u16 start,
> > > >  					    u8 cap)
> > > >  {
> > > > diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> > > > index a871ae7eb59e..45fcdfc8c035 100644
> > > > --- a/drivers/pci/controller/dwc/pcie-designware.h
> > > > +++ b/drivers/pci/controller/dwc/pcie-designware.h
> > > > @@ -332,6 +332,7 @@ void dw_pcie_version_detect(struct dw_pcie *pci);
> > > >  
> > > >  u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
> > > >  u16 dw_pcie_find_ext_capability(struct dw_pcie *pci, u8 cap);
> > > > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci);
> > > >  
> > > >  int dw_pcie_read(void __iomem *addr, int size, u32 *val);
> > > >  int dw_pcie_write(void __iomem *addr, int size, u32 val);
> > > > -- 
> > > > 2.37.2.672.g94769d06f0-goog
> > > > 
> > > > 
> >
Serge Semin Sept. 29, 2022, 7 p.m. UTC | #11
On Thu, Sep 29, 2022 at 06:50:01PM +0000, William McVicker wrote:
> On 09/29/2022, Lorenzo Pieralisi wrote:
> > On Wed, Sep 28, 2022 at 05:52:26PM +0000, William McVicker wrote:
> > > On 09/28/2022, Serge Semin wrote:
> > > > On Thu, Aug 25, 2022 at 06:50:25PM +0000, Will McVicker wrote:
> > > > > Since not all devices require a 32-bit MSI address, add support to the
> > > > > PCIe host driver to allow setting the DMA mask to 64-bits if the 32-bit
> > > > > allocation fails. This allows kernels to disable ZONE_DMA32 and bounce
> > > > > buffering (swiotlb) without risking not being able to get a 32-bit address
> > > > > during DMA allocation.
> > > > 
> > > > What is a problem in having the ZONE_DMA32 enabled anyway?
> > > 
> > > On Android most devices don't have a 32-bit limitation. Several Android OEMs
> > > have reported significant enough performance improvements after disabling
> > > ZONE_DMA32. These include reducing memory usage, improving the time spent by
> > > kswapd, improving direct reclaim, and improving app launch time.
> > > 
> > > So this patch series was introduced to remove the dependency on ZONE_DMA32 for
> > > the DW PCIe drivers.
> > > 
> > > > 
> > > > > 
> > > > > Basically, in the slim chance that there are no 32-bit allocations
> > > > > available, the current PCIe host driver will fail to allocate the msi_msg
> > > > > page due to a DMA address overflow (seen in [1]). With this patch, the
> > > > > PCIe host can retry the allocation with a 64-bit DMA mask if the current
> > > > > PCIe device advertises 64-bit support via its MSI capabilities.
> > > > > 
> > > > > [1] https://lore.kernel.org/all/Yo0soniFborDl7+C@google.com/
> > > > 
> > > > Note the reported error isn't caused by the allocation procedure, but
> > > > by the mapping procedure.
> > > > 
> > > > > 
> > > > > Reported-by: kernel test robot <lkp@intel.com>
> > > > > Signed-off-by: Will McVicker <willmcvicker@google.com>
> > > > > Reviewed-by: Rob Herring <robh@kernel.org>
> > > > > Acked-by: Jingoo Han <jingoohan1@gmail.com>
> > > > > ---
> > > > >  .../pci/controller/dwc/pcie-designware-host.c | 38 ++++++++++++++-----
> > > > >  drivers/pci/controller/dwc/pcie-designware.c  |  8 ++++
> > > > >  drivers/pci/controller/dwc/pcie-designware.h  |  1 +
> > > > >  3 files changed, 38 insertions(+), 9 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > > > index 39f3b37d4033..8928a9a29d58 100644
> > > > > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > > > > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > > > @@ -330,6 +330,9 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> > > > >  	u64 *msi_vaddr;
> > > > >  	int ret;
> > > > >  	u32 ctrl, num_ctrls;
> > > > > +	bool msi_64bit = false;
> > > > > +	bool retry_64bit = false;
> > > > > +	u16 msi_capabilities;
> > > > >  
> > > > >  	for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++)
> > > > >  		pp->irq_mask[ctrl] = ~0;
> > > > > @@ -367,16 +370,33 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> > > > >  						    dw_chained_msi_isr, pp);
> > > > >  	}
> > > > >  
> > > > > -	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
> > > > > -	if (ret)
> > > > > -		dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n");
> > > > 
> > > > > +	msi_capabilities = dw_pcie_msi_capabilities(pci);
> > > > > +	if (msi_capabilities & PCI_MSI_FLAGS_ENABLE)
> > > > > +		msi_64bit = msi_capabilities & PCI_MSI_FLAGS_64BIT;
> > > > 
> > > > Note this capability flag has nothing to do with the DW PCIe iMSI-RX
> > > > engine, which is used here to detect and report MSI TLPs. By design
> > > > iMSI-RX always support 64-bit addresses. If you imply having that flag
> > > > set by the DW PCIe platform drivers on the platform-specific probe
> > > > stage as an indication of MSI address range, then ok.
> > > 
> > > Right. The DW PCIe device driver can set this flag during probe before calling
> > > dw_pcie_host init() to ensure that we will always successfully allocate and map
> > > the MSI target address (as required to return successfully from
> > > dw_pcie_host_init()).
> > > 
> > > > 
> > > > >  
> > > > > -	msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > > > > -					GFP_KERNEL);
> > > > > -	if (!msi_vaddr) {
> > > > > -		dev_err(dev, "Failed to alloc and map MSI data\n");
> > > > > -		dw_pcie_free_msi(pp);
> > > > > -		return -ENOMEM;
> > > > > +	while (true) {
> > > > > +		dev_dbg(dev, "Setting MSI DMA mask to %s-bit.\n",
> > > > > +			retry_64bit ? "64" : "32");
> > > > 
> > > > > +		ret = dma_set_mask_and_coherent(dev, retry_64bit ?
> > > > > +						DMA_BIT_MASK(64) :
> > > > > +						DMA_BIT_MASK(32));
> > > > 
> > > > I'd suggest to just drop this. No DMA actually performed on getting the
> > > > MSI TLPs. So modifying the device DMA-mask due to something which
> > > > doesn't cause DMA and based on the flag which doesn't indicates the
> > > > device DMA-capability is at least inappropriate.
> > > > 
> > > > > +		if (ret)
> > > > > +			dev_warn(dev, "Failed to set DMA mask to %s-bit.\n",
> > > > > +				 retry_64bit ? "64" : "32");
> > > > > +
> > > > 
> > > > > +		msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > > > > +						GFP_KERNEL);
> > > > 
> > > > As I noted earlier the DMA-coherent memory can be too expensive. So
> > > > it's a waste of one allocating with no intent of usage. Instead of this
> > > > just get back the alloc_page() method here and pass the flag GFP_DMA32
> > > > to that function if MSI-capability reported the PCI_MSI_FLAGS_64BIT
> > > > unset.
> > > 
> > > As mentioned above, we don't want to force this driver to require the kernel to
> > > enable ZONE_DMA32. Since no I/O happens to this buffer, could we use
> > > dma_alloc_attrs() with the DMA_ATTR_SKIP_CPU_SYNC and
> > > DMA_ATTR_NO_KERNEL_MAPPING attribute? Would that address the "too expensive"
> > > issues you're referring to?
> > > 
> > > With regards to the DMA mask, I'm okay with moving that out of the host
> > > controller and into the DW PCIe device driver. That would address all of my
> > > issues and we could just drop the logic for checking the PCI_MSI_FLAGS_64BIT.
> > > However, I'm not the one you to convince to do that.
> > 
> > We are late -rc7 and it does not look like we are converging on this
> > discussion - I will wait till tomorrow but then I will have to drop
> > 
> > https://lore.kernel.org/linux-pci/20220825235404.4132818-1-willmcvicker@google.com
> > 
> > from the PCI queue for v6.1 so that we can restart from a clean slate.
> > 
> > Lorenzo
> > 
> 
> Hi Lorenzo,
> 
> Based on Robin's response [1], I don't think we should change the
> implementation based on MIPS32 until we have (1) someone showing MIPS32 is
> using this driver and 

This patch adds the DW PCIe controller implemented in the framework of
MIPS32 arch:
https://lore.kernel.org/linux-pci/20220822184701.25246-21-Sergey.Semin@baikalelectronics.ru/

> (2) that there's an actual perf regression when using
> dmam_alloc_coherent(). My patch series addresses a real issue by removing the
> dependency on ZONE_DMA32.

What about finding out what is a root cause of the performance
degradation instead of just dropping the whole standard zone support?

> Even if we did drop my patches, it won't solve
> Serge's DMA mask issues since the DW PCIe host driver will continue to
> unconditionally set the mask to 32-bits.

If you moved the DMA-mask setting to the platform driver that would
have solved my problems. I am pretty much sure the generic code
shouldn't be altering the DMA-mask if it isn't aware of the actual
device capability. In case of DW PCIe controller the AXI-bus address
width is a platform specific parameter and the generic DW PCIe code
doesn't know which width is valid.

-Sergey

> 
> [1] https://lore.kernel.org/linux-pci/4dc31a63-00b1-f379-c5ac-7dc9425937f4@arm.com/
> 
> Thanks,
> Will
> 
> > > Regards,
> > > Will
> > > 
> > > > 
> > > > -Sergey
> > > > 
> > > > > +		if (!msi_vaddr) {
> > > > > +			dev_err(dev, "Failed to alloc and map MSI data\n");
> > > > > +			if (msi_64bit && !retry_64bit) {
> > > > > +				retry_64bit = true;
> > > > > +				continue;
> > > > > +			}
> > > > > +
> > > > > +			dw_pcie_free_msi(pp);
> > > > > +			return -ENOMEM;
> > > > > +		}
> > > > > +		break;
> > > > >  	}
> > > > >  
> > > > >  	return 0;
> > > > > diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> > > > > index c6725c519a47..650a7f22f9d0 100644
> > > > > --- a/drivers/pci/controller/dwc/pcie-designware.c
> > > > > +++ b/drivers/pci/controller/dwc/pcie-designware.c
> > > > > @@ -82,6 +82,14 @@ u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
> > > > >  }
> > > > >  EXPORT_SYMBOL_GPL(dw_pcie_find_capability);
> > > > >  
> > > > > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci)
> > > > > +{
> > > > > +	u8 offset;
> > > > > +
> > > > > +	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_MSI);
> > > > > +	return dw_pcie_readw_dbi(pci, offset + PCI_MSI_FLAGS);
> > > > > +}
> > > > > +
> > > > >  static u16 dw_pcie_find_next_ext_capability(struct dw_pcie *pci, u16 start,
> > > > >  					    u8 cap)
> > > > >  {
> > > > > diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> > > > > index a871ae7eb59e..45fcdfc8c035 100644
> > > > > --- a/drivers/pci/controller/dwc/pcie-designware.h
> > > > > +++ b/drivers/pci/controller/dwc/pcie-designware.h
> > > > > @@ -332,6 +332,7 @@ void dw_pcie_version_detect(struct dw_pcie *pci);
> > > > >  
> > > > >  u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
> > > > >  u16 dw_pcie_find_ext_capability(struct dw_pcie *pci, u8 cap);
> > > > > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci);
> > > > >  
> > > > >  int dw_pcie_read(void __iomem *addr, int size, u32 *val);
> > > > >  int dw_pcie_write(void __iomem *addr, int size, u32 val);
> > > > > -- 
> > > > > 2.37.2.672.g94769d06f0-goog
> > > > > 
> > > > > 
> > >
Lorenzo Pieralisi Sept. 30, 2022, 1:46 p.m. UTC | #12
On Wed, Sep 28, 2022 at 03:05:10PM +0300, Serge Semin wrote:
> On Thu, Aug 25, 2022 at 06:50:25PM +0000, Will McVicker wrote:

[...]

> > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > index 39f3b37d4033..8928a9a29d58 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > @@ -330,6 +330,9 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> >  	u64 *msi_vaddr;
> >  	int ret;
> >  	u32 ctrl, num_ctrls;
> > +	bool msi_64bit = false;
> > +	bool retry_64bit = false;
> > +	u16 msi_capabilities;
> >  
> >  	for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++)
> >  		pp->irq_mask[ctrl] = ~0;
> > @@ -367,16 +370,33 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> >  						    dw_chained_msi_isr, pp);
> >  	}
> >  
> > -	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
> > -	if (ret)
> > -		dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n");
> 
> > +	msi_capabilities = dw_pcie_msi_capabilities(pci);
> > +	if (msi_capabilities & PCI_MSI_FLAGS_ENABLE)
> > +		msi_64bit = msi_capabilities & PCI_MSI_FLAGS_64BIT;
> 
> Note this capability flag has nothing to do with the DW PCIe iMSI-RX
> engine, which is used here to detect and report MSI TLPs. By design
> iMSI-RX always support 64-bit addresses. If you imply having that flag
> set by the DW PCIe platform drivers on the platform-specific probe
> stage as an indication of MSI address range, then ok.

The MSI cap shows that - AFAICS - the RP can be programmed with
a 64-bit MSI(DMA) address. It is fair to say that this is not
enough to guarantee that the DMA mask for the host bridge can be
inferred to be 64-bit though.

I am inclined to drop this patch (only) from the PCI queue because
it is unclear to me whether it really does the right thing.

Lorenzo

> > -	msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > -					GFP_KERNEL);
> > -	if (!msi_vaddr) {
> > -		dev_err(dev, "Failed to alloc and map MSI data\n");
> > -		dw_pcie_free_msi(pp);
> > -		return -ENOMEM;
> > +	while (true) {
> > +		dev_dbg(dev, "Setting MSI DMA mask to %s-bit.\n",
> > +			retry_64bit ? "64" : "32");
> 
> > +		ret = dma_set_mask_and_coherent(dev, retry_64bit ?
> > +						DMA_BIT_MASK(64) :
> > +						DMA_BIT_MASK(32));
> 
> I'd suggest to just drop this. No DMA actually performed on getting the
> MSI TLPs. So modifying the device DMA-mask due to something which
> doesn't cause DMA and based on the flag which doesn't indicates the
> device DMA-capability is at least inappropriate.
> 
> > +		if (ret)
> > +			dev_warn(dev, "Failed to set DMA mask to %s-bit.\n",
> > +				 retry_64bit ? "64" : "32");
> > +
> 
> > +		msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > +						GFP_KERNEL);
> 
> As I noted earlier the DMA-coherent memory can be too expensive. So
> it's a waste of one allocating with no intent of usage. Instead of this
> just get back the alloc_page() method here and pass the flag GFP_DMA32
> to that function if MSI-capability reported the PCI_MSI_FLAGS_64BIT
> unset.
> 
> -Sergey
> 
> > +		if (!msi_vaddr) {
> > +			dev_err(dev, "Failed to alloc and map MSI data\n");
> > +			if (msi_64bit && !retry_64bit) {
> > +				retry_64bit = true;
> > +				continue;
> > +			}
> > +
> > +			dw_pcie_free_msi(pp);
> > +			return -ENOMEM;
> > +		}
> > +		break;
> >  	}
> >  
> >  	return 0;
> > diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> > index c6725c519a47..650a7f22f9d0 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware.c
> > @@ -82,6 +82,14 @@ u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
> >  }
> >  EXPORT_SYMBOL_GPL(dw_pcie_find_capability);
> >  
> > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci)
> > +{
> > +	u8 offset;
> > +
> > +	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_MSI);
> > +	return dw_pcie_readw_dbi(pci, offset + PCI_MSI_FLAGS);
> > +}
> > +
> >  static u16 dw_pcie_find_next_ext_capability(struct dw_pcie *pci, u16 start,
> >  					    u8 cap)
> >  {
> > diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> > index a871ae7eb59e..45fcdfc8c035 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware.h
> > +++ b/drivers/pci/controller/dwc/pcie-designware.h
> > @@ -332,6 +332,7 @@ void dw_pcie_version_detect(struct dw_pcie *pci);
> >  
> >  u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
> >  u16 dw_pcie_find_ext_capability(struct dw_pcie *pci, u8 cap);
> > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci);
> >  
> >  int dw_pcie_read(void __iomem *addr, int size, u32 *val);
> >  int dw_pcie_write(void __iomem *addr, int size, u32 val);
> > -- 
> > 2.37.2.672.g94769d06f0-goog
> > 
> >
Serge Semin Sept. 30, 2022, 2:14 p.m. UTC | #13
On Fri, Sep 30, 2022 at 03:46:56PM +0200, Lorenzo Pieralisi wrote:
> On Wed, Sep 28, 2022 at 03:05:10PM +0300, Serge Semin wrote:
> > On Thu, Aug 25, 2022 at 06:50:25PM +0000, Will McVicker wrote:
> 
> [...]
> 
> > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > index 39f3b37d4033..8928a9a29d58 100644
> > > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > @@ -330,6 +330,9 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> > >  	u64 *msi_vaddr;
> > >  	int ret;
> > >  	u32 ctrl, num_ctrls;
> > > +	bool msi_64bit = false;
> > > +	bool retry_64bit = false;
> > > +	u16 msi_capabilities;
> > >  
> > >  	for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++)
> > >  		pp->irq_mask[ctrl] = ~0;
> > > @@ -367,16 +370,33 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> > >  						    dw_chained_msi_isr, pp);
> > >  	}
> > >  
> > > -	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
> > > -	if (ret)
> > > -		dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n");
> > 
> > > +	msi_capabilities = dw_pcie_msi_capabilities(pci);
> > > +	if (msi_capabilities & PCI_MSI_FLAGS_ENABLE)
> > > +		msi_64bit = msi_capabilities & PCI_MSI_FLAGS_64BIT;
> > 
> > Note this capability flag has nothing to do with the DW PCIe iMSI-RX
> > engine, which is used here to detect and report MSI TLPs. By design
> > iMSI-RX always support 64-bit addresses. If you imply having that flag
> > set by the DW PCIe platform drivers on the platform-specific probe
> > stage as an indication of MSI address range, then ok.
> 

> The MSI cap shows that - AFAICS - the RP can be programmed with
> a 64-bit MSI(DMA) address. It is fair to say that this is not
> enough to guarantee that the DMA mask for the host bridge can be
> inferred to be 64-bit though.

iMSI-RX always supports 64-bit bus addresses by design. The
MSI-control CSRs are unconditionally permit having 64-bit address
setup. So you can't even synthesize the DW PCIe RP IP-core with only
32-bits MSI support. AFAICS what @William is introducing here is
MSI_FLAGS_64BIT usage as a flag, which could be manually set by the
platform drivers and would indicate that the platform driver permits
having 64-bit MSI TLPs. I guess the platforms are supposed to know
better which PCIe device are going to live on the bus and set that
flag accordingly. It isn't true of course without the bus
pre-scanning.

> 
> I am inclined to drop this patch (only) from the PCI queue because
> it is unclear to me whether it really does the right thing.

Let's wait for the Robin response for my last comment regarding the
patch 1 fate too.

-Sergey

> 
> Lorenzo
> 
> > > -	msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > > -					GFP_KERNEL);
> > > -	if (!msi_vaddr) {
> > > -		dev_err(dev, "Failed to alloc and map MSI data\n");
> > > -		dw_pcie_free_msi(pp);
> > > -		return -ENOMEM;
> > > +	while (true) {
> > > +		dev_dbg(dev, "Setting MSI DMA mask to %s-bit.\n",
> > > +			retry_64bit ? "64" : "32");
> > 
> > > +		ret = dma_set_mask_and_coherent(dev, retry_64bit ?
> > > +						DMA_BIT_MASK(64) :
> > > +						DMA_BIT_MASK(32));
> > 
> > I'd suggest to just drop this. No DMA actually performed on getting the
> > MSI TLPs. So modifying the device DMA-mask due to something which
> > doesn't cause DMA and based on the flag which doesn't indicates the
> > device DMA-capability is at least inappropriate.
> > 
> > > +		if (ret)
> > > +			dev_warn(dev, "Failed to set DMA mask to %s-bit.\n",
> > > +				 retry_64bit ? "64" : "32");
> > > +
> > 
> > > +		msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
> > > +						GFP_KERNEL);
> > 
> > As I noted earlier the DMA-coherent memory can be too expensive. So
> > it's a waste of one allocating with no intent of usage. Instead of this
> > just get back the alloc_page() method here and pass the flag GFP_DMA32
> > to that function if MSI-capability reported the PCI_MSI_FLAGS_64BIT
> > unset.
> > 
> > -Sergey
> > 
> > > +		if (!msi_vaddr) {
> > > +			dev_err(dev, "Failed to alloc and map MSI data\n");
> > > +			if (msi_64bit && !retry_64bit) {
> > > +				retry_64bit = true;
> > > +				continue;
> > > +			}
> > > +
> > > +			dw_pcie_free_msi(pp);
> > > +			return -ENOMEM;
> > > +		}
> > > +		break;
> > >  	}
> > >  
> > >  	return 0;
> > > diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> > > index c6725c519a47..650a7f22f9d0 100644
> > > --- a/drivers/pci/controller/dwc/pcie-designware.c
> > > +++ b/drivers/pci/controller/dwc/pcie-designware.c
> > > @@ -82,6 +82,14 @@ u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
> > >  }
> > >  EXPORT_SYMBOL_GPL(dw_pcie_find_capability);
> > >  
> > > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci)
> > > +{
> > > +	u8 offset;
> > > +
> > > +	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_MSI);
> > > +	return dw_pcie_readw_dbi(pci, offset + PCI_MSI_FLAGS);
> > > +}
> > > +
> > >  static u16 dw_pcie_find_next_ext_capability(struct dw_pcie *pci, u16 start,
> > >  					    u8 cap)
> > >  {
> > > diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> > > index a871ae7eb59e..45fcdfc8c035 100644
> > > --- a/drivers/pci/controller/dwc/pcie-designware.h
> > > +++ b/drivers/pci/controller/dwc/pcie-designware.h
> > > @@ -332,6 +332,7 @@ void dw_pcie_version_detect(struct dw_pcie *pci);
> > >  
> > >  u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
> > >  u16 dw_pcie_find_ext_capability(struct dw_pcie *pci, u8 cap);
> > > +u16 dw_pcie_msi_capabilities(struct dw_pcie *pci);
> > >  
> > >  int dw_pcie_read(void __iomem *addr, int size, u32 *val);
> > >  int dw_pcie_write(void __iomem *addr, int size, u32 val);
> > > -- 
> > > 2.37.2.672.g94769d06f0-goog
> > > 
> > >
diff mbox series

Patch

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index 39f3b37d4033..8928a9a29d58 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -330,6 +330,9 @@  static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
 	u64 *msi_vaddr;
 	int ret;
 	u32 ctrl, num_ctrls;
+	bool msi_64bit = false;
+	bool retry_64bit = false;
+	u16 msi_capabilities;
 
 	for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++)
 		pp->irq_mask[ctrl] = ~0;
@@ -367,16 +370,33 @@  static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
 						    dw_chained_msi_isr, pp);
 	}
 
-	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
-	if (ret)
-		dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n");
+	msi_capabilities = dw_pcie_msi_capabilities(pci);
+	if (msi_capabilities & PCI_MSI_FLAGS_ENABLE)
+		msi_64bit = msi_capabilities & PCI_MSI_FLAGS_64BIT;
 
-	msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
-					GFP_KERNEL);
-	if (!msi_vaddr) {
-		dev_err(dev, "Failed to alloc and map MSI data\n");
-		dw_pcie_free_msi(pp);
-		return -ENOMEM;
+	while (true) {
+		dev_dbg(dev, "Setting MSI DMA mask to %s-bit.\n",
+			retry_64bit ? "64" : "32");
+		ret = dma_set_mask_and_coherent(dev, retry_64bit ?
+						DMA_BIT_MASK(64) :
+						DMA_BIT_MASK(32));
+		if (ret)
+			dev_warn(dev, "Failed to set DMA mask to %s-bit.\n",
+				 retry_64bit ? "64" : "32");
+
+		msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data,
+						GFP_KERNEL);
+		if (!msi_vaddr) {
+			dev_err(dev, "Failed to alloc and map MSI data\n");
+			if (msi_64bit && !retry_64bit) {
+				retry_64bit = true;
+				continue;
+			}
+
+			dw_pcie_free_msi(pp);
+			return -ENOMEM;
+		}
+		break;
 	}
 
 	return 0;
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index c6725c519a47..650a7f22f9d0 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -82,6 +82,14 @@  u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap)
 }
 EXPORT_SYMBOL_GPL(dw_pcie_find_capability);
 
+u16 dw_pcie_msi_capabilities(struct dw_pcie *pci)
+{
+	u8 offset;
+
+	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_MSI);
+	return dw_pcie_readw_dbi(pci, offset + PCI_MSI_FLAGS);
+}
+
 static u16 dw_pcie_find_next_ext_capability(struct dw_pcie *pci, u16 start,
 					    u8 cap)
 {
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index a871ae7eb59e..45fcdfc8c035 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -332,6 +332,7 @@  void dw_pcie_version_detect(struct dw_pcie *pci);
 
 u8 dw_pcie_find_capability(struct dw_pcie *pci, u8 cap);
 u16 dw_pcie_find_ext_capability(struct dw_pcie *pci, u8 cap);
+u16 dw_pcie_msi_capabilities(struct dw_pcie *pci);
 
 int dw_pcie_read(void __iomem *addr, int size, u32 *val);
 int dw_pcie_write(void __iomem *addr, int size, u32 val);