diff mbox series

[v12,4/6] iommu/s390: Disable deferred flush for ISM devices

Message ID 20230825-dma_iommu-v12-4-4134455994a7@linux.ibm.com (mailing list archive)
State Not Applicable
Headers show
Series iommu/dma: s390 DMA API conversion and optimized IOTLB flushing | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Niklas Schnelle Aug. 25, 2023, 10:11 a.m. UTC
ISM devices are virtual PCI devices used for cross-LPAR communication.
Unlike real PCI devices ISM devices do not use the hardware IOMMU but
inspects IOMMU translation tables directly on IOTLB flush (s390 RPCIT
instruction).

ISM devices keep their DMA allocations static and only very rarely DMA
unmap at all. For each IOTLB flush that occurs after unmap the ISM
devices will however inspect the area of the IOVA space indicated by the
flush. This means that for the global IOTLB flushes used by the flush
queue mechanism the entire IOVA space would be inspected. In principle
this would be fine, albeit potentially unnecessarily slow, it turns out
however that ISM devices are sensitive to seeing IOVA addresses that are
currently in use in the IOVA range being flushed. Seeing such in-use
IOVA addresses will cause the ISM device to enter an error state and
become unusable.

Fix this by claiming IOMMU_CAP_DEFERRED_FLUSH only for non-ISM devices.
This makes sure IOTLB flushes only cover IOVAs that have been unmapped
and also restricts the range of the IOTLB flush potentially reducing
latency spikes.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
 drivers/iommu/s390-iommu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Matthew Rosato Aug. 25, 2023, 6:23 p.m. UTC | #1
On 8/25/23 6:11 AM, Niklas Schnelle wrote:
> ISM devices are virtual PCI devices used for cross-LPAR communication.
> Unlike real PCI devices ISM devices do not use the hardware IOMMU but
> inspects IOMMU translation tables directly on IOTLB flush (s390 RPCIT
> instruction).
> 
> ISM devices keep their DMA allocations static and only very rarely DMA
> unmap at all. For each IOTLB flush that occurs after unmap the ISM
> devices will however inspect the area of the IOVA space indicated by the
> flush. This means that for the global IOTLB flushes used by the flush
> queue mechanism the entire IOVA space would be inspected. In principle
> this would be fine, albeit potentially unnecessarily slow, it turns out
> however that ISM devices are sensitive to seeing IOVA addresses that are
> currently in use in the IOVA range being flushed. Seeing such in-use
> IOVA addresses will cause the ISM device to enter an error state and
> become unusable.
> 
> Fix this by claiming IOMMU_CAP_DEFERRED_FLUSH only for non-ISM devices.
> This makes sure IOTLB flushes only cover IOVAs that have been unmapped
> and also restricts the range of the IOTLB flush potentially reducing
> latency spikes.
> 
> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

> ---
>  drivers/iommu/s390-iommu.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
> index f6d6c60e5634..8310180a102c 100644
> --- a/drivers/iommu/s390-iommu.c
> +++ b/drivers/iommu/s390-iommu.c
> @@ -315,11 +315,13 @@ static struct s390_domain *to_s390_domain(struct iommu_domain *dom)
>  
>  static bool s390_iommu_capable(struct device *dev, enum iommu_cap cap)
>  {
> +	struct zpci_dev *zdev = to_zpci_dev(dev);
> +
>  	switch (cap) {
>  	case IOMMU_CAP_CACHE_COHERENCY:
>  		return true;
>  	case IOMMU_CAP_DEFERRED_FLUSH:
> -		return true;
> +		return zdev->pft != PCI_FUNC_TYPE_ISM;
>  	default:
>  		return false;
>  	}
>
diff mbox series

Patch

diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index f6d6c60e5634..8310180a102c 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -315,11 +315,13 @@  static struct s390_domain *to_s390_domain(struct iommu_domain *dom)
 
 static bool s390_iommu_capable(struct device *dev, enum iommu_cap cap)
 {
+	struct zpci_dev *zdev = to_zpci_dev(dev);
+
 	switch (cap) {
 	case IOMMU_CAP_CACHE_COHERENCY:
 		return true;
 	case IOMMU_CAP_DEFERRED_FLUSH:
-		return true;
+		return zdev->pft != PCI_FUNC_TYPE_ISM;
 	default:
 		return false;
 	}