diff mbox series

[v8,07/10] iommu/vt-d: Allow qi_submit_sync() to return the QI faults

Message ID 20231227161354.67701-8-yi.l.liu@intel.com (mailing list archive)
State New
Headers show
Series Add iommufd nesting (part 2/2) | expand

Commit Message

Yi Liu Dec. 27, 2023, 4:13 p.m. UTC
From: Lu Baolu <baolu.lu@linux.intel.com>

This allows qi_submit_sync() to return back faults to callers.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/iommu/intel/dmar.c          | 31 +++++++++++++++++++----------
 drivers/iommu/intel/iommu.h         |  2 +-
 drivers/iommu/intel/irq_remapping.c |  2 +-
 drivers/iommu/intel/pasid.c         |  2 +-
 drivers/iommu/intel/svm.c           |  6 +++---
 5 files changed, 26 insertions(+), 17 deletions(-)

Comments

Tian, Kevin Dec. 28, 2023, 6:17 a.m. UTC | #1
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, December 28, 2023 12:14 AM
> 
> @@ -1376,6 +1383,8 @@ int qi_submit_sync(struct intel_iommu *iommu,
> struct qi_desc *desc,
> 
>  restart:
>  	rc = 0;
> +	if (fault)
> +		*fault = 0;

move it to right before the loop of qi_check_fault()

> 
>  	raw_spin_lock_irqsave(&qi->q_lock, flags);
>  	/*
> @@ -1430,7 +1439,7 @@ int qi_submit_sync(struct intel_iommu *iommu,
> struct qi_desc *desc,
>  		 * a deadlock where the interrupt context can wait
> indefinitely
>  		 * for free slots in the queue.
>  		 */
> -		rc = qi_check_fault(iommu, index, wait_index);
> +		rc = qi_check_fault(iommu, index, wait_index, fault);
>  		if (rc)
>  			break;

and as replied in another thread let's change qi_check_fault to return
-ETIMEDOUT to break the restart loop when fault pointer is valid.
Yi Liu Dec. 28, 2023, 8:33 a.m. UTC | #2
On 2023/12/28 14:17, Tian, Kevin wrote:
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Sent: Thursday, December 28, 2023 12:14 AM
>>
>> @@ -1376,6 +1383,8 @@ int qi_submit_sync(struct intel_iommu *iommu,
>> struct qi_desc *desc,
>>
>>   restart:
>>   	rc = 0;
>> +	if (fault)
>> +		*fault = 0;
> 
> move it to right before the loop of qi_check_fault()

ok.

>>
>>   	raw_spin_lock_irqsave(&qi->q_lock, flags);
>>   	/*
>> @@ -1430,7 +1439,7 @@ int qi_submit_sync(struct intel_iommu *iommu,
>> struct qi_desc *desc,
>>   		 * a deadlock where the interrupt context can wait
>> indefinitely
>>   		 * for free slots in the queue.
>>   		 */
>> -		rc = qi_check_fault(iommu, index, wait_index);
>> +		rc = qi_check_fault(iommu, index, wait_index, fault);
>>   		if (rc)
>>   			break;
> 
> and as replied in another thread let's change qi_check_fault to return
> -ETIMEDOUT to break the restart loop when fault pointer is valid.

sure.
Baolu Lu Jan. 1, 2024, 3:34 a.m. UTC | #3
On 12/28/23 2:17 PM, Tian, Kevin wrote:
>>   	raw_spin_lock_irqsave(&qi->q_lock, flags);
>>   	/*
>> @@ -1430,7 +1439,7 @@ int qi_submit_sync(struct intel_iommu *iommu,
>> struct qi_desc *desc,
>>   		 * a deadlock where the interrupt context can wait
>> indefinitely
>>   		 * for free slots in the queue.
>>   		 */
>> -		rc = qi_check_fault(iommu, index, wait_index);
>> +		rc = qi_check_fault(iommu, index, wait_index, fault);
>>   		if (rc)
>>   			break;
> and as replied in another thread let's change qi_check_fault to return
> -ETIMEDOUT to break the restart loop when fault pointer is valid.

It's fine to break the retry loop when fault happens and the fault
pointer is valid. Please don't forget to add an explanation comment
around the code. Something like:

/*
  * The caller is able to handle the fault by itself. The IOMMU driver
  * should not attempt to retry this request.
  */

Best regards,
baolu
Ethan Zhao Jan. 11, 2024, 7:14 a.m. UTC | #4
On 1/1/2024 11:34 AM, Baolu Lu wrote:
> On 12/28/23 2:17 PM, Tian, Kevin wrote:
>>> raw_spin_lock_irqsave(&qi->q_lock, flags);
>>>       /*
>>> @@ -1430,7 +1439,7 @@ int qi_submit_sync(struct intel_iommu *iommu,
>>> struct qi_desc *desc,
>>>            * a deadlock where the interrupt context can wait
>>> indefinitely
>>>            * for free slots in the queue.
>>>            */
>>> -        rc = qi_check_fault(iommu, index, wait_index);
>>> +        rc = qi_check_fault(iommu, index, wait_index, fault);
>>>           if (rc)
>>>               break;
>> and as replied in another thread let's change qi_check_fault to return
>> -ETIMEDOUT to break the restart loop when fault pointer is valid.
>
> It's fine to break the retry loop when fault happens and the fault
> pointer is valid. Please don't forget to add an explanation comment
> around the code. Something like:
>
> /*
>  * The caller is able to handle the fault by itself. The IOMMU driver
>  * should not attempt to retry this request.
>  */

If caller could pass desc with mixed iotlb & devtlb invalidation request,

it would be problematic/difficult for caller or qi_submit_sync() to do

error handling, imagine a case like,

1. call qi_submit_sync() with iotlb & devltb.

2. qi_submit_sync() detects the target device is dead.

3.  break the loop, or will block other invalidation submitter / hang.

4. it is hard for qi_submit_sync() to extract those iotlb invalidation 
to retry.

5. it is also difficult for caller to retry the iotlb invalidation, or

     leave iotlb out-of-sync. ---there is no sync at all, device is gone.

and if only ITE fault hit, but target device is there && configuration

space reading okay, the ITE is probably left by previous request for

other device, not triggered by this batch, the question is we couldn't

identify the ITE device is just the same as current target ? if the same,

then breaking out is reasonable, or just leave the problem to caller,

something in the request batch is bad, some requests someone request

befoere is bad, but the request is not from the same caller.


Thanks,

Ethan


>
> Best regards,
> baolu
>
diff mbox series

Patch

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 23cb80d62a9a..701705b3a8ea 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1267,7 +1267,8 @@  static void qi_dump_fault(struct intel_iommu *iommu, u32 fault)
 	       (unsigned long long)desc->qw1);
 }
 
-static int qi_check_fault(struct intel_iommu *iommu, int index, int wait_index)
+static int qi_check_fault(struct intel_iommu *iommu, int index,
+			  int wait_index, u32 *fsts)
 {
 	u32 fault;
 	int head, tail;
@@ -1278,8 +1279,12 @@  static int qi_check_fault(struct intel_iommu *iommu, int index, int wait_index)
 		return -EAGAIN;
 
 	fault = readl(iommu->reg + DMAR_FSTS_REG);
-	if (fault & (DMA_FSTS_IQE | DMA_FSTS_ITE | DMA_FSTS_ICE))
+	fault &= DMA_FSTS_IQE | DMA_FSTS_ITE | DMA_FSTS_ICE;
+	if (fault) {
+		if (fsts)
+			*fsts |= fault;
 		qi_dump_fault(iommu, fault);
+	}
 
 	/*
 	 * If IQE happens, the head points to the descriptor associated
@@ -1342,9 +1347,11 @@  static int qi_check_fault(struct intel_iommu *iommu, int index, int wait_index)
  * time, a wait descriptor will be appended to each submission to ensure
  * hardware has completed the invalidation before return. Wait descriptors
  * can be part of the submission but it will not be polled for completion.
+ * If callers are interested in the QI faults that occur during the handling
+ * of requests, the QI faults are saved in @fault.
  */
 int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
-		   unsigned int count, unsigned long options)
+		   unsigned int count, unsigned long options, u32 *fault)
 {
 	struct q_inval *qi = iommu->qi;
 	s64 devtlb_start_ktime = 0;
@@ -1376,6 +1383,8 @@  int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
 
 restart:
 	rc = 0;
+	if (fault)
+		*fault = 0;
 
 	raw_spin_lock_irqsave(&qi->q_lock, flags);
 	/*
@@ -1430,7 +1439,7 @@  int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
 		 * a deadlock where the interrupt context can wait indefinitely
 		 * for free slots in the queue.
 		 */
-		rc = qi_check_fault(iommu, index, wait_index);
+		rc = qi_check_fault(iommu, index, wait_index, fault);
 		if (rc)
 			break;
 
@@ -1476,7 +1485,7 @@  void qi_global_iec(struct intel_iommu *iommu)
 	desc.qw3 = 0;
 
 	/* should never fail */
-	qi_submit_sync(iommu, &desc, 1, 0);
+	qi_submit_sync(iommu, &desc, 1, 0, NULL);
 }
 
 void qi_flush_context(struct intel_iommu *iommu, u16 did, u16 sid, u8 fm,
@@ -1490,7 +1499,7 @@  void qi_flush_context(struct intel_iommu *iommu, u16 did, u16 sid, u8 fm,
 	desc.qw2 = 0;
 	desc.qw3 = 0;
 
-	qi_submit_sync(iommu, &desc, 1, 0);
+	qi_submit_sync(iommu, &desc, 1, 0, NULL);
 }
 
 void qi_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr,
@@ -1514,7 +1523,7 @@  void qi_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr,
 	desc.qw2 = 0;
 	desc.qw3 = 0;
 
-	qi_submit_sync(iommu, &desc, 1, 0);
+	qi_submit_sync(iommu, &desc, 1, 0, NULL);
 }
 
 void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
@@ -1545,7 +1554,7 @@  void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
 	desc.qw2 = 0;
 	desc.qw3 = 0;
 
-	qi_submit_sync(iommu, &desc, 1, 0);
+	qi_submit_sync(iommu, &desc, 1, 0, NULL);
 }
 
 /* PASID-based IOTLB invalidation */
@@ -1586,7 +1595,7 @@  void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr,
 				QI_EIOTLB_AM(mask);
 	}
 
-	qi_submit_sync(iommu, &desc, 1, 0);
+	qi_submit_sync(iommu, &desc, 1, 0, NULL);
 }
 
 /* PASID-based device IOTLB Invalidate */
@@ -1639,7 +1648,7 @@  void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid,
 		desc.qw1 |= QI_DEV_EIOTLB_SIZE;
 	}
 
-	qi_submit_sync(iommu, &desc, 1, 0);
+	qi_submit_sync(iommu, &desc, 1, 0, NULL);
 }
 
 void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did,
@@ -1649,7 +1658,7 @@  void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did,
 
 	desc.qw0 = QI_PC_PASID(pasid) | QI_PC_DID(did) |
 			QI_PC_GRAN(granu) | QI_PC_TYPE;
-	qi_submit_sync(iommu, &desc, 1, 0);
+	qi_submit_sync(iommu, &desc, 1, 0, NULL);
 }
 
 /*
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index ce030c5b5772..c6de958e4f54 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -881,7 +881,7 @@  void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did, u64 granu,
 			  u32 pasid);
 
 int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
-		   unsigned int count, unsigned long options);
+		   unsigned int count, unsigned long options, u32 *fault);
 /*
  * Options used in qi_submit_sync:
  * QI_OPT_WAIT_DRAIN - Wait for PRQ drain completion, spec 6.5.2.8.
diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c
index 29b9e55dcf26..f834afa3672d 100644
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -153,7 +153,7 @@  static int qi_flush_iec(struct intel_iommu *iommu, int index, int mask)
 	desc.qw2 = 0;
 	desc.qw3 = 0;
 
-	return qi_submit_sync(iommu, &desc, 1, 0);
+	return qi_submit_sync(iommu, &desc, 1, 0, NULL);
 }
 
 static int modify_irte(struct irq_2_iommu *irq_iommu,
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 74e8e4c17e81..67f924760ba8 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -467,7 +467,7 @@  pasid_cache_invalidation_with_pasid(struct intel_iommu *iommu,
 	desc.qw2 = 0;
 	desc.qw3 = 0;
 
-	qi_submit_sync(iommu, &desc, 1, 0);
+	qi_submit_sync(iommu, &desc, 1, 0, NULL);
 }
 
 static void
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index ac12f76c1212..660d049ad5b6 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -543,7 +543,7 @@  void intel_drain_pasid_prq(struct device *dev, u32 pasid)
 			QI_DEV_IOTLB_PFSID(info->pfsid);
 qi_retry:
 	reinit_completion(&iommu->prq_complete);
-	qi_submit_sync(iommu, desc, 3, QI_OPT_WAIT_DRAIN);
+	qi_submit_sync(iommu, desc, 3, QI_OPT_WAIT_DRAIN, NULL);
 	if (readl(iommu->reg + DMAR_PRS_REG) & DMA_PRS_PRO) {
 		wait_for_completion(&iommu->prq_complete);
 		goto qi_retry;
@@ -646,7 +646,7 @@  static void handle_bad_prq_event(struct intel_iommu *iommu,
 		desc.qw3 = 0;
 	}
 
-	qi_submit_sync(iommu, &desc, 1, 0);
+	qi_submit_sync(iommu, &desc, 1, 0, NULL);
 }
 
 static irqreturn_t prq_event_thread(int irq, void *d)
@@ -811,7 +811,7 @@  int intel_svm_page_response(struct device *dev,
 				ktime_to_ns(ktime_get()) - prm->private_data[0]);
 		}
 
-		qi_submit_sync(iommu, &desc, 1, 0);
+		qi_submit_sync(iommu, &desc, 1, 0, NULL);
 	}
 out:
 	return ret;