Message ID | 20240130080835.58921-14-baolu.lu@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | iommu: Prepare to deliver page faults to user space | expand |
> From: Lu Baolu <baolu.lu@linux.intel.com> > Sent: Tuesday, January 30, 2024 4:09 PM > * > - * Caller makes sure that no more faults are reported for this device. > + * Removing a device from an iopf_queue. It's recommended to follow > these > + * steps when removing a device: > * > - * Return: 0 on success and <0 on error. > + * - Disable new PRI reception: Turn off PRI generation in the IOMMU > hardware > + * and flush any hardware page request queues. This should be done > before > + * calling into this helper. this 1st step is already not followed by intel-iommu driver. The Page Request Enable (PRE) bit is set in the context entry when a device is attached to the default domain and cleared only in intel_iommu_release_device(). but iopf_queue_remove_device() is called when IOMMU_DEV_FEAT_IOPF is disabled e.g. when idxd driver is unbound from the device. so the order is already violated. > + * - Acknowledge all outstanding PRQs to the device: Respond to all > outstanding > + * page requests with IOMMU_PAGE_RESP_INVALID, indicating the device > should > + * not retry. This helper function handles this. > + * - Disable PRI on the device: After calling this helper, the caller could > + * then disable PRI on the device. intel_iommu_disable_iopf() disables PRI cap before calling this helper. > + * - Tear down the iopf infrastructure: Calling iopf_queue_remove_device() > + * essentially disassociates the device. The fault_param might still exist, > + * but iommu_page_response() will do nothing. The device fault parameter > + * reference count has been properly passed from > iommu_report_device_fault() > + * to the fault handling work, and will eventually be released after > + * iommu_page_response(). it's unclear what 'tear down' means here.
On 2024/2/5 17:00, Tian, Kevin wrote: >> From: Lu Baolu <baolu.lu@linux.intel.com> >> Sent: Tuesday, January 30, 2024 4:09 PM >> * >> - * Caller makes sure that no more faults are reported for this device. >> + * Removing a device from an iopf_queue. It's recommended to follow >> these >> + * steps when removing a device: >> * >> - * Return: 0 on success and <0 on error. >> + * - Disable new PRI reception: Turn off PRI generation in the IOMMU >> hardware >> + * and flush any hardware page request queues. This should be done >> before >> + * calling into this helper. > > this 1st step is already not followed by intel-iommu driver. The Page > Request Enable (PRE) bit is set in the context entry when a device > is attached to the default domain and cleared only in > intel_iommu_release_device(). > > but iopf_queue_remove_device() is called when IOMMU_DEV_FEAT_IOPF > is disabled e.g. when idxd driver is unbound from the device. > > so the order is already violated. > >> + * - Acknowledge all outstanding PRQs to the device: Respond to all >> outstanding >> + * page requests with IOMMU_PAGE_RESP_INVALID, indicating the device >> should >> + * not retry. This helper function handles this. >> + * - Disable PRI on the device: After calling this helper, the caller could >> + * then disable PRI on the device. > > intel_iommu_disable_iopf() disables PRI cap before calling this helper. You are right. The individual drivers should be adjusted accordingly in separated patches. Here we just define the expected behaviors of the individual iommu driver from the core's perspective. > >> + * - Tear down the iopf infrastructure: Calling iopf_queue_remove_device() >> + * essentially disassociates the device. The fault_param might still exist, >> + * but iommu_page_response() will do nothing. The device fault parameter >> + * reference count has been properly passed from >> iommu_report_device_fault() >> + * to the fault handling work, and will eventually be released after >> + * iommu_page_response(). > > it's unclear what 'tear down' means here. It's the same as calling iopf_queue_remove_device(). Perhaps I could remove the confusing "tear down the iopf infrastructure"? Best regards, baolu
On Mon, Feb 05, 2024 at 07:55:23PM +0800, Baolu Lu wrote: > On 2024/2/5 17:00, Tian, Kevin wrote: > > > From: Lu Baolu <baolu.lu@linux.intel.com> > > > Sent: Tuesday, January 30, 2024 4:09 PM > > > * > > > - * Caller makes sure that no more faults are reported for this device. > > > + * Removing a device from an iopf_queue. It's recommended to follow > > > these > > > + * steps when removing a device: > > > * > > > - * Return: 0 on success and <0 on error. > > > + * - Disable new PRI reception: Turn off PRI generation in the IOMMU > > > hardware > > > + * and flush any hardware page request queues. This should be done > > > before > > > + * calling into this helper. > > > > this 1st step is already not followed by intel-iommu driver. The Page > > Request Enable (PRE) bit is set in the context entry when a device > > is attached to the default domain and cleared only in > > intel_iommu_release_device(). > > > > but iopf_queue_remove_device() is called when IOMMU_DEV_FEAT_IOPF > > is disabled e.g. when idxd driver is unbound from the device. > > > > so the order is already violated. > > > > > + * - Acknowledge all outstanding PRQs to the device: Respond to all > > > outstanding > > > + * page requests with IOMMU_PAGE_RESP_INVALID, indicating the device > > > should > > > + * not retry. This helper function handles this. > > > + * - Disable PRI on the device: After calling this helper, the caller could > > > + * then disable PRI on the device. > > > > intel_iommu_disable_iopf() disables PRI cap before calling this helper. > > You are right. The individual drivers should be adjusted accordingly in > separated patches. Here we just define the expected behaviors of the > individual iommu driver from the core's perspective. Yeah, I don't think the driver really works properly before this documentation was added either :\ We also need to check that the proposed AMD patches (SVA support part 4) are working right before they are merged. Jason
> From: Baolu Lu <baolu.lu@linux.intel.com> > Sent: Monday, February 5, 2024 7:55 PM > > On 2024/2/5 17:00, Tian, Kevin wrote: > >> From: Lu Baolu <baolu.lu@linux.intel.com> > >> Sent: Tuesday, January 30, 2024 4:09 PM > >> * > >> - * Caller makes sure that no more faults are reported for this device. > >> + * Removing a device from an iopf_queue. It's recommended to follow > >> these > >> + * steps when removing a device: > >> * > >> - * Return: 0 on success and <0 on error. > >> + * - Disable new PRI reception: Turn off PRI generation in the IOMMU > >> hardware > >> + * and flush any hardware page request queues. This should be done > >> before > >> + * calling into this helper. > > > > this 1st step is already not followed by intel-iommu driver. The Page > > Request Enable (PRE) bit is set in the context entry when a device > > is attached to the default domain and cleared only in > > intel_iommu_release_device(). > > > > but iopf_queue_remove_device() is called when IOMMU_DEV_FEAT_IOPF > > is disabled e.g. when idxd driver is unbound from the device. > > > > so the order is already violated. > > > >> + * - Acknowledge all outstanding PRQs to the device: Respond to all > >> outstanding > >> + * page requests with IOMMU_PAGE_RESP_INVALID, indicating the > device > >> should > >> + * not retry. This helper function handles this. > >> + * - Disable PRI on the device: After calling this helper, the caller could > >> + * then disable PRI on the device. > > > > intel_iommu_disable_iopf() disables PRI cap before calling this helper. > > You are right. The individual drivers should be adjusted accordingly in > separated patches. Here we just define the expected behaviors of the > individual iommu driver from the core's perspective. can you add a note in commit msg about it? > > > > >> + * - Tear down the iopf infrastructure: Calling > iopf_queue_remove_device() > >> + * essentially disassociates the device. The fault_param might still exist, > >> + * but iommu_page_response() will do nothing. The device fault > parameter > >> + * reference count has been properly passed from > >> iommu_report_device_fault() > >> + * to the fault handling work, and will eventually be released after > >> + * iommu_page_response(). > > > > it's unclear what 'tear down' means here. > > It's the same as calling iopf_queue_remove_device(). Perhaps I could > remove the confusing "tear down the iopf infrastructure"? > I thought it is the last step then must have something real to do. if not then removing it is clearer.
On 2024/2/6 16:09, Tian, Kevin wrote: >> From: Baolu Lu <baolu.lu@linux.intel.com> >> Sent: Monday, February 5, 2024 7:55 PM >> >> On 2024/2/5 17:00, Tian, Kevin wrote: >>>> From: Lu Baolu <baolu.lu@linux.intel.com> >>>> Sent: Tuesday, January 30, 2024 4:09 PM >>>> * >>>> - * Caller makes sure that no more faults are reported for this device. >>>> + * Removing a device from an iopf_queue. It's recommended to follow >>>> these >>>> + * steps when removing a device: >>>> * >>>> - * Return: 0 on success and <0 on error. >>>> + * - Disable new PRI reception: Turn off PRI generation in the IOMMU >>>> hardware >>>> + * and flush any hardware page request queues. This should be done >>>> before >>>> + * calling into this helper. >>> >>> this 1st step is already not followed by intel-iommu driver. The Page >>> Request Enable (PRE) bit is set in the context entry when a device >>> is attached to the default domain and cleared only in >>> intel_iommu_release_device(). >>> >>> but iopf_queue_remove_device() is called when IOMMU_DEV_FEAT_IOPF >>> is disabled e.g. when idxd driver is unbound from the device. >>> >>> so the order is already violated. >>> >>>> + * - Acknowledge all outstanding PRQs to the device: Respond to all >>>> outstanding >>>> + * page requests with IOMMU_PAGE_RESP_INVALID, indicating the >> device >>>> should >>>> + * not retry. This helper function handles this. >>>> + * - Disable PRI on the device: After calling this helper, the caller could >>>> + * then disable PRI on the device. >>> >>> intel_iommu_disable_iopf() disables PRI cap before calling this helper. >> >> You are right. The individual drivers should be adjusted accordingly in >> separated patches. Here we just define the expected behaviors of the >> individual iommu driver from the core's perspective. > > can you add a note in commit msg about it? > >> >>> >>>> + * - Tear down the iopf infrastructure: Calling >> iopf_queue_remove_device() >>>> + * essentially disassociates the device. The fault_param might still exist, >>>> + * but iommu_page_response() will do nothing. The device fault >> parameter >>>> + * reference count has been properly passed from >>>> iommu_report_device_fault() >>>> + * to the fault handling work, and will eventually be released after >>>> + * iommu_page_response(). >>> >>> it's unclear what 'tear down' means here. >> >> It's the same as calling iopf_queue_remove_device(). Perhaps I could >> remove the confusing "tear down the iopf infrastructure"? >> > > I thought it is the last step then must have something real to do. > > if not then removing it is clearer. Both done. Thanks! Best regards, baolu
diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 396d7b0d88b2..d9a99a978ffa 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -1542,7 +1542,7 @@ iommu_sva_domain_alloc(struct device *dev, struct mm_struct *mm) #ifdef CONFIG_IOMMU_IOPF int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev); -int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev); +void iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev); int iopf_queue_flush_dev(struct device *dev); struct iopf_queue *iopf_queue_alloc(const char *name); void iopf_queue_free(struct iopf_queue *queue); @@ -1558,10 +1558,9 @@ iopf_queue_add_device(struct iopf_queue *queue, struct device *dev) return -ENODEV; } -static inline int +static inline void iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) { - return -ENODEV; } static inline int iopf_queue_flush_dev(struct device *dev) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 29a12f289e2e..a81a2be9b870 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4455,12 +4455,7 @@ static int intel_iommu_disable_iopf(struct device *dev) */ pci_disable_pri(to_pci_dev(dev)); info->pri_enabled = 0; - - /* - * With PRI disabled and outstanding PRQs drained, removing device - * from iopf queue should never fail. - */ - WARN_ON(iopf_queue_remove_device(iommu->iopf_queue, dev)); + iopf_queue_remove_device(iommu->iopf_queue, dev); return 0; } diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index ce7058892b59..26e100ca3221 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -448,50 +448,67 @@ EXPORT_SYMBOL_GPL(iopf_queue_add_device); * @queue: IOPF queue * @dev: device to remove * - * Caller makes sure that no more faults are reported for this device. + * Removing a device from an iopf_queue. It's recommended to follow these + * steps when removing a device: * - * Return: 0 on success and <0 on error. + * - Disable new PRI reception: Turn off PRI generation in the IOMMU hardware + * and flush any hardware page request queues. This should be done before + * calling into this helper. + * - Acknowledge all outstanding PRQs to the device: Respond to all outstanding + * page requests with IOMMU_PAGE_RESP_INVALID, indicating the device should + * not retry. This helper function handles this. + * - Disable PRI on the device: After calling this helper, the caller could + * then disable PRI on the device. + * - Tear down the iopf infrastructure: Calling iopf_queue_remove_device() + * essentially disassociates the device. The fault_param might still exist, + * but iommu_page_response() will do nothing. The device fault parameter + * reference count has been properly passed from iommu_report_device_fault() + * to the fault handling work, and will eventually be released after + * iommu_page_response(). */ -int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) +void iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) { - int ret = 0; struct iopf_fault *iopf, *next; + struct iommu_page_response resp; struct dev_iommu *param = dev->iommu; struct iommu_fault_param *fault_param; + const struct iommu_ops *ops = dev_iommu_ops(dev); mutex_lock(&queue->lock); mutex_lock(¶m->lock); fault_param = rcu_dereference_check(param->fault_param, lockdep_is_held(¶m->lock)); - if (!fault_param) { - ret = -ENODEV; - goto unlock; - } - - if (fault_param->queue != queue) { - ret = -EINVAL; - goto unlock; - } - if (!list_empty(&fault_param->faults)) { - ret = -EBUSY; + if (WARN_ON(!fault_param || fault_param->queue != queue)) goto unlock; - } - - list_del(&fault_param->queue_list); - /* Just in case some faults are still stuck */ + mutex_lock(&fault_param->lock); list_for_each_entry_safe(iopf, next, &fault_param->partial, list) kfree(iopf); + list_for_each_entry_safe(iopf, next, &fault_param->faults, list) { + memset(&resp, 0, sizeof(struct iommu_page_response)); + resp.pasid = iopf->fault.prm.pasid; + resp.grpid = iopf->fault.prm.grpid; + resp.code = IOMMU_PAGE_RESP_INVALID; + + if (iopf->fault.prm.flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID) + resp.flags = IOMMU_PAGE_RESP_PASID_VALID; + + ops->page_response(dev, iopf, &resp); + list_del(&iopf->list); + kfree(iopf); + } + mutex_unlock(&fault_param->lock); + + list_del(&fault_param->queue_list); + /* dec the ref owned by iopf_queue_add_device() */ rcu_assign_pointer(param->fault_param, NULL); iopf_put_dev_fault_param(fault_param); unlock: mutex_unlock(¶m->lock); mutex_unlock(&queue->lock); - - return ret; } EXPORT_SYMBOL_GPL(iopf_queue_remove_device);