Message ID | 20220106022053.2406748-6-baolu.lu@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Scrap iommu_attach/detach_group() interfaces | expand |
On Thu, Jan 06, 2022 at 10:20:50AM +0800, Lu Baolu wrote: > The individual device driver should use iommu_attach/detach_device() > for domain attachment/detachment. > > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> > drivers/iommu/amd/iommu_v2.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c > index 58da08cc3d01..7d9d0fe89064 100644 > +++ b/drivers/iommu/amd/iommu_v2.c > @@ -133,7 +133,7 @@ static void free_device_state(struct device_state *dev_state) > if (WARN_ON(!group)) > return; > > - iommu_detach_group(dev_state->domain, group); > + iommu_detach_device(dev_state->domain, &dev_state->pdev->dev); > > iommu_group_put(group); This is the only user of the group in the function all the group_get/put should be deleted too. Joerg said in commit 55c99a4dc50f ("iommu/amd: Use iommu_attach_group()") that the device API doesn't work here because there are multi-device groups? But I'm not sure how this can work with multi-device groups - this seems to assigns a domain setup for direct map, so perhaps this only works if all devices are setup for direct map? > @@ -791,7 +791,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids) > goto out_free_domain; > } > > - ret = iommu_attach_group(dev_state->domain, group); > + ret = iommu_attach_device(dev_state->domain, &pdev->dev); > if (ret != 0) > goto out_drop_group; Same comment here Jason
Hi Jason, On 1/6/22 10:33 PM, Jason Gunthorpe wrote: > On Thu, Jan 06, 2022 at 10:20:50AM +0800, Lu Baolu wrote: >> The individual device driver should use iommu_attach/detach_device() >> for domain attachment/detachment. >> >> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> >> drivers/iommu/amd/iommu_v2.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c >> index 58da08cc3d01..7d9d0fe89064 100644 >> +++ b/drivers/iommu/amd/iommu_v2.c >> @@ -133,7 +133,7 @@ static void free_device_state(struct device_state *dev_state) >> if (WARN_ON(!group)) >> return; >> >> - iommu_detach_group(dev_state->domain, group); >> + iommu_detach_device(dev_state->domain, &dev_state->pdev->dev); >> >> iommu_group_put(group); > > This is the only user of the group in the function all the > group_get/put should be deleted too. > > Joerg said in commit 55c99a4dc50f ("iommu/amd: Use > iommu_attach_group()") that the device API doesn't work here because > there are multi-device groups? > > But I'm not sure how this can work with multi-device groups - this > seems to assigns a domain setup for direct map, so perhaps this only > works if all devices are setup for direct map? It's also difficult for me to understand how this can work with multi- device group. The iommu_attach_group() returns -EBUSY if _init_device() is called for the second device in the group. That's the reason why I didn't set no_kernel_dma. > >> @@ -791,7 +791,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids) >> goto out_free_domain; >> } >> >> - ret = iommu_attach_group(dev_state->domain, group); >> + ret = iommu_attach_device(dev_state->domain, &pdev->dev); >> if (ret != 0) >> goto out_drop_group; > > Same comment here Yes. > > Jason > Best regards, baolu
On Thu, Jan 06, 2022 at 10:33:45AM -0400, Jason Gunthorpe wrote: > But I'm not sure how this can work with multi-device groups - this > seems to assigns a domain setup for direct map, so perhaps this only > works if all devices are setup for direct map? Right, at boot all devices in this group are already setup for using a direct map. There are usually two devices in those groups, the GPU itself and the sound device for HDMI sound output. Regards, Joerg
On Mon, Feb 14, 2022 at 12:27:05PM +0100, Joerg Roedel wrote: > On Thu, Jan 06, 2022 at 10:33:45AM -0400, Jason Gunthorpe wrote: > > But I'm not sure how this can work with multi-device groups - this > > seems to assigns a domain setup for direct map, so perhaps this only > > works if all devices are setup for direct map? > > Right, at boot all devices in this group are already setup for using a > direct map. There are usually two devices in those groups, the GPU > itself and the sound device for HDMI sound output. But how does the sound device know that this has been done to it? eg how do we know the sound device hasn't been bound to VFIO or something at this point? Jason
On Mon, Feb 14, 2022 at 09:15:44AM -0400, Jason Gunthorpe wrote: > But how does the sound device know that this has been done to it? > > eg how do we know the sound device hasn't been bound to VFIO or > something at this point? The iommu_attach_group() call will fail when the group (which includes GPU and sound device) it not in its default-domain. So if VFIO attached the group to its own domain, there is a failure in this init function. Note that this function is intended to be called by the driver currently controling this device, so there should also be no race with VFIO trying to grab the device in parallel. Regards, Joerg
On Mon, Feb 14, 2022 at 02:40:56PM +0100, Joerg Roedel wrote: > On Mon, Feb 14, 2022 at 09:15:44AM -0400, Jason Gunthorpe wrote: > > But how does the sound device know that this has been done to it? > > > > eg how do we know the sound device hasn't been bound to VFIO or > > something at this point? > > The iommu_attach_group() call will fail when the group (which includes > GPU and sound device) it not in its default-domain. So if VFIO attached > the group to its own domain, there is a failure in this init function. That works for VFIO, but it doesn't work for other in-kernel drivers.. Is there something ensuring the group is only the GPU and sound device? Is the GPU never an addin card? I'd say the right way to code this after Lu's series to have both the GPU and sound driver call iommu_attach_device() during their probe()'s and specify the identity domain as the attaching domain. That would be quite similar to how the Tegra drivers got arranged. And then maybe someone could better guess what the "sound driver" is since it would be marked with an iommu_attach_device() call.. Jason
On Mon, Feb 14, 2022 at 10:02:36AM -0400, Jason Gunthorpe wrote: > That works for VFIO, but it doesn't work for other in-kernel > drivers.. Is there something ensuring the group is only the GPU and > sound device? Is the GPU never an addin card? GPUs supporting this functionality are always iGPUs, AFAIK. > I'd say the right way to code this after Lu's series to have both the > GPU and sound driver call iommu_attach_device() during their probe()'s > and specify the identity domain as the attaching domain. > > That would be quite similar to how the Tegra drivers got arranged. > > And then maybe someone could better guess what the "sound driver" is > since it would be marked with an iommu_attach_device() call. Device drivers calling into iommu_attach_device() is seldom a good idea. In this case the sound device has some generic hardware interface so that an existing sound driver can be re-used. Making this driver call iommu-specific functions for some devices is something hard to justify. With sub-groups on the other hand it would be a no-brainer, because the sound device would be in a separate sub-group. Basically any device in the same group as the GPU would be in a separate sub-group. Regards, Joerg
On Mon, Feb 14, 2022 at 03:23:07PM +0100, Joerg Roedel wrote: > Device drivers calling into iommu_attach_device() is seldom a good > idea. In this case the sound device has some generic hardware > interface so that an existing sound driver can be re-used. Making this > driver call iommu-specific functions for some devices is something hard > to justify. Er, so this is transparent to the generic sound device? I guess something fixed up the dma_api on that device to keep working? But, then, the requirement is that nobody is using the dma API when we make this change? > With sub-groups on the other hand it would be a no-brainer, because the > sound device would be in a separate sub-group. Basically any device in > the same group as the GPU would be in a separate sub-group. I don't think it matters how big/small the group is, only that when we change the domain we know everything flowing through the domain is still happy. Jason
On Mon, Feb 14, 2022 at 11:00:59AM -0400, Jason Gunthorpe wrote: > On Mon, Feb 14, 2022 at 03:23:07PM +0100, Joerg Roedel wrote: > > > Device drivers calling into iommu_attach_device() is seldom a good > > idea. In this case the sound device has some generic hardware > > interface so that an existing sound driver can be re-used. Making this > > driver call iommu-specific functions for some devices is something hard > > to justify. > > Er, so this is transparent to the generic sound device? I guess > something fixed up the dma_api on that device to keep working? Right, this is completly transparent to the sound device. The IOMMU code will not set dma_ops on the device because it uses a direct mapping and so the standard implementation will be used. > But, then, the requirement is that nobody is using the dma API when we > make this change? That is the tricky part. DMA-API keeps working after the change is made, because the new domain is also direct mapped. The new domain just has the ability to assign host page-tables to device PASIDs, so that DMA requests with a PASID TLP will be remapped. It was actually a requirement for this code that when it jumps in, the DMA-API mappings stay live. And the reason a direct mapping is used at all is that the page-table walker of the IOMMU is a two-dimensional walker, which will treat the addresses found in the host page-tables as IO-virtual an translates them through the underlying page-table. So to use host-pagetables the underlying mapping must be direct mapped. > I don't think it matters how big/small the group is, only that when we > change the domain we know everything flowing through the domain is > still happy. Yes, that matters. The group size matters too for DMA-API performance. If two devices compete for the same lock in the allocator and/or the same cached magazines, things will slow down. That only matters for high-throughput devices, but still... Regards, Joerg
On 2022-02-15 09:11, Joerg Roedel wrote: > On Mon, Feb 14, 2022 at 11:00:59AM -0400, Jason Gunthorpe wrote: >> On Mon, Feb 14, 2022 at 03:23:07PM +0100, Joerg Roedel wrote: >> >>> Device drivers calling into iommu_attach_device() is seldom a good >>> idea. In this case the sound device has some generic hardware >>> interface so that an existing sound driver can be re-used. Making this >>> driver call iommu-specific functions for some devices is something hard >>> to justify. >> >> Er, so this is transparent to the generic sound device? I guess >> something fixed up the dma_api on that device to keep working? > > Right, this is completly transparent to the sound device. The IOMMU code > will not set dma_ops on the device because it uses a direct mapping and > so the standard implementation will be used. > >> But, then, the requirement is that nobody is using the dma API when we >> make this change? > > That is the tricky part. DMA-API keeps working after the change is made, > because the new domain is also direct mapped. The new domain just has > the ability to assign host page-tables to device PASIDs, so that DMA > requests with a PASID TLP will be remapped. > > It was actually a requirement for this code that when it jumps in, the > DMA-API mappings stay live. And the reason a direct mapping is used at > all is that the page-table walker of the IOMMU is a two-dimensional > walker, which will treat the addresses found in the host page-tables as > IO-virtual an translates them through the underlying page-table. So to > use host-pagetables the underlying mapping must be direct mapped. Given how things have evolved since that code was originally written, and that we seemingly now have the def_domain_type override kicking in as soon as we first see an IOMMUv2-capable device, do we even need to then subsequently switch to this special unmanaged domain with its pagetable sucked out, or could we just install the PASID table in the default domain itself? Robin. >> I don't think it matters how big/small the group is, only that when we >> change the domain we know everything flowing through the domain is >> still happy. > > Yes, that matters. The group size matters too for DMA-API performance. > If two devices compete for the same lock in the allocator and/or the > same cached magazines, things will slow down. That only matters for > high-throughput devices, but still... > > Regards, > > Joerg >
On Tue, Feb 15, 2022 at 10:11:01AM +0100, Joerg Roedel wrote: > > But, then, the requirement is that nobody is using the dma API when we > > make this change? > > That is the tricky part. DMA-API keeps working after the change is made, > because the new domain is also direct mapped. The new domain just has > the ability to assign host page-tables to device PASIDs, so that DMA > requests with a PASID TLP will be remapped. From, what you've described this is also a good use case for the replace_group idea.. Using Lu's series we'd: - Require the group to be in DMA-API mode. We know this is the case because of how we got here from probe() - Use the 'replace group's domain' API to switch the group from one identity translation domain to another identity translation domain - Rely on the DMA-API refcount to block VFIO/etc instead of checking for the default domain - Restore the default domain when all the DMA-API mode drivers unprobe That is, if Robin's idea to just get the right domain in the first place doesn't work. Anyhow, given all this I think this patch is not OK as is. Thanks, Jason
diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c index 58da08cc3d01..7d9d0fe89064 100644 --- a/drivers/iommu/amd/iommu_v2.c +++ b/drivers/iommu/amd/iommu_v2.c @@ -133,7 +133,7 @@ static void free_device_state(struct device_state *dev_state) if (WARN_ON(!group)) return; - iommu_detach_group(dev_state->domain, group); + iommu_detach_device(dev_state->domain, &dev_state->pdev->dev); iommu_group_put(group); @@ -791,7 +791,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids) goto out_free_domain; } - ret = iommu_attach_group(dev_state->domain, group); + ret = iommu_attach_device(dev_state->domain, &pdev->dev); if (ret != 0) goto out_drop_group;
The individual device driver should use iommu_attach/detach_device() for domain attachment/detachment. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> --- drivers/iommu/amd/iommu_v2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)