Message ID | 20210826121006.685257-3-michael@walle.cc (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/etnaviv: IOMMU related fixes | expand |
On Thu, Aug 26, 2021 at 02:10:05PM +0200, Michael Walle wrote: > + pdev->dev.coherent_dma_mask = DMA_BIT_MASK(40); > + pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask; Please use dma_coerce_mask_and_coherent() here instead.
On 2021-08-26 13:10, Michael Walle wrote: > The DMA configuration of the virtual device is inherited from the first > actual etnaviv device. Unfortunately, this doesn't work with an IOMMU: > > [ 5.191008] Failed to set up IOMMU for device (null); retaining platform DMA ops > > This is because there is no associated iommu_group with the device. The > group is set in iommu_group_add_device() which is eventually called by > device_add() via the platform bus: > device_add() > blocking_notifier_call_chain() > iommu_bus_notifier() > iommu_probe_device() > __iommu_probe_device() > iommu_group_get_for_dev() > iommu_group_add_device() > > Move of_dma_configure() into the probe function, which is called after > device_add(). Normally, the platform code will already call it itself > if .of_node is set. Unfortunately, this isn't the case here. > > Also move the dma mask assignemnts to probe() to keep all DMA related > settings together. I assume the driver must already keep track of the real GPU platform device in order to map registers, request interrupts, etc. correctly - can't it also correctly use that device for DMA API calls and avoid the need for these shenanigans altogether? FYI, IOMMU configuration is really supposed to *only* run at add_device() time as above - the fact that it's currently hooked in to be retriggered by of_dma_configure() on DT platforms actually turns out to lead to various issues within the IOMMU API, and the plan to change that is slowly climbing up my to-do list. Robin. > Signed-off-by: Michael Walle <michael@walle.cc> > --- > drivers/gpu/drm/etnaviv/etnaviv_drv.c | 24 +++++++++++++++--------- > 1 file changed, 15 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c > index 2509b3e85709..ff6425f6ebad 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c > @@ -589,6 +589,7 @@ static int compare_str(struct device *dev, void *data) > static int etnaviv_pdev_probe(struct platform_device *pdev) > { > struct device *dev = &pdev->dev; > + struct device_node *first_node = NULL; > struct component_match *match = NULL; > > if (!dev->platform_data) { > @@ -598,6 +599,9 @@ static int etnaviv_pdev_probe(struct platform_device *pdev) > if (!of_device_is_available(core_node)) > continue; > > + if (!first_node) > + first_node = core_node; > + > drm_of_component_match_add(&pdev->dev, &match, > compare_of, core_node); > } > @@ -609,6 +613,17 @@ static int etnaviv_pdev_probe(struct platform_device *pdev) > component_match_add(dev, &match, compare_str, names[i]); > } > > + pdev->dev.coherent_dma_mask = DMA_BIT_MASK(40); > + pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask; > + > + /* > + * Apply the same DMA configuration to the virtual etnaviv > + * device as the GPU we found. This assumes that all Vivante > + * GPUs in the system share the same DMA constraints. > + */ > + if (first_node) > + of_dma_configure(&pdev->dev, first_node, true); > + > return component_master_add_with_match(dev, &etnaviv_master_ops, match); > } > > @@ -659,15 +674,6 @@ static int __init etnaviv_init(void) > of_node_put(np); > goto unregister_platform_driver; > } > - pdev->dev.coherent_dma_mask = DMA_BIT_MASK(40); > - pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask; > - > - /* > - * Apply the same DMA configuration to the virtual etnaviv > - * device as the GPU we found. This assumes that all Vivante > - * GPUs in the system share the same DMA constraints. > - */ > - of_dma_configure(&pdev->dev, np, true); > > ret = platform_device_add(pdev); > if (ret) { >
Am Donnerstag, dem 26.08.2021 um 16:00 +0100 schrieb Robin Murphy: > On 2021-08-26 13:10, Michael Walle wrote: > > The DMA configuration of the virtual device is inherited from the first > > actual etnaviv device. Unfortunately, this doesn't work with an IOMMU: > > > > [ 5.191008] Failed to set up IOMMU for device (null); retaining platform DMA ops > > > > This is because there is no associated iommu_group with the device. The > > group is set in iommu_group_add_device() which is eventually called by > > device_add() via the platform bus: > > device_add() > > blocking_notifier_call_chain() > > iommu_bus_notifier() > > iommu_probe_device() > > __iommu_probe_device() > > iommu_group_get_for_dev() > > iommu_group_add_device() > > > > Move of_dma_configure() into the probe function, which is called after > > device_add(). Normally, the platform code will already call it itself > > if .of_node is set. Unfortunately, this isn't the case here. > > > > Also move the dma mask assignemnts to probe() to keep all DMA related > > settings together. > > I assume the driver must already keep track of the real GPU platform > device in order to map registers, request interrupts, etc. correctly - > can't it also correctly use that device for DMA API calls and avoid the > need for these shenanigans altogether? > Not without a bigger rework. There's still quite a bit of midlayer issues in DRM, where dma-buf imports are dma-mapped and cached via the virtual DRM device instead of the real GPU device. Also etnaviv is able to coalesce multiple Vivante GPUs in a single system under one virtual DRM device, which is used on i.MX6 where the 2D and 3D GPUs are separate peripherals, but have the same DMA constraints. Effectively we would need to handle N devices for the dma-mapping in a lot of places instead of only dealing with the one virtual DRM device. It would probably be the right thing to anyways, but it's not something that can be changed short-term. I'm also not yet sure about the performance implications, as we might run into some cache maintenance bottlenecks if we dma synchronize buffers to multiple real device instead of doing it a single time with the virtual DRM device. I know, I know, this has a lot of assumptions baked in that could fall apart if someone builds a SoC with multiple Vivante GPUs that have differing DMA constraints, but up until now hardware designers have not been *that* crazy, fortunately. Regards, Lucas > FYI, IOMMU configuration is really supposed to *only* run at > add_device() time as above - the fact that it's currently hooked in to > be retriggered by of_dma_configure() on DT platforms actually turns out > to lead to various issues within the IOMMU API, and the plan to change > that is slowly climbing up my to-do list. > > Robin. > > > Signed-off-by: Michael Walle <michael@walle.cc> > > --- > > drivers/gpu/drm/etnaviv/etnaviv_drv.c | 24 +++++++++++++++--------- > > 1 file changed, 15 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c > > index 2509b3e85709..ff6425f6ebad 100644 > > --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c > > @@ -589,6 +589,7 @@ static int compare_str(struct device *dev, void *data) > > static int etnaviv_pdev_probe(struct platform_device *pdev) > > { > > struct device *dev = &pdev->dev; > > + struct device_node *first_node = NULL; > > struct component_match *match = NULL; > > > > if (!dev->platform_data) { > > @@ -598,6 +599,9 @@ static int etnaviv_pdev_probe(struct platform_device *pdev) > > if (!of_device_is_available(core_node)) > > continue; > > > > + if (!first_node) > > + first_node = core_node; > > + > > drm_of_component_match_add(&pdev->dev, &match, > > compare_of, core_node); > > } > > @@ -609,6 +613,17 @@ static int etnaviv_pdev_probe(struct platform_device *pdev) > > component_match_add(dev, &match, compare_str, names[i]); > > } > > > > + pdev->dev.coherent_dma_mask = DMA_BIT_MASK(40); > > + pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask; > > + > > + /* > > + * Apply the same DMA configuration to the virtual etnaviv > > + * device as the GPU we found. This assumes that all Vivante > > + * GPUs in the system share the same DMA constraints. > > + */ > > + if (first_node) > > + of_dma_configure(&pdev->dev, first_node, true); > > + > > return component_master_add_with_match(dev, &etnaviv_master_ops, match); > > } > > > > @@ -659,15 +674,6 @@ static int __init etnaviv_init(void) > > of_node_put(np); > > goto unregister_platform_driver; > > } > > - pdev->dev.coherent_dma_mask = DMA_BIT_MASK(40); > > - pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask; > > - > > - /* > > - * Apply the same DMA configuration to the virtual etnaviv > > - * device as the GPU we found. This assumes that all Vivante > > - * GPUs in the system share the same DMA constraints. > > - */ > > - of_dma_configure(&pdev->dev, np, true); > > > > ret = platform_device_add(pdev); > > if (ret) { > >
On 2021-08-26 16:17, Lucas Stach wrote: > Am Donnerstag, dem 26.08.2021 um 16:00 +0100 schrieb Robin Murphy: >> On 2021-08-26 13:10, Michael Walle wrote: >>> The DMA configuration of the virtual device is inherited from the first >>> actual etnaviv device. Unfortunately, this doesn't work with an IOMMU: >>> >>> [ 5.191008] Failed to set up IOMMU for device (null); retaining platform DMA ops >>> >>> This is because there is no associated iommu_group with the device. The >>> group is set in iommu_group_add_device() which is eventually called by >>> device_add() via the platform bus: >>> device_add() >>> blocking_notifier_call_chain() >>> iommu_bus_notifier() >>> iommu_probe_device() >>> __iommu_probe_device() >>> iommu_group_get_for_dev() >>> iommu_group_add_device() >>> >>> Move of_dma_configure() into the probe function, which is called after >>> device_add(). Normally, the platform code will already call it itself >>> if .of_node is set. Unfortunately, this isn't the case here. >>> >>> Also move the dma mask assignemnts to probe() to keep all DMA related >>> settings together. >> >> I assume the driver must already keep track of the real GPU platform >> device in order to map registers, request interrupts, etc. correctly - >> can't it also correctly use that device for DMA API calls and avoid the >> need for these shenanigans altogether? >> > Not without a bigger rework. There's still quite a bit of midlayer > issues in DRM, where dma-buf imports are dma-mapped and cached via the > virtual DRM device instead of the real GPU device. Also etnaviv is able > to coalesce multiple Vivante GPUs in a single system under one virtual > DRM device, which is used on i.MX6 where the 2D and 3D GPUs are > separate peripherals, but have the same DMA constraints. Sure, I wouldn't expect it to be trivial to fix properly, but I wanted to point out that this is essentially a hack, relying on an implicit side-effect of of_dma_configure() which is already slated for removal. As such, I for one am not going to be too sympathetic if it stops working in future. Furthermore, even today it doesn't work in general - it might be OK for LS1028A with a single GPU block behind an SMMU, but as soon as you have multiple GPU blocks with distinct SMMU StreamIDs, or behind different IOMMU instances, then you're stuffed again. Although in fact I think it's also broken even for LS1028A, since AFAICS there's no guarantee that the relevant SMMU instance will actually be probed, or the SMMU driver even loaded, when etnaviv_pdev_probe() runs. > Effectively we would need to handle N devices for the dma-mapping in a > lot of places instead of only dealing with the one virtual DRM device. > It would probably be the right thing to anyways, but it's not something > that can be changed short-term. I'm also not yet sure about the > performance implications, as we might run into some cache maintenance > bottlenecks if we dma synchronize buffers to multiple real device > instead of doing it a single time with the virtual DRM device. I know, > I know, this has a lot of assumptions baked in that could fall apart if > someone builds a SoC with multiple Vivante GPUs that have differing DMA > constraints, but up until now hardware designers have not been *that* > crazy, fortunately. I'm not too familiar with the component stuff, but would it be viable to just have etnaviv_gpu_platform_probe() set up the first GPU which comes along as the master component and fundamental DRM device, then treat any subsequent ones as subcomponents as before? That would at least stand to be more robust in terms of obviating the of_dma_configure() hack (only actual bus code should ever be calling that), even if it won't do anything for the multiple IOMMU mapping or differing DMA constraints problems. Thanks, Robin.
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c index 2509b3e85709..ff6425f6ebad 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c @@ -589,6 +589,7 @@ static int compare_str(struct device *dev, void *data) static int etnaviv_pdev_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; + struct device_node *first_node = NULL; struct component_match *match = NULL; if (!dev->platform_data) { @@ -598,6 +599,9 @@ static int etnaviv_pdev_probe(struct platform_device *pdev) if (!of_device_is_available(core_node)) continue; + if (!first_node) + first_node = core_node; + drm_of_component_match_add(&pdev->dev, &match, compare_of, core_node); } @@ -609,6 +613,17 @@ static int etnaviv_pdev_probe(struct platform_device *pdev) component_match_add(dev, &match, compare_str, names[i]); } + pdev->dev.coherent_dma_mask = DMA_BIT_MASK(40); + pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask; + + /* + * Apply the same DMA configuration to the virtual etnaviv + * device as the GPU we found. This assumes that all Vivante + * GPUs in the system share the same DMA constraints. + */ + if (first_node) + of_dma_configure(&pdev->dev, first_node, true); + return component_master_add_with_match(dev, &etnaviv_master_ops, match); } @@ -659,15 +674,6 @@ static int __init etnaviv_init(void) of_node_put(np); goto unregister_platform_driver; } - pdev->dev.coherent_dma_mask = DMA_BIT_MASK(40); - pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask; - - /* - * Apply the same DMA configuration to the virtual etnaviv - * device as the GPU we found. This assumes that all Vivante - * GPUs in the system share the same DMA constraints. - */ - of_dma_configure(&pdev->dev, np, true); ret = platform_device_add(pdev); if (ret) {
The DMA configuration of the virtual device is inherited from the first actual etnaviv device. Unfortunately, this doesn't work with an IOMMU: [ 5.191008] Failed to set up IOMMU for device (null); retaining platform DMA ops This is because there is no associated iommu_group with the device. The group is set in iommu_group_add_device() which is eventually called by device_add() via the platform bus: device_add() blocking_notifier_call_chain() iommu_bus_notifier() iommu_probe_device() __iommu_probe_device() iommu_group_get_for_dev() iommu_group_add_device() Move of_dma_configure() into the probe function, which is called after device_add(). Normally, the platform code will already call it itself if .of_node is set. Unfortunately, this isn't the case here. Also move the dma mask assignemnts to probe() to keep all DMA related settings together. Signed-off-by: Michael Walle <michael@walle.cc> --- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-)