diff mbox series

[2/3] drm/prime: set the dma_coherent flag for export

Message ID 20221020121316.3946-3-christian.koenig@amd.com (mailing list archive)
State New, archived
Headers show
Series [1/3] dma-buf: add dma_coherent flag | expand

Commit Message

Christian König Oct. 20, 2022, 12:13 p.m. UTC
When a device driver is snooping the CPU cache during access we assume
that all importers need to be able to snoop the CPU cache as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/drm_prime.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Rob Clark Oct. 20, 2022, 2:43 p.m. UTC | #1
On Thu, Oct 20, 2022 at 5:13 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> When a device driver is snooping the CPU cache during access we assume
> that all importers need to be able to snoop the CPU cache as well.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/drm_prime.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 20e109a802ae..d5c70b6fe8a4 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -28,6 +28,7 @@
>
>  #include <linux/export.h>
>  #include <linux/dma-buf.h>
> +#include <linux/dma-map-ops.h>
>  #include <linux/rbtree.h>
>  #include <linux/module.h>
>
> @@ -889,6 +890,7 @@ struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj,
>                 .size = obj->size,
>                 .flags = flags,
>                 .priv = obj,
> +               .coherent = dev_is_dma_coherent(dev->dev),

To set the coherent flag correctly, I think I'd need a way to override
on a per buffer basis, since coherency is a property of the gpu
pgtables (which in the msm case is an immutable property of the gem
object).  We also have some awkwardness that drm->dev isn't actually
the GPU, thanks to the kernels device model seeing a collection of
other small devices shoehorned into a single drm device to fit
userspace's view of the world.  So relying on drm->dev isn't really
going to give sensible results.

I guess msm could just bury our heads in the sand and continue to do
things the way we have been (buffers that are mapped cached-coherent
are only self-shared) but would be nice to catch if userspace tried to
import one into (for ex) v4l2..

BR,
-R

>                 .resv = obj->resv,
>         };
>
> --
> 2.25.1
>
Christian König Oct. 20, 2022, 2:56 p.m. UTC | #2
Am 20.10.22 um 16:43 schrieb Rob Clark:
> On Thu, Oct 20, 2022 at 5:13 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> When a device driver is snooping the CPU cache during access we assume
>> that all importers need to be able to snoop the CPU cache as well.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/drm_prime.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
>> index 20e109a802ae..d5c70b6fe8a4 100644
>> --- a/drivers/gpu/drm/drm_prime.c
>> +++ b/drivers/gpu/drm/drm_prime.c
>> @@ -28,6 +28,7 @@
>>
>>   #include <linux/export.h>
>>   #include <linux/dma-buf.h>
>> +#include <linux/dma-map-ops.h>
>>   #include <linux/rbtree.h>
>>   #include <linux/module.h>
>>
>> @@ -889,6 +890,7 @@ struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj,
>>                  .size = obj->size,
>>                  .flags = flags,
>>                  .priv = obj,
>> +               .coherent = dev_is_dma_coherent(dev->dev),
> To set the coherent flag correctly, I think I'd need a way to override
> on a per buffer basis, since coherency is a property of the gpu
> pgtables (which in the msm case is an immutable property of the gem
> object).  We also have some awkwardness that drm->dev isn't actually
> the GPU, thanks to the kernels device model seeing a collection of
> other small devices shoehorned into a single drm device to fit
> userspace's view of the world.  So relying on drm->dev isn't really
> going to give sensible results.

Yeah, I've the same problem for amdgpu where some buffers are snooped 
while others aren't.

But this should be unproblematic since the flag can always be cleared by 
the driver later on (it just can't be set).

Additional to that I've just noted that armada, i915, omap and tegra use 
their own DMA-buf export function. MSM could do the same as well if the 
device itself is marked as not coherent while some buffers are mapped 
cache coherent.

Regards,
Christian.

> I guess msm could just bury our heads in the sand and continue to do
> things the way we have been (buffers that are mapped cached-coherent
> are only self-shared) but would be nice to catch if userspace tried to
> import one into (for ex) v4l2..
>
> BR,
> -R
>
>>                  .resv = obj->resv,
>>          };
>>
>> --
>> 2.25.1
>>
Rob Clark Oct. 28, 2022, 1:11 a.m. UTC | #3
On Thu, Oct 20, 2022 at 7:57 AM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 20.10.22 um 16:43 schrieb Rob Clark:
> > On Thu, Oct 20, 2022 at 5:13 AM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> When a device driver is snooping the CPU cache during access we assume
> >> that all importers need to be able to snoop the CPU cache as well.
> >>
> >> Signed-off-by: Christian König <christian.koenig@amd.com>
> >> ---
> >>   drivers/gpu/drm/drm_prime.c | 2 ++
> >>   1 file changed, 2 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> >> index 20e109a802ae..d5c70b6fe8a4 100644
> >> --- a/drivers/gpu/drm/drm_prime.c
> >> +++ b/drivers/gpu/drm/drm_prime.c
> >> @@ -28,6 +28,7 @@
> >>
> >>   #include <linux/export.h>
> >>   #include <linux/dma-buf.h>
> >> +#include <linux/dma-map-ops.h>
> >>   #include <linux/rbtree.h>
> >>   #include <linux/module.h>
> >>
> >> @@ -889,6 +890,7 @@ struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj,
> >>                  .size = obj->size,
> >>                  .flags = flags,
> >>                  .priv = obj,
> >> +               .coherent = dev_is_dma_coherent(dev->dev),
> > To set the coherent flag correctly, I think I'd need a way to override
> > on a per buffer basis, since coherency is a property of the gpu
> > pgtables (which in the msm case is an immutable property of the gem
> > object).  We also have some awkwardness that drm->dev isn't actually
> > the GPU, thanks to the kernels device model seeing a collection of
> > other small devices shoehorned into a single drm device to fit
> > userspace's view of the world.  So relying on drm->dev isn't really
> > going to give sensible results.
>
> Yeah, I've the same problem for amdgpu where some buffers are snooped
> while others aren't.
>
> But this should be unproblematic since the flag can always be cleared by
> the driver later on (it just can't be set).
>
> Additional to that I've just noted that armada, i915, omap and tegra use
> their own DMA-buf export function. MSM could do the same as well if the
> device itself is marked as not coherent while some buffers are mapped
> cache coherent.

yeah, I guess that would work.. it would be a bit unfortunate to need
to use our own export function, but I guess it is a small price to pay
and I like the overall idea, so a-b for the series

For the VMM case, it would be nice to expose this to userspace, but
I've sent a patch to do this in an msm specific way, and I guess at
least solving that problem for one driver and better than the current
state of "iff driver == "i915" { it's mmap'd cached } else { it's
writecombine }" in crosvm

Admittedly the VMM case is a rather special case compared to your
average userspace dealing with dmabuf's, but it would be nice to get
out of the current situation where it is having to make assumptions
which are quite possibly wrong, so giving the VMM some information
even if it is "the cachability isn't static, you should bail now if
your arch can't cope" would be an improvement.  (For background, this
case is also a bit specific for android/gralloc.. for driver allocated
buffers in a VM, with the native usermode driver (UMD) in guest, you
still have some control within the UMD)

BR,
-R


> Regards,
> Christian.
>
> > I guess msm could just bury our heads in the sand and continue to do
> > things the way we have been (buffers that are mapped cached-coherent
> > are only self-shared) but would be nice to catch if userspace tried to
> > import one into (for ex) v4l2..
> >
> > BR,
> > -R
> >
> >>                  .resv = obj->resv,
> >>          };
> >>
> >> --
> >> 2.25.1
> >>
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 20e109a802ae..d5c70b6fe8a4 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -28,6 +28,7 @@ 
 
 #include <linux/export.h>
 #include <linux/dma-buf.h>
+#include <linux/dma-map-ops.h>
 #include <linux/rbtree.h>
 #include <linux/module.h>
 
@@ -889,6 +890,7 @@  struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj,
 		.size = obj->size,
 		.flags = flags,
 		.priv = obj,
+		.coherent = dev_is_dma_coherent(dev->dev),
 		.resv = obj->resv,
 	};