Message ID | 20221020121316.3946-3-christian.koenig@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/3] dma-buf: add dma_coherent flag | expand |
On Thu, Oct 20, 2022 at 5:13 AM Christian König <ckoenig.leichtzumerken@gmail.com> wrote: > > When a device driver is snooping the CPU cache during access we assume > that all importers need to be able to snoop the CPU cache as well. > > Signed-off-by: Christian König <christian.koenig@amd.com> > --- > drivers/gpu/drm/drm_prime.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c > index 20e109a802ae..d5c70b6fe8a4 100644 > --- a/drivers/gpu/drm/drm_prime.c > +++ b/drivers/gpu/drm/drm_prime.c > @@ -28,6 +28,7 @@ > > #include <linux/export.h> > #include <linux/dma-buf.h> > +#include <linux/dma-map-ops.h> > #include <linux/rbtree.h> > #include <linux/module.h> > > @@ -889,6 +890,7 @@ struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj, > .size = obj->size, > .flags = flags, > .priv = obj, > + .coherent = dev_is_dma_coherent(dev->dev), To set the coherent flag correctly, I think I'd need a way to override on a per buffer basis, since coherency is a property of the gpu pgtables (which in the msm case is an immutable property of the gem object). We also have some awkwardness that drm->dev isn't actually the GPU, thanks to the kernels device model seeing a collection of other small devices shoehorned into a single drm device to fit userspace's view of the world. So relying on drm->dev isn't really going to give sensible results. I guess msm could just bury our heads in the sand and continue to do things the way we have been (buffers that are mapped cached-coherent are only self-shared) but would be nice to catch if userspace tried to import one into (for ex) v4l2.. BR, -R > .resv = obj->resv, > }; > > -- > 2.25.1 >
Am 20.10.22 um 16:43 schrieb Rob Clark: > On Thu, Oct 20, 2022 at 5:13 AM Christian König > <ckoenig.leichtzumerken@gmail.com> wrote: >> When a device driver is snooping the CPU cache during access we assume >> that all importers need to be able to snoop the CPU cache as well. >> >> Signed-off-by: Christian König <christian.koenig@amd.com> >> --- >> drivers/gpu/drm/drm_prime.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c >> index 20e109a802ae..d5c70b6fe8a4 100644 >> --- a/drivers/gpu/drm/drm_prime.c >> +++ b/drivers/gpu/drm/drm_prime.c >> @@ -28,6 +28,7 @@ >> >> #include <linux/export.h> >> #include <linux/dma-buf.h> >> +#include <linux/dma-map-ops.h> >> #include <linux/rbtree.h> >> #include <linux/module.h> >> >> @@ -889,6 +890,7 @@ struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj, >> .size = obj->size, >> .flags = flags, >> .priv = obj, >> + .coherent = dev_is_dma_coherent(dev->dev), > To set the coherent flag correctly, I think I'd need a way to override > on a per buffer basis, since coherency is a property of the gpu > pgtables (which in the msm case is an immutable property of the gem > object). We also have some awkwardness that drm->dev isn't actually > the GPU, thanks to the kernels device model seeing a collection of > other small devices shoehorned into a single drm device to fit > userspace's view of the world. So relying on drm->dev isn't really > going to give sensible results. Yeah, I've the same problem for amdgpu where some buffers are snooped while others aren't. But this should be unproblematic since the flag can always be cleared by the driver later on (it just can't be set). Additional to that I've just noted that armada, i915, omap and tegra use their own DMA-buf export function. MSM could do the same as well if the device itself is marked as not coherent while some buffers are mapped cache coherent. Regards, Christian. > I guess msm could just bury our heads in the sand and continue to do > things the way we have been (buffers that are mapped cached-coherent > are only self-shared) but would be nice to catch if userspace tried to > import one into (for ex) v4l2.. > > BR, > -R > >> .resv = obj->resv, >> }; >> >> -- >> 2.25.1 >>
On Thu, Oct 20, 2022 at 7:57 AM Christian König <christian.koenig@amd.com> wrote: > > Am 20.10.22 um 16:43 schrieb Rob Clark: > > On Thu, Oct 20, 2022 at 5:13 AM Christian König > > <ckoenig.leichtzumerken@gmail.com> wrote: > >> When a device driver is snooping the CPU cache during access we assume > >> that all importers need to be able to snoop the CPU cache as well. > >> > >> Signed-off-by: Christian König <christian.koenig@amd.com> > >> --- > >> drivers/gpu/drm/drm_prime.c | 2 ++ > >> 1 file changed, 2 insertions(+) > >> > >> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c > >> index 20e109a802ae..d5c70b6fe8a4 100644 > >> --- a/drivers/gpu/drm/drm_prime.c > >> +++ b/drivers/gpu/drm/drm_prime.c > >> @@ -28,6 +28,7 @@ > >> > >> #include <linux/export.h> > >> #include <linux/dma-buf.h> > >> +#include <linux/dma-map-ops.h> > >> #include <linux/rbtree.h> > >> #include <linux/module.h> > >> > >> @@ -889,6 +890,7 @@ struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj, > >> .size = obj->size, > >> .flags = flags, > >> .priv = obj, > >> + .coherent = dev_is_dma_coherent(dev->dev), > > To set the coherent flag correctly, I think I'd need a way to override > > on a per buffer basis, since coherency is a property of the gpu > > pgtables (which in the msm case is an immutable property of the gem > > object). We also have some awkwardness that drm->dev isn't actually > > the GPU, thanks to the kernels device model seeing a collection of > > other small devices shoehorned into a single drm device to fit > > userspace's view of the world. So relying on drm->dev isn't really > > going to give sensible results. > > Yeah, I've the same problem for amdgpu where some buffers are snooped > while others aren't. > > But this should be unproblematic since the flag can always be cleared by > the driver later on (it just can't be set). > > Additional to that I've just noted that armada, i915, omap and tegra use > their own DMA-buf export function. MSM could do the same as well if the > device itself is marked as not coherent while some buffers are mapped > cache coherent. yeah, I guess that would work.. it would be a bit unfortunate to need to use our own export function, but I guess it is a small price to pay and I like the overall idea, so a-b for the series For the VMM case, it would be nice to expose this to userspace, but I've sent a patch to do this in an msm specific way, and I guess at least solving that problem for one driver and better than the current state of "iff driver == "i915" { it's mmap'd cached } else { it's writecombine }" in crosvm Admittedly the VMM case is a rather special case compared to your average userspace dealing with dmabuf's, but it would be nice to get out of the current situation where it is having to make assumptions which are quite possibly wrong, so giving the VMM some information even if it is "the cachability isn't static, you should bail now if your arch can't cope" would be an improvement. (For background, this case is also a bit specific for android/gralloc.. for driver allocated buffers in a VM, with the native usermode driver (UMD) in guest, you still have some control within the UMD) BR, -R > Regards, > Christian. > > > I guess msm could just bury our heads in the sand and continue to do > > things the way we have been (buffers that are mapped cached-coherent > > are only self-shared) but would be nice to catch if userspace tried to > > import one into (for ex) v4l2.. > > > > BR, > > -R > > > >> .resv = obj->resv, > >> }; > >> > >> -- > >> 2.25.1 > >> >
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 20e109a802ae..d5c70b6fe8a4 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -28,6 +28,7 @@ #include <linux/export.h> #include <linux/dma-buf.h> +#include <linux/dma-map-ops.h> #include <linux/rbtree.h> #include <linux/module.h> @@ -889,6 +890,7 @@ struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj, .size = obj->size, .flags = flags, .priv = obj, + .coherent = dev_is_dma_coherent(dev->dev), .resv = obj->resv, };
When a device driver is snooping the CPU cache during access we assume that all importers need to be able to snoop the CPU cache as well. Signed-off-by: Christian König <christian.koenig@amd.com> --- drivers/gpu/drm/drm_prime.c | 2 ++ 1 file changed, 2 insertions(+)