Message ID | 20210930190954.1525933-7-boris.brezillon@collabora.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/panfrost: drm/panfrost: Add a new submit ioctl | expand |
On 30/09/2021 20:09, Boris Brezillon wrote: > Sometimes, all the user wants to do is add a synchronization point. > Userspace can already do that by submitting a NULL job, but this implies > submitting something to the GPU when we could simply skip the job and > signal the done fence directly. > > v5: > * New patch > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> I had thought we would be fine without kbase's "dependency only atom" because we don't have the fan-{in,out} problems that kbase's atoms produce. But if we're ending up with real hardware NULL jobs then this is clearly better. A couple of minor points below, but as far as I can tell this is functionally correct. Reviewed-by: Steven Price <steven.price@arm.com> > --- > drivers/gpu/drm/panfrost/panfrost_drv.c | 9 +++++++-- > drivers/gpu/drm/panfrost/panfrost_job.c | 6 ++++++ > include/uapi/drm/panfrost_drm.h | 7 +++++++ > 3 files changed, 20 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c > index 30dc158d56e6..89a0c484310c 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c > @@ -542,7 +542,9 @@ static const struct panfrost_submit_ioctl_version_info submit_versions[] = { > [1] = { 48, 8, 16 }, > }; > > -#define PANFROST_JD_ALLOWED_REQS PANFROST_JD_REQ_FS > +#define PANFROST_JD_ALLOWED_REQS \ > + (PANFROST_JD_REQ_FS | \ > + PANFROST_JD_REQ_DEP_ONLY) > > static int > panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv, > @@ -559,7 +561,10 @@ panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv, > if (args->requirements & ~PANFROST_JD_ALLOWED_REQS) > return -EINVAL; > > - if (!args->head) > + /* If this is a dependency-only job, the job chain head should be NULL, > + * otherwise it should be non-NULL. > + */ > + if ((args->head != 0) != !(args->requirements & PANFROST_JD_REQ_DEP_ONLY)) NIT: There's confusion over NULL vs 0 here - the code is correct (args->head is a u64 and not a pointer for a kernel) but the comment makes it seem like it should be a pointer. We could side-step the mismatch by rewriting as below: !args->head == !(args->requirements & PANFROST_JD_REQ_DEP_ONLY) Although I'm not convinced whether that's more readable or not! > return -EINVAL; > > bo_stride = submit_versions[version].bo_ref_stride; > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c > index 0367cee8f6df..6d8706d4a096 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_job.c > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c > @@ -192,6 +192,12 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js) > u64 jc_head = job->jc; > int ret; > > + if (job->requirements & PANFROST_JD_REQ_DEP_ONLY) { > + /* Nothing to execute, signal the fence directly. */ > + dma_fence_signal_locked(job->done_fence); > + return; > + } > + It took me a while to convince myself that the reference counting for the PM reference is correct. Before panfrost_job_hw_submit() always returned with an extra reference, but now we have a case which doesn't. AFAICT this is probably fine because we dereference on the path where the hardware has completed the job (which obviously won't happen here). But I'm still a bit uneasy whether the reference counts are always correct. Steve > panfrost_devfreq_record_busy(&pfdev->pfdevfreq); > > ret = pm_runtime_get_sync(pfdev->dev); > diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h > index 5e3f8a344f41..b9df066970f6 100644 > --- a/include/uapi/drm/panfrost_drm.h > +++ b/include/uapi/drm/panfrost_drm.h > @@ -46,6 +46,13 @@ extern "C" { > #define DRM_IOCTL_PANFROST_PERFCNT_DUMP DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_PERFCNT_DUMP, struct drm_panfrost_perfcnt_dump) > > #define PANFROST_JD_REQ_FS (1 << 0) > + > +/* > + * Dependency only job. The job chain head should be set to 0 when this flag > + * is set. > + */ > +#define PANFROST_JD_REQ_DEP_ONLY (1 << 1) > + > /** > * struct drm_panfrost_submit - ioctl argument for submitting commands to the 3D > * engine. >
On Mon, 4 Oct 2021 12:30:42 +0100 Steven Price <steven.price@arm.com> wrote: > On 30/09/2021 20:09, Boris Brezillon wrote: > > Sometimes, all the user wants to do is add a synchronization point. > > Userspace can already do that by submitting a NULL job, but this implies > > submitting something to the GPU when we could simply skip the job and > > signal the done fence directly. > > > > v5: > > * New patch > > > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> > > I had thought we would be fine without kbase's "dependency only atom" > because we don't have the fan-{in,out} problems that kbase's atoms > produce. But if we're ending up with real hardware NULL jobs then this > is clearly better. > > A couple of minor points below, but as far as I can tell this is > functionally correct. > > Reviewed-by: Steven Price <steven.price@arm.com> > > > --- > > drivers/gpu/drm/panfrost/panfrost_drv.c | 9 +++++++-- > > drivers/gpu/drm/panfrost/panfrost_job.c | 6 ++++++ > > include/uapi/drm/panfrost_drm.h | 7 +++++++ > > 3 files changed, 20 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c > > index 30dc158d56e6..89a0c484310c 100644 > > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c > > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c > > @@ -542,7 +542,9 @@ static const struct panfrost_submit_ioctl_version_info submit_versions[] = { > > [1] = { 48, 8, 16 }, > > }; > > > > -#define PANFROST_JD_ALLOWED_REQS PANFROST_JD_REQ_FS > > +#define PANFROST_JD_ALLOWED_REQS \ > > + (PANFROST_JD_REQ_FS | \ > > + PANFROST_JD_REQ_DEP_ONLY) > > > > static int > > panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv, > > @@ -559,7 +561,10 @@ panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv, > > if (args->requirements & ~PANFROST_JD_ALLOWED_REQS) > > return -EINVAL; > > > > - if (!args->head) > > + /* If this is a dependency-only job, the job chain head should be NULL, > > + * otherwise it should be non-NULL. > > + */ > > + if ((args->head != 0) != !(args->requirements & PANFROST_JD_REQ_DEP_ONLY)) > > NIT: There's confusion over NULL vs 0 here - the code is correct > (args->head is a u64 and not a pointer for a kernel) but the comment > makes it seem like it should be a pointer. > > We could side-step the mismatch by rewriting as below: > > !args->head == !(args->requirements & PANFROST_JD_REQ_DEP_ONLY) > > Although I'm not convinced whether that's more readable or not! I'll replace 'NULL' by 'zero' in the comment. > > > return -EINVAL; > > > > bo_stride = submit_versions[version].bo_ref_stride; > > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c > > index 0367cee8f6df..6d8706d4a096 100644 > > --- a/drivers/gpu/drm/panfrost/panfrost_job.c > > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c > > @@ -192,6 +192,12 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js) > > u64 jc_head = job->jc; > > int ret; > > > > + if (job->requirements & PANFROST_JD_REQ_DEP_ONLY) { > > + /* Nothing to execute, signal the fence directly. */ > > + dma_fence_signal_locked(job->done_fence); > > + return; > > + } > > + > > It took me a while to convince myself that the reference counting for > the PM reference is correct. Before panfrost_job_hw_submit() always > returned with an extra reference, but now we have a case which doesn't. > AFAICT this is probably fine because we dereference on the path where > the hardware has completed the job (which obviously won't happen here). > But I'm still a bit uneasy whether the reference counts are always correct. I think it is. We only decrement the PM count in the interrupt handler path, and as you pointed out, we won't reach that path here. But if that helps, I can move this if to `panfrost_job_run()`: /* Nothing to execute, signal the fence directly. */ if (job->requirements & PANFROST_JD_REQ_DEP_ONLY) dma_fence_signal_locked(job->done_fence); else panfrost_job_hw_submit(job, slot); > > Steve > > > panfrost_devfreq_record_busy(&pfdev->pfdevfreq); > > > > ret = pm_runtime_get_sync(pfdev->dev); > > diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h > > index 5e3f8a344f41..b9df066970f6 100644 > > --- a/include/uapi/drm/panfrost_drm.h > > +++ b/include/uapi/drm/panfrost_drm.h > > @@ -46,6 +46,13 @@ extern "C" { > > #define DRM_IOCTL_PANFROST_PERFCNT_DUMP DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_PERFCNT_DUMP, struct drm_panfrost_perfcnt_dump) > > > > #define PANFROST_JD_REQ_FS (1 << 0) > > + > > +/* > > + * Dependency only job. The job chain head should be set to 0 when this flag > > + * is set. > > + */ > > +#define PANFROST_JD_REQ_DEP_ONLY (1 << 1) > > + > > /** > > * struct drm_panfrost_submit - ioctl argument for submitting commands to the 3D > > * engine. > > >
On 04/10/2021 13:24, Boris Brezillon wrote: > On Mon, 4 Oct 2021 12:30:42 +0100 > Steven Price <steven.price@arm.com> wrote: [...] >> >> It took me a while to convince myself that the reference counting for >> the PM reference is correct. Before panfrost_job_hw_submit() always >> returned with an extra reference, but now we have a case which doesn't. >> AFAICT this is probably fine because we dereference on the path where >> the hardware has completed the job (which obviously won't happen here). >> But I'm still a bit uneasy whether the reference counts are always correct. > > I think it is. We only decrement the PM count in the interrupt handler > path, and as you pointed out, we won't reach that path here. But if > that helps, I can move this if to `panfrost_job_run()`: > > /* Nothing to execute, signal the fence directly. */ > if (job->requirements & PANFROST_JD_REQ_DEP_ONLY) > dma_fence_signal_locked(job->done_fence); > else > panfrost_job_hw_submit(job, slot); > I think that would make it a bit more readable - really panfrost_job_hw_submit() needs a bit of TLC, I did post a patch ages ago[1] but it didn't get any feedback and then I forgot about it. Things have moved on so it would need a little bit of rework. Thanks, Steve [1] https://lore.kernel.org/dri-devel/20210512152419.30003-1-steven.price@arm.com/
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index 30dc158d56e6..89a0c484310c 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -542,7 +542,9 @@ static const struct panfrost_submit_ioctl_version_info submit_versions[] = { [1] = { 48, 8, 16 }, }; -#define PANFROST_JD_ALLOWED_REQS PANFROST_JD_REQ_FS +#define PANFROST_JD_ALLOWED_REQS \ + (PANFROST_JD_REQ_FS | \ + PANFROST_JD_REQ_DEP_ONLY) static int panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv, @@ -559,7 +561,10 @@ panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv, if (args->requirements & ~PANFROST_JD_ALLOWED_REQS) return -EINVAL; - if (!args->head) + /* If this is a dependency-only job, the job chain head should be NULL, + * otherwise it should be non-NULL. + */ + if ((args->head != 0) != !(args->requirements & PANFROST_JD_REQ_DEP_ONLY)) return -EINVAL; bo_stride = submit_versions[version].bo_ref_stride; diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 0367cee8f6df..6d8706d4a096 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -192,6 +192,12 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js) u64 jc_head = job->jc; int ret; + if (job->requirements & PANFROST_JD_REQ_DEP_ONLY) { + /* Nothing to execute, signal the fence directly. */ + dma_fence_signal_locked(job->done_fence); + return; + } + panfrost_devfreq_record_busy(&pfdev->pfdevfreq); ret = pm_runtime_get_sync(pfdev->dev); diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h index 5e3f8a344f41..b9df066970f6 100644 --- a/include/uapi/drm/panfrost_drm.h +++ b/include/uapi/drm/panfrost_drm.h @@ -46,6 +46,13 @@ extern "C" { #define DRM_IOCTL_PANFROST_PERFCNT_DUMP DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_PERFCNT_DUMP, struct drm_panfrost_perfcnt_dump) #define PANFROST_JD_REQ_FS (1 << 0) + +/* + * Dependency only job. The job chain head should be set to 0 when this flag + * is set. + */ +#define PANFROST_JD_REQ_DEP_ONLY (1 << 1) + /** * struct drm_panfrost_submit - ioctl argument for submitting commands to the 3D * engine.
Sometimes, all the user wants to do is add a synchronization point. Userspace can already do that by submitting a NULL job, but this implies submitting something to the GPU when we could simply skip the job and signal the done fence directly. v5: * New patch Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> --- drivers/gpu/drm/panfrost/panfrost_drv.c | 9 +++++++-- drivers/gpu/drm/panfrost/panfrost_job.c | 6 ++++++ include/uapi/drm/panfrost_drm.h | 7 +++++++ 3 files changed, 20 insertions(+), 2 deletions(-)