diff mbox series

[v5,6/8] drm/panfrost: Support synchronization jobs

Message ID 20210930190954.1525933-7-boris.brezillon@collabora.com (mailing list archive)
State New, archived
Headers show
Series drm/panfrost: drm/panfrost: Add a new submit ioctl | expand

Commit Message

Boris Brezillon Sept. 30, 2021, 7:09 p.m. UTC
Sometimes, all the user wants to do is add a synchronization point.
Userspace can already do that by submitting a NULL job, but this implies
submitting something to the GPU when we could simply skip the job and
signal the done fence directly.

v5:
* New patch

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 9 +++++++--
 drivers/gpu/drm/panfrost/panfrost_job.c | 6 ++++++
 include/uapi/drm/panfrost_drm.h         | 7 +++++++
 3 files changed, 20 insertions(+), 2 deletions(-)

Comments

Steven Price Oct. 4, 2021, 11:30 a.m. UTC | #1
On 30/09/2021 20:09, Boris Brezillon wrote:
> Sometimes, all the user wants to do is add a synchronization point.
> Userspace can already do that by submitting a NULL job, but this implies
> submitting something to the GPU when we could simply skip the job and
> signal the done fence directly.
> 
> v5:
> * New patch
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>

I had thought we would be fine without kbase's "dependency only atom"
because we don't have the fan-{in,out} problems that kbase's atoms
produce. But if we're ending up with real hardware NULL jobs then this
is clearly better.

A couple of minor points below, but as far as I can tell this is
functionally correct.

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 9 +++++++--
>  drivers/gpu/drm/panfrost/panfrost_job.c | 6 ++++++
>  include/uapi/drm/panfrost_drm.h         | 7 +++++++
>  3 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 30dc158d56e6..89a0c484310c 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -542,7 +542,9 @@ static const struct panfrost_submit_ioctl_version_info submit_versions[] = {
>  	[1] = { 48, 8, 16 },
>  };
>  
> -#define PANFROST_JD_ALLOWED_REQS PANFROST_JD_REQ_FS
> +#define PANFROST_JD_ALLOWED_REQS \
> +	(PANFROST_JD_REQ_FS | \
> +	 PANFROST_JD_REQ_DEP_ONLY)
>  
>  static int
>  panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
> @@ -559,7 +561,10 @@ panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
>  	if (args->requirements & ~PANFROST_JD_ALLOWED_REQS)
>  		return -EINVAL;
>  
> -	if (!args->head)
> +	/* If this is a dependency-only job, the job chain head should be NULL,
> +	 * otherwise it should be non-NULL.
> +	 */
> +	if ((args->head != 0) != !(args->requirements & PANFROST_JD_REQ_DEP_ONLY))

NIT: There's confusion over NULL vs 0 here - the code is correct
(args->head is a u64 and not a pointer for a kernel) but the comment
makes it seem like it should be a pointer.

We could side-step the mismatch by rewriting as below:

  !args->head == !(args->requirements & PANFROST_JD_REQ_DEP_ONLY)

Although I'm not convinced whether that's more readable or not!

>  		return -EINVAL;
>  
>  	bo_stride = submit_versions[version].bo_ref_stride;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 0367cee8f6df..6d8706d4a096 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -192,6 +192,12 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
>  	u64 jc_head = job->jc;
>  	int ret;
>  
> +	if (job->requirements & PANFROST_JD_REQ_DEP_ONLY) {
> +		/* Nothing to execute, signal the fence directly. */
> +		dma_fence_signal_locked(job->done_fence);
> +		return;
> +	}
> +

It took me a while to convince myself that the reference counting for
the PM reference is correct. Before panfrost_job_hw_submit() always
returned with an extra reference, but now we have a case which doesn't.
AFAICT this is probably fine because we dereference on the path where
the hardware has completed the job (which obviously won't happen here).
But I'm still a bit uneasy whether the reference counts are always correct.

Steve

>  	panfrost_devfreq_record_busy(&pfdev->pfdevfreq);
>  
>  	ret = pm_runtime_get_sync(pfdev->dev);
> diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
> index 5e3f8a344f41..b9df066970f6 100644
> --- a/include/uapi/drm/panfrost_drm.h
> +++ b/include/uapi/drm/panfrost_drm.h
> @@ -46,6 +46,13 @@ extern "C" {
>  #define DRM_IOCTL_PANFROST_PERFCNT_DUMP		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_PERFCNT_DUMP, struct drm_panfrost_perfcnt_dump)
>  
>  #define PANFROST_JD_REQ_FS (1 << 0)
> +
> +/*
> + * Dependency only job. The job chain head should be set to 0 when this flag
> + * is set.
> + */
> +#define PANFROST_JD_REQ_DEP_ONLY (1 << 1)
> +
>  /**
>   * struct drm_panfrost_submit - ioctl argument for submitting commands to the 3D
>   * engine.
>
Boris Brezillon Oct. 4, 2021, 12:24 p.m. UTC | #2
On Mon, 4 Oct 2021 12:30:42 +0100
Steven Price <steven.price@arm.com> wrote:

> On 30/09/2021 20:09, Boris Brezillon wrote:
> > Sometimes, all the user wants to do is add a synchronization point.
> > Userspace can already do that by submitting a NULL job, but this implies
> > submitting something to the GPU when we could simply skip the job and
> > signal the done fence directly.
> > 
> > v5:
> > * New patch
> > 
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>  
> 
> I had thought we would be fine without kbase's "dependency only atom"
> because we don't have the fan-{in,out} problems that kbase's atoms
> produce. But if we're ending up with real hardware NULL jobs then this
> is clearly better.
> 
> A couple of minor points below, but as far as I can tell this is
> functionally correct.
> 
> Reviewed-by: Steven Price <steven.price@arm.com>
> 
> > ---
> >  drivers/gpu/drm/panfrost/panfrost_drv.c | 9 +++++++--
> >  drivers/gpu/drm/panfrost/panfrost_job.c | 6 ++++++
> >  include/uapi/drm/panfrost_drm.h         | 7 +++++++
> >  3 files changed, 20 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > index 30dc158d56e6..89a0c484310c 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > @@ -542,7 +542,9 @@ static const struct panfrost_submit_ioctl_version_info submit_versions[] = {
> >  	[1] = { 48, 8, 16 },
> >  };
> >  
> > -#define PANFROST_JD_ALLOWED_REQS PANFROST_JD_REQ_FS
> > +#define PANFROST_JD_ALLOWED_REQS \
> > +	(PANFROST_JD_REQ_FS | \
> > +	 PANFROST_JD_REQ_DEP_ONLY)
> >  
> >  static int
> >  panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
> > @@ -559,7 +561,10 @@ panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
> >  	if (args->requirements & ~PANFROST_JD_ALLOWED_REQS)
> >  		return -EINVAL;
> >  
> > -	if (!args->head)
> > +	/* If this is a dependency-only job, the job chain head should be NULL,
> > +	 * otherwise it should be non-NULL.
> > +	 */
> > +	if ((args->head != 0) != !(args->requirements & PANFROST_JD_REQ_DEP_ONLY))  
> 
> NIT: There's confusion over NULL vs 0 here - the code is correct
> (args->head is a u64 and not a pointer for a kernel) but the comment
> makes it seem like it should be a pointer.
> 
> We could side-step the mismatch by rewriting as below:
> 
>   !args->head == !(args->requirements & PANFROST_JD_REQ_DEP_ONLY)
> 
> Although I'm not convinced whether that's more readable or not!

I'll replace 'NULL' by 'zero' in the comment.

> 
> >  		return -EINVAL;
> >  
> >  	bo_stride = submit_versions[version].bo_ref_stride;
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index 0367cee8f6df..6d8706d4a096 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -192,6 +192,12 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
> >  	u64 jc_head = job->jc;
> >  	int ret;
> >  
> > +	if (job->requirements & PANFROST_JD_REQ_DEP_ONLY) {
> > +		/* Nothing to execute, signal the fence directly. */
> > +		dma_fence_signal_locked(job->done_fence);
> > +		return;
> > +	}
> > +  
> 
> It took me a while to convince myself that the reference counting for
> the PM reference is correct. Before panfrost_job_hw_submit() always
> returned with an extra reference, but now we have a case which doesn't.
> AFAICT this is probably fine because we dereference on the path where
> the hardware has completed the job (which obviously won't happen here).
> But I'm still a bit uneasy whether the reference counts are always correct.

I think it is. We only decrement the PM count in the interrupt handler
path, and as you pointed out, we won't reach that path here. But if
that helps, I can move this if to `panfrost_job_run()`:

	/* Nothing to execute, signal the fence directly. */
	if (job->requirements & PANFROST_JD_REQ_DEP_ONLY)
		dma_fence_signal_locked(job->done_fence);
	else
		panfrost_job_hw_submit(job, slot);

> 
> Steve
> 
> >  	panfrost_devfreq_record_busy(&pfdev->pfdevfreq);
> >  
> >  	ret = pm_runtime_get_sync(pfdev->dev);
> > diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
> > index 5e3f8a344f41..b9df066970f6 100644
> > --- a/include/uapi/drm/panfrost_drm.h
> > +++ b/include/uapi/drm/panfrost_drm.h
> > @@ -46,6 +46,13 @@ extern "C" {
> >  #define DRM_IOCTL_PANFROST_PERFCNT_DUMP		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_PERFCNT_DUMP, struct drm_panfrost_perfcnt_dump)
> >  
> >  #define PANFROST_JD_REQ_FS (1 << 0)
> > +
> > +/*
> > + * Dependency only job. The job chain head should be set to 0 when this flag
> > + * is set.
> > + */
> > +#define PANFROST_JD_REQ_DEP_ONLY (1 << 1)
> > +
> >  /**
> >   * struct drm_panfrost_submit - ioctl argument for submitting commands to the 3D
> >   * engine.
> >   
>
Steven Price Oct. 4, 2021, 1:05 p.m. UTC | #3
On 04/10/2021 13:24, Boris Brezillon wrote:
> On Mon, 4 Oct 2021 12:30:42 +0100
> Steven Price <steven.price@arm.com> wrote:
[...]
>>
>> It took me a while to convince myself that the reference counting for
>> the PM reference is correct. Before panfrost_job_hw_submit() always
>> returned with an extra reference, but now we have a case which doesn't.
>> AFAICT this is probably fine because we dereference on the path where
>> the hardware has completed the job (which obviously won't happen here).
>> But I'm still a bit uneasy whether the reference counts are always correct.
> 
> I think it is. We only decrement the PM count in the interrupt handler
> path, and as you pointed out, we won't reach that path here. But if
> that helps, I can move this if to `panfrost_job_run()`:
> 
> 	/* Nothing to execute, signal the fence directly. */
> 	if (job->requirements & PANFROST_JD_REQ_DEP_ONLY)
> 		dma_fence_signal_locked(job->done_fence);
> 	else
> 		panfrost_job_hw_submit(job, slot);
> 

I think that would make it a bit more readable - really
panfrost_job_hw_submit() needs a bit of TLC, I did post a patch ages
ago[1] but it didn't get any feedback and then I forgot about it. Things
have moved on so it would need a little bit of rework.

Thanks,

Steve

[1]
https://lore.kernel.org/dri-devel/20210512152419.30003-1-steven.price@arm.com/
diff mbox series

Patch

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 30dc158d56e6..89a0c484310c 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -542,7 +542,9 @@  static const struct panfrost_submit_ioctl_version_info submit_versions[] = {
 	[1] = { 48, 8, 16 },
 };
 
-#define PANFROST_JD_ALLOWED_REQS PANFROST_JD_REQ_FS
+#define PANFROST_JD_ALLOWED_REQS \
+	(PANFROST_JD_REQ_FS | \
+	 PANFROST_JD_REQ_DEP_ONLY)
 
 static int
 panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
@@ -559,7 +561,10 @@  panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
 	if (args->requirements & ~PANFROST_JD_ALLOWED_REQS)
 		return -EINVAL;
 
-	if (!args->head)
+	/* If this is a dependency-only job, the job chain head should be NULL,
+	 * otherwise it should be non-NULL.
+	 */
+	if ((args->head != 0) != !(args->requirements & PANFROST_JD_REQ_DEP_ONLY))
 		return -EINVAL;
 
 	bo_stride = submit_versions[version].bo_ref_stride;
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 0367cee8f6df..6d8706d4a096 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -192,6 +192,12 @@  static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
 	u64 jc_head = job->jc;
 	int ret;
 
+	if (job->requirements & PANFROST_JD_REQ_DEP_ONLY) {
+		/* Nothing to execute, signal the fence directly. */
+		dma_fence_signal_locked(job->done_fence);
+		return;
+	}
+
 	panfrost_devfreq_record_busy(&pfdev->pfdevfreq);
 
 	ret = pm_runtime_get_sync(pfdev->dev);
diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
index 5e3f8a344f41..b9df066970f6 100644
--- a/include/uapi/drm/panfrost_drm.h
+++ b/include/uapi/drm/panfrost_drm.h
@@ -46,6 +46,13 @@  extern "C" {
 #define DRM_IOCTL_PANFROST_PERFCNT_DUMP		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_PERFCNT_DUMP, struct drm_panfrost_perfcnt_dump)
 
 #define PANFROST_JD_REQ_FS (1 << 0)
+
+/*
+ * Dependency only job. The job chain head should be set to 0 when this flag
+ * is set.
+ */
+#define PANFROST_JD_REQ_DEP_ONLY (1 << 1)
+
 /**
  * struct drm_panfrost_submit - ioctl argument for submitting commands to the 3D
  * engine.