Message ID | 20230331113603.2802515-2-stanislaw.gruszka@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | accel/ivpu: Fixes for linux-6.3-rc6 | expand |
On 3/31/2023 5:36 AM, Stanislaw Gruszka wrote: > From: Karol Wachowski <karol.wachowski@linux.intel.com> > > Currently job->done_fence is added to every BO handle within a job. If job > handle (command buffer) is shared between multiple submits, KMD will add > the fence in each of them. Then bo_wait_ioctl() executed on command buffer > will exit only when all jobs containing that handle are done. > > This creates deadlock scenario for user mode driver in case when job handle > is added as dependency of another job, because bo_wait_ioctl() of first job > will wait until second job finishes, and second job can not finish before > first one. > > Having fences added only to job buffer handle allows user space to execute > bo_wait_ioctl() on the job even if it's handle is submitted with other job. > > Fixes: cd7272215c44 ("accel/ivpu: Add command buffer submission logic") > Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> > Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
On Fri, Mar 31, 2023 at 01:36:02PM +0200, Stanislaw Gruszka wrote: > From: Karol Wachowski <karol.wachowski@linux.intel.com> > > Currently job->done_fence is added to every BO handle within a job. If job > handle (command buffer) is shared between multiple submits, KMD will add > the fence in each of them. Then bo_wait_ioctl() executed on command buffer > will exit only when all jobs containing that handle are done. > > This creates deadlock scenario for user mode driver in case when job handle > is added as dependency of another job, because bo_wait_ioctl() of first job > will wait until second job finishes, and second job can not finish before > first one. > > Having fences added only to job buffer handle allows user space to execute > bo_wait_ioctl() on the job even if it's handle is submitted with other job. > > Fixes: cd7272215c44 ("accel/ivpu: Add command buffer submission logic") > Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> > Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Uh this is confused on a lot of levels ... Frist having a new driver which uses the dma_resv/bo implicit fencing for umd synchronization is not great at all. The modern way of doing is: - in/out dependencies are handling with drm_syncobj - userspace waits on the drm_syncobj, not with a driver-private wait ioctl on a specific bo The other issue is that the below starts to fall over once you do dynamic memory management, for that case you _always_ have to install a fence. Now fear not, there's a solution here: - you always install a fence (i.e. revert this patch), but by default is a DMA_RESV_USAGE_BOOKKEEP fence. See the kerneldoc for enum dma_resv_usage for what that means. - only for the special job bo you set the DMA_RESV_USAGE_READ flag. You want read because really that's what the gpu is doing for the job bo. - the bo_wait ioctl then waits for write access internally Should result in the same uapi as your patch here, but without abusing dma_resv in a way that'll go boom. Note that userspace can get at the dma_resv READ/WRITE fences through ioctls on a dma-buf, so which one you pick here is uabi relevant. bo-as-job-fence is USAGE_READ. Please take care of this in -next. Cheers, Daniel > --- > drivers/accel/ivpu/ivpu_job.c | 18 +++++++----------- > 1 file changed, 7 insertions(+), 11 deletions(-) > > diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c > index 910301fae435..3c6f1e16cf2f 100644 > --- a/drivers/accel/ivpu/ivpu_job.c > +++ b/drivers/accel/ivpu/ivpu_job.c > @@ -461,26 +461,22 @@ ivpu_job_prepare_bos_for_submit(struct drm_file *file, struct ivpu_job *job, u32 > > job->cmd_buf_vpu_addr = bo->vpu_addr + commands_offset; > > - ret = drm_gem_lock_reservations((struct drm_gem_object **)job->bos, buf_count, > - &acquire_ctx); > + ret = drm_gem_lock_reservations((struct drm_gem_object **)job->bos, 1, &acquire_ctx); > if (ret) { > ivpu_warn(vdev, "Failed to lock reservations: %d\n", ret); > return ret; > } > > - for (i = 0; i < buf_count; i++) { > - ret = dma_resv_reserve_fences(job->bos[i]->base.resv, 1); > - if (ret) { > - ivpu_warn(vdev, "Failed to reserve fences: %d\n", ret); > - goto unlock_reservations; > - } > + ret = dma_resv_reserve_fences(bo->base.resv, 1); > + if (ret) { > + ivpu_warn(vdev, "Failed to reserve fences: %d\n", ret); > + goto unlock_reservations; > } > > - for (i = 0; i < buf_count; i++) > - dma_resv_add_fence(job->bos[i]->base.resv, job->done_fence, DMA_RESV_USAGE_WRITE); > + dma_resv_add_fence(bo->base.resv, job->done_fence, DMA_RESV_USAGE_WRITE); > > unlock_reservations: > - drm_gem_unlock_reservations((struct drm_gem_object **)job->bos, buf_count, &acquire_ctx); > + drm_gem_unlock_reservations((struct drm_gem_object **)job->bos, 1, &acquire_ctx); > > wmb(); /* Flush write combining buffers */ > > -- > 2.25.1 >
On Thu, Apr 06, 2023 at 06:57:34PM +0200, Daniel Vetter wrote: > On Fri, Mar 31, 2023 at 01:36:02PM +0200, Stanislaw Gruszka wrote: > > From: Karol Wachowski <karol.wachowski@linux.intel.com> > > > > Currently job->done_fence is added to every BO handle within a job. If job > > handle (command buffer) is shared between multiple submits, KMD will add > > the fence in each of them. Then bo_wait_ioctl() executed on command buffer > > will exit only when all jobs containing that handle are done. > > > > This creates deadlock scenario for user mode driver in case when job handle > > is added as dependency of another job, because bo_wait_ioctl() of first job > > will wait until second job finishes, and second job can not finish before > > first one. > > > > Having fences added only to job buffer handle allows user space to execute > > bo_wait_ioctl() on the job even if it's handle is submitted with other job. > > > > Fixes: cd7272215c44 ("accel/ivpu: Add command buffer submission logic") > > Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> > > Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> > > Uh this is confused on a lot of levels ... > > Frist having a new driver which uses the dma_resv/bo implicit fencing for > umd synchronization is not great at all. The modern way of doing is: > - in/out dependencies are handling with drm_syncobj > - userspace waits on the drm_syncobj, not with a driver-private wait ioctl > on a specific bo We have synobj on our TODO list, will work on it. > The other issue is that the below starts to fall over once you do dynamic > memory management, for that case you _always_ have to install a fence. > > Now fear not, there's a solution here: > - you always install a fence (i.e. revert this patch), but by default is a > DMA_RESV_USAGE_BOOKKEEP fence. See the kerneldoc for enum dma_resv_usage > for what that means. > - only for the special job bo you set the DMA_RESV_USAGE_READ flag. You > want read because really that's what the gpu is doing for the job bo. > - the bo_wait ioctl then waits for write access internally In our case VPU can write to command buffer (there is special context save area), so I think keeping USAGE_WRITE is appropriate. > Should result in the same uapi as your patch here, but without abusing > dma_resv in a way that'll go boom. > > Note that userspace can get at the dma_resv READ/WRITE fences through > ioctls on a dma-buf, so which one you pick here is uabi relevant. > bo-as-job-fence is USAGE_READ. > > Please take care of this in -next. Sure. Regards Stanislaw
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c index 910301fae435..3c6f1e16cf2f 100644 --- a/drivers/accel/ivpu/ivpu_job.c +++ b/drivers/accel/ivpu/ivpu_job.c @@ -461,26 +461,22 @@ ivpu_job_prepare_bos_for_submit(struct drm_file *file, struct ivpu_job *job, u32 job->cmd_buf_vpu_addr = bo->vpu_addr + commands_offset; - ret = drm_gem_lock_reservations((struct drm_gem_object **)job->bos, buf_count, - &acquire_ctx); + ret = drm_gem_lock_reservations((struct drm_gem_object **)job->bos, 1, &acquire_ctx); if (ret) { ivpu_warn(vdev, "Failed to lock reservations: %d\n", ret); return ret; } - for (i = 0; i < buf_count; i++) { - ret = dma_resv_reserve_fences(job->bos[i]->base.resv, 1); - if (ret) { - ivpu_warn(vdev, "Failed to reserve fences: %d\n", ret); - goto unlock_reservations; - } + ret = dma_resv_reserve_fences(bo->base.resv, 1); + if (ret) { + ivpu_warn(vdev, "Failed to reserve fences: %d\n", ret); + goto unlock_reservations; } - for (i = 0; i < buf_count; i++) - dma_resv_add_fence(job->bos[i]->base.resv, job->done_fence, DMA_RESV_USAGE_WRITE); + dma_resv_add_fence(bo->base.resv, job->done_fence, DMA_RESV_USAGE_WRITE); unlock_reservations: - drm_gem_unlock_reservations((struct drm_gem_object **)job->bos, buf_count, &acquire_ctx); + drm_gem_unlock_reservations((struct drm_gem_object **)job->bos, 1, &acquire_ctx); wmb(); /* Flush write combining buffers */