Message ID | 20190808090422.8102-2-lionel.g.landwerlin@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/syncobj: add syncobj sideband payload for threaded submission | expand |
Am 08.08.19 um 11:04 schrieb Lionel Landwerlin: > The Vulkan timeline semaphores allow signaling to happen on the point > of the timeline without all of the its dependencies to be created. > > The current 2 implementations (AMD/Intel) of the Vulkan spec on top of > the Linux kernel are using a thread to wait on the dependencies of a > given point to materialize and delay actual submission to the kernel > driver until the wait completes. > > If a binary semaphore is submitted for signaling along the side of a > timeline semaphore waiting for completion that means that the drm > syncobj associated with that binary semaphore will not have a DMA > fence associated with it by the time vkQueueSubmit() returns. This and > the fact that a binary semaphore can be signaled and unsignaled as > before its DMA fences materialize mean that we cannot just rely on the > fence within the syncobj but we also need a sideband payload verifying > that the fence in the syncobj matches the last submission from the > Vulkan API point of view. > > This change adds a sideband payload that is incremented with signaled > syncobj when vkQueueSubmit() is called. The next vkQueueSubmit() > waiting on a the syncobj will read the sideband payload and wait for a > fence chain element with a seqno superior or equal to the sideband > payload value to be added into the fence chain and use that fence to > trigger the submission on the kernel driver. > > v2: Use a separate ioctl to get/set the sideband value (Christian) > > Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> > Cc: Christian Koenig <Christian.Koenig@amd.com> > Cc: Jason Ekstrand <jason@jlekstrand.net> > Cc: David(ChunMing) Zhou <David1.Zhou@amd.com> > --- > drivers/gpu/drm/drm_internal.h | 2 ++ > drivers/gpu/drm/drm_ioctl.c | 3 ++ > drivers/gpu/drm/drm_syncobj.c | 63 +++++++++++++++++++++++++++++++--- > include/drm/drm_syncobj.h | 9 +++++ > include/uapi/drm/drm.h | 5 ++- > 5 files changed, 77 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h > index 51a2055c8f18..de5267711a7c 100644 > --- a/drivers/gpu/drm/drm_internal.h > +++ b/drivers/gpu/drm/drm_internal.h > @@ -208,6 +208,8 @@ int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, > struct drm_file *file_private); > int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, > struct drm_file *file_private); > +int drm_syncobj_sideband_ioctl(struct drm_device *dev, void *data, > + struct drm_file *file_private); > > /* drm_framebuffer.c */ > void drm_framebuffer_print_info(struct drm_printer *p, unsigned int indent, > diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c > index f675a3bb2c88..7fd99533f78f 100644 > --- a/drivers/gpu/drm/drm_ioctl.c > +++ b/drivers/gpu/drm/drm_ioctl.c > @@ -703,6 +703,9 @@ static const struct drm_ioctl_desc drm_ioctls[] = { > DRM_RENDER_ALLOW), > DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl, > DRM_RENDER_ALLOW), > + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIDEBAND, drm_syncobj_sideband_ioctl, > + DRM_RENDER_ALLOW), > + > DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, 0), > DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE, drm_crtc_queue_sequence_ioctl, 0), > DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE, drm_mode_create_lease_ioctl, DRM_MASTER), > diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c > index b927e482e554..66f4f91baa6d 100644 > --- a/drivers/gpu/drm/drm_syncobj.c > +++ b/drivers/gpu/drm/drm_syncobj.c > @@ -1150,8 +1150,10 @@ drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, > if (ret < 0) > return ret; > > - for (i = 0; i < args->count_handles; i++) > + for (i = 0; i < args->count_handles; i++) { > drm_syncobj_replace_fence(syncobjs[i], NULL); > + syncobjs[i]->sideband_payload = 0; > + } > > drm_syncobj_array_free(syncobjs, args->count_handles); > > @@ -1205,7 +1207,7 @@ drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, > if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) > return -EOPNOTSUPP; > > - if (args->pad != 0) > + if (args->flags != 0) > return -EINVAL; > > if (args->count_handles == 0) > @@ -1276,7 +1278,7 @@ int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, > if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) > return -EOPNOTSUPP; > > - if (args->pad != 0) > + if (args->flags != 0) > return -EINVAL; > > if (args->count_handles == 0) > @@ -1312,15 +1314,68 @@ int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, > point = dma_fence_is_signaled(last_signaled) ? > last_signaled->seqno : > to_dma_fence_chain(last_signaled)->prev_seqno; > - dma_fence_put(last_signaled); > + dma_fence_put(last_signaled); > } else { > point = 0; > } > ret = copy_to_user(&points[i], &point, sizeof(uint64_t)); > ret = ret ? -EFAULT : 0; > if (ret) > + goto error; > + break; > + } > + > +error: > + drm_syncobj_array_free(syncobjs, args->count_handles); > + > + return ret; > +} Those changes above to drm_syncobj_query now look completely unrelated to me. > + > +int drm_syncobj_sideband_ioctl(struct drm_device *dev, void *data, > + struct drm_file *file_private) > +{ > + struct drm_syncobj_timeline_array *args = data; > + struct drm_syncobj **syncobjs; > + uint64_t __user *points = u64_to_user_ptr(args->points); > + uint32_t i; > + int ret; > + > + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) > + return -EOPNOTSUPP; > + > + if (args->flags != DRM_SYNCOBJ_TIMELINE_ARRAY_GET_SIDEBAND_PAYLOAD && > + args->flags != DRM_SYNCOBJ_TIMELINE_ARRAY_SET_SIDEBAND_PAYLOAD) > + return -EINVAL; > + > + if (args->count_handles == 0) > + return -EINVAL; > + > + ret = drm_syncobj_array_find(file_private, > + u64_to_user_ptr(args->handles), > + args->count_handles, > + &syncobjs); > + if (ret < 0) > + return ret; > + > + for (i = 0; i < args->count_handles; i++) { > + switch (args->flags) { > + case DRM_SYNCOBJ_TIMELINE_ARRAY_GET_SIDEBAND_PAYLOAD: > + copy_to_user(&points[i], &syncobjs[i]->sideband_payload, sizeof(uint64_t)); > + ret = ret ? -EFAULT : 0; > + if (ret) > + goto error; > + break; > + > + case DRM_SYNCOBJ_TIMELINE_ARRAY_SET_SIDEBAND_PAYLOAD: > + copy_from_user(&syncobjs[i]->sideband_payload, &points[i], sizeof(uint64_t)); > + ret = ret ? -EFAULT : 0; > + if (ret) > + goto error; > break; > + } > } > + > +error: > drm_syncobj_array_free(syncobjs, args->count_handles); > > return ret; > diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h > index 6cf7243a1dc5..b587b8e07547 100644 > --- a/include/drm/drm_syncobj.h > +++ b/include/drm/drm_syncobj.h > @@ -61,6 +61,15 @@ struct drm_syncobj { > * @file: A file backing for this syncobj. > */ > struct file *file; > + /** > + * @sideband_payload: A 64bit side band payload. > + * > + * We use the sideband payload value to wait on binary syncobj fences > + * to materialize. It is a reservation mechanism for the signaler to > + * express that at some point in the future a dma fence with the same > + * seqno will be put into the syncobj. > + */ > + u64 sideband_payload; > }; > > void drm_syncobj_free(struct kref *kref); > diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h > index 8a5b2f8f8eb9..dfcbaf603aff 100644 > --- a/include/uapi/drm/drm.h > +++ b/include/uapi/drm/drm.h > @@ -782,7 +782,9 @@ struct drm_syncobj_timeline_array { > __u64 handles; > __u64 points; > __u32 count_handles; > - __u32 pad; > + __u32 flags; > +#define DRM_SYNCOBJ_TIMELINE_ARRAY_GET_SIDEBAND_PAYLOAD (0) > +#define DRM_SYNCOBJ_TIMELINE_ARRAY_SET_SIDEBAND_PAYLOAD (1) > }; > > > @@ -946,6 +948,7 @@ extern "C" { > #define DRM_IOCTL_SYNCOBJ_QUERY DRM_IOWR(0xCB, struct drm_syncobj_timeline_array) > #define DRM_IOCTL_SYNCOBJ_TRANSFER DRM_IOWR(0xCC, struct drm_syncobj_transfer) > #define DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL DRM_IOWR(0xCD, struct drm_syncobj_timeline_array) > +#define DRM_IOCTL_SYNCOBJ_SIDEBAND DRM_IOWR(0xCE, struct drm_syncobj_timeline_array) Maybe better add separate get/set IOCTLs, but just a suggestion. Regards, Christian. > > /** > * Device specific ioctls should only be in their respective headers
On 08/08/2019 12:32, Koenig, Christian wrote: > Am 08.08.19 um 11:04 schrieb Lionel Landwerlin: >> The Vulkan timeline semaphores allow signaling to happen on the point >> of the timeline without all of the its dependencies to be created. >> >> The current 2 implementations (AMD/Intel) of the Vulkan spec on top of >> the Linux kernel are using a thread to wait on the dependencies of a >> given point to materialize and delay actual submission to the kernel >> driver until the wait completes. >> >> If a binary semaphore is submitted for signaling along the side of a >> timeline semaphore waiting for completion that means that the drm >> syncobj associated with that binary semaphore will not have a DMA >> fence associated with it by the time vkQueueSubmit() returns. This and >> the fact that a binary semaphore can be signaled and unsignaled as >> before its DMA fences materialize mean that we cannot just rely on the >> fence within the syncobj but we also need a sideband payload verifying >> that the fence in the syncobj matches the last submission from the >> Vulkan API point of view. >> >> This change adds a sideband payload that is incremented with signaled >> syncobj when vkQueueSubmit() is called. The next vkQueueSubmit() >> waiting on a the syncobj will read the sideband payload and wait for a >> fence chain element with a seqno superior or equal to the sideband >> payload value to be added into the fence chain and use that fence to >> trigger the submission on the kernel driver. >> >> v2: Use a separate ioctl to get/set the sideband value (Christian) >> >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >> Cc: Christian Koenig <Christian.Koenig@amd.com> >> Cc: Jason Ekstrand <jason@jlekstrand.net> >> Cc: David(ChunMing) Zhou <David1.Zhou@amd.com> >> --- >> drivers/gpu/drm/drm_internal.h | 2 ++ >> drivers/gpu/drm/drm_ioctl.c | 3 ++ >> drivers/gpu/drm/drm_syncobj.c | 63 +++++++++++++++++++++++++++++++--- >> include/drm/drm_syncobj.h | 9 +++++ >> include/uapi/drm/drm.h | 5 ++- >> 5 files changed, 77 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h >> index 51a2055c8f18..de5267711a7c 100644 >> --- a/drivers/gpu/drm/drm_internal.h >> +++ b/drivers/gpu/drm/drm_internal.h >> @@ -208,6 +208,8 @@ int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, >> struct drm_file *file_private); >> int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, >> struct drm_file *file_private); >> +int drm_syncobj_sideband_ioctl(struct drm_device *dev, void *data, >> + struct drm_file *file_private); >> >> /* drm_framebuffer.c */ >> void drm_framebuffer_print_info(struct drm_printer *p, unsigned int indent, >> diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c >> index f675a3bb2c88..7fd99533f78f 100644 >> --- a/drivers/gpu/drm/drm_ioctl.c >> +++ b/drivers/gpu/drm/drm_ioctl.c >> @@ -703,6 +703,9 @@ static const struct drm_ioctl_desc drm_ioctls[] = { >> DRM_RENDER_ALLOW), >> DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl, >> DRM_RENDER_ALLOW), >> + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIDEBAND, drm_syncobj_sideband_ioctl, >> + DRM_RENDER_ALLOW), >> + >> DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, 0), >> DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE, drm_crtc_queue_sequence_ioctl, 0), >> DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE, drm_mode_create_lease_ioctl, DRM_MASTER), >> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c >> index b927e482e554..66f4f91baa6d 100644 >> --- a/drivers/gpu/drm/drm_syncobj.c >> +++ b/drivers/gpu/drm/drm_syncobj.c >> @@ -1150,8 +1150,10 @@ drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, >> if (ret < 0) >> return ret; >> >> - for (i = 0; i < args->count_handles; i++) >> + for (i = 0; i < args->count_handles; i++) { >> drm_syncobj_replace_fence(syncobjs[i], NULL); >> + syncobjs[i]->sideband_payload = 0; >> + } >> >> drm_syncobj_array_free(syncobjs, args->count_handles); >> >> @@ -1205,7 +1207,7 @@ drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, >> if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) >> return -EOPNOTSUPP; >> >> - if (args->pad != 0) >> + if (args->flags != 0) >> return -EINVAL; >> >> if (args->count_handles == 0) >> @@ -1276,7 +1278,7 @@ int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, >> if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) >> return -EOPNOTSUPP; >> >> - if (args->pad != 0) >> + if (args->flags != 0) >> return -EINVAL; >> >> if (args->count_handles == 0) >> @@ -1312,15 +1314,68 @@ int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, >> point = dma_fence_is_signaled(last_signaled) ? >> last_signaled->seqno : >> to_dma_fence_chain(last_signaled)->prev_seqno; >> - dma_fence_put(last_signaled); >> + dma_fence_put(last_signaled); Whoops, I need to drop this. >> } else { >> point = 0; >> } >> ret = copy_to_user(&points[i], &point, sizeof(uint64_t)); >> ret = ret ? -EFAULT : 0; >> if (ret) >> + goto error; >> + break; >> + } >> + >> +error: >> + drm_syncobj_array_free(syncobjs, args->count_handles); >> + >> + return ret; >> +} > Those changes above to drm_syncobj_query now look completely unrelated > to me. The diff is really weird indeed but apart from the mistake above it's right. > >> + >> +int drm_syncobj_sideband_ioctl(struct drm_device *dev, void *data, >> + struct drm_file *file_private) >> +{ >> + struct drm_syncobj_timeline_array *args = data; >> + struct drm_syncobj **syncobjs; >> + uint64_t __user *points = u64_to_user_ptr(args->points); >> + uint32_t i; >> + int ret; >> + >> + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) >> + return -EOPNOTSUPP; >> + >> + if (args->flags != DRM_SYNCOBJ_TIMELINE_ARRAY_GET_SIDEBAND_PAYLOAD && >> + args->flags != DRM_SYNCOBJ_TIMELINE_ARRAY_SET_SIDEBAND_PAYLOAD) >> + return -EINVAL; >> + >> + if (args->count_handles == 0) >> + return -EINVAL; >> + >> + ret = drm_syncobj_array_find(file_private, >> + u64_to_user_ptr(args->handles), >> + args->count_handles, >> + &syncobjs); >> + if (ret < 0) >> + return ret; >> + >> + for (i = 0; i < args->count_handles; i++) { >> + switch (args->flags) { >> + case DRM_SYNCOBJ_TIMELINE_ARRAY_GET_SIDEBAND_PAYLOAD: >> + copy_to_user(&points[i], &syncobjs[i]->sideband_payload, sizeof(uint64_t)); >> + ret = ret ? -EFAULT : 0; >> + if (ret) >> + goto error; >> + break; >> + >> + case DRM_SYNCOBJ_TIMELINE_ARRAY_SET_SIDEBAND_PAYLOAD: >> + copy_from_user(&syncobjs[i]->sideband_payload, &points[i], sizeof(uint64_t)); >> + ret = ret ? -EFAULT : 0; >> + if (ret) >> + goto error; >> break; >> + } >> } >> + >> +error: >> drm_syncobj_array_free(syncobjs, args->count_handles); >> >> return ret; >> diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h >> index 6cf7243a1dc5..b587b8e07547 100644 >> --- a/include/drm/drm_syncobj.h >> +++ b/include/drm/drm_syncobj.h >> @@ -61,6 +61,15 @@ struct drm_syncobj { >> * @file: A file backing for this syncobj. >> */ >> struct file *file; >> + /** >> + * @sideband_payload: A 64bit side band payload. >> + * >> + * We use the sideband payload value to wait on binary syncobj fences >> + * to materialize. It is a reservation mechanism for the signaler to >> + * express that at some point in the future a dma fence with the same >> + * seqno will be put into the syncobj. >> + */ >> + u64 sideband_payload; >> }; >> >> void drm_syncobj_free(struct kref *kref); >> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h >> index 8a5b2f8f8eb9..dfcbaf603aff 100644 >> --- a/include/uapi/drm/drm.h >> +++ b/include/uapi/drm/drm.h >> @@ -782,7 +782,9 @@ struct drm_syncobj_timeline_array { >> __u64 handles; >> __u64 points; >> __u32 count_handles; >> - __u32 pad; >> + __u32 flags; >> +#define DRM_SYNCOBJ_TIMELINE_ARRAY_GET_SIDEBAND_PAYLOAD (0) >> +#define DRM_SYNCOBJ_TIMELINE_ARRAY_SET_SIDEBAND_PAYLOAD (1) >> }; >> >> >> @@ -946,6 +948,7 @@ extern "C" { >> #define DRM_IOCTL_SYNCOBJ_QUERY DRM_IOWR(0xCB, struct drm_syncobj_timeline_array) >> #define DRM_IOCTL_SYNCOBJ_TRANSFER DRM_IOWR(0xCC, struct drm_syncobj_transfer) >> #define DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL DRM_IOWR(0xCD, struct drm_syncobj_timeline_array) >> +#define DRM_IOCTL_SYNCOBJ_SIDEBAND DRM_IOWR(0xCE, struct drm_syncobj_timeline_array) > Maybe better add separate get/set IOCTLs, but just a suggestion. > > Regards, > Christian. Alright, I was worried about polluting the ioctl IDs. Did the new tests on the CTS helped reproduce on your side the issue this is solving? Thanks, -Lionel > >> >> /** >> * Device specific ioctls should only be in their respective headers
On 08/08/2019 12:42, Lionel Landwerlin wrote: > On 08/08/2019 12:32, Koenig, Christian wrote: >> Am 08.08.19 um 11:04 schrieb Lionel Landwerlin: >>> The Vulkan timeline semaphores allow signaling to happen on the point >>> of the timeline without all of the its dependencies to be created. >>> >>> The current 2 implementations (AMD/Intel) of the Vulkan spec on top of >>> the Linux kernel are using a thread to wait on the dependencies of a >>> given point to materialize and delay actual submission to the kernel >>> driver until the wait completes. >>> >>> If a binary semaphore is submitted for signaling along the side of a >>> timeline semaphore waiting for completion that means that the drm >>> syncobj associated with that binary semaphore will not have a DMA >>> fence associated with it by the time vkQueueSubmit() returns. This and >>> the fact that a binary semaphore can be signaled and unsignaled as >>> before its DMA fences materialize mean that we cannot just rely on the >>> fence within the syncobj but we also need a sideband payload verifying >>> that the fence in the syncobj matches the last submission from the >>> Vulkan API point of view. >>> >>> This change adds a sideband payload that is incremented with signaled >>> syncobj when vkQueueSubmit() is called. The next vkQueueSubmit() >>> waiting on a the syncobj will read the sideband payload and wait for a >>> fence chain element with a seqno superior or equal to the sideband >>> payload value to be added into the fence chain and use that fence to >>> trigger the submission on the kernel driver. >>> >>> v2: Use a separate ioctl to get/set the sideband value (Christian) >>> >>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >>> Cc: Christian Koenig <Christian.Koenig@amd.com> >>> Cc: Jason Ekstrand <jason@jlekstrand.net> >>> Cc: David(ChunMing) Zhou <David1.Zhou@amd.com> >>> --- >>> drivers/gpu/drm/drm_internal.h | 2 ++ >>> drivers/gpu/drm/drm_ioctl.c | 3 ++ >>> drivers/gpu/drm/drm_syncobj.c | 63 >>> +++++++++++++++++++++++++++++++--- >>> include/drm/drm_syncobj.h | 9 +++++ >>> include/uapi/drm/drm.h | 5 ++- >>> 5 files changed, 77 insertions(+), 5 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/drm_internal.h >>> b/drivers/gpu/drm/drm_internal.h >>> index 51a2055c8f18..de5267711a7c 100644 >>> --- a/drivers/gpu/drm/drm_internal.h >>> +++ b/drivers/gpu/drm/drm_internal.h >>> @@ -208,6 +208,8 @@ int drm_syncobj_timeline_signal_ioctl(struct >>> drm_device *dev, void *data, >>> struct drm_file *file_private); >>> int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, >>> struct drm_file *file_private); >>> +int drm_syncobj_sideband_ioctl(struct drm_device *dev, void *data, >>> + struct drm_file *file_private); >>> /* drm_framebuffer.c */ >>> void drm_framebuffer_print_info(struct drm_printer *p, unsigned >>> int indent, >>> diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c >>> index f675a3bb2c88..7fd99533f78f 100644 >>> --- a/drivers/gpu/drm/drm_ioctl.c >>> +++ b/drivers/gpu/drm/drm_ioctl.c >>> @@ -703,6 +703,9 @@ static const struct drm_ioctl_desc drm_ioctls[] = { >>> DRM_RENDER_ALLOW), >>> DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl, >>> DRM_RENDER_ALLOW), >>> + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIDEBAND, >>> drm_syncobj_sideband_ioctl, >>> + DRM_RENDER_ALLOW), >>> + >>> DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, >>> drm_crtc_get_sequence_ioctl, 0), >>> DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE, >>> drm_crtc_queue_sequence_ioctl, 0), >>> DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE, >>> drm_mode_create_lease_ioctl, DRM_MASTER), >>> diff --git a/drivers/gpu/drm/drm_syncobj.c >>> b/drivers/gpu/drm/drm_syncobj.c >>> index b927e482e554..66f4f91baa6d 100644 >>> --- a/drivers/gpu/drm/drm_syncobj.c >>> +++ b/drivers/gpu/drm/drm_syncobj.c >>> @@ -1150,8 +1150,10 @@ drm_syncobj_reset_ioctl(struct drm_device >>> *dev, void *data, >>> if (ret < 0) >>> return ret; >>> - for (i = 0; i < args->count_handles; i++) >>> + for (i = 0; i < args->count_handles; i++) { >>> drm_syncobj_replace_fence(syncobjs[i], NULL); >>> + syncobjs[i]->sideband_payload = 0; >>> + } >>> drm_syncobj_array_free(syncobjs, args->count_handles); >>> @@ -1205,7 +1207,7 @@ drm_syncobj_timeline_signal_ioctl(struct >>> drm_device *dev, void *data, >>> if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) >>> return -EOPNOTSUPP; >>> - if (args->pad != 0) >>> + if (args->flags != 0) >>> return -EINVAL; >>> if (args->count_handles == 0) >>> @@ -1276,7 +1278,7 @@ int drm_syncobj_query_ioctl(struct drm_device >>> *dev, void *data, >>> if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) >>> return -EOPNOTSUPP; >>> - if (args->pad != 0) >>> + if (args->flags != 0) >>> return -EINVAL; >>> if (args->count_handles == 0) >>> @@ -1312,15 +1314,68 @@ int drm_syncobj_query_ioctl(struct >>> drm_device *dev, void *data, >>> point = dma_fence_is_signaled(last_signaled) ? >>> last_signaled->seqno : >>> to_dma_fence_chain(last_signaled)->prev_seqno; >>> - dma_fence_put(last_signaled); >>> + dma_fence_put(last_signaled); > > > Whoops, I need to drop this. > > >>> } else { >>> point = 0; >>> } >>> ret = copy_to_user(&points[i], &point, sizeof(uint64_t)); >>> ret = ret ? -EFAULT : 0; >>> if (ret) >>> + goto error; >>> + break; >>> + } >>> + >>> +error: >>> + drm_syncobj_array_free(syncobjs, args->count_handles); >>> + >>> + return ret; >>> +} >> Those changes above to drm_syncobj_query now look completely unrelated >> to me. > > > The diff is really weird indeed but apart from the mistake above it's > right. Arg no, it was wrong :( > > >> >>> + >>> +int drm_syncobj_sideband_ioctl(struct drm_device *dev, void *data, >>> + struct drm_file *file_private) >>> +{ >>> + struct drm_syncobj_timeline_array *args = data; >>> + struct drm_syncobj **syncobjs; >>> + uint64_t __user *points = u64_to_user_ptr(args->points); >>> + uint32_t i; >>> + int ret; >>> + >>> + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) >>> + return -EOPNOTSUPP; >>> + >>> + if (args->flags != >>> DRM_SYNCOBJ_TIMELINE_ARRAY_GET_SIDEBAND_PAYLOAD && >>> + args->flags != >>> DRM_SYNCOBJ_TIMELINE_ARRAY_SET_SIDEBAND_PAYLOAD) >>> + return -EINVAL; >>> + >>> + if (args->count_handles == 0) >>> + return -EINVAL; >>> + >>> + ret = drm_syncobj_array_find(file_private, >>> + u64_to_user_ptr(args->handles), >>> + args->count_handles, >>> + &syncobjs); >>> + if (ret < 0) >>> + return ret; >>> + >>> + for (i = 0; i < args->count_handles; i++) { >>> + switch (args->flags) { >>> + case DRM_SYNCOBJ_TIMELINE_ARRAY_GET_SIDEBAND_PAYLOAD: >>> + copy_to_user(&points[i], >>> &syncobjs[i]->sideband_payload, sizeof(uint64_t)); >>> + ret = ret ? -EFAULT : 0; >>> + if (ret) >>> + goto error; >>> + break; >>> + >>> + case DRM_SYNCOBJ_TIMELINE_ARRAY_SET_SIDEBAND_PAYLOAD: >>> + copy_from_user(&syncobjs[i]->sideband_payload, &points[i], >>> sizeof(uint64_t)); >>> + ret = ret ? -EFAULT : 0; >>> + if (ret) >>> + goto error; >>> break; >>> + } >>> } >>> + >>> +error: >>> drm_syncobj_array_free(syncobjs, args->count_handles); >>> return ret; >>> diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h >>> index 6cf7243a1dc5..b587b8e07547 100644 >>> --- a/include/drm/drm_syncobj.h >>> +++ b/include/drm/drm_syncobj.h >>> @@ -61,6 +61,15 @@ struct drm_syncobj { >>> * @file: A file backing for this syncobj. >>> */ >>> struct file *file; >>> + /** >>> + * @sideband_payload: A 64bit side band payload. >>> + * >>> + * We use the sideband payload value to wait on binary syncobj >>> fences >>> + * to materialize. It is a reservation mechanism for the >>> signaler to >>> + * express that at some point in the future a dma fence with >>> the same >>> + * seqno will be put into the syncobj. >>> + */ >>> + u64 sideband_payload; >>> }; >>> void drm_syncobj_free(struct kref *kref); >>> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h >>> index 8a5b2f8f8eb9..dfcbaf603aff 100644 >>> --- a/include/uapi/drm/drm.h >>> +++ b/include/uapi/drm/drm.h >>> @@ -782,7 +782,9 @@ struct drm_syncobj_timeline_array { >>> __u64 handles; >>> __u64 points; >>> __u32 count_handles; >>> - __u32 pad; >>> + __u32 flags; >>> +#define DRM_SYNCOBJ_TIMELINE_ARRAY_GET_SIDEBAND_PAYLOAD (0) >>> +#define DRM_SYNCOBJ_TIMELINE_ARRAY_SET_SIDEBAND_PAYLOAD (1) >>> }; >>> @@ -946,6 +948,7 @@ extern "C" { >>> #define DRM_IOCTL_SYNCOBJ_QUERY DRM_IOWR(0xCB, struct >>> drm_syncobj_timeline_array) >>> #define DRM_IOCTL_SYNCOBJ_TRANSFER DRM_IOWR(0xCC, struct >>> drm_syncobj_transfer) >>> #define DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL DRM_IOWR(0xCD, >>> struct drm_syncobj_timeline_array) >>> +#define DRM_IOCTL_SYNCOBJ_SIDEBAND DRM_IOWR(0xCE, struct >>> drm_syncobj_timeline_array) >> Maybe better add separate get/set IOCTLs, but just a suggestion. >> >> Regards, >> Christian. > > > Alright, I was worried about polluting the ioctl IDs. > > > Did the new tests on the CTS helped reproduce on your side the issue > this is solving? > > > Thanks, > > > -Lionel > > >> >>> /** >>> * Device specific ioctls should only be in their respective headers > > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index 51a2055c8f18..de5267711a7c 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -208,6 +208,8 @@ int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); +int drm_syncobj_sideband_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private); /* drm_framebuffer.c */ void drm_framebuffer_print_info(struct drm_printer *p, unsigned int indent, diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index f675a3bb2c88..7fd99533f78f 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -703,6 +703,9 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIDEBAND, drm_syncobj_sideband_ioctl, + DRM_RENDER_ALLOW), + DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, 0), DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE, drm_crtc_queue_sequence_ioctl, 0), DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE, drm_mode_create_lease_ioctl, DRM_MASTER), diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index b927e482e554..66f4f91baa6d 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -1150,8 +1150,10 @@ drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, if (ret < 0) return ret; - for (i = 0; i < args->count_handles; i++) + for (i = 0; i < args->count_handles; i++) { drm_syncobj_replace_fence(syncobjs[i], NULL); + syncobjs[i]->sideband_payload = 0; + } drm_syncobj_array_free(syncobjs, args->count_handles); @@ -1205,7 +1207,7 @@ drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) return -EOPNOTSUPP; - if (args->pad != 0) + if (args->flags != 0) return -EINVAL; if (args->count_handles == 0) @@ -1276,7 +1278,7 @@ int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) return -EOPNOTSUPP; - if (args->pad != 0) + if (args->flags != 0) return -EINVAL; if (args->count_handles == 0) @@ -1312,15 +1314,68 @@ int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, point = dma_fence_is_signaled(last_signaled) ? last_signaled->seqno : to_dma_fence_chain(last_signaled)->prev_seqno; - dma_fence_put(last_signaled); + dma_fence_put(last_signaled); } else { point = 0; } ret = copy_to_user(&points[i], &point, sizeof(uint64_t)); ret = ret ? -EFAULT : 0; if (ret) + goto error; + break; + } + +error: + drm_syncobj_array_free(syncobjs, args->count_handles); + + return ret; +} + +int drm_syncobj_sideband_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private) +{ + struct drm_syncobj_timeline_array *args = data; + struct drm_syncobj **syncobjs; + uint64_t __user *points = u64_to_user_ptr(args->points); + uint32_t i; + int ret; + + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) + return -EOPNOTSUPP; + + if (args->flags != DRM_SYNCOBJ_TIMELINE_ARRAY_GET_SIDEBAND_PAYLOAD && + args->flags != DRM_SYNCOBJ_TIMELINE_ARRAY_SET_SIDEBAND_PAYLOAD) + return -EINVAL; + + if (args->count_handles == 0) + return -EINVAL; + + ret = drm_syncobj_array_find(file_private, + u64_to_user_ptr(args->handles), + args->count_handles, + &syncobjs); + if (ret < 0) + return ret; + + for (i = 0; i < args->count_handles; i++) { + switch (args->flags) { + case DRM_SYNCOBJ_TIMELINE_ARRAY_GET_SIDEBAND_PAYLOAD: + copy_to_user(&points[i], &syncobjs[i]->sideband_payload, sizeof(uint64_t)); + ret = ret ? -EFAULT : 0; + if (ret) + goto error; + break; + + case DRM_SYNCOBJ_TIMELINE_ARRAY_SET_SIDEBAND_PAYLOAD: + copy_from_user(&syncobjs[i]->sideband_payload, &points[i], sizeof(uint64_t)); + ret = ret ? -EFAULT : 0; + if (ret) + goto error; break; + } } + +error: drm_syncobj_array_free(syncobjs, args->count_handles); return ret; diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h index 6cf7243a1dc5..b587b8e07547 100644 --- a/include/drm/drm_syncobj.h +++ b/include/drm/drm_syncobj.h @@ -61,6 +61,15 @@ struct drm_syncobj { * @file: A file backing for this syncobj. */ struct file *file; + /** + * @sideband_payload: A 64bit side band payload. + * + * We use the sideband payload value to wait on binary syncobj fences + * to materialize. It is a reservation mechanism for the signaler to + * express that at some point in the future a dma fence with the same + * seqno will be put into the syncobj. + */ + u64 sideband_payload; }; void drm_syncobj_free(struct kref *kref); diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index 8a5b2f8f8eb9..dfcbaf603aff 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -782,7 +782,9 @@ struct drm_syncobj_timeline_array { __u64 handles; __u64 points; __u32 count_handles; - __u32 pad; + __u32 flags; +#define DRM_SYNCOBJ_TIMELINE_ARRAY_GET_SIDEBAND_PAYLOAD (0) +#define DRM_SYNCOBJ_TIMELINE_ARRAY_SET_SIDEBAND_PAYLOAD (1) }; @@ -946,6 +948,7 @@ extern "C" { #define DRM_IOCTL_SYNCOBJ_QUERY DRM_IOWR(0xCB, struct drm_syncobj_timeline_array) #define DRM_IOCTL_SYNCOBJ_TRANSFER DRM_IOWR(0xCC, struct drm_syncobj_transfer) #define DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL DRM_IOWR(0xCD, struct drm_syncobj_timeline_array) +#define DRM_IOCTL_SYNCOBJ_SIDEBAND DRM_IOWR(0xCE, struct drm_syncobj_timeline_array) /** * Device specific ioctls should only be in their respective headers
The Vulkan timeline semaphores allow signaling to happen on the point of the timeline without all of the its dependencies to be created. The current 2 implementations (AMD/Intel) of the Vulkan spec on top of the Linux kernel are using a thread to wait on the dependencies of a given point to materialize and delay actual submission to the kernel driver until the wait completes. If a binary semaphore is submitted for signaling along the side of a timeline semaphore waiting for completion that means that the drm syncobj associated with that binary semaphore will not have a DMA fence associated with it by the time vkQueueSubmit() returns. This and the fact that a binary semaphore can be signaled and unsignaled as before its DMA fences materialize mean that we cannot just rely on the fence within the syncobj but we also need a sideband payload verifying that the fence in the syncobj matches the last submission from the Vulkan API point of view. This change adds a sideband payload that is incremented with signaled syncobj when vkQueueSubmit() is called. The next vkQueueSubmit() waiting on a the syncobj will read the sideband payload and wait for a fence chain element with a seqno superior or equal to the sideband payload value to be added into the fence chain and use that fence to trigger the submission on the kernel driver. v2: Use a separate ioctl to get/set the sideband value (Christian) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Christian Koenig <Christian.Koenig@amd.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: David(ChunMing) Zhou <David1.Zhou@amd.com> --- drivers/gpu/drm/drm_internal.h | 2 ++ drivers/gpu/drm/drm_ioctl.c | 3 ++ drivers/gpu/drm/drm_syncobj.c | 63 +++++++++++++++++++++++++++++++--- include/drm/drm_syncobj.h | 9 +++++ include/uapi/drm/drm.h | 5 ++- 5 files changed, 77 insertions(+), 5 deletions(-)