Message ID | 20220824150834.427572-4-contact@emersion.fr (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add support for atomic async page-flips | expand |
On Wed, Aug 24, 2022 at 03:08:55PM +0000, Simon Ser wrote: > This new kernel capability indicates whether async page-flips are > supported via the atomic uAPI. DRM clients can use it to check > for support before feeding DRM_MODE_PAGE_FLIP_ASYNC to the kernel. I think we'd need to clarify the semantics of the async flag for atomic commits. Eg. on Intel hw only pure page flips are possible async, if you do anything else (change plane size/pos/scaling/etc.) you will need to do a sync update. Technically not even all page flips (from the uapi POV) might be possible as the exact scanout source address is specified via two registers, only one of which can be update async. So technically the two framebuffers might be laid out just slightly differently which could prevent an async flip. Also only some subset of planes actually support async flips. And on hw where multiple planes support it on the same crtc, only one plane can do it at a time. Well, more accurately we can only select one plane at a time to give us the "flip done" interrupt. I guess if the user wants to async flip multiple planes at the same time we could do them serially as opposed to in parallel to make sure all the flips actually happened before we signal completion of the entire commit. Async flips of multiple planes probably won't happen atomically anyway so doing them serially seems fine. ATM in i915 we probably don't have sufficient state checks in place to catch all the restrictions, and instead in part we rely on the limited scope of the legacy async flip ioctl to make sure the operation doesn't attempt something the hw can't do. > Make it clear that DRM_CAP_ASYNC_PAGE_FLIP is for legacy uAPI only. > > Signed-off-by: Simon Ser <contact@emersion.fr> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: Joshua Ashton <joshua@froggi.es> > Cc: Melissa Wen <mwen@igalia.com> > Cc: Alex Deucher <alexander.deucher@amd.com> > Cc: Harry Wentland <hwentlan@amd.com> > Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> > --- > drivers/gpu/drm/drm_ioctl.c | 5 +++++ > include/uapi/drm/drm.h | 10 +++++++++- > 2 files changed, 14 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c > index ca2a6e6101dc..5b1591e2b46c 100644 > --- a/drivers/gpu/drm/drm_ioctl.c > +++ b/drivers/gpu/drm/drm_ioctl.c > @@ -302,6 +302,11 @@ static int drm_getcap(struct drm_device *dev, void *data, struct drm_file *file_ > case DRM_CAP_CRTC_IN_VBLANK_EVENT: > req->value = 1; > break; > + case DRM_CAP_ATOMIC_ASYNC_PAGE_FLIP: > + req->value = drm_core_check_feature(dev, DRIVER_ATOMIC) && > + dev->mode_config.async_page_flip && > + !dev->mode_config.atomic_async_page_flip_not_supported; > + break; > default: > return -EINVAL; > } > diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h > index 642808520d92..b1962628ecda 100644 > --- a/include/uapi/drm/drm.h > +++ b/include/uapi/drm/drm.h > @@ -706,7 +706,8 @@ struct drm_gem_open { > /** > * DRM_CAP_ASYNC_PAGE_FLIP > * > - * If set to 1, the driver supports &DRM_MODE_PAGE_FLIP_ASYNC. > + * If set to 1, the driver supports &DRM_MODE_PAGE_FLIP_ASYNC for legacy > + * page-flips. > */ > #define DRM_CAP_ASYNC_PAGE_FLIP 0x7 > /** > @@ -767,6 +768,13 @@ struct drm_gem_open { > * Documentation/gpu/drm-mm.rst, section "DRM Sync Objects". > */ > #define DRM_CAP_SYNCOBJ_TIMELINE 0x14 > +/** > + * DRM_CAP_ATOMIC_ASYNC_PAGE_FLIP > + * > + * If set to 1, the driver supports &DRM_MODE_PAGE_FLIP_ASYNC for atomic > + * commits. > + */ > +#define DRM_CAP_ATOMIC_ASYNC_PAGE_FLIP 0x15 > > /* DRM_IOCTL_GET_CAP ioctl argument type */ > struct drm_get_cap { > -- > 2.37.2 >
On Friday, August 26th, 2022 at 10:19, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > On Wed, Aug 24, 2022 at 03:08:55PM +0000, Simon Ser wrote: > > This new kernel capability indicates whether async page-flips are > > supported via the atomic uAPI. DRM clients can use it to check > > for support before feeding DRM_MODE_PAGE_FLIP_ASYNC to the kernel. > > I think we'd need to clarify the semantics of the async flag > for atomic commits. > > Eg. on Intel hw only pure page flips are possible async, if you do > anything else (change plane size/pos/scaling/etc.) you will need > to do a sync update. Technically not even all page flips (from the > uapi POV) might be possible as the exact scanout source address > is specified via two registers, only one of which can be update > async. So technically the two framebuffers might be laid out > just slightly differently which could prevent an async flip. > Also only some subset of planes actually support async flips. Also IIRC some format modifiers don't support async flip at all (CCS)? > And on hw where multiple planes support it on the same crtc, only one > plane can do it at a time. Well, more accurately we can only select > one plane at a time to give us the "flip done" interrupt. I guess > if the user wants to async flip multiple planes at the same time > we could do them serially as opposed to in parallel to make sure > all the flips actually happened before we signal completion of the > entire commit. Async flips of multiple planes probably won't > happen atomically anyway so doing them serially seems fine. > > ATM in i915 we probably don't have sufficient state checks in > place to catch all the restrictions, and instead in part we rely > on the limited scope of the legacy async flip ioctl to make sure > the operation doesn't attempt something the hw can't do. Yeah, that makes sense. In the documentation patch discussion [1], it appears it's not clear what drivers should do when async flip isn't possible with the legacy uAPI. For the atomic uAPI, we need to pick on of these two approaches: 1. Let the kernel fall back to a sync flip if async isn't possible. This simplifies user-space, but then user-space has no reliable way to figure out what really happened (sync or async?). That could be fixed with a new read-only CRTC prop indicating whether the last flip was async or not. However, maybe someone will come up in the future with user-space which needs to only apply an update if async flip is possible, in which case this approach falls short. 2. Make the kernel return EINVAL if async flip isn't possible. This adds more complexity to user-space, but enables a more reliable and deterministic uAPI. This is also more consistent with the rest of the existing atomic uAPI. Note, current user-space would only need to opportunistically enable async flip. IOW, I think that for current user-space plans "async if possible, otherwise sync" is good enough. That behavior maps well to the Vulkan present modes as well (which says that "this mode *may* result in visible tearing", but doesn't guarantee it). Another possible shortcoming of the proposed new uAPI here is that user-space cannot submit a single atomic commit which updates multiple CRTCs, and individually select which CRTC does an async flip. This could be fixed with a "ASYNC_FLIP" CRTC prop which the kernel always resets to 0 on commit. I'm not sure we want/need to cross that bridge right now, it would be easy enough to add as a second step if some user-space would benefit from it. What do you think? [1]: https://lore.kernel.org/dri-devel/ASSNOUe9wtsXskZjVlf1X4pl53T7pVE0MfEzkQ_h4cX0tjnF7e3cxpwGpRNPudmIHoRuW4kz_v1AeTpXgouLpTYcI8q-lPTzc1YMLR8JiJM=@emersion.fr/
On Mon, Aug 29, 2022 at 04:01:44PM +0000, Simon Ser wrote: > On Friday, August 26th, 2022 at 10:19, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > > On Wed, Aug 24, 2022 at 03:08:55PM +0000, Simon Ser wrote: > > > This new kernel capability indicates whether async page-flips are > > > supported via the atomic uAPI. DRM clients can use it to check > > > for support before feeding DRM_MODE_PAGE_FLIP_ASYNC to the kernel. > > > > I think we'd need to clarify the semantics of the async flag > > for atomic commits. > > > > Eg. on Intel hw only pure page flips are possible async, if you do > > anything else (change plane size/pos/scaling/etc.) you will need > > to do a sync update. Technically not even all page flips (from the > > uapi POV) might be possible as the exact scanout source address > > is specified via two registers, only one of which can be update > > async. So technically the two framebuffers might be laid out > > just slightly differently which could prevent an async flip. > > Also only some subset of planes actually support async flips. > > Also IIRC some format modifiers don't support async flip at all (CCS)? Yeah, that too. Also planar YUV formats aren't allowed. > > > And on hw where multiple planes support it on the same crtc, only one > > plane can do it at a time. Well, more accurately we can only select > > one plane at a time to give us the "flip done" interrupt. I guess > > if the user wants to async flip multiple planes at the same time > > we could do them serially as opposed to in parallel to make sure > > all the flips actually happened before we signal completion of the > > entire commit. Async flips of multiple planes probably won't > > happen atomically anyway so doing them serially seems fine. > > > > ATM in i915 we probably don't have sufficient state checks in > > place to catch all the restrictions, and instead in part we rely > > on the limited scope of the legacy async flip ioctl to make sure > > the operation doesn't attempt something the hw can't do. > > Yeah, that makes sense. > > In the documentation patch discussion [1], it appears it's not clear what > drivers should do when async flip isn't possible with the legacy uAPI. > > For the atomic uAPI, we need to pick on of these two approaches: > > 1. Let the kernel fall back to a sync flip if async isn't possible. This > simplifies user-space, but then user-space has no reliable way to figure out > what really happened (sync or async?). That could be fixed with a new > read-only CRTC prop indicating whether the last flip was async or not. > However, maybe someone will come up in the future with user-space which > needs to only apply an update if async flip is possible, in which case this > approach falls short. > 2. Make the kernel return EINVAL if async flip isn't possible. This adds more > complexity to user-space, but enables a more reliable and deterministic > uAPI. This is also more consistent with the rest of the existing atomic > uAPI. The current behaviour is somewhat a combination of the two. We return an error if async flip is not possible at all given the current state. When async flip is possible we return success, but may still internally fall back to a sync flip for the first flip. That is required on some borked hardware that can't switch from sync flips to async flips without doing an extra sync flip. Also on some other hardware we intentionally fall back to a sync flip for the first async flip, so that we can reprogram some display FIFO stuff (aimed to make the subsequent async flips faster). > > Note, current user-space would only need to opportunistically enable async > flip. IOW, I think that for current user-space plans "async if possible, > otherwise sync" is good enough. That behavior maps well to the Vulkan present > modes as well (which says that "this mode *may* result in visible tearing", but > doesn't guarantee it). The current behaviour is to fall back to a blit if the async flip fails. So you still get the same effective behaviour, just not as efficient. I think that's a reasonable way to handle it. > > Another possible shortcoming of the proposed new uAPI here is that user-space > cannot submit a single atomic commit which updates multiple CRTCs, and > individually select which CRTC does an async flip. This could be fixed with > a "ASYNC_FLIP" CRTC prop which the kernel always resets to 0 on commit. I'm not > sure we want/need to cross that bridge right now, it would be easy enough to > add as a second step if some user-space would benefit from it. Technically it should really be per-plane since that is what does the flip. But I have a feeling that allowing a mix of async and sync in the same commit is just going to make everything more complicated without really helping anything (async flips won't happen atomically anyway with anything else). One (crazy?) idea I had for the atomic api is that we could even reject most of the properties already on the uapi level before anyone gets to examine the final state. Ie. as soon as the atomic ioctl sees eg. a gamma LUT property change it would just immediately return an error if async flip is also requested. > > What do you think? > > [1]: https://lore.kernel.org/dri-devel/ASSNOUe9wtsXskZjVlf1X4pl53T7pVE0MfEzkQ_h4cX0tjnF7e3cxpwGpRNPudmIHoRuW4kz_v1AeTpXgouLpTYcI8q-lPTzc1YMLR8JiJM=@emersion.fr/
On Tue, 30 Aug 2022 11:08:22 +0300 Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > On Mon, Aug 29, 2022 at 04:01:44PM +0000, Simon Ser wrote: > > On Friday, August 26th, 2022 at 10:19, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > > > > On Wed, Aug 24, 2022 at 03:08:55PM +0000, Simon Ser wrote: > > > > This new kernel capability indicates whether async page-flips are > > > > supported via the atomic uAPI. DRM clients can use it to check > > > > for support before feeding DRM_MODE_PAGE_FLIP_ASYNC to the kernel. > > > > > > I think we'd need to clarify the semantics of the async flag > > > for atomic commits. > > > > > > Eg. on Intel hw only pure page flips are possible async, if you do > > > anything else (change plane size/pos/scaling/etc.) you will need > > > to do a sync update. Technically not even all page flips (from the > > > uapi POV) might be possible as the exact scanout source address > > > is specified via two registers, only one of which can be update > > > async. So technically the two framebuffers might be laid out > > > just slightly differently which could prevent an async flip. > > > Also only some subset of planes actually support async flips. > > > > Also IIRC some format modifiers don't support async flip at all (CCS)? > > Yeah, that too. Also planar YUV formats aren't allowed. > > > > > > And on hw where multiple planes support it on the same crtc, only one > > > plane can do it at a time. Well, more accurately we can only select > > > one plane at a time to give us the "flip done" interrupt. I guess > > > if the user wants to async flip multiple planes at the same time > > > we could do them serially as opposed to in parallel to make sure > > > all the flips actually happened before we signal completion of the > > > entire commit. Async flips of multiple planes probably won't > > > happen atomically anyway so doing them serially seems fine. > > > > > > ATM in i915 we probably don't have sufficient state checks in > > > place to catch all the restrictions, and instead in part we rely > > > on the limited scope of the legacy async flip ioctl to make sure > > > the operation doesn't attempt something the hw can't do. > > > > Yeah, that makes sense. > > > > In the documentation patch discussion [1], it appears it's not clear what > > drivers should do when async flip isn't possible with the legacy uAPI. > > > > For the atomic uAPI, we need to pick on of these two approaches: > > > > 1. Let the kernel fall back to a sync flip if async isn't possible. This > > simplifies user-space, but then user-space has no reliable way to figure out > > what really happened (sync or async?). That could be fixed with a new > > read-only CRTC prop indicating whether the last flip was async or not. > > However, maybe someone will come up in the future with user-space which > > needs to only apply an update if async flip is possible, in which case this > > approach falls short. There is the pageflip completion timestamp in the DRM event. If userspace knows the phase and period of the scanout cycle, the completion timestamp should tell quite reliably if the update was tearing. For the phase, one can query KMS for the last vblank timestamp. This should work also for VRR I assume. For the period, fixed-frequency video mode has it straight. VRR gives only a range or a minimum period. > > 2. Make the kernel return EINVAL if async flip isn't possible. This adds more > > complexity to user-space, but enables a more reliable and deterministic > > uAPI. This is also more consistent with the rest of the existing atomic > > uAPI. > > The current behaviour is somewhat a combination of the two. We return > an error if async flip is not possible at all given the current state. > > When async flip is possible we return success, but may still internally > fall back to a sync flip for the first flip. That is required on some > borked hardware that can't switch from sync flips to async flips without > doing an extra sync flip. Also on some other hardware we intentionally > fall back to a sync flip for the first async flip, so that we can > reprogram some display FIFO stuff (aimed to make the subsequent async > flips faster). Oh, so userspace should expect to run async for long periods of time, and not use async this frame, sync next, then async again depending on per-frame timings. That seems important to note. It's almost like the async flag should be a KMS property instead of a commit ioctl flag. > > Note, current user-space would only need to opportunistically enable async > > flip. IOW, I think that for current user-space plans "async if possible, > > otherwise sync" is good enough. That behavior maps well to the Vulkan present > > modes as well (which says that "this mode *may* result in visible tearing", but > > doesn't guarantee it). > > The current behaviour is to fall back to a blit if the async > flip fails. So you still get the same effective behaviour, just > not as efficient. I think that's a reasonable way to handle it. That's purely an Xorg thing, right? Should Wayland compositors implement the same thing is a good question. > > Another possible shortcoming of the proposed new uAPI here is that user-space > > cannot submit a single atomic commit which updates multiple CRTCs, and > > individually select which CRTC does an async flip. This could be fixed with I would think that you can just do per-CRTC atomic commits in that case. You would do per-CRTC atomic commits anyway when the vblanks do not coincide. I expect tearing updates to have unpredictable latency, especially if they can silently fall back to sync flips, so doing multi-CRTC async flips is not useful. > > a "ASYNC_FLIP" CRTC prop which the kernel always resets to 0 on commit. I'm not > > sure we want/need to cross that bridge right now, it would be easy enough to > > add as a second step if some user-space would benefit from it. > > Technically it should really be per-plane since that is what does > the flip. But I have a feeling that allowing a mix of async and > sync in the same commit is just going to make everything more > complicated without really helping anything (async flips won't > happen atomically anyway with anything else). > > One (crazy?) idea I had for the atomic api is that we could even > reject most of the properties already on the uapi level before anyone > gets to examine the final state. Ie. as soon as the atomic ioctl sees > eg. a gamma LUT property change it would just immediately return > an error if async flip is also requested. I agree with these two paragraphs. What about limiting async flag to atomic commits that update only a single KMS plane (regardless of how many planes are active)? Maybe that would be a good first step? > > > > > What do you think? > > > > [1]: https://lore.kernel.org/dri-devel/ASSNOUe9wtsXskZjVlf1X4pl53T7pVE0MfEzkQ_h4cX0tjnF7e3cxpwGpRNPudmIHoRuW4kz_v1AeTpXgouLpTYcI8q-lPTzc1YMLR8JiJM=@emersion.fr/ > Thanks, pq
On 2022-08-29 18:01, Simon Ser wrote: > On Friday, August 26th, 2022 at 10:19, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: >> On Wed, Aug 24, 2022 at 03:08:55PM +0000, Simon Ser wrote: >>> This new kernel capability indicates whether async page-flips are >>> supported via the atomic uAPI. DRM clients can use it to check >>> for support before feeding DRM_MODE_PAGE_FLIP_ASYNC to the kernel. >> >> I think we'd need to clarify the semantics of the async flag >> for atomic commits. >> >> Eg. on Intel hw only pure page flips are possible async, if you do >> anything else (change plane size/pos/scaling/etc.) you will need >> to do a sync update. Technically not even all page flips (from the >> uapi POV) might be possible as the exact scanout source address >> is specified via two registers, only one of which can be update >> async. So technically the two framebuffers might be laid out >> just slightly differently which could prevent an async flip. >> Also only some subset of planes actually support async flips. > > Also IIRC some format modifiers don't support async flip at all (CCS)? > >> And on hw where multiple planes support it on the same crtc, only one >> plane can do it at a time. Well, more accurately we can only select >> one plane at a time to give us the "flip done" interrupt. I guess >> if the user wants to async flip multiple planes at the same time >> we could do them serially as opposed to in parallel to make sure >> all the flips actually happened before we signal completion of the >> entire commit. Async flips of multiple planes probably won't >> happen atomically anyway so doing them serially seems fine. >> >> ATM in i915 we probably don't have sufficient state checks in >> place to catch all the restrictions, and instead in part we rely >> on the limited scope of the legacy async flip ioctl to make sure >> the operation doesn't attempt something the hw can't do. > > Yeah, that makes sense. > > In the documentation patch discussion [1], it appears it's not clear what > drivers should do when async flip isn't possible with the legacy uAPI. > > For the atomic uAPI, we need to pick on of these two approaches: > > 1. Let the kernel fall back to a sync flip if async isn't possible. This > simplifies user-space, but then user-space has no reliable way to figure out > what really happened (sync or async?). That could be fixed with a new > read-only CRTC prop indicating whether the last flip was async or not. > However, maybe someone will come up in the future with user-space which > needs to only apply an update if async flip is possible, in which case this > approach falls short. The future is now. :) As I described in the documentation patch discussion, this approach would make it tricky for a Wayland compositor to decide if it should use an async commit (which needs to be done ASAP to serve the intended purpose) or not (in which case the compositor may want to delay the commit as long as possible for minimal latency). > Another possible shortcoming of the proposed new uAPI here is that user-space > cannot submit a single atomic commit which updates multiple CRTCs, and > individually select which CRTC does an async flip. This could be fixed with > a "ASYNC_FLIP" CRTC prop which the kernel always resets to 0 on commit. I'm not > sure we want/need to cross that bridge right now, it would be easy enough to > add as a second step if some user-space would benefit from it. I thought about this as well, but I came to the conclusion it shouldn't be needed. User space can do one commit for the sync CRTC/s planes first and another commit for the async ones afterwards.
On Tue, Aug 30, 2022 at 11:40:10AM +0300, Pekka Paalanen wrote: > On Tue, 30 Aug 2022 11:08:22 +0300 > Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > > On Mon, Aug 29, 2022 at 04:01:44PM +0000, Simon Ser wrote: > > > On Friday, August 26th, 2022 at 10:19, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > > > > > > On Wed, Aug 24, 2022 at 03:08:55PM +0000, Simon Ser wrote: > > > > > This new kernel capability indicates whether async page-flips are > > > > > supported via the atomic uAPI. DRM clients can use it to check > > > > > for support before feeding DRM_MODE_PAGE_FLIP_ASYNC to the kernel. > > > > > > > > I think we'd need to clarify the semantics of the async flag > > > > for atomic commits. > > > > > > > > Eg. on Intel hw only pure page flips are possible async, if you do > > > > anything else (change plane size/pos/scaling/etc.) you will need > > > > to do a sync update. Technically not even all page flips (from the > > > > uapi POV) might be possible as the exact scanout source address > > > > is specified via two registers, only one of which can be update > > > > async. So technically the two framebuffers might be laid out > > > > just slightly differently which could prevent an async flip. > > > > Also only some subset of planes actually support async flips. > > > > > > Also IIRC some format modifiers don't support async flip at all (CCS)? > > > > Yeah, that too. Also planar YUV formats aren't allowed. > > > > > > > > > And on hw where multiple planes support it on the same crtc, only one > > > > plane can do it at a time. Well, more accurately we can only select > > > > one plane at a time to give us the "flip done" interrupt. I guess > > > > if the user wants to async flip multiple planes at the same time > > > > we could do them serially as opposed to in parallel to make sure > > > > all the flips actually happened before we signal completion of the > > > > entire commit. Async flips of multiple planes probably won't > > > > happen atomically anyway so doing them serially seems fine. > > > > > > > > ATM in i915 we probably don't have sufficient state checks in > > > > place to catch all the restrictions, and instead in part we rely > > > > on the limited scope of the legacy async flip ioctl to make sure > > > > the operation doesn't attempt something the hw can't do. > > > > > > Yeah, that makes sense. > > > > > > In the documentation patch discussion [1], it appears it's not clear what > > > drivers should do when async flip isn't possible with the legacy uAPI. > > > > > > For the atomic uAPI, we need to pick on of these two approaches: > > > > > > 1. Let the kernel fall back to a sync flip if async isn't possible. This > > > simplifies user-space, but then user-space has no reliable way to figure out > > > what really happened (sync or async?). That could be fixed with a new > > > read-only CRTC prop indicating whether the last flip was async or not. > > > However, maybe someone will come up in the future with user-space which > > > needs to only apply an update if async flip is possible, in which case this > > > approach falls short. > > There is the pageflip completion timestamp in the DRM event. If > userspace knows the phase and period of the scanout cycle, the > completion timestamp should tell quite reliably if the update was > tearing. > > For the phase, one can query KMS for the last vblank timestamp. This > should work also for VRR I assume. > > For the period, fixed-frequency video mode has it straight. VRR gives > only a range or a minimum period. > > > > 2. Make the kernel return EINVAL if async flip isn't possible. This adds more > > > complexity to user-space, but enables a more reliable and deterministic > > > uAPI. This is also more consistent with the rest of the existing atomic > > > uAPI. > > > > The current behaviour is somewhat a combination of the two. We return > > an error if async flip is not possible at all given the current state. > > > > When async flip is possible we return success, but may still internally > > fall back to a sync flip for the first flip. That is required on some > > borked hardware that can't switch from sync flips to async flips without > > doing an extra sync flip. Also on some other hardware we intentionally > > fall back to a sync flip for the first async flip, so that we can > > reprogram some display FIFO stuff (aimed to make the subsequent async > > flips faster). > > Oh, so userspace should expect to run async for long periods of time, > and not use async this frame, sync next, then async again depending on > per-frame timings. > > That seems important to note. Yeah, our hw can't really do the "tear if late, otherwise do a sync flip" behaviour. GLX_swap_control_tear I think was the glx extension for that. > > It's almost like the async flag should be a KMS property instead of a > commit ioctl flag. I guess that would be one way to implement it. > > > > Note, current user-space would only need to opportunistically enable async > > > flip. IOW, I think that for current user-space plans "async if possible, > > > otherwise sync" is good enough. That behavior maps well to the Vulkan present > > > modes as well (which says that "this mode *may* result in visible tearing", but > > > doesn't guarantee it). > > > > The current behaviour is to fall back to a blit if the async > > flip fails. So you still get the same effective behaviour, just > > not as efficient. I think that's a reasonable way to handle it. > > That's purely an Xorg thing, right? Yes. > > Should Wayland compositors implement the same thing is a good question. Is the whole tear+wayland situation more or less up in the air still? > > > > Another possible shortcoming of the proposed new uAPI here is that user-space > > > cannot submit a single atomic commit which updates multiple CRTCs, and > > > individually select which CRTC does an async flip. This could be fixed with > > I would think that you can just do per-CRTC atomic commits in that > case. You would do per-CRTC atomic commits anyway when the vblanks do > not coincide. I expect tearing updates to have unpredictable latency, > especially if they can silently fall back to sync flips, so doing > multi-CRTC async flips is not useful. > > > > a "ASYNC_FLIP" CRTC prop which the kernel always resets to 0 on commit. I'm not > > > sure we want/need to cross that bridge right now, it would be easy enough to > > > add as a second step if some user-space would benefit from it. > > > > Technically it should really be per-plane since that is what does > > the flip. But I have a feeling that allowing a mix of async and > > sync in the same commit is just going to make everything more > > complicated without really helping anything (async flips won't > > happen atomically anyway with anything else). > > > > One (crazy?) idea I had for the atomic api is that we could even > > reject most of the properties already on the uapi level before anyone > > gets to examine the final state. Ie. as soon as the atomic ioctl sees > > eg. a gamma LUT property change it would just immediately return > > an error if async flip is also requested. > > I agree with these two paragraphs. > > What about limiting async flag to atomic commits that update only a > single KMS plane (regardless of how many planes are active)? Maybe that > would be a good first step? Sure, that might help a bit. But I'm rather more worried about all the state that you can now include in the atomic commit but aren't actually allowed to change. Eg. probably all crtc/connector properties, and most plane properties. The legacy ioctl blocks most of that implicitly, but the atomic ioctl does not. Hence my idea about rejecting basically everything except the plane fb property changes already at the ioctl level. Would avoid that stuff leaking in by accident if the driver forgets to check absolutely everything.
On Tuesday, August 30th, 2022 at 10:08, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > In the documentation patch discussion [1], it appears it's not clear what > > drivers should do when async flip isn't possible with the legacy uAPI. > > > > For the atomic uAPI, we need to pick on of these two approaches: > > > > 1. Let the kernel fall back to a sync flip if async isn't possible. This > > simplifies user-space, but then user-space has no reliable way to figure out > > what really happened (sync or async?). That could be fixed with a new > > read-only CRTC prop indicating whether the last flip was async or not. > > However, maybe someone will come up in the future with user-space which > > needs to only apply an update if async flip is possible, in which case this > > approach falls short. > > 2. Make the kernel return EINVAL if async flip isn't possible. This adds more > > complexity to user-space, but enables a more reliable and deterministic > > uAPI. This is also more consistent with the rest of the existing atomic > > uAPI. > > The current behaviour is somewhat a combination of the two. We return > an error if async flip is not possible at all given the current state. > > When async flip is possible we return success, but may still internally > fall back to a sync flip for the first flip. That is required on some > borked hardware that can't switch from sync flips to async flips without > doing an extra sync flip. Also on some other hardware we intentionally > fall back to a sync flip for the first async flip, so that we can > reprogram some display FIFO stuff (aimed to make the subsequent async > flips faster). Hm. Would it be possible for the atomic uAPI to return EINVAL in this case too, to let user-space know what really happened? I suppose user-space could then (mistakenly) assume that async flip is never possible, and never try again? > > Note, current user-space would only need to opportunistically enable async > > flip. IOW, I think that for current user-space plans "async if possible, > > otherwise sync" is good enough. That behavior maps well to the Vulkan present > > modes as well (which says that "this mode *may* result in visible tearing", but > > doesn't guarantee it). > > The current behaviour is to fall back to a blit if the async > flip fails. So you still get the same effective behaviour, just > not as efficient. I think that's a reasonable way to handle it. After some discussion on IRC: the above describes Xorg's behavior. To reproduce this behavior with the atomic uAPI, it is necessary to use approach (2): make the atomic commit fail if async flip isn't possible, to let user-space fall back to a blit. > > Another possible shortcoming of the proposed new uAPI here is that user-space > > cannot submit a single atomic commit which updates multiple CRTCs, and > > individually select which CRTC does an async flip. This could be fixed with > > a "ASYNC_FLIP" CRTC prop which the kernel always resets to 0 on commit. I'm not > > sure we want/need to cross that bridge right now, it would be easy enough to > > add as a second step if some user-space would benefit from it. > > Technically it should really be per-plane since that is what does > the flip. But I have a feeling that allowing a mix of async and > sync in the same commit is just going to make everything more > complicated without really helping anything (async flips won't > happen atomically anyway with anything else). Agreed. > One (crazy?) idea I had for the atomic api is that we could even > reject most of the properties already on the uapi level before anyone > gets to examine the final state. Ie. as soon as the atomic ioctl sees > eg. a gamma LUT property change it would just immediately return > an error if async flip is also requested. Hm, I guess... Although amdgpu doesn't really need this, it already has all of the logic checking for stuff like gamma LUT property change + async. Could maybe be a DRM helper if other drivers need it.
On Tuesday, August 30th, 2022 at 12:24, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > > The current behaviour is to fall back to a blit if the async > > > flip fails. So you still get the same effective behaviour, just > > > not as efficient. I think that's a reasonable way to handle it. > > > > That's purely an Xorg thing, right? > > Yes. > > > > > Should Wayland compositors implement the same thing is a good question. > > Is the whole tear+wayland situation more or less up in the air still? The goal of the patches I'm sending is to allow Wayland compositors to request tearing. We also have a WIP protocol extension for this [1]. [1]: https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/65
(Oops, I replied to the wrong thread. Re-sending to the correct one.) On Tuesday, August 30th, 2022 at 10:41, Michel Dänzer <michel.daenzer@mailbox.org> wrote: > > For the atomic uAPI, we need to pick on of these two approaches: > > > > 1. Let the kernel fall back to a sync flip if async isn't possible. This > > simplifies user-space, but then user-space has no reliable way to figure out > > what really happened (sync or async?). That could be fixed with a new > > read-only CRTC prop indicating whether the last flip was async or not. > > However, maybe someone will come up in the future with user-space which > > needs to only apply an update if async flip is possible, in which case this > > approach falls short. > > The future is now. :) > > As I described in the documentation patch discussion, this approach would > make it tricky for a Wayland compositor to decide if it should use an async > commit (which needs to be done ASAP to serve the intended purpose) or not (in > which case the compositor may want to delay the commit as long as possible > for minimal latency). Ah right. If an async page-flip is not possible, then a Wayland compositor could want to wait the "last moment" before the next vblank to see if a more up-to-date buffer is available. With that + Xorg usage, I think we have a rather solid case for failing async flips rather than silently falling back to vsync.
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index ca2a6e6101dc..5b1591e2b46c 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -302,6 +302,11 @@ static int drm_getcap(struct drm_device *dev, void *data, struct drm_file *file_ case DRM_CAP_CRTC_IN_VBLANK_EVENT: req->value = 1; break; + case DRM_CAP_ATOMIC_ASYNC_PAGE_FLIP: + req->value = drm_core_check_feature(dev, DRIVER_ATOMIC) && + dev->mode_config.async_page_flip && + !dev->mode_config.atomic_async_page_flip_not_supported; + break; default: return -EINVAL; } diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index 642808520d92..b1962628ecda 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -706,7 +706,8 @@ struct drm_gem_open { /** * DRM_CAP_ASYNC_PAGE_FLIP * - * If set to 1, the driver supports &DRM_MODE_PAGE_FLIP_ASYNC. + * If set to 1, the driver supports &DRM_MODE_PAGE_FLIP_ASYNC for legacy + * page-flips. */ #define DRM_CAP_ASYNC_PAGE_FLIP 0x7 /** @@ -767,6 +768,13 @@ struct drm_gem_open { * Documentation/gpu/drm-mm.rst, section "DRM Sync Objects". */ #define DRM_CAP_SYNCOBJ_TIMELINE 0x14 +/** + * DRM_CAP_ATOMIC_ASYNC_PAGE_FLIP + * + * If set to 1, the driver supports &DRM_MODE_PAGE_FLIP_ASYNC for atomic + * commits. + */ +#define DRM_CAP_ATOMIC_ASYNC_PAGE_FLIP 0x15 /* DRM_IOCTL_GET_CAP ioctl argument type */ struct drm_get_cap {
This new kernel capability indicates whether async page-flips are supported via the atomic uAPI. DRM clients can use it to check for support before feeding DRM_MODE_PAGE_FLIP_ASYNC to the kernel. Make it clear that DRM_CAP_ASYNC_PAGE_FLIP is for legacy uAPI only. Signed-off-by: Simon Ser <contact@emersion.fr> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Joshua Ashton <joshua@froggi.es> Cc: Melissa Wen <mwen@igalia.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Harry Wentland <hwentlan@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> --- drivers/gpu/drm/drm_ioctl.c | 5 +++++ include/uapi/drm/drm.h | 10 +++++++++- 2 files changed, 14 insertions(+), 1 deletion(-)