Message ID | 20190211032225.9488-3-mario.kleiner.de@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/3] drm/amdgpu: Fix get_crtc_scanoutpos behavior in vrr when vpos >= vtotal. | expand |
On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote: > The pageflip completion timestamps transmitted to userspace > via pageflip completion events are supposed to describe the > time at which the first pixel of the new post-pageflip scanout > buffer leaves the video output of the gpu. This time is > identical to end of vblank, when active scanout starts. > > For a crtc in standard fixed refresh rate, the end of vblank > is identical to the vblank timestamps calculated by > drm_update_vblank_count() at each vblank interrupt, or each > vblank dis-/enable. Therefore pageflip events just carry > that vblank timestamp as their pageflip timestamp. > > For a crtc switched to variable refresh rate mode (vrr), the > pageflip completion timestamps are identical to the vblank > timestamps iff the pageflip was executed early in vblank, > before the minimum vblank duration elapsed. In this case > the time of display onset is identical to when the crtc > is running in fixed refresh rate. > > However, if a pageflip completes later in the vblank, inside > the "extended front porch" in vrr mode, then the vblank will > terminate at a fixed (back porch) duration after flip, so > the display onset time is delayed correspondingly. In this > case the vblank timestamp computed at vblank irq time would > be too early, and we need a way to calculate an estimated > pageflip timestamp that will be later than the vblank timestamp. > > How a driver determines such a "late flip" timestamp is hw > and driver specific, but this patch adds a new helper function > that allows the driver to propose such an alternate "late flip" > timestamp for use in pageflip events: > > drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > > When sending out pageflip events, we now compare that proposed > flip_timestamp against the vblank timestamp of the current > vblank of flip completion and choose to send out the greater/ > later timestamp as flip completion timestamp. > > The most simple way for a kms driver to supply a suitable > flip_timestamp in vrr mode would be to simply take a timestamp > at start of the pageflip completion handler, e.g., pageflip > irq handler: flip_timestamp = ktime_get(); and then set that > as proposed "late" alternative timestamp via ... > drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > > More clever approaches could try to add some corrective offset > for fixed back porch duration, or ideally use hardware features > like hw timestamps to calculate the exact end time of vblank. > > Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> > Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> > Cc: Harry Wentland <harry.wentland@amd.com> > Cc: Alex Deucher <alexander.deucher@amd.com> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us the right timestampe, once? With this I guess if you do a vblank query in between the wrong and the right vblank you'll get the bogus value. Not really great for userspace. -Daniel > --- > drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- > include/drm/drm_vblank.h | 8 ++++++ > 2 files changed, 56 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c > index 98e091175921..4b3a4c38fabe 100644 > --- a/drivers/gpu/drm/drm_vblank.c > +++ b/drivers/gpu/drm/drm_vblank.c > @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, > u64 seq, ktime_t now) > { > struct timespec64 tv; > + ktime_t alt_flip_time; > > switch (e->event.base.type) { > - case DRM_EVENT_VBLANK: > case DRM_EVENT_FLIP_COMPLETE: > + /* > + * For flip completion events, override "now" time > + * with alt_flip_time provided by the driver via > + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode > + * if that time is later than given "now" vblank time. > + */ > + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; > + if (alt_flip_time > now) > + now = alt_flip_time; > + /* Fallthrough */ > + case DRM_EVENT_VBLANK: > tv = ktime_to_timespec64(now); > e->event.vbl.sequence = seq; > /* > @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > > now = ktime_get(); > } > + > e->pipe = pipe; > send_vblank_event(dev, e, seq, now); > } > EXPORT_SYMBOL(drm_crtc_send_vblank_event); > > +/** > + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time > + * @crtc: the source CRTC of the pageflip completion event > + * @flip_time: The alternate pageflip completion timestamp in VRR mode > + * > + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried > + * by the pageflip event can never be earlier than the vblank timestamp of the > + * vblank of flip completion, as that vblank timestamp defines the end of the > + * shortest possible vblank duration. In case of a delayed flip completion > + * inside the extended VRR front porch however, the end of vblank can be much > + * later, so the driver must assign an estimated timestamp of that later end of > + * vblank. For a CRTC in VRR mode, the driver should use this helper function to > + * set an alternate flip completion timestamp in case of late flip completions > + * in extended vblank. In the most simple case, this @flip_time timestamp could > + * simply be a ktime_get() timestamp taken at the start of the pageflip > + * completion routine, with some constant duration of the back porch interval > + * added, although more precise estimates may be possible on some hardware if > + * the hardware provides some means of timestamping the true end of vblank. > + * > + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it > + * will use either the standard vblank timestamp, calculated for a minimum > + * duration vblank, or the provided @flip_time if that time is later than the > + * vblank timestamp, to get the best possible estimate of start of display of > + * the new post-pageflip scanout buffer. > + */ > +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > + ktime_t flip_time) > +{ > + struct drm_device *dev = crtc->dev; > + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; > + > + vblank->alt_flip_time = flip_time; > +} > +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); > + > static int __enable_vblank(struct drm_device *dev, unsigned int pipe) > { > if (drm_core_check_feature(dev, DRIVER_MODESET)) { > diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h > index 6ad9630d4f48..aacf44694ab6 100644 > --- a/include/drm/drm_vblank.h > +++ b/include/drm/drm_vblank.h > @@ -117,6 +117,12 @@ struct drm_vblank_crtc { > * @time: Vblank timestamp corresponding to @count. > */ > ktime_t time; > + /** > + * @alt_flip_time: Vblank timestamp for end of extended vblank due to > + * a late pageflip completion in variable refresh rate mode. Pageflip > + * events will carry the later one of @time and @alt_flip_time. > + */ > + ktime_t alt_flip_time; > > /** > * @refcount: Number of users/waiters of the vblank interrupt. Only when > @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); > u64 drm_crtc_vblank_count(struct drm_crtc *crtc); > u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, > ktime_t *vblanktime); > +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > + ktime_t flip_time); > void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > struct drm_pending_vblank_event *e); > void drm_crtc_arm_vblank_event(struct drm_crtc *crtc, > -- > 2.17.1 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
On 2/11/19 3:35 AM, Daniel Vetter wrote: > On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote: >> The pageflip completion timestamps transmitted to userspace >> via pageflip completion events are supposed to describe the >> time at which the first pixel of the new post-pageflip scanout >> buffer leaves the video output of the gpu. This time is >> identical to end of vblank, when active scanout starts. >> >> For a crtc in standard fixed refresh rate, the end of vblank >> is identical to the vblank timestamps calculated by >> drm_update_vblank_count() at each vblank interrupt, or each >> vblank dis-/enable. Therefore pageflip events just carry >> that vblank timestamp as their pageflip timestamp. >> >> For a crtc switched to variable refresh rate mode (vrr), the >> pageflip completion timestamps are identical to the vblank >> timestamps iff the pageflip was executed early in vblank, >> before the minimum vblank duration elapsed. In this case >> the time of display onset is identical to when the crtc >> is running in fixed refresh rate. >> >> However, if a pageflip completes later in the vblank, inside >> the "extended front porch" in vrr mode, then the vblank will >> terminate at a fixed (back porch) duration after flip, so >> the display onset time is delayed correspondingly. In this >> case the vblank timestamp computed at vblank irq time would >> be too early, and we need a way to calculate an estimated >> pageflip timestamp that will be later than the vblank timestamp. >> >> How a driver determines such a "late flip" timestamp is hw >> and driver specific, but this patch adds a new helper function >> that allows the driver to propose such an alternate "late flip" >> timestamp for use in pageflip events: >> >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); >> >> When sending out pageflip events, we now compare that proposed >> flip_timestamp against the vblank timestamp of the current >> vblank of flip completion and choose to send out the greater/ >> later timestamp as flip completion timestamp. >> >> The most simple way for a kms driver to supply a suitable >> flip_timestamp in vrr mode would be to simply take a timestamp >> at start of the pageflip completion handler, e.g., pageflip >> irq handler: flip_timestamp = ktime_get(); and then set that >> as proposed "late" alternative timestamp via ... >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); >> >> More clever approaches could try to add some corrective offset >> for fixed back porch duration, or ideally use hardware features >> like hw timestamps to calculate the exact end time of vblank. >> >> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> >> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> >> Cc: Harry Wentland <harry.wentland@amd.com> >> Cc: Alex Deucher <alexander.deucher@amd.com> > > Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us > the right timestampe, once? With this I guess if you do a vblank query in > between the wrong and the right vblank you'll get the bogus value. Not > really great for userspace. > -Daniel I think we calculate the timestamp and send the vblank event both within the pageflip IRQ handler so calculating the right pageflip timestamp once could probably be done. I'm not sure if it's easier than proposing a later flip time with an API like this though. The actual scanout time should be known from the page-flip handler so the semantics for VRR on/off remain the same. This is because the page-flip triggers entering the back porch if we're in the extended front porch. But scanout time from vblank events for something like DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only treated as estimates. If we're in the regular front porch then the timing to scanout is based on the fixed duration front porch for the current mode. If we're in the extended back porch then it's technically driver defined but the most reasonable guess is to assume that the front porch is going to end at any moment, so just return the length of the back porch for getting the scanout time. Proposing the late timestamp shouldn't affect vblank event in the DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip event case. I'm not sure if that's what's guaranteed to happen with this patch though. There doesn't seem to be any locking on either dev->vblank_time_lock or the vblank->seqlock so while it's likely to get the same vblank event back as the one just stored I don't think it's guaranteed. Nicholas Kazlauskas > >> --- >> drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- >> include/drm/drm_vblank.h | 8 ++++++ >> 2 files changed, 56 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c >> index 98e091175921..4b3a4c38fabe 100644 >> --- a/drivers/gpu/drm/drm_vblank.c >> +++ b/drivers/gpu/drm/drm_vblank.c >> @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, >> u64 seq, ktime_t now) >> { >> struct timespec64 tv; >> + ktime_t alt_flip_time; >> >> switch (e->event.base.type) { >> - case DRM_EVENT_VBLANK: >> case DRM_EVENT_FLIP_COMPLETE: >> + /* >> + * For flip completion events, override "now" time >> + * with alt_flip_time provided by the driver via >> + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode >> + * if that time is later than given "now" vblank time. >> + */ >> + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; >> + if (alt_flip_time > now) >> + now = alt_flip_time; >> + /* Fallthrough */ >> + case DRM_EVENT_VBLANK: >> tv = ktime_to_timespec64(now); >> e->event.vbl.sequence = seq; >> /* >> @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, >> >> now = ktime_get(); >> } >> + >> e->pipe = pipe; >> send_vblank_event(dev, e, seq, now); >> } >> EXPORT_SYMBOL(drm_crtc_send_vblank_event); >> >> +/** >> + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time >> + * @crtc: the source CRTC of the pageflip completion event >> + * @flip_time: The alternate pageflip completion timestamp in VRR mode >> + * >> + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried >> + * by the pageflip event can never be earlier than the vblank timestamp of the >> + * vblank of flip completion, as that vblank timestamp defines the end of the >> + * shortest possible vblank duration. In case of a delayed flip completion >> + * inside the extended VRR front porch however, the end of vblank can be much >> + * later, so the driver must assign an estimated timestamp of that later end of >> + * vblank. For a CRTC in VRR mode, the driver should use this helper function to >> + * set an alternate flip completion timestamp in case of late flip completions >> + * in extended vblank. In the most simple case, this @flip_time timestamp could >> + * simply be a ktime_get() timestamp taken at the start of the pageflip >> + * completion routine, with some constant duration of the back porch interval >> + * added, although more precise estimates may be possible on some hardware if >> + * the hardware provides some means of timestamping the true end of vblank. >> + * >> + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it >> + * will use either the standard vblank timestamp, calculated for a minimum >> + * duration vblank, or the provided @flip_time if that time is later than the >> + * vblank timestamp, to get the best possible estimate of start of display of >> + * the new post-pageflip scanout buffer. >> + */ >> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, >> + ktime_t flip_time) >> +{ >> + struct drm_device *dev = crtc->dev; >> + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; >> + >> + vblank->alt_flip_time = flip_time; >> +} >> +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); >> + >> static int __enable_vblank(struct drm_device *dev, unsigned int pipe) >> { >> if (drm_core_check_feature(dev, DRIVER_MODESET)) { >> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h >> index 6ad9630d4f48..aacf44694ab6 100644 >> --- a/include/drm/drm_vblank.h >> +++ b/include/drm/drm_vblank.h >> @@ -117,6 +117,12 @@ struct drm_vblank_crtc { >> * @time: Vblank timestamp corresponding to @count. >> */ >> ktime_t time; >> + /** >> + * @alt_flip_time: Vblank timestamp for end of extended vblank due to >> + * a late pageflip completion in variable refresh rate mode. Pageflip >> + * events will carry the later one of @time and @alt_flip_time. >> + */ >> + ktime_t alt_flip_time; >> >> /** >> * @refcount: Number of users/waiters of the vblank interrupt. Only when >> @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); >> u64 drm_crtc_vblank_count(struct drm_crtc *crtc); >> u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, >> ktime_t *vblanktime); >> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, >> + ktime_t flip_time); >> void drm_crtc_send_vblank_event(struct drm_crtc *crtc, >> struct drm_pending_vblank_event *e); >> void drm_crtc_arm_vblank_event(struct drm_crtc *crtc, >> -- >> 2.17.1 >> >> _______________________________________________ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/dri-devel >
On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas <Nicholas.Kazlauskas@amd.com> wrote: > > On 2/11/19 3:35 AM, Daniel Vetter wrote: > > On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote: > >> The pageflip completion timestamps transmitted to userspace > >> via pageflip completion events are supposed to describe the > >> time at which the first pixel of the new post-pageflip scanout > >> buffer leaves the video output of the gpu. This time is > >> identical to end of vblank, when active scanout starts. > >> > >> For a crtc in standard fixed refresh rate, the end of vblank > >> is identical to the vblank timestamps calculated by > >> drm_update_vblank_count() at each vblank interrupt, or each > >> vblank dis-/enable. Therefore pageflip events just carry > >> that vblank timestamp as their pageflip timestamp. > >> > >> For a crtc switched to variable refresh rate mode (vrr), the > >> pageflip completion timestamps are identical to the vblank > >> timestamps iff the pageflip was executed early in vblank, > >> before the minimum vblank duration elapsed. In this case > >> the time of display onset is identical to when the crtc > >> is running in fixed refresh rate. > >> > >> However, if a pageflip completes later in the vblank, inside > >> the "extended front porch" in vrr mode, then the vblank will > >> terminate at a fixed (back porch) duration after flip, so > >> the display onset time is delayed correspondingly. In this > >> case the vblank timestamp computed at vblank irq time would > >> be too early, and we need a way to calculate an estimated > >> pageflip timestamp that will be later than the vblank timestamp. > >> > >> How a driver determines such a "late flip" timestamp is hw > >> and driver specific, but this patch adds a new helper function > >> that allows the driver to propose such an alternate "late flip" > >> timestamp for use in pageflip events: > >> > >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > >> > >> When sending out pageflip events, we now compare that proposed > >> flip_timestamp against the vblank timestamp of the current > >> vblank of flip completion and choose to send out the greater/ > >> later timestamp as flip completion timestamp. > >> > >> The most simple way for a kms driver to supply a suitable > >> flip_timestamp in vrr mode would be to simply take a timestamp > >> at start of the pageflip completion handler, e.g., pageflip > >> irq handler: flip_timestamp = ktime_get(); and then set that > >> as proposed "late" alternative timestamp via ... > >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > >> > >> More clever approaches could try to add some corrective offset > >> for fixed back porch duration, or ideally use hardware features > >> like hw timestamps to calculate the exact end time of vblank. > >> > >> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> > >> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> > >> Cc: Harry Wentland <harry.wentland@amd.com> > >> Cc: Alex Deucher <alexander.deucher@amd.com> > > > > Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us > > the right timestampe, once? With this I guess if you do a vblank query in > > between the wrong and the right vblank you'll get the bogus value. Not > > really great for userspace. > > -Daniel > > I think we calculate the timestamp and send the vblank event both within > the pageflip IRQ handler so calculating the right pageflip timestamp > once could probably be done. I'm not sure if it's easier than proposing > a later flip time with an API like this though. > > The actual scanout time should be known from the page-flip handler so > the semantics for VRR on/off remain the same. This is because the > page-flip triggers entering the back porch if we're in the extended > front porch. > > But scanout time from vblank events for something like > DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only > treated as estimates. If we're in the regular front porch then the > timing to scanout is based on the fixed duration front porch for the > current mode. If we're in the extended back porch then it's technically > driver defined but the most reasonable guess is to assume that the front > porch is going to end at any moment, so just return the length of the > back porch for getting the scanout time. > > Proposing the late timestamp shouldn't affect vblank event in the > DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip > event case. I'm not sure if that's what's guaranteed to happen with this > patch though. There doesn't seem to be any locking on either > dev->vblank_time_lock or the vblank->seqlock so while it's likely to get > the same vblank event back as the one just stored I don't think it's > guaranteed. That's the inconsistency I mean to highlight - the timestamp for the same frame as observed through flip complete and through the wait_vblank ioctl can differ. Which they really shouldn't. Now added complication is that amdgpu sends out vblank events really early, which is used by userspace to do frontbuffer rendering in the vblank time. But I don't think anyone wants to do both VRR and frontbuffer rendering, hence I think it should be possible to create correct vblank timestamps for the VRR case, while leaving the current logic in place for everything else. But that means moving the entire vblank processing to where you know the next frame's start time for VRR, both for page flips and for all other vblank events. -Daniel > > Nicholas Kazlauskas > > > > >> --- > >> drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- > >> include/drm/drm_vblank.h | 8 ++++++ > >> 2 files changed, 56 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c > >> index 98e091175921..4b3a4c38fabe 100644 > >> --- a/drivers/gpu/drm/drm_vblank.c > >> +++ b/drivers/gpu/drm/drm_vblank.c > >> @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, > >> u64 seq, ktime_t now) > >> { > >> struct timespec64 tv; > >> + ktime_t alt_flip_time; > >> > >> switch (e->event.base.type) { > >> - case DRM_EVENT_VBLANK: > >> case DRM_EVENT_FLIP_COMPLETE: > >> + /* > >> + * For flip completion events, override "now" time > >> + * with alt_flip_time provided by the driver via > >> + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode > >> + * if that time is later than given "now" vblank time. > >> + */ > >> + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; > >> + if (alt_flip_time > now) > >> + now = alt_flip_time; > >> + /* Fallthrough */ > >> + case DRM_EVENT_VBLANK: > >> tv = ktime_to_timespec64(now); > >> e->event.vbl.sequence = seq; > >> /* > >> @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > >> > >> now = ktime_get(); > >> } > >> + > >> e->pipe = pipe; > >> send_vblank_event(dev, e, seq, now); > >> } > >> EXPORT_SYMBOL(drm_crtc_send_vblank_event); > >> > >> +/** > >> + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time > >> + * @crtc: the source CRTC of the pageflip completion event > >> + * @flip_time: The alternate pageflip completion timestamp in VRR mode > >> + * > >> + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried > >> + * by the pageflip event can never be earlier than the vblank timestamp of the > >> + * vblank of flip completion, as that vblank timestamp defines the end of the > >> + * shortest possible vblank duration. In case of a delayed flip completion > >> + * inside the extended VRR front porch however, the end of vblank can be much > >> + * later, so the driver must assign an estimated timestamp of that later end of > >> + * vblank. For a CRTC in VRR mode, the driver should use this helper function to > >> + * set an alternate flip completion timestamp in case of late flip completions > >> + * in extended vblank. In the most simple case, this @flip_time timestamp could > >> + * simply be a ktime_get() timestamp taken at the start of the pageflip > >> + * completion routine, with some constant duration of the back porch interval > >> + * added, although more precise estimates may be possible on some hardware if > >> + * the hardware provides some means of timestamping the true end of vblank. > >> + * > >> + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it > >> + * will use either the standard vblank timestamp, calculated for a minimum > >> + * duration vblank, or the provided @flip_time if that time is later than the > >> + * vblank timestamp, to get the best possible estimate of start of display of > >> + * the new post-pageflip scanout buffer. > >> + */ > >> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > >> + ktime_t flip_time) > >> +{ > >> + struct drm_device *dev = crtc->dev; > >> + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; > >> + > >> + vblank->alt_flip_time = flip_time; > >> +} > >> +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); > >> + > >> static int __enable_vblank(struct drm_device *dev, unsigned int pipe) > >> { > >> if (drm_core_check_feature(dev, DRIVER_MODESET)) { > >> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h > >> index 6ad9630d4f48..aacf44694ab6 100644 > >> --- a/include/drm/drm_vblank.h > >> +++ b/include/drm/drm_vblank.h > >> @@ -117,6 +117,12 @@ struct drm_vblank_crtc { > >> * @time: Vblank timestamp corresponding to @count. > >> */ > >> ktime_t time; > >> + /** > >> + * @alt_flip_time: Vblank timestamp for end of extended vblank due to > >> + * a late pageflip completion in variable refresh rate mode. Pageflip > >> + * events will carry the later one of @time and @alt_flip_time. > >> + */ > >> + ktime_t alt_flip_time; > >> > >> /** > >> * @refcount: Number of users/waiters of the vblank interrupt. Only when > >> @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); > >> u64 drm_crtc_vblank_count(struct drm_crtc *crtc); > >> u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, > >> ktime_t *vblanktime); > >> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > >> + ktime_t flip_time); > >> void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > >> struct drm_pending_vblank_event *e); > >> void drm_crtc_arm_vblank_event(struct drm_crtc *crtc, > >> -- > >> 2.17.1 > >> > >> _______________________________________________ > >> dri-devel mailing list > >> dri-devel@lists.freedesktop.org > >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > > >
On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas > <Nicholas.Kazlauskas@amd.com> wrote: > > > > On 2/11/19 3:35 AM, Daniel Vetter wrote: > > > On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote: > > >> The pageflip completion timestamps transmitted to userspace > > >> via pageflip completion events are supposed to describe the > > >> time at which the first pixel of the new post-pageflip scanout > > >> buffer leaves the video output of the gpu. This time is > > >> identical to end of vblank, when active scanout starts. > > >> > > >> For a crtc in standard fixed refresh rate, the end of vblank > > >> is identical to the vblank timestamps calculated by > > >> drm_update_vblank_count() at each vblank interrupt, or each > > >> vblank dis-/enable. Therefore pageflip events just carry > > >> that vblank timestamp as their pageflip timestamp. > > >> > > >> For a crtc switched to variable refresh rate mode (vrr), the > > >> pageflip completion timestamps are identical to the vblank > > >> timestamps iff the pageflip was executed early in vblank, > > >> before the minimum vblank duration elapsed. In this case > > >> the time of display onset is identical to when the crtc > > >> is running in fixed refresh rate. > > >> > > >> However, if a pageflip completes later in the vblank, inside > > >> the "extended front porch" in vrr mode, then the vblank will > > >> terminate at a fixed (back porch) duration after flip, so > > >> the display onset time is delayed correspondingly. In this > > >> case the vblank timestamp computed at vblank irq time would > > >> be too early, and we need a way to calculate an estimated > > >> pageflip timestamp that will be later than the vblank timestamp. > > >> > > >> How a driver determines such a "late flip" timestamp is hw > > >> and driver specific, but this patch adds a new helper function > > >> that allows the driver to propose such an alternate "late flip" > > >> timestamp for use in pageflip events: > > >> > > >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > > >> > > >> When sending out pageflip events, we now compare that proposed > > >> flip_timestamp against the vblank timestamp of the current > > >> vblank of flip completion and choose to send out the greater/ > > >> later timestamp as flip completion timestamp. > > >> > > >> The most simple way for a kms driver to supply a suitable > > >> flip_timestamp in vrr mode would be to simply take a timestamp > > >> at start of the pageflip completion handler, e.g., pageflip > > >> irq handler: flip_timestamp = ktime_get(); and then set that > > >> as proposed "late" alternative timestamp via ... > > >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > > >> > > >> More clever approaches could try to add some corrective offset > > >> for fixed back porch duration, or ideally use hardware features > > >> like hw timestamps to calculate the exact end time of vblank. > > >> > > >> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> > > >> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> > > >> Cc: Harry Wentland <harry.wentland@amd.com> > > >> Cc: Alex Deucher <alexander.deucher@amd.com> > > > > > > Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us > > > the right timestampe, once? With this I guess if you do a vblank query in > > > between the wrong and the right vblank you'll get the bogus value. Not > > > really great for userspace. > > > -Daniel > > > > I think we calculate the timestamp and send the vblank event both within > > the pageflip IRQ handler so calculating the right pageflip timestamp > > once could probably be done. I'm not sure if it's easier than proposing > > a later flip time with an API like this though. > > > > The actual scanout time should be known from the page-flip handler so > > the semantics for VRR on/off remain the same. This is because the > > page-flip triggers entering the back porch if we're in the extended > > front porch. > > > > But scanout time from vblank events for something like > > DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only > > treated as estimates. If we're in the regular front porch then the > > timing to scanout is based on the fixed duration front porch for the > > current mode. If we're in the extended back porch then it's technically > > driver defined but the most reasonable guess is to assume that the front > > porch is going to end at any moment, so just return the length of the > > back porch for getting the scanout time. > > > > Proposing the late timestamp shouldn't affect vblank event in the > > DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip > > event case. I'm not sure if that's what's guaranteed to happen with this > > patch though. There doesn't seem to be any locking on either > > dev->vblank_time_lock or the vblank->seqlock so while it's likely to get > > the same vblank event back as the one just stored I don't think it's > > guaranteed. > > That's the inconsistency I mean to highlight - the timestamp for the > same frame as observed through flip complete and through the > wait_vblank ioctl can differ. Which they really shouldn't. > Ideally they shouldn't differ. The kernel docs for drm_crtc_state say that vblank and pageflip timestamps should always match. But then the kernel docs for "Variable refresh properties" in drm_connector.c for vblank timestamps were changed for the VRR implementation in Linux 5.0-rc to redefine them when in VRR mode. They are defined, but probably rather useless for any practical purpose, like this: "The semantics for the vertical blank timestamp differ when variable refresh rate is active. The vertical blank timestamp is defined to be an estimate using the current mode's fixed refresh rate timings. The semantics for the page-flip event timestamp remain the same." So our docs contradict each other as of Linux 5.0-rc. Certainly having useful vblank timetamps would be useful. > Now added complication is that amdgpu sends out vblank events really > early, which is used by userspace to do frontbuffer rendering in the > vblank time. But I don't think anyone wants to do both VRR and I think all kms drivers try to call drm_crtc_handle_vblank() at start of vblank to give Mesa the most time for frontbuffer rendering for classic X. But vblank events are also used for scheduling bufferswaps or other stuff for redirected windowed rendering, or via api's like OML_sync_controls glXWaitForMscOML, so there might be other things affected by a more delayed vblank handling. > frontbuffer rendering, hence I think it should be possible to create > correct vblank timestamps for the VRR case, while leaving the current > logic in place for everything else. But that means moving the entire > vblank processing to where you know the next frame's start time for > VRR, both for page flips and for all other vblank events. > -Daniel > I think for amdgpu it would be doable by calling drm_handle_vblank from the pageflip irq handler in VRR mode, and then additionally from the vblank interrupt handler like now. Vblank irq's are triggered via vline interrupts with programmable vertical trigger line position, so we could program the trigger line to be start of back-porch instead of start of front-porch. In the pageflip case we do vblank handling + timestamping + pflip completion from pflip irq, and have the regular back-porch vblank handling for refresh cycles in which no pageflip/pflip irq happened. Not sure if/how that would mess with below-vrr-refresh-rate-range support for stutter reduction at low fps though? Or with performance, given vblank event dispatch could be delayed a lot in VRR compared to normal mode? Or with all the vblank accounting during modesets or crtc en/disable? Also, could other drivers like Intel easily do this delayed vblank processing in the non-pageflip case? From my user perspective as a developer of a neuroscience research application that has a couple of exciting/important use cases for VRR mode, i absolutely do need trustworthy and precise pageflip event timestamps that represent reality, otherwise VRR mode would be less than useless for me. I can do without meaningful vblank timestamps in VRR mode though, and expected to do without them, as any scheduling has to happen time-based instead of based on vblank counts anyway, given that vblank counts are quite meaningless if every frame can have a different duration. I'd expect that situation to be similar for other apps that would want to use timestamps in VRR mode like video games, movie players or VR/AR applications. -mario > > > > Nicholas Kazlauskas > > > > > > > >> --- > > >> drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- > > >> include/drm/drm_vblank.h | 8 ++++++ > > >> 2 files changed, 56 insertions(+), 1 deletion(-) > > >> > > >> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c > > >> index 98e091175921..4b3a4c38fabe 100644 > > >> --- a/drivers/gpu/drm/drm_vblank.c > > >> +++ b/drivers/gpu/drm/drm_vblank.c > > >> @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, > > >> u64 seq, ktime_t now) > > >> { > > >> struct timespec64 tv; > > >> + ktime_t alt_flip_time; > > >> > > >> switch (e->event.base.type) { > > >> - case DRM_EVENT_VBLANK: > > >> case DRM_EVENT_FLIP_COMPLETE: > > >> + /* > > >> + * For flip completion events, override "now" time > > >> + * with alt_flip_time provided by the driver via > > >> + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode > > >> + * if that time is later than given "now" vblank time. > > >> + */ > > >> + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; > > >> + if (alt_flip_time > now) > > >> + now = alt_flip_time; > > >> + /* Fallthrough */ > > >> + case DRM_EVENT_VBLANK: > > >> tv = ktime_to_timespec64(now); > > >> e->event.vbl.sequence = seq; > > >> /* > > >> @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > > >> > > >> now = ktime_get(); > > >> } > > >> + > > >> e->pipe = pipe; > > >> send_vblank_event(dev, e, seq, now); > > >> } > > >> EXPORT_SYMBOL(drm_crtc_send_vblank_event); > > >> > > >> +/** > > >> + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time > > >> + * @crtc: the source CRTC of the pageflip completion event > > >> + * @flip_time: The alternate pageflip completion timestamp in VRR mode > > >> + * > > >> + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried > > >> + * by the pageflip event can never be earlier than the vblank timestamp of the > > >> + * vblank of flip completion, as that vblank timestamp defines the end of the > > >> + * shortest possible vblank duration. In case of a delayed flip completion > > >> + * inside the extended VRR front porch however, the end of vblank can be much > > >> + * later, so the driver must assign an estimated timestamp of that later end of > > >> + * vblank. For a CRTC in VRR mode, the driver should use this helper function to > > >> + * set an alternate flip completion timestamp in case of late flip completions > > >> + * in extended vblank. In the most simple case, this @flip_time timestamp could > > >> + * simply be a ktime_get() timestamp taken at the start of the pageflip > > >> + * completion routine, with some constant duration of the back porch interval > > >> + * added, although more precise estimates may be possible on some hardware if > > >> + * the hardware provides some means of timestamping the true end of vblank. > > >> + * > > >> + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it > > >> + * will use either the standard vblank timestamp, calculated for a minimum > > >> + * duration vblank, or the provided @flip_time if that time is later than the > > >> + * vblank timestamp, to get the best possible estimate of start of display of > > >> + * the new post-pageflip scanout buffer. > > >> + */ > > >> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > > >> + ktime_t flip_time) > > >> +{ > > >> + struct drm_device *dev = crtc->dev; > > >> + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; > > >> + > > >> + vblank->alt_flip_time = flip_time; > > >> +} > > >> +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); > > >> + > > >> static int __enable_vblank(struct drm_device *dev, unsigned int pipe) > > >> { > > >> if (drm_core_check_feature(dev, DRIVER_MODESET)) { > > >> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h > > >> index 6ad9630d4f48..aacf44694ab6 100644 > > >> --- a/include/drm/drm_vblank.h > > >> +++ b/include/drm/drm_vblank.h > > >> @@ -117,6 +117,12 @@ struct drm_vblank_crtc { > > >> * @time: Vblank timestamp corresponding to @count. > > >> */ > > >> ktime_t time; > > >> + /** > > >> + * @alt_flip_time: Vblank timestamp for end of extended vblank due to > > >> + * a late pageflip completion in variable refresh rate mode. Pageflip > > >> + * events will carry the later one of @time and @alt_flip_time. > > >> + */ > > >> + ktime_t alt_flip_time; > > >> > > >> /** > > >> * @refcount: Number of users/waiters of the vblank interrupt. Only when > > >> @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); > > >> u64 drm_crtc_vblank_count(struct drm_crtc *crtc); > > >> u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, > > >> ktime_t *vblanktime); > > >> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > > >> + ktime_t flip_time); > > >> void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > > >> struct drm_pending_vblank_event *e); > > >> void drm_crtc_arm_vblank_event(struct drm_crtc *crtc, > > >> -- > > >> 2.17.1 > > >> > > >> _______________________________________________ > > >> dri-devel mailing list > > >> dri-devel@lists.freedesktop.org > > >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > > > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote: > On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas > > <Nicholas.Kazlauskas@amd.com> wrote: > > > > > > On 2/11/19 3:35 AM, Daniel Vetter wrote: > > > > On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote: > > > >> The pageflip completion timestamps transmitted to userspace > > > >> via pageflip completion events are supposed to describe the > > > >> time at which the first pixel of the new post-pageflip scanout > > > >> buffer leaves the video output of the gpu. This time is > > > >> identical to end of vblank, when active scanout starts. > > > >> > > > >> For a crtc in standard fixed refresh rate, the end of vblank > > > >> is identical to the vblank timestamps calculated by > > > >> drm_update_vblank_count() at each vblank interrupt, or each > > > >> vblank dis-/enable. Therefore pageflip events just carry > > > >> that vblank timestamp as their pageflip timestamp. > > > >> > > > >> For a crtc switched to variable refresh rate mode (vrr), the > > > >> pageflip completion timestamps are identical to the vblank > > > >> timestamps iff the pageflip was executed early in vblank, > > > >> before the minimum vblank duration elapsed. In this case > > > >> the time of display onset is identical to when the crtc > > > >> is running in fixed refresh rate. > > > >> > > > >> However, if a pageflip completes later in the vblank, inside > > > >> the "extended front porch" in vrr mode, then the vblank will > > > >> terminate at a fixed (back porch) duration after flip, so > > > >> the display onset time is delayed correspondingly. In this > > > >> case the vblank timestamp computed at vblank irq time would > > > >> be too early, and we need a way to calculate an estimated > > > >> pageflip timestamp that will be later than the vblank timestamp. > > > >> > > > >> How a driver determines such a "late flip" timestamp is hw > > > >> and driver specific, but this patch adds a new helper function > > > >> that allows the driver to propose such an alternate "late flip" > > > >> timestamp for use in pageflip events: > > > >> > > > >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > > > >> > > > >> When sending out pageflip events, we now compare that proposed > > > >> flip_timestamp against the vblank timestamp of the current > > > >> vblank of flip completion and choose to send out the greater/ > > > >> later timestamp as flip completion timestamp. > > > >> > > > >> The most simple way for a kms driver to supply a suitable > > > >> flip_timestamp in vrr mode would be to simply take a timestamp > > > >> at start of the pageflip completion handler, e.g., pageflip > > > >> irq handler: flip_timestamp = ktime_get(); and then set that > > > >> as proposed "late" alternative timestamp via ... > > > >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > > > >> > > > >> More clever approaches could try to add some corrective offset > > > >> for fixed back porch duration, or ideally use hardware features > > > >> like hw timestamps to calculate the exact end time of vblank. > > > >> > > > >> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> > > > >> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> > > > >> Cc: Harry Wentland <harry.wentland@amd.com> > > > >> Cc: Alex Deucher <alexander.deucher@amd.com> > > > > > > > > Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us > > > > the right timestampe, once? With this I guess if you do a vblank query in > > > > between the wrong and the right vblank you'll get the bogus value. Not > > > > really great for userspace. > > > > -Daniel > > > > > > I think we calculate the timestamp and send the vblank event both within > > > the pageflip IRQ handler so calculating the right pageflip timestamp > > > once could probably be done. I'm not sure if it's easier than proposing > > > a later flip time with an API like this though. > > > > > > The actual scanout time should be known from the page-flip handler so > > > the semantics for VRR on/off remain the same. This is because the > > > page-flip triggers entering the back porch if we're in the extended > > > front porch. > > > > > > But scanout time from vblank events for something like > > > DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only > > > treated as estimates. If we're in the regular front porch then the > > > timing to scanout is based on the fixed duration front porch for the > > > current mode. If we're in the extended back porch then it's technically > > > driver defined but the most reasonable guess is to assume that the front > > > porch is going to end at any moment, so just return the length of the > > > back porch for getting the scanout time. > > > > > > Proposing the late timestamp shouldn't affect vblank event in the > > > DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip > > > event case. I'm not sure if that's what's guaranteed to happen with this > > > patch though. There doesn't seem to be any locking on either > > > dev->vblank_time_lock or the vblank->seqlock so while it's likely to get > > > the same vblank event back as the one just stored I don't think it's > > > guaranteed. > > > > That's the inconsistency I mean to highlight - the timestamp for the > > same frame as observed through flip complete and through the > > wait_vblank ioctl can differ. Which they really shouldn't. > > > > Ideally they shouldn't differ. The kernel docs for drm_crtc_state say > that vblank and pageflip timestamps should always match. But then the > kernel docs for "Variable refresh properties" in drm_connector.c for > vblank timestamps were changed for the VRR implementation in Linux > 5.0-rc to redefine them when in VRR mode. They are defined, but > probably rather useless for any practical purpose, like this: > > "The semantics for the vertical blank timestamp differ when > variable refresh rate is active. The vertical blank timestamp > is defined to be an estimate using the current mode's fixed > refresh rate timings. The semantics for the page-flip event > timestamp remain the same." Uh I missed that. That sounds like nonsense tbh. > So our docs contradict each other as of Linux 5.0-rc. Certainly having > useful vblank timetamps would be useful. Yup, imo vblank should still match page_flip. Otherwise I expect a lot of hilarity will ensue. > > Now added complication is that amdgpu sends out vblank events really > > early, which is used by userspace to do frontbuffer rendering in the > > vblank time. But I don't think anyone wants to do both VRR and > > I think all kms drivers try to call drm_crtc_handle_vblank() at start > of vblank to give Mesa the most time for frontbuffer rendering for > classic X. But vblank events are also used for scheduling bufferswaps > or other stuff for redirected windowed rendering, or via api's like > OML_sync_controls glXWaitForMscOML, so there might be other things > affected by a more delayed vblank handling. The frontbuffer rendering is very much X driver specific, and I think -amdgpu/radeon is the only one that requires this. No i915 driver ever used the vblank interrupt to schedule frontbuffer blits, we use some CS-side stalls. Wrt scheduling pageflips: The rule is that right after you've received the vblank for frame X, then an immediately schedule pageflip should hit X+1, but not X. amdgpu had this broken, but it's fixed since a while. > > frontbuffer rendering, hence I think it should be possible to create > > correct vblank timestamps for the VRR case, while leaving the current > > logic in place for everything else. But that means moving the entire > > vblank processing to where you know the next frame's start time for > > VRR, both for page flips and for all other vblank events. > > -Daniel > > > > I think for amdgpu it would be doable by calling drm_handle_vblank > from the pageflip irq handler in VRR mode, and then additionally from > the vblank interrupt handler like now. Vblank irq's are triggered via > vline interrupts with programmable vertical trigger line position, so > we could program the trigger line to be start of back-porch instead of > start of front-porch. In the pageflip case we do vblank handling + > timestamping + pflip completion from pflip irq, and have the regular > back-porch vblank handling for refresh cycles in which no > pageflip/pflip irq happened. Hm, can't we simply program the vblank to occur in the back porch region for VRR? The vblank timestamping should automatically correct the timestamp in that case. Would require a lot less code changes. Plus update the documentation ofc. > Not sure if/how that would mess with below-vrr-refresh-rate-range > support for stutter reduction at low fps though? Or with performance, > given vblank event dispatch could be delayed a lot in VRR compared to > normal mode? Or with all the vblank accounting during modesets or crtc > en/disable? Currently VRR and explicit frame scheduling don't mix. We've discussed this, and consens seems to be that we need a new property for page_flip, similar to the target frame, but instead of frames a timestamp. Then the driver can hit a specific time for the next frame. And since compositors use the vblank ioctl just to do frame scheduling, I don't think we have to worry hugely about interactions there. Much better to aim for clean semantics (i.e. vblank and flip complete timestamps better match). > Also, could other drivers like Intel easily do this delayed vblank > processing in the non-pageflip case? Intel's problem :-) But yeah we have page flip timestamp registers, so worst case all we need to do is shuffle code around a bit. And we have interrupts for both start of vblank and and of vblank. It might be a bit more work because we use the vblank interrupt to avoid races in atomic updates, but that's not too terrible to fix. > From my user perspective as a developer of a neuroscience research > application that has a couple of exciting/important use cases for VRR > mode, i absolutely do need trustworthy and precise pageflip event > timestamps that represent reality, otherwise VRR mode would be less > than useless for me. I can do without meaningful vblank timestamps in > VRR mode though, and expected to do without them, as any scheduling > has to happen time-based instead of based on vblank counts anyway, > given that vblank counts are quite meaningless if every frame can have > a different duration. I'd expect that situation to be similar for > other apps that would want to use timestamps in VRR mode like video > games, movie players or VR/AR applications. Yeah for that you want to schedule your frames with a target frame number. Would still be nice if vblank timestamps wouldn't be an outright lie (and the timestamp is still useful information, even if the frame counter isn't - many displays have min/max frame rates for VRR, which is information we could perhaps expose). -Daniel > > -mario > > > > > > > Nicholas Kazlauskas > > > > > > > > > > >> --- > > > >> drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- > > > >> include/drm/drm_vblank.h | 8 ++++++ > > > >> 2 files changed, 56 insertions(+), 1 deletion(-) > > > >> > > > >> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c > > > >> index 98e091175921..4b3a4c38fabe 100644 > > > >> --- a/drivers/gpu/drm/drm_vblank.c > > > >> +++ b/drivers/gpu/drm/drm_vblank.c > > > >> @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, > > > >> u64 seq, ktime_t now) > > > >> { > > > >> struct timespec64 tv; > > > >> + ktime_t alt_flip_time; > > > >> > > > >> switch (e->event.base.type) { > > > >> - case DRM_EVENT_VBLANK: > > > >> case DRM_EVENT_FLIP_COMPLETE: > > > >> + /* > > > >> + * For flip completion events, override "now" time > > > >> + * with alt_flip_time provided by the driver via > > > >> + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode > > > >> + * if that time is later than given "now" vblank time. > > > >> + */ > > > >> + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; > > > >> + if (alt_flip_time > now) > > > >> + now = alt_flip_time; > > > >> + /* Fallthrough */ > > > >> + case DRM_EVENT_VBLANK: > > > >> tv = ktime_to_timespec64(now); > > > >> e->event.vbl.sequence = seq; > > > >> /* > > > >> @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > > > >> > > > >> now = ktime_get(); > > > >> } > > > >> + > > > >> e->pipe = pipe; > > > >> send_vblank_event(dev, e, seq, now); > > > >> } > > > >> EXPORT_SYMBOL(drm_crtc_send_vblank_event); > > > >> > > > >> +/** > > > >> + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time > > > >> + * @crtc: the source CRTC of the pageflip completion event > > > >> + * @flip_time: The alternate pageflip completion timestamp in VRR mode > > > >> + * > > > >> + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried > > > >> + * by the pageflip event can never be earlier than the vblank timestamp of the > > > >> + * vblank of flip completion, as that vblank timestamp defines the end of the > > > >> + * shortest possible vblank duration. In case of a delayed flip completion > > > >> + * inside the extended VRR front porch however, the end of vblank can be much > > > >> + * later, so the driver must assign an estimated timestamp of that later end of > > > >> + * vblank. For a CRTC in VRR mode, the driver should use this helper function to > > > >> + * set an alternate flip completion timestamp in case of late flip completions > > > >> + * in extended vblank. In the most simple case, this @flip_time timestamp could > > > >> + * simply be a ktime_get() timestamp taken at the start of the pageflip > > > >> + * completion routine, with some constant duration of the back porch interval > > > >> + * added, although more precise estimates may be possible on some hardware if > > > >> + * the hardware provides some means of timestamping the true end of vblank. > > > >> + * > > > >> + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it > > > >> + * will use either the standard vblank timestamp, calculated for a minimum > > > >> + * duration vblank, or the provided @flip_time if that time is later than the > > > >> + * vblank timestamp, to get the best possible estimate of start of display of > > > >> + * the new post-pageflip scanout buffer. > > > >> + */ > > > >> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > > > >> + ktime_t flip_time) > > > >> +{ > > > >> + struct drm_device *dev = crtc->dev; > > > >> + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; > > > >> + > > > >> + vblank->alt_flip_time = flip_time; > > > >> +} > > > >> +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); > > > >> + > > > >> static int __enable_vblank(struct drm_device *dev, unsigned int pipe) > > > >> { > > > >> if (drm_core_check_feature(dev, DRIVER_MODESET)) { > > > >> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h > > > >> index 6ad9630d4f48..aacf44694ab6 100644 > > > >> --- a/include/drm/drm_vblank.h > > > >> +++ b/include/drm/drm_vblank.h > > > >> @@ -117,6 +117,12 @@ struct drm_vblank_crtc { > > > >> * @time: Vblank timestamp corresponding to @count. > > > >> */ > > > >> ktime_t time; > > > >> + /** > > > >> + * @alt_flip_time: Vblank timestamp for end of extended vblank due to > > > >> + * a late pageflip completion in variable refresh rate mode. Pageflip > > > >> + * events will carry the later one of @time and @alt_flip_time. > > > >> + */ > > > >> + ktime_t alt_flip_time; > > > >> > > > >> /** > > > >> * @refcount: Number of users/waiters of the vblank interrupt. Only when > > > >> @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); > > > >> u64 drm_crtc_vblank_count(struct drm_crtc *crtc); > > > >> u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, > > > >> ktime_t *vblanktime); > > > >> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > > > >> + ktime_t flip_time); > > > >> void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > > > >> struct drm_pending_vblank_event *e); > > > >> void drm_crtc_arm_vblank_event(struct drm_crtc *crtc, > > > >> -- > > > >> 2.17.1 > > > >> > > > >> _______________________________________________ > > > >> dri-devel mailing list > > > >> dri-devel@lists.freedesktop.org > > > >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > > > > > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
Quoting Daniel Vetter (2019-02-13 09:50:55) > On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote: > > I think all kms drivers try to call drm_crtc_handle_vblank() at start > > of vblank to give Mesa the most time for frontbuffer rendering for > > classic X. But vblank events are also used for scheduling bufferswaps > > or other stuff for redirected windowed rendering, or via api's like > > OML_sync_controls glXWaitForMscOML, so there might be other things > > affected by a more delayed vblank handling. > > The frontbuffer rendering is very much X driver specific, and I think > -amdgpu/radeon is the only one that requires this. No i915 driver ever > used the vblank interrupt to schedule frontbuffer blits, we use some > CS-side stalls. Fwiw, the Present midlayer does use vblank scheduling for inplace copy updates. Not that I wish to encourage anyone to use frontbuffer rendering. -Chris
On Wed, Feb 13, 2019 at 10:56 AM Chris Wilson <chris@chris-wilson.co.uk> wrote: > > Quoting Daniel Vetter (2019-02-13 09:50:55) > > On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote: > > > I think all kms drivers try to call drm_crtc_handle_vblank() at start > > > of vblank to give Mesa the most time for frontbuffer rendering for > > > classic X. But vblank events are also used for scheduling bufferswaps > > > or other stuff for redirected windowed rendering, or via api's like > > > OML_sync_controls glXWaitForMscOML, so there might be other things > > > affected by a more delayed vblank handling. > > > > The frontbuffer rendering is very much X driver specific, and I think > > -amdgpu/radeon is the only one that requires this. No i915 driver ever > > used the vblank interrupt to schedule frontbuffer blits, we use some > > CS-side stalls. > > Fwiw, the Present midlayer does use vblank scheduling for inplace copy > updates. Not that I wish to encourage anyone to use frontbuffer > rendering. > -Chris Yes, that's what i meant. Under DRI2 at least AMD, Intel and nouveau have throttling based on CS stalls to avoid tearing and do throttling. DRI3/Present last time i checked just waited for a vblank event and then triggered the blit - something that causes tearing even when triggered at start of vblank. -mario
On Wed, Feb 13, 2019 at 10:50 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote: > > On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas > > > <Nicholas.Kazlauskas@amd.com> wrote: > > > > > > > > On 2/11/19 3:35 AM, Daniel Vetter wrote: > > > > > On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote: > > > > >> The pageflip completion timestamps transmitted to userspace ... > > > > > > > > > > Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us > > > > > the right timestampe, once? With this I guess if you do a vblank query in > > > > > between the wrong and the right vblank you'll get the bogus value. Not > > > > > really great for userspace. > > > > > -Daniel > > > > > > > > I think we calculate the timestamp and send the vblank event both within > > > > the pageflip IRQ handler so calculating the right pageflip timestamp > > > > once could probably be done. I'm not sure if it's easier than proposing > > > > a later flip time with an API like this though. > > > > > > > > The actual scanout time should be known from the page-flip handler so > > > > the semantics for VRR on/off remain the same. This is because the > > > > page-flip triggers entering the back porch if we're in the extended > > > > front porch. > > > > > > > > But scanout time from vblank events for something like > > > > DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only > > > > treated as estimates. If we're in the regular front porch then the > > > > timing to scanout is based on the fixed duration front porch for the > > > > current mode. If we're in the extended back porch then it's technically > > > > driver defined but the most reasonable guess is to assume that the front > > > > porch is going to end at any moment, so just return the length of the > > > > back porch for getting the scanout time. > > > > > > > > Proposing the late timestamp shouldn't affect vblank event in the > > > > DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip > > > > event case. I'm not sure if that's what's guaranteed to happen with this > > > > patch though. There doesn't seem to be any locking on either > > > > dev->vblank_time_lock or the vblank->seqlock so while it's likely to get > > > > the same vblank event back as the one just stored I don't think it's > > > > guaranteed. > > > > > > That's the inconsistency I mean to highlight - the timestamp for the > > > same frame as observed through flip complete and through the > > > wait_vblank ioctl can differ. Which they really shouldn't. > > > > > > > Ideally they shouldn't differ. The kernel docs for drm_crtc_state say > > that vblank and pageflip timestamps should always match. But then the > > kernel docs for "Variable refresh properties" in drm_connector.c for > > vblank timestamps were changed for the VRR implementation in Linux > > 5.0-rc to redefine them when in VRR mode. They are defined, but > > probably rather useless for any practical purpose, like this: > > > > "The semantics for the vertical blank timestamp differ when > > variable refresh rate is active. The vertical blank timestamp > > is defined to be an estimate using the current mode's fixed > > refresh rate timings. The semantics for the page-flip event > > timestamp remain the same." > > Uh I missed that. That sounds like nonsense tbh. > > > So our docs contradict each other as of Linux 5.0-rc. Certainly having > > useful vblank timetamps would be useful. > > Yup, imo vblank should still match page_flip. Otherwise I expect a lot of > hilarity will ensue. > > > > Now added complication is that amdgpu sends out vblank events really > > > early, which is used by userspace to do frontbuffer rendering in the > > > vblank time. But I don't think anyone wants to do both VRR and > > > > I think all kms drivers try to call drm_crtc_handle_vblank() at start > > of vblank to give Mesa the most time for frontbuffer rendering for > > classic X. But vblank events are also used for scheduling bufferswaps > > or other stuff for redirected windowed rendering, or via api's like > > OML_sync_controls glXWaitForMscOML, so there might be other things > > affected by a more delayed vblank handling. > > The frontbuffer rendering is very much X driver specific, and I think > -amdgpu/radeon is the only one that requires this. No i915 driver ever > used the vblank interrupt to schedule frontbuffer blits, we use some > CS-side stalls. > > Wrt scheduling pageflips: The rule is that right after you've received the > vblank for frame X, then an immediately schedule pageflip should hit X+1, > but not X. amdgpu had this broken, but it's fixed since a while. > > > > frontbuffer rendering, hence I think it should be possible to create > > > correct vblank timestamps for the VRR case, while leaving the current > > > logic in place for everything else. But that means moving the entire > > > vblank processing to where you know the next frame's start time for > > > VRR, both for page flips and for all other vblank events. > > > -Daniel > > > > > > > I think for amdgpu it would be doable by calling drm_handle_vblank > > from the pageflip irq handler in VRR mode, and then additionally from > > the vblank interrupt handler like now. Vblank irq's are triggered via > > vline interrupts with programmable vertical trigger line position, so > > we could program the trigger line to be start of back-porch instead of > > start of front-porch. In the pageflip case we do vblank handling + > > timestamping + pflip completion from pflip irq, and have the regular > > back-porch vblank handling for refresh cycles in which no > > pageflip/pflip irq happened. > > Hm, can't we simply program the vblank to occur in the back porch region > for VRR? The vblank timestamping should automatically correct the > timestamp in that case. Would require a lot less code changes. > > Plus update the documentation ofc. > Thanks to an unexpected bus driver strike i got stuck in the lab instead of my bed for the last 6 extra hours, so i had some time to think about this and hack up a proof-of-concept. Yes, moving vblank irq ~ drm_handle_vblank() to the start of back-porch seems to do the trick in VRR mode pretty well, at least as far as 30 minutes of testing with my own application Psychtoolbox show. This may work - More sleep and more testing required :) As a side-effect of the shifted point of drm_update_vblank_counter() update in back-porch or flip-irq though, the flip throttling in VRR needed some work, which makes it even more hacky/messy - Michel will not be pleased ;). Might need a timestamp-based throttling in VRR mode to make it less messy. -mario > > Not sure if/how that would mess with below-vrr-refresh-rate-range > > support for stutter reduction at low fps though? Or with performance, > > given vblank event dispatch could be delayed a lot in VRR compared to > > normal mode? Or with all the vblank accounting during modesets or crtc > > en/disable? > > Currently VRR and explicit frame scheduling don't mix. We've discussed > this, and consens seems to be that we need a new property for page_flip, > similar to the target frame, but instead of frames a timestamp. Then the > driver can hit a specific time for the next frame. > > And since compositors use the vblank ioctl just to do frame scheduling, I > don't think we have to worry hugely about interactions there. Much better > to aim for clean semantics (i.e. vblank and flip complete timestamps > better match). > > > Also, could other drivers like Intel easily do this delayed vblank > > processing in the non-pageflip case? > > Intel's problem :-) But yeah we have page flip timestamp registers, so > worst case all we need to do is shuffle code around a bit. And we have > interrupts for both start of vblank and and of vblank. It might be a bit > more work because we use the vblank interrupt to avoid races in atomic > updates, but that's not too terrible to fix. > > > From my user perspective as a developer of a neuroscience research > > application that has a couple of exciting/important use cases for VRR > > mode, i absolutely do need trustworthy and precise pageflip event > > timestamps that represent reality, otherwise VRR mode would be less > > than useless for me. I can do without meaningful vblank timestamps in > > VRR mode though, and expected to do without them, as any scheduling > > has to happen time-based instead of based on vblank counts anyway, > > given that vblank counts are quite meaningless if every frame can have > > a different duration. I'd expect that situation to be similar for > > other apps that would want to use timestamps in VRR mode like video > > games, movie players or VR/AR applications. > > Yeah for that you want to schedule your frames with a target frame number. > Would still be nice if vblank timestamps wouldn't be an outright lie (and > the timestamp is still useful information, even if the frame counter > isn't - many displays have min/max frame rates for VRR, which is > information we could perhaps expose). > -Daniel > > > > > -mario > > > > > > > > > > Nicholas Kazlauskas > > > > > > > > > > > > > >> --- > > > > >> drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- > > > > >> include/drm/drm_vblank.h | 8 ++++++ > > > > >> 2 files changed, 56 insertions(+), 1 deletion(-) > > > > >> > > > > >> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c > > > > >> index 98e091175921..4b3a4c38fabe 100644 > > > > >> --- a/drivers/gpu/drm/drm_vblank.c > > > > >> +++ b/drivers/gpu/drm/drm_vblank.c > > > > >> @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, > > > > >> u64 seq, ktime_t now) > > > > >> { > > > > >> struct timespec64 tv; > > > > >> + ktime_t alt_flip_time; > > > > >> > > > > >> switch (e->event.base.type) { > > > > >> - case DRM_EVENT_VBLANK: > > > > >> case DRM_EVENT_FLIP_COMPLETE: > > > > >> + /* > > > > >> + * For flip completion events, override "now" time > > > > >> + * with alt_flip_time provided by the driver via > > > > >> + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode > > > > >> + * if that time is later than given "now" vblank time. > > > > >> + */ > > > > >> + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; > > > > >> + if (alt_flip_time > now) > > > > >> + now = alt_flip_time; > > > > >> + /* Fallthrough */ > > > > >> + case DRM_EVENT_VBLANK: > > > > >> tv = ktime_to_timespec64(now); > > > > >> e->event.vbl.sequence = seq; > > > > >> /* > > > > >> @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > > > > >> > > > > >> now = ktime_get(); > > > > >> } > > > > >> + > > > > >> e->pipe = pipe; > > > > >> send_vblank_event(dev, e, seq, now); > > > > >> } > > > > >> EXPORT_SYMBOL(drm_crtc_send_vblank_event); > > > > >> > > > > >> +/** > > > > >> + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time > > > > >> + * @crtc: the source CRTC of the pageflip completion event > > > > >> + * @flip_time: The alternate pageflip completion timestamp in VRR mode > > > > >> + * > > > > >> + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried > > > > >> + * by the pageflip event can never be earlier than the vblank timestamp of the > > > > >> + * vblank of flip completion, as that vblank timestamp defines the end of the > > > > >> + * shortest possible vblank duration. In case of a delayed flip completion > > > > >> + * inside the extended VRR front porch however, the end of vblank can be much > > > > >> + * later, so the driver must assign an estimated timestamp of that later end of > > > > >> + * vblank. For a CRTC in VRR mode, the driver should use this helper function to > > > > >> + * set an alternate flip completion timestamp in case of late flip completions > > > > >> + * in extended vblank. In the most simple case, this @flip_time timestamp could > > > > >> + * simply be a ktime_get() timestamp taken at the start of the pageflip > > > > >> + * completion routine, with some constant duration of the back porch interval > > > > >> + * added, although more precise estimates may be possible on some hardware if > > > > >> + * the hardware provides some means of timestamping the true end of vblank. > > > > >> + * > > > > >> + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it > > > > >> + * will use either the standard vblank timestamp, calculated for a minimum > > > > >> + * duration vblank, or the provided @flip_time if that time is later than the > > > > >> + * vblank timestamp, to get the best possible estimate of start of display of > > > > >> + * the new post-pageflip scanout buffer. > > > > >> + */ > > > > >> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > > > > >> + ktime_t flip_time) > > > > >> +{ > > > > >> + struct drm_device *dev = crtc->dev; > > > > >> + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; > > > > >> + > > > > >> + vblank->alt_flip_time = flip_time; > > > > >> +} > > > > >> +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); > > > > >> + > > > > >> static int __enable_vblank(struct drm_device *dev, unsigned int pipe) > > > > >> { > > > > >> if (drm_core_check_feature(dev, DRIVER_MODESET)) { > > > > >> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h > > > > >> index 6ad9630d4f48..aacf44694ab6 100644 > > > > >> --- a/include/drm/drm_vblank.h > > > > >> +++ b/include/drm/drm_vblank.h > > > > >> @@ -117,6 +117,12 @@ struct drm_vblank_crtc { > > > > >> * @time: Vblank timestamp corresponding to @count. > > > > >> */ > > > > >> ktime_t time; > > > > >> + /** > > > > >> + * @alt_flip_time: Vblank timestamp for end of extended vblank due to > > > > >> + * a late pageflip completion in variable refresh rate mode. Pageflip > > > > >> + * events will carry the later one of @time and @alt_flip_time. > > > > >> + */ > > > > >> + ktime_t alt_flip_time; > > > > >> > > > > >> /** > > > > >> * @refcount: Number of users/waiters of the vblank interrupt. Only when > > > > >> @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); > > > > >> u64 drm_crtc_vblank_count(struct drm_crtc *crtc); > > > > >> u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, > > > > >> ktime_t *vblanktime); > > > > >> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > > > > >> + ktime_t flip_time); > > > > >> void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > > > > >> struct drm_pending_vblank_event *e); > > > > >> void drm_crtc_arm_vblank_event(struct drm_crtc *crtc, > > > > >> -- > > > > >> 2.17.1 > > > > >> > > > > >> _______________________________________________ > > > > >> dri-devel mailing list > > > > >> dri-devel@lists.freedesktop.org > > > > >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > > > > > > > > > > > > > > > > -- > > > Daniel Vetter > > > Software Engineer, Intel Corporation > > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
On Wed, Feb 13, 2019 at 12:05 PM Mario Kleiner <mario.kleiner.de@gmail.com> wrote: > > On Wed, Feb 13, 2019 at 10:56 AM Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > > Quoting Daniel Vetter (2019-02-13 09:50:55) > > > On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote: > > > > I think all kms drivers try to call drm_crtc_handle_vblank() at start > > > > of vblank to give Mesa the most time for frontbuffer rendering for > > > > classic X. But vblank events are also used for scheduling bufferswaps > > > > or other stuff for redirected windowed rendering, or via api's like > > > > OML_sync_controls glXWaitForMscOML, so there might be other things > > > > affected by a more delayed vblank handling. > > > > > > The frontbuffer rendering is very much X driver specific, and I think > > > -amdgpu/radeon is the only one that requires this. No i915 driver ever > > > used the vblank interrupt to schedule frontbuffer blits, we use some > > > CS-side stalls. > > > > Fwiw, the Present midlayer does use vblank scheduling for inplace copy > > updates. Not that I wish to encourage anyone to use frontbuffer > > rendering. > > -Chris > > Yes, that's what i meant. Under DRI2 at least AMD, Intel and nouveau > have throttling based on CS stalls to avoid tearing and do throttling. > DRI3/Present last time i checked just waited for a vblank event and > then triggered the blit - something that causes tearing even when > triggered at start of vblank. Hm, might be good to document all that stuff somewhere. But in the end I'd say the answer is "don't do frontbuffer rendering". I'm not sure we managed to document all the rules for existing vblank vs. flip interactions anywhere. At least I didn't find anything with a look at our docs. -Daniel
On 2/13/19 4:50 AM, Daniel Vetter wrote: > On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote: >> On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter <daniel@ffwll.ch> wrote: >>> >>> On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas >>> <Nicholas.Kazlauskas@amd.com> wrote: >>>> >>>> On 2/11/19 3:35 AM, Daniel Vetter wrote: >>>>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote: >>>>>> The pageflip completion timestamps transmitted to userspace >>>>>> via pageflip completion events are supposed to describe the >>>>>> time at which the first pixel of the new post-pageflip scanout >>>>>> buffer leaves the video output of the gpu. This time is >>>>>> identical to end of vblank, when active scanout starts. >>>>>> >>>>>> For a crtc in standard fixed refresh rate, the end of vblank >>>>>> is identical to the vblank timestamps calculated by >>>>>> drm_update_vblank_count() at each vblank interrupt, or each >>>>>> vblank dis-/enable. Therefore pageflip events just carry >>>>>> that vblank timestamp as their pageflip timestamp. >>>>>> >>>>>> For a crtc switched to variable refresh rate mode (vrr), the >>>>>> pageflip completion timestamps are identical to the vblank >>>>>> timestamps iff the pageflip was executed early in vblank, >>>>>> before the minimum vblank duration elapsed. In this case >>>>>> the time of display onset is identical to when the crtc >>>>>> is running in fixed refresh rate. >>>>>> >>>>>> However, if a pageflip completes later in the vblank, inside >>>>>> the "extended front porch" in vrr mode, then the vblank will >>>>>> terminate at a fixed (back porch) duration after flip, so >>>>>> the display onset time is delayed correspondingly. In this >>>>>> case the vblank timestamp computed at vblank irq time would >>>>>> be too early, and we need a way to calculate an estimated >>>>>> pageflip timestamp that will be later than the vblank timestamp. >>>>>> >>>>>> How a driver determines such a "late flip" timestamp is hw >>>>>> and driver specific, but this patch adds a new helper function >>>>>> that allows the driver to propose such an alternate "late flip" >>>>>> timestamp for use in pageflip events: >>>>>> >>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); >>>>>> >>>>>> When sending out pageflip events, we now compare that proposed >>>>>> flip_timestamp against the vblank timestamp of the current >>>>>> vblank of flip completion and choose to send out the greater/ >>>>>> later timestamp as flip completion timestamp. >>>>>> >>>>>> The most simple way for a kms driver to supply a suitable >>>>>> flip_timestamp in vrr mode would be to simply take a timestamp >>>>>> at start of the pageflip completion handler, e.g., pageflip >>>>>> irq handler: flip_timestamp = ktime_get(); and then set that >>>>>> as proposed "late" alternative timestamp via ... >>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); >>>>>> >>>>>> More clever approaches could try to add some corrective offset >>>>>> for fixed back porch duration, or ideally use hardware features >>>>>> like hw timestamps to calculate the exact end time of vblank. >>>>>> >>>>>> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> >>>>>> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> >>>>>> Cc: Harry Wentland <harry.wentland@amd.com> >>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>>> >>>>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us >>>>> the right timestampe, once? With this I guess if you do a vblank query in >>>>> between the wrong and the right vblank you'll get the bogus value. Not >>>>> really great for userspace. >>>>> -Daniel >>>> >>>> I think we calculate the timestamp and send the vblank event both within >>>> the pageflip IRQ handler so calculating the right pageflip timestamp >>>> once could probably be done. I'm not sure if it's easier than proposing >>>> a later flip time with an API like this though. >>>> >>>> The actual scanout time should be known from the page-flip handler so >>>> the semantics for VRR on/off remain the same. This is because the >>>> page-flip triggers entering the back porch if we're in the extended >>>> front porch. >>>> >>>> But scanout time from vblank events for something like >>>> DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only >>>> treated as estimates. If we're in the regular front porch then the >>>> timing to scanout is based on the fixed duration front porch for the >>>> current mode. If we're in the extended back porch then it's technically >>>> driver defined but the most reasonable guess is to assume that the front >>>> porch is going to end at any moment, so just return the length of the >>>> back porch for getting the scanout time. >>>> >>>> Proposing the late timestamp shouldn't affect vblank event in the >>>> DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip >>>> event case. I'm not sure if that's what's guaranteed to happen with this >>>> patch though. There doesn't seem to be any locking on either >>>> dev->vblank_time_lock or the vblank->seqlock so while it's likely to get >>>> the same vblank event back as the one just stored I don't think it's >>>> guaranteed. >>> >>> That's the inconsistency I mean to highlight - the timestamp for the >>> same frame as observed through flip complete and through the >>> wait_vblank ioctl can differ. Which they really shouldn't. >>> >> >> Ideally they shouldn't differ. The kernel docs for drm_crtc_state say >> that vblank and pageflip timestamps should always match. But then the >> kernel docs for "Variable refresh properties" in drm_connector.c for >> vblank timestamps were changed for the VRR implementation in Linux >> 5.0-rc to redefine them when in VRR mode. They are defined, but >> probably rather useless for any practical purpose, like this: >> >> "The semantics for the vertical blank timestamp differ when >> variable refresh rate is active. The vertical blank timestamp >> is defined to be an estimate using the current mode's fixed >> refresh rate timings. The semantics for the page-flip event >> timestamp remain the same." > > Uh I missed that. That sounds like nonsense tbh. > >> So our docs contradict each other as of Linux 5.0-rc. Certainly having >> useful vblank timetamps would be useful. > > Yup, imo vblank should still match page_flip. Otherwise I expect a lot of > hilarity will ensue. I would imagine you would see more breakage by changing userspace expectations for when the event is actually received. The IOCTL is "wait for vblank", but if we start sending the event near the end of vblank it becomes more "wait for pageflip" (near scanout) or "wait 15ms while the VRR front porch times out and causes flickering". I don't speak for all userspace but it seems more useful to me to have the event sent at the start of vblank. VRR in its current state is useful for applications that aren't completely timing sensitive (like most games). This is because the driver has free reign in restricting the min/max bounds of the range - allowing for enhancements like low framerate compensation / BTR. For a concrete example, imagine you're not doing frontbuffer rendering but you're still trying to target a specific scanout time. So even if we were to send the vblank event at vpos 0 you'd get an accurate scanout timestamp for the next scanout after the vback porch ends. You're not going to be doing any rendering or preparation in this time because it's simply too short. So you'd target the scanout period after this one, which you'd need to calculate somehow in userspace. Taking the deltas between two vblank events is completely useless here because you'll get back the vrr min refresh delta. So you can take your source content rate or some fixed rate you'd like to target within the range but even then you're still not guaranteed to have that flip end up at that scanout time. The current solution is to just return the scanout pos at vrr max, or the current mode's fixed refresh rate. If you're flip before the fixed duration front porch ends then that scanout time will likely be accurate since you can't flip faster than vrr max. Is there actually userspace that cares that these timestamps need to match? Like said below, it seems that wait for vblank is mostly about scheduling and for that I think having the event sent back at the start of vblank is more important here. > >>> Now added complication is that amdgpu sends out vblank events really >>> early, which is used by userspace to do frontbuffer rendering in the >>> vblank time. But I don't think anyone wants to do both VRR and >> >> I think all kms drivers try to call drm_crtc_handle_vblank() at start >> of vblank to give Mesa the most time for frontbuffer rendering for >> classic X. But vblank events are also used for scheduling bufferswaps >> or other stuff for redirected windowed rendering, or via api's like >> OML_sync_controls glXWaitForMscOML, so there might be other things >> affected by a more delayed vblank handling. > > The frontbuffer rendering is very much X driver specific, and I think > -amdgpu/radeon is the only one that requires this. No i915 driver ever > used the vblank interrupt to schedule frontbuffer blits, we use some > CS-side stalls. > > Wrt scheduling pageflips: The rule is that right after you've received the > vblank for frame X, then an immediately schedule pageflip should hit X+1, > but not X. amdgpu had this broken, but it's fixed since a while. > >>> frontbuffer rendering, hence I think it should be possible to create >>> correct vblank timestamps for the VRR case, while leaving the current >>> logic in place for everything else. But that means moving the entire >>> vblank processing to where you know the next frame's start time for >>> VRR, both for page flips and for all other vblank events. >>> -Daniel >>> >> >> I think for amdgpu it would be doable by calling drm_handle_vblank >> from the pageflip irq handler in VRR mode, and then additionally from >> the vblank interrupt handler like now. Vblank irq's are triggered via >> vline interrupts with programmable vertical trigger line position, so >> we could program the trigger line to be start of back-porch instead of >> start of front-porch. In the pageflip case we do vblank handling + >> timestamping + pflip completion from pflip irq, and have the regular >> back-porch vblank handling for refresh cycles in which no >> pageflip/pflip irq happened. > > Hm, can't we simply program the vblank to occur in the back porch region > for VRR? The vblank timestamping should automatically correct the > timestamp in that case. Would require a lot less code changes. > > Plus update the documentation ofc. > >> Not sure if/how that would mess with below-vrr-refresh-rate-range >> support for stutter reduction at low fps though? Or with performance, >> given vblank event dispatch could be delayed a lot in VRR compared to >> normal mode? Or with all the vblank accounting during modesets or crtc >> en/disable? I think we have extra VLINEs that can be programmed at vpos 0 that could resolve this but I'm still not really sold on the idea. > > Currently VRR and explicit frame scheduling don't mix. We've discussed > this, and consens seems to be that we need a new property for page_flip, > similar to the target frame, but instead of frames a timestamp. Then the > driver can hit a specific time for the next frame. > > And since compositors use the vblank ioctl just to do frame scheduling, I > don't think we have to worry hugely about interactions there. Much better > to aim for clean semantics (i.e. vblank and flip complete timestamps > better match). > >> Also, could other drivers like Intel easily do this delayed vblank >> processing in the non-pageflip case? > > Intel's problem :-) But yeah we have page flip timestamp registers, so > worst case all we need to do is shuffle code around a bit. And we have > interrupts for both start of vblank and and of vblank. It might be a bit > more work because we use the vblank interrupt to avoid races in atomic > updates, but that's not too terrible to fix. > >> From my user perspective as a developer of a neuroscience research >> application that has a couple of exciting/important use cases for VRR >> mode, i absolutely do need trustworthy and precise pageflip event >> timestamps that represent reality, otherwise VRR mode would be less >> than useless for me. I can do without meaningful vblank timestamps in >> VRR mode though, and expected to do without them, as any scheduling >> has to happen time-based instead of based on vblank counts anyway, >> given that vblank counts are quite meaningless if every frame can have >> a different duration. I'd expect that situation to be similar for >> other apps that would want to use timestamps in VRR mode like video >> games, movie players or VR/AR applications. > > Yeah for that you want to schedule your frames with a target frame number. > Would still be nice if vblank timestamps wouldn't be an outright lie (and > the timestamp is still useful information, even if the frame counter > isn't - many displays have min/max frame rates for VRR, which is > information we could perhaps expose). > -Daniel > >> >> -mario >> >>>> >>>> Nicholas Kazlauskas >>>> >>>>> >>>>>> --- >>>>>> drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- >>>>>> include/drm/drm_vblank.h | 8 ++++++ >>>>>> 2 files changed, 56 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c >>>>>> index 98e091175921..4b3a4c38fabe 100644 >>>>>> --- a/drivers/gpu/drm/drm_vblank.c >>>>>> +++ b/drivers/gpu/drm/drm_vblank.c >>>>>> @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, >>>>>> u64 seq, ktime_t now) >>>>>> { >>>>>> struct timespec64 tv; >>>>>> + ktime_t alt_flip_time; >>>>>> >>>>>> switch (e->event.base.type) { >>>>>> - case DRM_EVENT_VBLANK: >>>>>> case DRM_EVENT_FLIP_COMPLETE: >>>>>> + /* >>>>>> + * For flip completion events, override "now" time >>>>>> + * with alt_flip_time provided by the driver via >>>>>> + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode >>>>>> + * if that time is later than given "now" vblank time. >>>>>> + */ >>>>>> + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; >>>>>> + if (alt_flip_time > now) >>>>>> + now = alt_flip_time; >>>>>> + /* Fallthrough */ >>>>>> + case DRM_EVENT_VBLANK: >>>>>> tv = ktime_to_timespec64(now); >>>>>> e->event.vbl.sequence = seq; >>>>>> /* >>>>>> @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, >>>>>> >>>>>> now = ktime_get(); >>>>>> } >>>>>> + >>>>>> e->pipe = pipe; >>>>>> send_vblank_event(dev, e, seq, now); >>>>>> } >>>>>> EXPORT_SYMBOL(drm_crtc_send_vblank_event); >>>>>> >>>>>> +/** >>>>>> + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time >>>>>> + * @crtc: the source CRTC of the pageflip completion event >>>>>> + * @flip_time: The alternate pageflip completion timestamp in VRR mode >>>>>> + * >>>>>> + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried >>>>>> + * by the pageflip event can never be earlier than the vblank timestamp of the >>>>>> + * vblank of flip completion, as that vblank timestamp defines the end of the >>>>>> + * shortest possible vblank duration. In case of a delayed flip completion >>>>>> + * inside the extended VRR front porch however, the end of vblank can be much >>>>>> + * later, so the driver must assign an estimated timestamp of that later end of >>>>>> + * vblank. For a CRTC in VRR mode, the driver should use this helper function to >>>>>> + * set an alternate flip completion timestamp in case of late flip completions >>>>>> + * in extended vblank. In the most simple case, this @flip_time timestamp could >>>>>> + * simply be a ktime_get() timestamp taken at the start of the pageflip >>>>>> + * completion routine, with some constant duration of the back porch interval >>>>>> + * added, although more precise estimates may be possible on some hardware if >>>>>> + * the hardware provides some means of timestamping the true end of vblank. >>>>>> + * >>>>>> + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it >>>>>> + * will use either the standard vblank timestamp, calculated for a minimum >>>>>> + * duration vblank, or the provided @flip_time if that time is later than the >>>>>> + * vblank timestamp, to get the best possible estimate of start of display of >>>>>> + * the new post-pageflip scanout buffer. >>>>>> + */ >>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, >>>>>> + ktime_t flip_time) >>>>>> +{ >>>>>> + struct drm_device *dev = crtc->dev; >>>>>> + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; >>>>>> + >>>>>> + vblank->alt_flip_time = flip_time; >>>>>> +} >>>>>> +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); >>>>>> + >>>>>> static int __enable_vblank(struct drm_device *dev, unsigned int pipe) >>>>>> { >>>>>> if (drm_core_check_feature(dev, DRIVER_MODESET)) { >>>>>> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h >>>>>> index 6ad9630d4f48..aacf44694ab6 100644 >>>>>> --- a/include/drm/drm_vblank.h >>>>>> +++ b/include/drm/drm_vblank.h >>>>>> @@ -117,6 +117,12 @@ struct drm_vblank_crtc { >>>>>> * @time: Vblank timestamp corresponding to @count. >>>>>> */ >>>>>> ktime_t time; >>>>>> + /** >>>>>> + * @alt_flip_time: Vblank timestamp for end of extended vblank due to >>>>>> + * a late pageflip completion in variable refresh rate mode. Pageflip >>>>>> + * events will carry the later one of @time and @alt_flip_time. >>>>>> + */ >>>>>> + ktime_t alt_flip_time; >>>>>> >>>>>> /** >>>>>> * @refcount: Number of users/waiters of the vblank interrupt. Only when >>>>>> @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); >>>>>> u64 drm_crtc_vblank_count(struct drm_crtc *crtc); >>>>>> u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, >>>>>> ktime_t *vblanktime); >>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, >>>>>> + ktime_t flip_time); >>>>>> void drm_crtc_send_vblank_event(struct drm_crtc *crtc, >>>>>> struct drm_pending_vblank_event *e); >>>>>> void drm_crtc_arm_vblank_event(struct drm_crtc *crtc, >>>>>> -- >>>>>> 2.17.1 >>>>>> >>>>>> _______________________________________________ >>>>>> dri-devel mailing list >>>>>> dri-devel@lists.freedesktop.org >>>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel >>>>> >>>> >>> >>> >>> -- >>> Daniel Vetter >>> Software Engineer, Intel Corporation >>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch > Nicholas Kazlauskas
On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas <Nicholas.Kazlauskas@amd.com> wrote: > > On 2/13/19 4:50 AM, Daniel Vetter wrote: > > On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote: > >> On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter <daniel@ffwll.ch> wrote: > >>> > >>> On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas > >>> <Nicholas.Kazlauskas@amd.com> wrote: > >>>> > >>>> On 2/11/19 3:35 AM, Daniel Vetter wrote: > >>>>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote: > >>>>>> The pageflip completion timestamps transmitted to userspace > >>>>>> via pageflip completion events are supposed to describe the > >>>>>> time at which the first pixel of the new post-pageflip scanout > >>>>>> buffer leaves the video output of the gpu. This time is > >>>>>> identical to end of vblank, when active scanout starts. > >>>>>> > >>>>>> For a crtc in standard fixed refresh rate, the end of vblank > >>>>>> is identical to the vblank timestamps calculated by > >>>>>> drm_update_vblank_count() at each vblank interrupt, or each > >>>>>> vblank dis-/enable. Therefore pageflip events just carry > >>>>>> that vblank timestamp as their pageflip timestamp. > >>>>>> > >>>>>> For a crtc switched to variable refresh rate mode (vrr), the > >>>>>> pageflip completion timestamps are identical to the vblank > >>>>>> timestamps iff the pageflip was executed early in vblank, > >>>>>> before the minimum vblank duration elapsed. In this case > >>>>>> the time of display onset is identical to when the crtc > >>>>>> is running in fixed refresh rate. > >>>>>> > >>>>>> However, if a pageflip completes later in the vblank, inside > >>>>>> the "extended front porch" in vrr mode, then the vblank will > >>>>>> terminate at a fixed (back porch) duration after flip, so > >>>>>> the display onset time is delayed correspondingly. In this > >>>>>> case the vblank timestamp computed at vblank irq time would > >>>>>> be too early, and we need a way to calculate an estimated > >>>>>> pageflip timestamp that will be later than the vblank timestamp. > >>>>>> > >>>>>> How a driver determines such a "late flip" timestamp is hw > >>>>>> and driver specific, but this patch adds a new helper function > >>>>>> that allows the driver to propose such an alternate "late flip" > >>>>>> timestamp for use in pageflip events: > >>>>>> > >>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > >>>>>> > >>>>>> When sending out pageflip events, we now compare that proposed > >>>>>> flip_timestamp against the vblank timestamp of the current > >>>>>> vblank of flip completion and choose to send out the greater/ > >>>>>> later timestamp as flip completion timestamp. > >>>>>> > >>>>>> The most simple way for a kms driver to supply a suitable > >>>>>> flip_timestamp in vrr mode would be to simply take a timestamp > >>>>>> at start of the pageflip completion handler, e.g., pageflip > >>>>>> irq handler: flip_timestamp = ktime_get(); and then set that > >>>>>> as proposed "late" alternative timestamp via ... > >>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > >>>>>> > >>>>>> More clever approaches could try to add some corrective offset > >>>>>> for fixed back porch duration, or ideally use hardware features > >>>>>> like hw timestamps to calculate the exact end time of vblank. > >>>>>> > >>>>>> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> > >>>>>> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> > >>>>>> Cc: Harry Wentland <harry.wentland@amd.com> > >>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> > >>>>> > >>>>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us > >>>>> the right timestampe, once? With this I guess if you do a vblank query in > >>>>> between the wrong and the right vblank you'll get the bogus value. Not > >>>>> really great for userspace. > >>>>> -Daniel > >>>> > >>>> I think we calculate the timestamp and send the vblank event both within > >>>> the pageflip IRQ handler so calculating the right pageflip timestamp > >>>> once could probably be done. I'm not sure if it's easier than proposing > >>>> a later flip time with an API like this though. > >>>> > >>>> The actual scanout time should be known from the page-flip handler so > >>>> the semantics for VRR on/off remain the same. This is because the > >>>> page-flip triggers entering the back porch if we're in the extended > >>>> front porch. > >>>> > >>>> But scanout time from vblank events for something like > >>>> DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only > >>>> treated as estimates. If we're in the regular front porch then the > >>>> timing to scanout is based on the fixed duration front porch for the > >>>> current mode. If we're in the extended back porch then it's technically > >>>> driver defined but the most reasonable guess is to assume that the front > >>>> porch is going to end at any moment, so just return the length of the > >>>> back porch for getting the scanout time. > >>>> > >>>> Proposing the late timestamp shouldn't affect vblank event in the > >>>> DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip > >>>> event case. I'm not sure if that's what's guaranteed to happen with this > >>>> patch though. There doesn't seem to be any locking on either > >>>> dev->vblank_time_lock or the vblank->seqlock so while it's likely to get > >>>> the same vblank event back as the one just stored I don't think it's > >>>> guaranteed. > >>> > >>> That's the inconsistency I mean to highlight - the timestamp for the > >>> same frame as observed through flip complete and through the > >>> wait_vblank ioctl can differ. Which they really shouldn't. > >>> > >> > >> Ideally they shouldn't differ. The kernel docs for drm_crtc_state say > >> that vblank and pageflip timestamps should always match. But then the > >> kernel docs for "Variable refresh properties" in drm_connector.c for > >> vblank timestamps were changed for the VRR implementation in Linux > >> 5.0-rc to redefine them when in VRR mode. They are defined, but > >> probably rather useless for any practical purpose, like this: > >> > >> "The semantics for the vertical blank timestamp differ when > >> variable refresh rate is active. The vertical blank timestamp > >> is defined to be an estimate using the current mode's fixed > >> refresh rate timings. The semantics for the page-flip event > >> timestamp remain the same." > > > > Uh I missed that. That sounds like nonsense tbh. > > > >> So our docs contradict each other as of Linux 5.0-rc. Certainly having > >> useful vblank timetamps would be useful. > > > > Yup, imo vblank should still match page_flip. Otherwise I expect a lot of > > hilarity will ensue. > > I would imagine you would see more breakage by changing userspace > expectations for when the event is actually received. > > The IOCTL is "wait for vblank", but if we start sending the event near > the end of vblank it becomes more "wait for pageflip" (near scanout) or > "wait 15ms while the VRR front porch times out and causes flickering". > > I don't speak for all userspace but it seems more useful to me to have > the event sent at the start of vblank. Maybe we need to things. The problem is that the timestampe is defined to be correct for start-of-frame. And right now the amdgpu implementation of VRR doesn't even give an accurate timestampe for the flip, so most likely no one yet thought about this properly. Definitely no testcases (or it would have been caught), whereas we have tons of testcases for fixed refresh. > VRR in its current state is useful for applications that aren't > completely timing sensitive (like most games). This is because the > driver has free reign in restricting the min/max bounds of the range - > allowing for enhancements like low framerate compensation / BTR. > > For a concrete example, imagine you're not doing frontbuffer rendering > but you're still trying to target a specific scanout time. So even if we > were to send the vblank event at vpos 0 you'd get an accurate scanout > timestamp for the next scanout after the vback porch ends. You're not > going to be doing any rendering or preparation in this time because it's > simply too short. So you'd target the scanout period after this one, > which you'd need to calculate somehow in userspace. Taking the deltas > between two vblank events is completely useless here because you'll get > back the vrr min refresh delta. So you can take your source content rate > or some fixed rate you'd like to target within the range but even then > you're still not guaranteed to have that flip end up at that scanout time. > > The current solution is to just return the scanout pos at vrr max, or > the current mode's fixed refresh rate. If you're flip before the fixed > duration front porch ends then that scanout time will likely be accurate > since you can't flip faster than vrr max. > > Is there actually userspace that cares that these timestamps need to > match? Like said below, it seems that wait for vblank is mostly about > scheduling and for that I think having the event sent back at the start > of vblank is more important here. There's 3 cases: - DRI3: Waits for vblank for frontbuffer blits. If you combine that with VRR then imo you deserve to keep all the pieces. - generic compositors. Use VRR for scheduling, expect that a page_flip queued immediately after a page_flip will hit the next frame, and not the current frame. They also expect that this will not give you the slowest possible refresh rate, but I guess VRR plus frame scheduling is currently out of scope. It's kinda just for as-fast-as-possible games. - flip timestamp. Currently the entire codebase assumes that vblank ts = flip ts. I don't really see a point in breaking that. - better scheduling of VRR: what compositors actually do isn't wait for the vblank, that only latches the page_flip. They set up a hrtimer, with a fixed/auto-tuned head-start, from which they start their rendering. The same could be done for VRR, if we expose the earliest/latest vblank end times (and how much they can shift). vblank ioctl is only used for syncing that up and making sure we're hitting the right frame. This is all for pageflipping compositors. This is also what they'll probably keep doing when we give them a desired flip time (instead of a target frame), they still need an hrtimer to fire a little bit ahead to get the frame rendered. So taking all together, we have 2 use-cases I think: - frontbuffer rendering. Kinda wants vblank event to fire at the beginning of vblank. Won't ever do VRR - pageflipping compositors. Don't care about the vblank event, either they have an hrtimer timer (which gets synced with the vblank ioctl), or just flip as fast as possible. They don't care one bit when the vblank event happens, as long as a pageflip scheduled immediately after the event shows up only hits the next frame (which is the delay code that currently stretches vblank unecessarily with VRR if you're already a bit late). Making sure that vblank and page flip timestamp match the real userspace seems like the cleanest semantics, and the least surprising to userspace. E.g. what happens if you implement VRR in a compositor, but then tune your hrtimer within the vblank, when the vblank timestamp is totally not reflecting your real pageflip? I just don't see what's the upside of having a vblank timestamp that's not actually reflecting the real flip. Also, if your app wants the start of vblank, that's very easy to compute: Just add the scanout time to the last vblank timestamp. Figuring out the start of frame otoh isn't doable if you can't query, and constantly pageflipping just to have an accurate start-of-frame (for tuning your hrtimer) seems rather wasteful - it outright defeats some of the use-cases for VRR. So what benefit do you see in having vblank ts and pageflip ts not agree? What's the use-case this solves - aside from "less typing" :-) If the only use-case is VRR for frontbuffer rendering, then that doesn't even work right now (aside from that I think it's something we shouldn't support and definitely not encourage). -Daniel > >>> Now added complication is that amdgpu sends out vblank events really > >>> early, which is used by userspace to do frontbuffer rendering in the > >>> vblank time. But I don't think anyone wants to do both VRR and > >> > >> I think all kms drivers try to call drm_crtc_handle_vblank() at start > >> of vblank to give Mesa the most time for frontbuffer rendering for > >> classic X. But vblank events are also used for scheduling bufferswaps > >> or other stuff for redirected windowed rendering, or via api's like > >> OML_sync_controls glXWaitForMscOML, so there might be other things > >> affected by a more delayed vblank handling. > > > > The frontbuffer rendering is very much X driver specific, and I think > > -amdgpu/radeon is the only one that requires this. No i915 driver ever > > used the vblank interrupt to schedule frontbuffer blits, we use some > > CS-side stalls. > > > > Wrt scheduling pageflips: The rule is that right after you've received the > > vblank for frame X, then an immediately schedule pageflip should hit X+1, > > but not X. amdgpu had this broken, but it's fixed since a while. > > > >>> frontbuffer rendering, hence I think it should be possible to create > >>> correct vblank timestamps for the VRR case, while leaving the current > >>> logic in place for everything else. But that means moving the entire > >>> vblank processing to where you know the next frame's start time for > >>> VRR, both for page flips and for all other vblank events. > >>> -Daniel > >>> > >> > >> I think for amdgpu it would be doable by calling drm_handle_vblank > >> from the pageflip irq handler in VRR mode, and then additionally from > >> the vblank interrupt handler like now. Vblank irq's are triggered via > >> vline interrupts with programmable vertical trigger line position, so > >> we could program the trigger line to be start of back-porch instead of > >> start of front-porch. In the pageflip case we do vblank handling + > >> timestamping + pflip completion from pflip irq, and have the regular > >> back-porch vblank handling for refresh cycles in which no > >> pageflip/pflip irq happened. > > > > Hm, can't we simply program the vblank to occur in the back porch region > > for VRR? The vblank timestamping should automatically correct the > > timestamp in that case. Would require a lot less code changes. > > > > Plus update the documentation ofc. > > > >> Not sure if/how that would mess with below-vrr-refresh-rate-range > >> support for stutter reduction at low fps though? Or with performance, > >> given vblank event dispatch could be delayed a lot in VRR compared to > >> normal mode? Or with all the vblank accounting during modesets or crtc > >> en/disable? > > I think we have extra VLINEs that can be programmed at vpos 0 that could > resolve this but I'm still not really sold on the idea. > > > > > Currently VRR and explicit frame scheduling don't mix. We've discussed > > this, and consens seems to be that we need a new property for page_flip, > > similar to the target frame, but instead of frames a timestamp. Then the > > driver can hit a specific time for the next frame. > > > > And since compositors use the vblank ioctl just to do frame scheduling, I > > don't think we have to worry hugely about interactions there. Much better > > to aim for clean semantics (i.e. vblank and flip complete timestamps > > better match). > > > >> Also, could other drivers like Intel easily do this delayed vblank > >> processing in the non-pageflip case? > > > > Intel's problem :-) But yeah we have page flip timestamp registers, so > > worst case all we need to do is shuffle code around a bit. And we have > > interrupts for both start of vblank and and of vblank. It might be a bit > > more work because we use the vblank interrupt to avoid races in atomic > > updates, but that's not too terrible to fix. > > > >> From my user perspective as a developer of a neuroscience research > >> application that has a couple of exciting/important use cases for VRR > >> mode, i absolutely do need trustworthy and precise pageflip event > >> timestamps that represent reality, otherwise VRR mode would be less > >> than useless for me. I can do without meaningful vblank timestamps in > >> VRR mode though, and expected to do without them, as any scheduling > >> has to happen time-based instead of based on vblank counts anyway, > >> given that vblank counts are quite meaningless if every frame can have > >> a different duration. I'd expect that situation to be similar for > >> other apps that would want to use timestamps in VRR mode like video > >> games, movie players or VR/AR applications. > > > > Yeah for that you want to schedule your frames with a target frame number. > > Would still be nice if vblank timestamps wouldn't be an outright lie (and > > the timestamp is still useful information, even if the frame counter > > isn't - many displays have min/max frame rates for VRR, which is > > information we could perhaps expose). > > -Daniel > > >> > >> -mario > >> > >>>> > >>>> Nicholas Kazlauskas > >>>> > >>>>> > >>>>>> --- > >>>>>> drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- > >>>>>> include/drm/drm_vblank.h | 8 ++++++ > >>>>>> 2 files changed, 56 insertions(+), 1 deletion(-) > >>>>>> > >>>>>> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c > >>>>>> index 98e091175921..4b3a4c38fabe 100644 > >>>>>> --- a/drivers/gpu/drm/drm_vblank.c > >>>>>> +++ b/drivers/gpu/drm/drm_vblank.c > >>>>>> @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, > >>>>>> u64 seq, ktime_t now) > >>>>>> { > >>>>>> struct timespec64 tv; > >>>>>> + ktime_t alt_flip_time; > >>>>>> > >>>>>> switch (e->event.base.type) { > >>>>>> - case DRM_EVENT_VBLANK: > >>>>>> case DRM_EVENT_FLIP_COMPLETE: > >>>>>> + /* > >>>>>> + * For flip completion events, override "now" time > >>>>>> + * with alt_flip_time provided by the driver via > >>>>>> + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode > >>>>>> + * if that time is later than given "now" vblank time. > >>>>>> + */ > >>>>>> + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; > >>>>>> + if (alt_flip_time > now) > >>>>>> + now = alt_flip_time; > >>>>>> + /* Fallthrough */ > >>>>>> + case DRM_EVENT_VBLANK: > >>>>>> tv = ktime_to_timespec64(now); > >>>>>> e->event.vbl.sequence = seq; > >>>>>> /* > >>>>>> @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > >>>>>> > >>>>>> now = ktime_get(); > >>>>>> } > >>>>>> + > >>>>>> e->pipe = pipe; > >>>>>> send_vblank_event(dev, e, seq, now); > >>>>>> } > >>>>>> EXPORT_SYMBOL(drm_crtc_send_vblank_event); > >>>>>> > >>>>>> +/** > >>>>>> + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time > >>>>>> + * @crtc: the source CRTC of the pageflip completion event > >>>>>> + * @flip_time: The alternate pageflip completion timestamp in VRR mode > >>>>>> + * > >>>>>> + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried > >>>>>> + * by the pageflip event can never be earlier than the vblank timestamp of the > >>>>>> + * vblank of flip completion, as that vblank timestamp defines the end of the > >>>>>> + * shortest possible vblank duration. In case of a delayed flip completion > >>>>>> + * inside the extended VRR front porch however, the end of vblank can be much > >>>>>> + * later, so the driver must assign an estimated timestamp of that later end of > >>>>>> + * vblank. For a CRTC in VRR mode, the driver should use this helper function to > >>>>>> + * set an alternate flip completion timestamp in case of late flip completions > >>>>>> + * in extended vblank. In the most simple case, this @flip_time timestamp could > >>>>>> + * simply be a ktime_get() timestamp taken at the start of the pageflip > >>>>>> + * completion routine, with some constant duration of the back porch interval > >>>>>> + * added, although more precise estimates may be possible on some hardware if > >>>>>> + * the hardware provides some means of timestamping the true end of vblank. > >>>>>> + * > >>>>>> + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it > >>>>>> + * will use either the standard vblank timestamp, calculated for a minimum > >>>>>> + * duration vblank, or the provided @flip_time if that time is later than the > >>>>>> + * vblank timestamp, to get the best possible estimate of start of display of > >>>>>> + * the new post-pageflip scanout buffer. > >>>>>> + */ > >>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > >>>>>> + ktime_t flip_time) > >>>>>> +{ > >>>>>> + struct drm_device *dev = crtc->dev; > >>>>>> + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; > >>>>>> + > >>>>>> + vblank->alt_flip_time = flip_time; > >>>>>> +} > >>>>>> +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); > >>>>>> + > >>>>>> static int __enable_vblank(struct drm_device *dev, unsigned int pipe) > >>>>>> { > >>>>>> if (drm_core_check_feature(dev, DRIVER_MODESET)) { > >>>>>> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h > >>>>>> index 6ad9630d4f48..aacf44694ab6 100644 > >>>>>> --- a/include/drm/drm_vblank.h > >>>>>> +++ b/include/drm/drm_vblank.h > >>>>>> @@ -117,6 +117,12 @@ struct drm_vblank_crtc { > >>>>>> * @time: Vblank timestamp corresponding to @count. > >>>>>> */ > >>>>>> ktime_t time; > >>>>>> + /** > >>>>>> + * @alt_flip_time: Vblank timestamp for end of extended vblank due to > >>>>>> + * a late pageflip completion in variable refresh rate mode. Pageflip > >>>>>> + * events will carry the later one of @time and @alt_flip_time. > >>>>>> + */ > >>>>>> + ktime_t alt_flip_time; > >>>>>> > >>>>>> /** > >>>>>> * @refcount: Number of users/waiters of the vblank interrupt. Only when > >>>>>> @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); > >>>>>> u64 drm_crtc_vblank_count(struct drm_crtc *crtc); > >>>>>> u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, > >>>>>> ktime_t *vblanktime); > >>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > >>>>>> + ktime_t flip_time); > >>>>>> void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > >>>>>> struct drm_pending_vblank_event *e); > >>>>>> void drm_crtc_arm_vblank_event(struct drm_crtc *crtc, > >>>>>> -- > >>>>>> 2.17.1 > >>>>>> > >>>>>> _______________________________________________ > >>>>>> dri-devel mailing list > >>>>>> dri-devel@lists.freedesktop.org > >>>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel > >>>>> > >>>> > >>> > >>> > >>> -- > >>> Daniel Vetter > >>> Software Engineer, Intel Corporation > >>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch > > > > Nicholas Kazlauskas >
On 2/13/19 10:14 AM, Daniel Vetter wrote: > On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas > <Nicholas.Kazlauskas@amd.com> wrote: >> >> On 2/13/19 4:50 AM, Daniel Vetter wrote: >>> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote: >>>> On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter <daniel@ffwll.ch> wrote: >>>>> >>>>> On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas >>>>> <Nicholas.Kazlauskas@amd.com> wrote: >>>>>> >>>>>> On 2/11/19 3:35 AM, Daniel Vetter wrote: >>>>>>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote: >>>>>>>> The pageflip completion timestamps transmitted to userspace >>>>>>>> via pageflip completion events are supposed to describe the >>>>>>>> time at which the first pixel of the new post-pageflip scanout >>>>>>>> buffer leaves the video output of the gpu. This time is >>>>>>>> identical to end of vblank, when active scanout starts. >>>>>>>> >>>>>>>> For a crtc in standard fixed refresh rate, the end of vblank >>>>>>>> is identical to the vblank timestamps calculated by >>>>>>>> drm_update_vblank_count() at each vblank interrupt, or each >>>>>>>> vblank dis-/enable. Therefore pageflip events just carry >>>>>>>> that vblank timestamp as their pageflip timestamp. >>>>>>>> >>>>>>>> For a crtc switched to variable refresh rate mode (vrr), the >>>>>>>> pageflip completion timestamps are identical to the vblank >>>>>>>> timestamps iff the pageflip was executed early in vblank, >>>>>>>> before the minimum vblank duration elapsed. In this case >>>>>>>> the time of display onset is identical to when the crtc >>>>>>>> is running in fixed refresh rate. >>>>>>>> >>>>>>>> However, if a pageflip completes later in the vblank, inside >>>>>>>> the "extended front porch" in vrr mode, then the vblank will >>>>>>>> terminate at a fixed (back porch) duration after flip, so >>>>>>>> the display onset time is delayed correspondingly. In this >>>>>>>> case the vblank timestamp computed at vblank irq time would >>>>>>>> be too early, and we need a way to calculate an estimated >>>>>>>> pageflip timestamp that will be later than the vblank timestamp. >>>>>>>> >>>>>>>> How a driver determines such a "late flip" timestamp is hw >>>>>>>> and driver specific, but this patch adds a new helper function >>>>>>>> that allows the driver to propose such an alternate "late flip" >>>>>>>> timestamp for use in pageflip events: >>>>>>>> >>>>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); >>>>>>>> >>>>>>>> When sending out pageflip events, we now compare that proposed >>>>>>>> flip_timestamp against the vblank timestamp of the current >>>>>>>> vblank of flip completion and choose to send out the greater/ >>>>>>>> later timestamp as flip completion timestamp. >>>>>>>> >>>>>>>> The most simple way for a kms driver to supply a suitable >>>>>>>> flip_timestamp in vrr mode would be to simply take a timestamp >>>>>>>> at start of the pageflip completion handler, e.g., pageflip >>>>>>>> irq handler: flip_timestamp = ktime_get(); and then set that >>>>>>>> as proposed "late" alternative timestamp via ... >>>>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); >>>>>>>> >>>>>>>> More clever approaches could try to add some corrective offset >>>>>>>> for fixed back porch duration, or ideally use hardware features >>>>>>>> like hw timestamps to calculate the exact end time of vblank. >>>>>>>> >>>>>>>> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> >>>>>>>> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> >>>>>>>> Cc: Harry Wentland <harry.wentland@amd.com> >>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>>>>> >>>>>>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us >>>>>>> the right timestampe, once? With this I guess if you do a vblank query in >>>>>>> between the wrong and the right vblank you'll get the bogus value. Not >>>>>>> really great for userspace. >>>>>>> -Daniel >>>>>> >>>>>> I think we calculate the timestamp and send the vblank event both within >>>>>> the pageflip IRQ handler so calculating the right pageflip timestamp >>>>>> once could probably be done. I'm not sure if it's easier than proposing >>>>>> a later flip time with an API like this though. >>>>>> >>>>>> The actual scanout time should be known from the page-flip handler so >>>>>> the semantics for VRR on/off remain the same. This is because the >>>>>> page-flip triggers entering the back porch if we're in the extended >>>>>> front porch. >>>>>> >>>>>> But scanout time from vblank events for something like >>>>>> DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only >>>>>> treated as estimates. If we're in the regular front porch then the >>>>>> timing to scanout is based on the fixed duration front porch for the >>>>>> current mode. If we're in the extended back porch then it's technically >>>>>> driver defined but the most reasonable guess is to assume that the front >>>>>> porch is going to end at any moment, so just return the length of the >>>>>> back porch for getting the scanout time. >>>>>> >>>>>> Proposing the late timestamp shouldn't affect vblank event in the >>>>>> DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip >>>>>> event case. I'm not sure if that's what's guaranteed to happen with this >>>>>> patch though. There doesn't seem to be any locking on either >>>>>> dev->vblank_time_lock or the vblank->seqlock so while it's likely to get >>>>>> the same vblank event back as the one just stored I don't think it's >>>>>> guaranteed. >>>>> >>>>> That's the inconsistency I mean to highlight - the timestamp for the >>>>> same frame as observed through flip complete and through the >>>>> wait_vblank ioctl can differ. Which they really shouldn't. >>>>> >>>> >>>> Ideally they shouldn't differ. The kernel docs for drm_crtc_state say >>>> that vblank and pageflip timestamps should always match. But then the >>>> kernel docs for "Variable refresh properties" in drm_connector.c for >>>> vblank timestamps were changed for the VRR implementation in Linux >>>> 5.0-rc to redefine them when in VRR mode. They are defined, but >>>> probably rather useless for any practical purpose, like this: >>>> >>>> "The semantics for the vertical blank timestamp differ when >>>> variable refresh rate is active. The vertical blank timestamp >>>> is defined to be an estimate using the current mode's fixed >>>> refresh rate timings. The semantics for the page-flip event >>>> timestamp remain the same." >>> >>> Uh I missed that. That sounds like nonsense tbh. >>> >>>> So our docs contradict each other as of Linux 5.0-rc. Certainly having >>>> useful vblank timetamps would be useful. >>> >>> Yup, imo vblank should still match page_flip. Otherwise I expect a lot of >>> hilarity will ensue. >> >> I would imagine you would see more breakage by changing userspace >> expectations for when the event is actually received. >> >> The IOCTL is "wait for vblank", but if we start sending the event near >> the end of vblank it becomes more "wait for pageflip" (near scanout) or >> "wait 15ms while the VRR front porch times out and causes flickering". >> >> I don't speak for all userspace but it seems more useful to me to have >> the event sent at the start of vblank. > > Maybe we need to things. The problem is that the timestampe is defined > to be correct for start-of-frame. And right now the amdgpu > implementation of VRR doesn't even give an accurate timestampe for the > flip, so most likely no one yet thought about this properly. > Definitely no testcases (or it would have been caught), whereas we > have tons of testcases for fixed refresh. I agree that it would be good to have documentation for what waiting for vblank is supposed to mean here. There's a big difference between getting the event at the start of vsync vs the start of the back porch near the end of it. > >> VRR in its current state is useful for applications that aren't >> completely timing sensitive (like most games). This is because the >> driver has free reign in restricting the min/max bounds of the range - >> allowing for enhancements like low framerate compensation / BTR. >> >> For a concrete example, imagine you're not doing frontbuffer rendering >> but you're still trying to target a specific scanout time. So even if we >> were to send the vblank event at vpos 0 you'd get an accurate scanout >> timestamp for the next scanout after the vback porch ends. You're not >> going to be doing any rendering or preparation in this time because it's >> simply too short. So you'd target the scanout period after this one, >> which you'd need to calculate somehow in userspace. Taking the deltas >> between two vblank events is completely useless here because you'll get >> back the vrr min refresh delta. So you can take your source content rate >> or some fixed rate you'd like to target within the range but even then >> you're still not guaranteed to have that flip end up at that scanout time. >> >> The current solution is to just return the scanout pos at vrr max, or >> the current mode's fixed refresh rate. If you're flip before the fixed >> duration front porch ends then that scanout time will likely be accurate >> since you can't flip faster than vrr max. >> >> Is there actually userspace that cares that these timestamps need to >> match? Like said below, it seems that wait for vblank is mostly about >> scheduling and for that I think having the event sent back at the start >> of vblank is more important here. > > There's 3 cases: > - DRI3: Waits for vblank for frontbuffer blits. If you combine that > with VRR then imo you deserve to keep all the pieces. > - generic compositors. Use VRR for scheduling, expect that a page_flip > queued immediately after a page_flip will hit the next frame, and not > the current frame. They also expect that this will not give you the > slowest possible refresh rate, but I guess VRR plus frame scheduling > is currently out of scope. It's kinda just for as-fast-as-possible > games. > - flip timestamp. Currently the entire codebase assumes that vblank ts > = flip ts. I don't really see a point in breaking that. > - better scheduling of VRR: what compositors actually do isn't wait > for the vblank, that only latches the page_flip. They set up a > hrtimer, with a fixed/auto-tuned head-start, from which they start > their rendering. The same could be done for VRR, if we expose the > earliest/latest vblank end times (and how much they can shift). vblank > ioctl is only used for syncing that up and making sure we're hitting > the right frame. This is all for pageflipping compositors. This is > also what they'll probably keep doing when we give them a desired flip > time (instead of a target frame), they still need an hrtimer to fire a > little bit ahead to get the frame rendered. > > So taking all together, we have 2 use-cases I think: > - frontbuffer rendering. Kinda wants vblank event to fire at the > beginning of vblank. Won't ever do VRR > - pageflipping compositors. Don't care about the vblank event, either > they have an hrtimer timer (which gets synced with the vblank ioctl), > or just flip as fast as possible. They don't care one bit when the > vblank event happens, as long as a pageflip scheduled immediately > after the event shows up only hits the next frame (which is the delay > code that currently stretches vblank unecessarily with VRR if you're > already a bit late). > > Making sure that vblank and page flip timestamp match the real > userspace seems like the cleanest semantics, and the least surprising > to userspace. E.g. what happens if you implement VRR in a compositor, > but then tune your hrtimer within the vblank, when the vblank > timestamp is totally not reflecting your real pageflip? I just don't > see what's the upside of having a vblank timestamp that's not actually > reflecting the real flip. > > Also, if your app wants the start of vblank, that's very easy to > compute: Just add the scanout time to the last vblank timestamp. > Figuring out the start of frame otoh isn't doable if you can't query, > and constantly pageflipping just to have an accurate start-of-frame > (for tuning your hrtimer) seems rather wasteful - it outright defeats > some of the use-cases for VRR. > > So what benefit do you see in having vblank ts and pageflip ts not > agree? What's the use-case this solves - aside from "less typing" :-) > If the only use-case is VRR for frontbuffer rendering, then that > doesn't even work right now (aside from that I think it's something we > shouldn't support and definitely not encourage). > -Daniel I'm actually fine with moving the vblank event back to the back porch but I'd just be worried about userspace breakages that relied on it happening the other way. It'd be especially confusing if the VLINE was different for fixed vs variable refresh. I wouldn't want to move it to vpos 0 unless the event is also getting moved for fixed refresh as well. But like you said, the only thing I can think of that should need the event right away is low latency / framebuffer rendering. I can think at least one potential use case where you would want to do front buffer rendering for low latency with VRR - emulation of old console hardware. There's not really any modern display that has fixed refresh modes that match the render frequency of some consoles. But this use case isn't out there in the wild right now at least. Regarding compositors: I don't really recommend VRR for this use case in general since you'll want a fixed refresh rate frequency for input to feel consistent and to avoid any flickering / luminance changes. The fastest refresh rate will be the most responsive to the user and you can just used the fixed refresh rate mode in this case. Lowering the refresh rate to save power sounds nice in theory but I'd imagine it'd feel terrible in practice. Nicholas Kazlauskas > >>>>> Now added complication is that amdgpu sends out vblank events really >>>>> early, which is used by userspace to do frontbuffer rendering in the >>>>> vblank time. But I don't think anyone wants to do both VRR and >>>> >>>> I think all kms drivers try to call drm_crtc_handle_vblank() at start >>>> of vblank to give Mesa the most time for frontbuffer rendering for >>>> classic X. But vblank events are also used for scheduling bufferswaps >>>> or other stuff for redirected windowed rendering, or via api's like >>>> OML_sync_controls glXWaitForMscOML, so there might be other things >>>> affected by a more delayed vblank handling. >>> >>> The frontbuffer rendering is very much X driver specific, and I think >>> -amdgpu/radeon is the only one that requires this. No i915 driver ever >>> used the vblank interrupt to schedule frontbuffer blits, we use some >>> CS-side stalls. >>> >>> Wrt scheduling pageflips: The rule is that right after you've received the >>> vblank for frame X, then an immediately schedule pageflip should hit X+1, >>> but not X. amdgpu had this broken, but it's fixed since a while. >>> >>>>> frontbuffer rendering, hence I think it should be possible to create >>>>> correct vblank timestamps for the VRR case, while leaving the current >>>>> logic in place for everything else. But that means moving the entire >>>>> vblank processing to where you know the next frame's start time for >>>>> VRR, both for page flips and for all other vblank events. >>>>> -Daniel >>>>> >>>> >>>> I think for amdgpu it would be doable by calling drm_handle_vblank >>>> from the pageflip irq handler in VRR mode, and then additionally from >>>> the vblank interrupt handler like now. Vblank irq's are triggered via >>>> vline interrupts with programmable vertical trigger line position, so >>>> we could program the trigger line to be start of back-porch instead of >>>> start of front-porch. In the pageflip case we do vblank handling + >>>> timestamping + pflip completion from pflip irq, and have the regular >>>> back-porch vblank handling for refresh cycles in which no >>>> pageflip/pflip irq happened. >>> >>> Hm, can't we simply program the vblank to occur in the back porch region >>> for VRR? The vblank timestamping should automatically correct the >>> timestamp in that case. Would require a lot less code changes. >>> >>> Plus update the documentation ofc. >>> >>>> Not sure if/how that would mess with below-vrr-refresh-rate-range >>>> support for stutter reduction at low fps though? Or with performance, >>>> given vblank event dispatch could be delayed a lot in VRR compared to >>>> normal mode? Or with all the vblank accounting during modesets or crtc >>>> en/disable? >> >> I think we have extra VLINEs that can be programmed at vpos 0 that could >> resolve this but I'm still not really sold on the idea. >> >>> >>> Currently VRR and explicit frame scheduling don't mix. We've discussed >>> this, and consens seems to be that we need a new property for page_flip, >>> similar to the target frame, but instead of frames a timestamp. Then the >>> driver can hit a specific time for the next frame. >>> >>> And since compositors use the vblank ioctl just to do frame scheduling, I >>> don't think we have to worry hugely about interactions there. Much better >>> to aim for clean semantics (i.e. vblank and flip complete timestamps >>> better match). >>> >>>> Also, could other drivers like Intel easily do this delayed vblank >>>> processing in the non-pageflip case? >>> >>> Intel's problem :-) But yeah we have page flip timestamp registers, so >>> worst case all we need to do is shuffle code around a bit. And we have >>> interrupts for both start of vblank and and of vblank. It might be a bit >>> more work because we use the vblank interrupt to avoid races in atomic >>> updates, but that's not too terrible to fix. >>> >>>> From my user perspective as a developer of a neuroscience research >>>> application that has a couple of exciting/important use cases for VRR >>>> mode, i absolutely do need trustworthy and precise pageflip event >>>> timestamps that represent reality, otherwise VRR mode would be less >>>> than useless for me. I can do without meaningful vblank timestamps in >>>> VRR mode though, and expected to do without them, as any scheduling >>>> has to happen time-based instead of based on vblank counts anyway, >>>> given that vblank counts are quite meaningless if every frame can have >>>> a different duration. I'd expect that situation to be similar for >>>> other apps that would want to use timestamps in VRR mode like video >>>> games, movie players or VR/AR applications. >>> >>> Yeah for that you want to schedule your frames with a target frame number. >>> Would still be nice if vblank timestamps wouldn't be an outright lie (and >>> the timestamp is still useful information, even if the frame counter >>> isn't - many displays have min/max frame rates for VRR, which is >>> information we could perhaps expose). >>> -Daniel > >>>> >>>> -mario >>>> >>>>>> >>>>>> Nicholas Kazlauskas >>>>>> >>>>>>> >>>>>>>> --- >>>>>>>> drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- >>>>>>>> include/drm/drm_vblank.h | 8 ++++++ >>>>>>>> 2 files changed, 56 insertions(+), 1 deletion(-) >>>>>>>> >>>>>>>> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c >>>>>>>> index 98e091175921..4b3a4c38fabe 100644 >>>>>>>> --- a/drivers/gpu/drm/drm_vblank.c >>>>>>>> +++ b/drivers/gpu/drm/drm_vblank.c >>>>>>>> @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, >>>>>>>> u64 seq, ktime_t now) >>>>>>>> { >>>>>>>> struct timespec64 tv; >>>>>>>> + ktime_t alt_flip_time; >>>>>>>> >>>>>>>> switch (e->event.base.type) { >>>>>>>> - case DRM_EVENT_VBLANK: >>>>>>>> case DRM_EVENT_FLIP_COMPLETE: >>>>>>>> + /* >>>>>>>> + * For flip completion events, override "now" time >>>>>>>> + * with alt_flip_time provided by the driver via >>>>>>>> + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode >>>>>>>> + * if that time is later than given "now" vblank time. >>>>>>>> + */ >>>>>>>> + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; >>>>>>>> + if (alt_flip_time > now) >>>>>>>> + now = alt_flip_time; >>>>>>>> + /* Fallthrough */ >>>>>>>> + case DRM_EVENT_VBLANK: >>>>>>>> tv = ktime_to_timespec64(now); >>>>>>>> e->event.vbl.sequence = seq; >>>>>>>> /* >>>>>>>> @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, >>>>>>>> >>>>>>>> now = ktime_get(); >>>>>>>> } >>>>>>>> + >>>>>>>> e->pipe = pipe; >>>>>>>> send_vblank_event(dev, e, seq, now); >>>>>>>> } >>>>>>>> EXPORT_SYMBOL(drm_crtc_send_vblank_event); >>>>>>>> >>>>>>>> +/** >>>>>>>> + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time >>>>>>>> + * @crtc: the source CRTC of the pageflip completion event >>>>>>>> + * @flip_time: The alternate pageflip completion timestamp in VRR mode >>>>>>>> + * >>>>>>>> + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried >>>>>>>> + * by the pageflip event can never be earlier than the vblank timestamp of the >>>>>>>> + * vblank of flip completion, as that vblank timestamp defines the end of the >>>>>>>> + * shortest possible vblank duration. In case of a delayed flip completion >>>>>>>> + * inside the extended VRR front porch however, the end of vblank can be much >>>>>>>> + * later, so the driver must assign an estimated timestamp of that later end of >>>>>>>> + * vblank. For a CRTC in VRR mode, the driver should use this helper function to >>>>>>>> + * set an alternate flip completion timestamp in case of late flip completions >>>>>>>> + * in extended vblank. In the most simple case, this @flip_time timestamp could >>>>>>>> + * simply be a ktime_get() timestamp taken at the start of the pageflip >>>>>>>> + * completion routine, with some constant duration of the back porch interval >>>>>>>> + * added, although more precise estimates may be possible on some hardware if >>>>>>>> + * the hardware provides some means of timestamping the true end of vblank. >>>>>>>> + * >>>>>>>> + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it >>>>>>>> + * will use either the standard vblank timestamp, calculated for a minimum >>>>>>>> + * duration vblank, or the provided @flip_time if that time is later than the >>>>>>>> + * vblank timestamp, to get the best possible estimate of start of display of >>>>>>>> + * the new post-pageflip scanout buffer. >>>>>>>> + */ >>>>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, >>>>>>>> + ktime_t flip_time) >>>>>>>> +{ >>>>>>>> + struct drm_device *dev = crtc->dev; >>>>>>>> + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; >>>>>>>> + >>>>>>>> + vblank->alt_flip_time = flip_time; >>>>>>>> +} >>>>>>>> +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); >>>>>>>> + >>>>>>>> static int __enable_vblank(struct drm_device *dev, unsigned int pipe) >>>>>>>> { >>>>>>>> if (drm_core_check_feature(dev, DRIVER_MODESET)) { >>>>>>>> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h >>>>>>>> index 6ad9630d4f48..aacf44694ab6 100644 >>>>>>>> --- a/include/drm/drm_vblank.h >>>>>>>> +++ b/include/drm/drm_vblank.h >>>>>>>> @@ -117,6 +117,12 @@ struct drm_vblank_crtc { >>>>>>>> * @time: Vblank timestamp corresponding to @count. >>>>>>>> */ >>>>>>>> ktime_t time; >>>>>>>> + /** >>>>>>>> + * @alt_flip_time: Vblank timestamp for end of extended vblank due to >>>>>>>> + * a late pageflip completion in variable refresh rate mode. Pageflip >>>>>>>> + * events will carry the later one of @time and @alt_flip_time. >>>>>>>> + */ >>>>>>>> + ktime_t alt_flip_time; >>>>>>>> >>>>>>>> /** >>>>>>>> * @refcount: Number of users/waiters of the vblank interrupt. Only when >>>>>>>> @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); >>>>>>>> u64 drm_crtc_vblank_count(struct drm_crtc *crtc); >>>>>>>> u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, >>>>>>>> ktime_t *vblanktime); >>>>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, >>>>>>>> + ktime_t flip_time); >>>>>>>> void drm_crtc_send_vblank_event(struct drm_crtc *crtc, >>>>>>>> struct drm_pending_vblank_event *e); >>>>>>>> void drm_crtc_arm_vblank_event(struct drm_crtc *crtc, >>>>>>>> -- >>>>>>>> 2.17.1 >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> dri-devel mailing list >>>>>>>> dri-devel@lists.freedesktop.org >>>>>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Daniel Vetter >>>>> Software Engineer, Intel Corporation >>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch >>> >> >> Nicholas Kazlauskas >> > >
On Wed, Feb 13, 2019 at 4:46 PM Kazlauskas, Nicholas <Nicholas.Kazlauskas@amd.com> wrote: > > On 2/13/19 10:14 AM, Daniel Vetter wrote: > > On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas > > <Nicholas.Kazlauskas@amd.com> wrote: > >> > >> On 2/13/19 4:50 AM, Daniel Vetter wrote: > >>> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote: > >>>> On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter <daniel@ffwll.ch> wrote: > >>>>> > >>>>> On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas > >>>>> <Nicholas.Kazlauskas@amd.com> wrote: > >>>>>> > >>>>>> On 2/11/19 3:35 AM, Daniel Vetter wrote: > >>>>>>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote: > >>>>>>>> The pageflip completion timestamps transmitted to userspace > >>>>>>>> via pageflip completion events are supposed to describe the > >>>>>>>> time at which the first pixel of the new post-pageflip scanout > >>>>>>>> buffer leaves the video output of the gpu. This time is > >>>>>>>> identical to end of vblank, when active scanout starts. > >>>>>>>> > >>>>>>>> For a crtc in standard fixed refresh rate, the end of vblank > >>>>>>>> is identical to the vblank timestamps calculated by > >>>>>>>> drm_update_vblank_count() at each vblank interrupt, or each > >>>>>>>> vblank dis-/enable. Therefore pageflip events just carry > >>>>>>>> that vblank timestamp as their pageflip timestamp. > >>>>>>>> > >>>>>>>> For a crtc switched to variable refresh rate mode (vrr), the > >>>>>>>> pageflip completion timestamps are identical to the vblank > >>>>>>>> timestamps iff the pageflip was executed early in vblank, > >>>>>>>> before the minimum vblank duration elapsed. In this case > >>>>>>>> the time of display onset is identical to when the crtc > >>>>>>>> is running in fixed refresh rate. > >>>>>>>> > >>>>>>>> However, if a pageflip completes later in the vblank, inside > >>>>>>>> the "extended front porch" in vrr mode, then the vblank will > >>>>>>>> terminate at a fixed (back porch) duration after flip, so > >>>>>>>> the display onset time is delayed correspondingly. In this > >>>>>>>> case the vblank timestamp computed at vblank irq time would > >>>>>>>> be too early, and we need a way to calculate an estimated > >>>>>>>> pageflip timestamp that will be later than the vblank timestamp. > >>>>>>>> > >>>>>>>> How a driver determines such a "late flip" timestamp is hw > >>>>>>>> and driver specific, but this patch adds a new helper function > >>>>>>>> that allows the driver to propose such an alternate "late flip" > >>>>>>>> timestamp for use in pageflip events: > >>>>>>>> > >>>>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > >>>>>>>> > >>>>>>>> When sending out pageflip events, we now compare that proposed > >>>>>>>> flip_timestamp against the vblank timestamp of the current > >>>>>>>> vblank of flip completion and choose to send out the greater/ > >>>>>>>> later timestamp as flip completion timestamp. > >>>>>>>> > >>>>>>>> The most simple way for a kms driver to supply a suitable > >>>>>>>> flip_timestamp in vrr mode would be to simply take a timestamp > >>>>>>>> at start of the pageflip completion handler, e.g., pageflip > >>>>>>>> irq handler: flip_timestamp = ktime_get(); and then set that > >>>>>>>> as proposed "late" alternative timestamp via ... > >>>>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > >>>>>>>> > >>>>>>>> More clever approaches could try to add some corrective offset > >>>>>>>> for fixed back porch duration, or ideally use hardware features > >>>>>>>> like hw timestamps to calculate the exact end time of vblank. > >>>>>>>> > >>>>>>>> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> > >>>>>>>> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> > >>>>>>>> Cc: Harry Wentland <harry.wentland@amd.com> > >>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> > >>>>>>> > >>>>>>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us > >>>>>>> the right timestampe, once? With this I guess if you do a vblank query in > >>>>>>> between the wrong and the right vblank you'll get the bogus value. Not > >>>>>>> really great for userspace. > >>>>>>> -Daniel > >>>>>> > >>>>>> I think we calculate the timestamp and send the vblank event both within > >>>>>> the pageflip IRQ handler so calculating the right pageflip timestamp > >>>>>> once could probably be done. I'm not sure if it's easier than proposing > >>>>>> a later flip time with an API like this though. > >>>>>> > >>>>>> The actual scanout time should be known from the page-flip handler so > >>>>>> the semantics for VRR on/off remain the same. This is because the > >>>>>> page-flip triggers entering the back porch if we're in the extended > >>>>>> front porch. > >>>>>> > >>>>>> But scanout time from vblank events for something like > >>>>>> DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only > >>>>>> treated as estimates. If we're in the regular front porch then the > >>>>>> timing to scanout is based on the fixed duration front porch for the > >>>>>> current mode. If we're in the extended back porch then it's technically > >>>>>> driver defined but the most reasonable guess is to assume that the front > >>>>>> porch is going to end at any moment, so just return the length of the > >>>>>> back porch for getting the scanout time. > >>>>>> > >>>>>> Proposing the late timestamp shouldn't affect vblank event in the > >>>>>> DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip > >>>>>> event case. I'm not sure if that's what's guaranteed to happen with this > >>>>>> patch though. There doesn't seem to be any locking on either > >>>>>> dev->vblank_time_lock or the vblank->seqlock so while it's likely to get > >>>>>> the same vblank event back as the one just stored I don't think it's > >>>>>> guaranteed. > >>>>> > >>>>> That's the inconsistency I mean to highlight - the timestamp for the > >>>>> same frame as observed through flip complete and through the > >>>>> wait_vblank ioctl can differ. Which they really shouldn't. > >>>>> > >>>> > >>>> Ideally they shouldn't differ. The kernel docs for drm_crtc_state say > >>>> that vblank and pageflip timestamps should always match. But then the > >>>> kernel docs for "Variable refresh properties" in drm_connector.c for > >>>> vblank timestamps were changed for the VRR implementation in Linux > >>>> 5.0-rc to redefine them when in VRR mode. They are defined, but > >>>> probably rather useless for any practical purpose, like this: > >>>> > >>>> "The semantics for the vertical blank timestamp differ when > >>>> variable refresh rate is active. The vertical blank timestamp > >>>> is defined to be an estimate using the current mode's fixed > >>>> refresh rate timings. The semantics for the page-flip event > >>>> timestamp remain the same." > >>> > >>> Uh I missed that. That sounds like nonsense tbh. > >>> > >>>> So our docs contradict each other as of Linux 5.0-rc. Certainly having > >>>> useful vblank timetamps would be useful. > >>> > >>> Yup, imo vblank should still match page_flip. Otherwise I expect a lot of > >>> hilarity will ensue. > >> > >> I would imagine you would see more breakage by changing userspace > >> expectations for when the event is actually received. > >> > >> The IOCTL is "wait for vblank", but if we start sending the event near > >> the end of vblank it becomes more "wait for pageflip" (near scanout) or > >> "wait 15ms while the VRR front porch times out and causes flickering". > >> > >> I don't speak for all userspace but it seems more useful to me to have > >> the event sent at the start of vblank. > > > > Maybe we need to things. The problem is that the timestampe is defined > > to be correct for start-of-frame. And right now the amdgpu > > implementation of VRR doesn't even give an accurate timestampe for the > > flip, so most likely no one yet thought about this properly. > > Definitely no testcases (or it would have been caught), whereas we > > have tons of testcases for fixed refresh. > > I agree that it would be good to have documentation for what waiting for > vblank is supposed to mean here. There's a big difference between > getting the event at the start of vsync vs the start of the back porch > near the end of it. > > > > >> VRR in its current state is useful for applications that aren't > >> completely timing sensitive (like most games). This is because the > >> driver has free reign in restricting the min/max bounds of the range - > >> allowing for enhancements like low framerate compensation / BTR. > >> > >> For a concrete example, imagine you're not doing frontbuffer rendering > >> but you're still trying to target a specific scanout time. So even if we > >> were to send the vblank event at vpos 0 you'd get an accurate scanout > >> timestamp for the next scanout after the vback porch ends. You're not > >> going to be doing any rendering or preparation in this time because it's > >> simply too short. So you'd target the scanout period after this one, > >> which you'd need to calculate somehow in userspace. Taking the deltas > >> between two vblank events is completely useless here because you'll get > >> back the vrr min refresh delta. So you can take your source content rate > >> or some fixed rate you'd like to target within the range but even then > >> you're still not guaranteed to have that flip end up at that scanout time. > >> > >> The current solution is to just return the scanout pos at vrr max, or > >> the current mode's fixed refresh rate. If you're flip before the fixed > >> duration front porch ends then that scanout time will likely be accurate > >> since you can't flip faster than vrr max. > >> > >> Is there actually userspace that cares that these timestamps need to > >> match? Like said below, it seems that wait for vblank is mostly about > >> scheduling and for that I think having the event sent back at the start > >> of vblank is more important here. > > > > There's 3 cases: > > - DRI3: Waits for vblank for frontbuffer blits. If you combine that > > with VRR then imo you deserve to keep all the pieces. > > - generic compositors. Use VRR for scheduling, expect that a page_flip > > queued immediately after a page_flip will hit the next frame, and not > > the current frame. They also expect that this will not give you the > > slowest possible refresh rate, but I guess VRR plus frame scheduling > > is currently out of scope. It's kinda just for as-fast-as-possible > > games. > > - flip timestamp. Currently the entire codebase assumes that vblank ts > > = flip ts. I don't really see a point in breaking that. > > - better scheduling of VRR: what compositors actually do isn't wait > > for the vblank, that only latches the page_flip. They set up a > > hrtimer, with a fixed/auto-tuned head-start, from which they start > > their rendering. The same could be done for VRR, if we expose the > > earliest/latest vblank end times (and how much they can shift). vblank > > ioctl is only used for syncing that up and making sure we're hitting > > the right frame. This is all for pageflipping compositors. This is > > also what they'll probably keep doing when we give them a desired flip > > time (instead of a target frame), they still need an hrtimer to fire a > > little bit ahead to get the frame rendered. > > > > So taking all together, we have 2 use-cases I think: > > - frontbuffer rendering. Kinda wants vblank event to fire at the > > beginning of vblank. Won't ever do VRR > > - pageflipping compositors. Don't care about the vblank event, either > > they have an hrtimer timer (which gets synced with the vblank ioctl), > > or just flip as fast as possible. They don't care one bit when the > > vblank event happens, as long as a pageflip scheduled immediately > > after the event shows up only hits the next frame (which is the delay > > code that currently stretches vblank unecessarily with VRR if you're > > already a bit late). > > > > Making sure that vblank and page flip timestamp match the real > > userspace seems like the cleanest semantics, and the least surprising > > to userspace. E.g. what happens if you implement VRR in a compositor, > > but then tune your hrtimer within the vblank, when the vblank > > timestamp is totally not reflecting your real pageflip? I just don't > > see what's the upside of having a vblank timestamp that's not actually > > reflecting the real flip. > > > > Also, if your app wants the start of vblank, that's very easy to > > compute: Just add the scanout time to the last vblank timestamp. > > Figuring out the start of frame otoh isn't doable if you can't query, > > and constantly pageflipping just to have an accurate start-of-frame > > (for tuning your hrtimer) seems rather wasteful - it outright defeats > > some of the use-cases for VRR. > > > > So what benefit do you see in having vblank ts and pageflip ts not > > agree? What's the use-case this solves - aside from "less typing" :-) > > If the only use-case is VRR for frontbuffer rendering, then that > > doesn't even work right now (aside from that I think it's something we > > shouldn't support and definitely not encourage). > > -Daniel > > I'm actually fine with moving the vblank event back to the back porch > but I'd just be worried about userspace breakages that relied on it > happening the other way. > > It'd be especially confusing if the VLINE was different for fixed vs > variable refresh. I wouldn't want to move it to vpos 0 unless the event > is also getting moved for fixed refresh as well. That would break DRI3. So we can pick between two options: - always send out the vblank early, accept that vblank timestamps change. - for VRR only, send out the vblank only when we know the correct timestamp. For non-VRR we have to send out the vblank early, because that's somewhat become api (but not all drivers do it that way, so it's about as hopeless as frontbuffer rendering in general). I still don't see the benefit of the first option, or the exact use case. > But like you said, the only thing I can think of that should need the > event right away is low latency / framebuffer rendering. > > I can think at least one potential use case where you would want to do > front buffer rendering for low latency with VRR - emulation of old > console hardware. There's not really any modern display that has fixed > refresh modes that match the render frequency of some consoles. But this > use case isn't out there in the wild right now at least. Even on an emulator I'd expect it'd just do regular flips to the frontbuffer. Especially on mobile there's really no such thing as frontbuffer rendering, you always do a flip. The DIRTYFB ioctl (which you need to call instead of doing a page flip) simply does a pageflip behind your back for you. Which I think is worse than actually flipping. And for emulating exact refresh rates your then again go back to hrtimer + issuing the flip approach, but using your desired refresh rate to tune the hrtimer instead of the fixed one. But that needs a correct vblank timestamp for the current frame to work correctly (or you force compositor has to always flip, which defeats the power saving aspects of VRR allowing to stretch refresh rates when nothing changes). > Regarding compositors: I don't really recommend VRR for this use case in > general since you'll want a fixed refresh rate frequency for input to > feel consistent and to avoid any flickering / luminance changes. The > fastest refresh rate will be the most responsive to the user and you can > just used the fixed refresh rate mode in this case. Lowering the refresh > rate to save power sounds nice in theory but I'd imagine it'd feel > terrible in practice. Why is a compositor any different from a game? You aim to render at 60fps, or whatever the refresh rate is. If you're one frame late it's better to stretch the vblank slightly than a full on 1-frame freeze. And on mobile it's unfortunately fairly common that the system fails to clock up quickly enough when starting from an idle state, VRR sounds like really useful here. But like in a game if you don't at least aim to render at a consistent fps, and handle your input at a consistent fps, then the end result will indeed be terrible. -Daniel > Nicholas Kazlauskas > > > > >>>>> Now added complication is that amdgpu sends out vblank events really > >>>>> early, which is used by userspace to do frontbuffer rendering in the > >>>>> vblank time. But I don't think anyone wants to do both VRR and > >>>> > >>>> I think all kms drivers try to call drm_crtc_handle_vblank() at start > >>>> of vblank to give Mesa the most time for frontbuffer rendering for > >>>> classic X. But vblank events are also used for scheduling bufferswaps > >>>> or other stuff for redirected windowed rendering, or via api's like > >>>> OML_sync_controls glXWaitForMscOML, so there might be other things > >>>> affected by a more delayed vblank handling. > >>> > >>> The frontbuffer rendering is very much X driver specific, and I think > >>> -amdgpu/radeon is the only one that requires this. No i915 driver ever > >>> used the vblank interrupt to schedule frontbuffer blits, we use some > >>> CS-side stalls. > >>> > >>> Wrt scheduling pageflips: The rule is that right after you've received the > >>> vblank for frame X, then an immediately schedule pageflip should hit X+1, > >>> but not X. amdgpu had this broken, but it's fixed since a while. > >>> > >>>>> frontbuffer rendering, hence I think it should be possible to create > >>>>> correct vblank timestamps for the VRR case, while leaving the current > >>>>> logic in place for everything else. But that means moving the entire > >>>>> vblank processing to where you know the next frame's start time for > >>>>> VRR, both for page flips and for all other vblank events. > >>>>> -Daniel > >>>>> > >>>> > >>>> I think for amdgpu it would be doable by calling drm_handle_vblank > >>>> from the pageflip irq handler in VRR mode, and then additionally from > >>>> the vblank interrupt handler like now. Vblank irq's are triggered via > >>>> vline interrupts with programmable vertical trigger line position, so > >>>> we could program the trigger line to be start of back-porch instead of > >>>> start of front-porch. In the pageflip case we do vblank handling + > >>>> timestamping + pflip completion from pflip irq, and have the regular > >>>> back-porch vblank handling for refresh cycles in which no > >>>> pageflip/pflip irq happened. > >>> > >>> Hm, can't we simply program the vblank to occur in the back porch region > >>> for VRR? The vblank timestamping should automatically correct the > >>> timestamp in that case. Would require a lot less code changes. > >>> > >>> Plus update the documentation ofc. > >>> > >>>> Not sure if/how that would mess with below-vrr-refresh-rate-range > >>>> support for stutter reduction at low fps though? Or with performance, > >>>> given vblank event dispatch could be delayed a lot in VRR compared to > >>>> normal mode? Or with all the vblank accounting during modesets or crtc > >>>> en/disable? > >> > >> I think we have extra VLINEs that can be programmed at vpos 0 that could > >> resolve this but I'm still not really sold on the idea. > >> > >>> > >>> Currently VRR and explicit frame scheduling don't mix. We've discussed > >>> this, and consens seems to be that we need a new property for page_flip, > >>> similar to the target frame, but instead of frames a timestamp. Then the > >>> driver can hit a specific time for the next frame. > >>> > >>> And since compositors use the vblank ioctl just to do frame scheduling, I > >>> don't think we have to worry hugely about interactions there. Much better > >>> to aim for clean semantics (i.e. vblank and flip complete timestamps > >>> better match). > >>> > >>>> Also, could other drivers like Intel easily do this delayed vblank > >>>> processing in the non-pageflip case? > >>> > >>> Intel's problem :-) But yeah we have page flip timestamp registers, so > >>> worst case all we need to do is shuffle code around a bit. And we have > >>> interrupts for both start of vblank and and of vblank. It might be a bit > >>> more work because we use the vblank interrupt to avoid races in atomic > >>> updates, but that's not too terrible to fix. > >>> > >>>> From my user perspective as a developer of a neuroscience research > >>>> application that has a couple of exciting/important use cases for VRR > >>>> mode, i absolutely do need trustworthy and precise pageflip event > >>>> timestamps that represent reality, otherwise VRR mode would be less > >>>> than useless for me. I can do without meaningful vblank timestamps in > >>>> VRR mode though, and expected to do without them, as any scheduling > >>>> has to happen time-based instead of based on vblank counts anyway, > >>>> given that vblank counts are quite meaningless if every frame can have > >>>> a different duration. I'd expect that situation to be similar for > >>>> other apps that would want to use timestamps in VRR mode like video > >>>> games, movie players or VR/AR applications. > >>> > >>> Yeah for that you want to schedule your frames with a target frame number. > >>> Would still be nice if vblank timestamps wouldn't be an outright lie (and > >>> the timestamp is still useful information, even if the frame counter > >>> isn't - many displays have min/max frame rates for VRR, which is > >>> information we could perhaps expose). > >>> -Daniel > > >>>> > >>>> -mario > >>>> > >>>>>> > >>>>>> Nicholas Kazlauskas > >>>>>> > >>>>>>> > >>>>>>>> --- > >>>>>>>> drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- > >>>>>>>> include/drm/drm_vblank.h | 8 ++++++ > >>>>>>>> 2 files changed, 56 insertions(+), 1 deletion(-) > >>>>>>>> > >>>>>>>> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c > >>>>>>>> index 98e091175921..4b3a4c38fabe 100644 > >>>>>>>> --- a/drivers/gpu/drm/drm_vblank.c > >>>>>>>> +++ b/drivers/gpu/drm/drm_vblank.c > >>>>>>>> @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, > >>>>>>>> u64 seq, ktime_t now) > >>>>>>>> { > >>>>>>>> struct timespec64 tv; > >>>>>>>> + ktime_t alt_flip_time; > >>>>>>>> > >>>>>>>> switch (e->event.base.type) { > >>>>>>>> - case DRM_EVENT_VBLANK: > >>>>>>>> case DRM_EVENT_FLIP_COMPLETE: > >>>>>>>> + /* > >>>>>>>> + * For flip completion events, override "now" time > >>>>>>>> + * with alt_flip_time provided by the driver via > >>>>>>>> + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode > >>>>>>>> + * if that time is later than given "now" vblank time. > >>>>>>>> + */ > >>>>>>>> + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; > >>>>>>>> + if (alt_flip_time > now) > >>>>>>>> + now = alt_flip_time; > >>>>>>>> + /* Fallthrough */ > >>>>>>>> + case DRM_EVENT_VBLANK: > >>>>>>>> tv = ktime_to_timespec64(now); > >>>>>>>> e->event.vbl.sequence = seq; > >>>>>>>> /* > >>>>>>>> @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > >>>>>>>> > >>>>>>>> now = ktime_get(); > >>>>>>>> } > >>>>>>>> + > >>>>>>>> e->pipe = pipe; > >>>>>>>> send_vblank_event(dev, e, seq, now); > >>>>>>>> } > >>>>>>>> EXPORT_SYMBOL(drm_crtc_send_vblank_event); > >>>>>>>> > >>>>>>>> +/** > >>>>>>>> + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time > >>>>>>>> + * @crtc: the source CRTC of the pageflip completion event > >>>>>>>> + * @flip_time: The alternate pageflip completion timestamp in VRR mode > >>>>>>>> + * > >>>>>>>> + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried > >>>>>>>> + * by the pageflip event can never be earlier than the vblank timestamp of the > >>>>>>>> + * vblank of flip completion, as that vblank timestamp defines the end of the > >>>>>>>> + * shortest possible vblank duration. In case of a delayed flip completion > >>>>>>>> + * inside the extended VRR front porch however, the end of vblank can be much > >>>>>>>> + * later, so the driver must assign an estimated timestamp of that later end of > >>>>>>>> + * vblank. For a CRTC in VRR mode, the driver should use this helper function to > >>>>>>>> + * set an alternate flip completion timestamp in case of late flip completions > >>>>>>>> + * in extended vblank. In the most simple case, this @flip_time timestamp could > >>>>>>>> + * simply be a ktime_get() timestamp taken at the start of the pageflip > >>>>>>>> + * completion routine, with some constant duration of the back porch interval > >>>>>>>> + * added, although more precise estimates may be possible on some hardware if > >>>>>>>> + * the hardware provides some means of timestamping the true end of vblank. > >>>>>>>> + * > >>>>>>>> + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it > >>>>>>>> + * will use either the standard vblank timestamp, calculated for a minimum > >>>>>>>> + * duration vblank, or the provided @flip_time if that time is later than the > >>>>>>>> + * vblank timestamp, to get the best possible estimate of start of display of > >>>>>>>> + * the new post-pageflip scanout buffer. > >>>>>>>> + */ > >>>>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > >>>>>>>> + ktime_t flip_time) > >>>>>>>> +{ > >>>>>>>> + struct drm_device *dev = crtc->dev; > >>>>>>>> + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; > >>>>>>>> + > >>>>>>>> + vblank->alt_flip_time = flip_time; > >>>>>>>> +} > >>>>>>>> +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); > >>>>>>>> + > >>>>>>>> static int __enable_vblank(struct drm_device *dev, unsigned int pipe) > >>>>>>>> { > >>>>>>>> if (drm_core_check_feature(dev, DRIVER_MODESET)) { > >>>>>>>> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h > >>>>>>>> index 6ad9630d4f48..aacf44694ab6 100644 > >>>>>>>> --- a/include/drm/drm_vblank.h > >>>>>>>> +++ b/include/drm/drm_vblank.h > >>>>>>>> @@ -117,6 +117,12 @@ struct drm_vblank_crtc { > >>>>>>>> * @time: Vblank timestamp corresponding to @count. > >>>>>>>> */ > >>>>>>>> ktime_t time; > >>>>>>>> + /** > >>>>>>>> + * @alt_flip_time: Vblank timestamp for end of extended vblank due to > >>>>>>>> + * a late pageflip completion in variable refresh rate mode. Pageflip > >>>>>>>> + * events will carry the later one of @time and @alt_flip_time. > >>>>>>>> + */ > >>>>>>>> + ktime_t alt_flip_time; > >>>>>>>> > >>>>>>>> /** > >>>>>>>> * @refcount: Number of users/waiters of the vblank interrupt. Only when > >>>>>>>> @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); > >>>>>>>> u64 drm_crtc_vblank_count(struct drm_crtc *crtc); > >>>>>>>> u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, > >>>>>>>> ktime_t *vblanktime); > >>>>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > >>>>>>>> + ktime_t flip_time); > >>>>>>>> void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > >>>>>>>> struct drm_pending_vblank_event *e); > >>>>>>>> void drm_crtc_arm_vblank_event(struct drm_crtc *crtc, > >>>>>>>> -- > >>>>>>>> 2.17.1 > >>>>>>>> > >>>>>>>> _______________________________________________ > >>>>>>>> dri-devel mailing list > >>>>>>>> dri-devel@lists.freedesktop.org > >>>>>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Daniel Vetter > >>>>> Software Engineer, Intel Corporation > >>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch > >>> > >> > >> Nicholas Kazlauskas > >> > > > > >
On Wed, Feb 13, 2019 at 5:03 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Wed, Feb 13, 2019 at 4:46 PM Kazlauskas, Nicholas > <Nicholas.Kazlauskas@amd.com> wrote: > > > > On 2/13/19 10:14 AM, Daniel Vetter wrote: > > > On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas > > > <Nicholas.Kazlauskas@amd.com> wrote: > > >> > > >> On 2/13/19 4:50 AM, Daniel Vetter wrote: > > >>> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote: > > >>>> On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > >>>>> > > >>>>> On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas > > >>>>> <Nicholas.Kazlauskas@amd.com> wrote: > > >>>>>> > > >>>>>> On 2/11/19 3:35 AM, Daniel Vetter wrote: > > >>>>>>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote: > > >>>>>>>> The pageflip completion timestamps transmitted to userspace > > >>>>>>>> via pageflip completion events are supposed to describe the > > >>>>>>>> time at which the first pixel of the new post-pageflip scanout > > >>>>>>>> buffer leaves the video output of the gpu. This time is > > >>>>>>>> identical to end of vblank, when active scanout starts. > > >>>>>>>> > > >>>>>>>> For a crtc in standard fixed refresh rate, the end of vblank > > >>>>>>>> is identical to the vblank timestamps calculated by > > >>>>>>>> drm_update_vblank_count() at each vblank interrupt, or each > > >>>>>>>> vblank dis-/enable. Therefore pageflip events just carry > > >>>>>>>> that vblank timestamp as their pageflip timestamp. > > >>>>>>>> > > >>>>>>>> For a crtc switched to variable refresh rate mode (vrr), the > > >>>>>>>> pageflip completion timestamps are identical to the vblank > > >>>>>>>> timestamps iff the pageflip was executed early in vblank, > > >>>>>>>> before the minimum vblank duration elapsed. In this case > > >>>>>>>> the time of display onset is identical to when the crtc > > >>>>>>>> is running in fixed refresh rate. > > >>>>>>>> > > >>>>>>>> However, if a pageflip completes later in the vblank, inside > > >>>>>>>> the "extended front porch" in vrr mode, then the vblank will > > >>>>>>>> terminate at a fixed (back porch) duration after flip, so > > >>>>>>>> the display onset time is delayed correspondingly. In this > > >>>>>>>> case the vblank timestamp computed at vblank irq time would > > >>>>>>>> be too early, and we need a way to calculate an estimated > > >>>>>>>> pageflip timestamp that will be later than the vblank timestamp. > > >>>>>>>> > > >>>>>>>> How a driver determines such a "late flip" timestamp is hw > > >>>>>>>> and driver specific, but this patch adds a new helper function > > >>>>>>>> that allows the driver to propose such an alternate "late flip" > > >>>>>>>> timestamp for use in pageflip events: > > >>>>>>>> > > >>>>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > > >>>>>>>> > > >>>>>>>> When sending out pageflip events, we now compare that proposed > > >>>>>>>> flip_timestamp against the vblank timestamp of the current > > >>>>>>>> vblank of flip completion and choose to send out the greater/ > > >>>>>>>> later timestamp as flip completion timestamp. > > >>>>>>>> > > >>>>>>>> The most simple way for a kms driver to supply a suitable > > >>>>>>>> flip_timestamp in vrr mode would be to simply take a timestamp > > >>>>>>>> at start of the pageflip completion handler, e.g., pageflip > > >>>>>>>> irq handler: flip_timestamp = ktime_get(); and then set that > > >>>>>>>> as proposed "late" alternative timestamp via ... > > >>>>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > > >>>>>>>> > > >>>>>>>> More clever approaches could try to add some corrective offset > > >>>>>>>> for fixed back porch duration, or ideally use hardware features > > >>>>>>>> like hw timestamps to calculate the exact end time of vblank. > > >>>>>>>> > > >>>>>>>> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> > > >>>>>>>> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> > > >>>>>>>> Cc: Harry Wentland <harry.wentland@amd.com> > > >>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> > > >>>>>>> > > >>>>>>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us > > >>>>>>> the right timestampe, once? With this I guess if you do a vblank query in > > >>>>>>> between the wrong and the right vblank you'll get the bogus value. Not > > >>>>>>> really great for userspace. > > >>>>>>> -Daniel > > >>>>>> > > >>>>>> I think we calculate the timestamp and send the vblank event both within > > >>>>>> the pageflip IRQ handler so calculating the right pageflip timestamp > > >>>>>> once could probably be done. I'm not sure if it's easier than proposing > > >>>>>> a later flip time with an API like this though. > > >>>>>> > > >>>>>> The actual scanout time should be known from the page-flip handler so > > >>>>>> the semantics for VRR on/off remain the same. This is because the > > >>>>>> page-flip triggers entering the back porch if we're in the extended > > >>>>>> front porch. > > >>>>>> > > >>>>>> But scanout time from vblank events for something like > > >>>>>> DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only > > >>>>>> treated as estimates. If we're in the regular front porch then the > > >>>>>> timing to scanout is based on the fixed duration front porch for the > > >>>>>> current mode. If we're in the extended back porch then it's technically > > >>>>>> driver defined but the most reasonable guess is to assume that the front > > >>>>>> porch is going to end at any moment, so just return the length of the > > >>>>>> back porch for getting the scanout time. > > >>>>>> > > >>>>>> Proposing the late timestamp shouldn't affect vblank event in the > > >>>>>> DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip > > >>>>>> event case. I'm not sure if that's what's guaranteed to happen with this > > >>>>>> patch though. There doesn't seem to be any locking on either > > >>>>>> dev->vblank_time_lock or the vblank->seqlock so while it's likely to get > > >>>>>> the same vblank event back as the one just stored I don't think it's > > >>>>>> guaranteed. > > >>>>> > > >>>>> That's the inconsistency I mean to highlight - the timestamp for the > > >>>>> same frame as observed through flip complete and through the > > >>>>> wait_vblank ioctl can differ. Which they really shouldn't. > > >>>>> > > >>>> > > >>>> Ideally they shouldn't differ. The kernel docs for drm_crtc_state say > > >>>> that vblank and pageflip timestamps should always match. But then the > > >>>> kernel docs for "Variable refresh properties" in drm_connector.c for > > >>>> vblank timestamps were changed for the VRR implementation in Linux > > >>>> 5.0-rc to redefine them when in VRR mode. They are defined, but > > >>>> probably rather useless for any practical purpose, like this: > > >>>> > > >>>> "The semantics for the vertical blank timestamp differ when > > >>>> variable refresh rate is active. The vertical blank timestamp > > >>>> is defined to be an estimate using the current mode's fixed > > >>>> refresh rate timings. The semantics for the page-flip event > > >>>> timestamp remain the same." > > >>> > > >>> Uh I missed that. That sounds like nonsense tbh. > > >>> > > >>>> So our docs contradict each other as of Linux 5.0-rc. Certainly having > > >>>> useful vblank timetamps would be useful. > > >>> > > >>> Yup, imo vblank should still match page_flip. Otherwise I expect a lot of > > >>> hilarity will ensue. > > >> > > >> I would imagine you would see more breakage by changing userspace > > >> expectations for when the event is actually received. > > >> > > >> The IOCTL is "wait for vblank", but if we start sending the event near > > >> the end of vblank it becomes more "wait for pageflip" (near scanout) or > > >> "wait 15ms while the VRR front porch times out and causes flickering". > > >> > > >> I don't speak for all userspace but it seems more useful to me to have > > >> the event sent at the start of vblank. > > > > > > Maybe we need to things. The problem is that the timestampe is defined > > > to be correct for start-of-frame. And right now the amdgpu > > > implementation of VRR doesn't even give an accurate timestampe for the > > > flip, so most likely no one yet thought about this properly. > > > Definitely no testcases (or it would have been caught), whereas we > > > have tons of testcases for fixed refresh. > > > > I agree that it would be good to have documentation for what waiting for > > vblank is supposed to mean here. There's a big difference between > > getting the event at the start of vsync vs the start of the back porch > > near the end of it. > > > > > > > >> VRR in its current state is useful for applications that aren't > > >> completely timing sensitive (like most games). This is because the > > >> driver has free reign in restricting the min/max bounds of the range - > > >> allowing for enhancements like low framerate compensation / BTR. > > >> > > >> For a concrete example, imagine you're not doing frontbuffer rendering > > >> but you're still trying to target a specific scanout time. So even if we > > >> were to send the vblank event at vpos 0 you'd get an accurate scanout > > >> timestamp for the next scanout after the vback porch ends. You're not > > >> going to be doing any rendering or preparation in this time because it's > > >> simply too short. So you'd target the scanout period after this one, > > >> which you'd need to calculate somehow in userspace. Taking the deltas > > >> between two vblank events is completely useless here because you'll get > > >> back the vrr min refresh delta. So you can take your source content rate > > >> or some fixed rate you'd like to target within the range but even then > > >> you're still not guaranteed to have that flip end up at that scanout time. > > >> > > >> The current solution is to just return the scanout pos at vrr max, or > > >> the current mode's fixed refresh rate. If you're flip before the fixed > > >> duration front porch ends then that scanout time will likely be accurate > > >> since you can't flip faster than vrr max. > > >> > > >> Is there actually userspace that cares that these timestamps need to > > >> match? Like said below, it seems that wait for vblank is mostly about > > >> scheduling and for that I think having the event sent back at the start > > >> of vblank is more important here. > > > > > > There's 3 cases: > > > - DRI3: Waits for vblank for frontbuffer blits. If you combine that > > > with VRR then imo you deserve to keep all the pieces. > > > - generic compositors. Use VRR for scheduling, expect that a page_flip > > > queued immediately after a page_flip will hit the next frame, and not > > > the current frame. They also expect that this will not give you the > > > slowest possible refresh rate, but I guess VRR plus frame scheduling > > > is currently out of scope. It's kinda just for as-fast-as-possible > > > games. > > > - flip timestamp. Currently the entire codebase assumes that vblank ts > > > = flip ts. I don't really see a point in breaking that. > > > - better scheduling of VRR: what compositors actually do isn't wait > > > for the vblank, that only latches the page_flip. They set up a > > > hrtimer, with a fixed/auto-tuned head-start, from which they start > > > their rendering. The same could be done for VRR, if we expose the > > > earliest/latest vblank end times (and how much they can shift). vblank > > > ioctl is only used for syncing that up and making sure we're hitting > > > the right frame. This is all for pageflipping compositors. This is > > > also what they'll probably keep doing when we give them a desired flip > > > time (instead of a target frame), they still need an hrtimer to fire a > > > little bit ahead to get the frame rendered. > > > > > > So taking all together, we have 2 use-cases I think: > > > - frontbuffer rendering. Kinda wants vblank event to fire at the > > > beginning of vblank. Won't ever do VRR > > > - pageflipping compositors. Don't care about the vblank event, either > > > they have an hrtimer timer (which gets synced with the vblank ioctl), > > > or just flip as fast as possible. They don't care one bit when the > > > vblank event happens, as long as a pageflip scheduled immediately > > > after the event shows up only hits the next frame (which is the delay > > > code that currently stretches vblank unecessarily with VRR if you're > > > already a bit late). > > > > > > Making sure that vblank and page flip timestamp match the real > > > userspace seems like the cleanest semantics, and the least surprising > > > to userspace. E.g. what happens if you implement VRR in a compositor, > > > but then tune your hrtimer within the vblank, when the vblank > > > timestamp is totally not reflecting your real pageflip? I just don't > > > see what's the upside of having a vblank timestamp that's not actually > > > reflecting the real flip. > > > > > > Also, if your app wants the start of vblank, that's very easy to > > > compute: Just add the scanout time to the last vblank timestamp. > > > Figuring out the start of frame otoh isn't doable if you can't query, > > > and constantly pageflipping just to have an accurate start-of-frame > > > (for tuning your hrtimer) seems rather wasteful - it outright defeats > > > some of the use-cases for VRR. > > > > > > So what benefit do you see in having vblank ts and pageflip ts not > > > agree? What's the use-case this solves - aside from "less typing" :-) > > > If the only use-case is VRR for frontbuffer rendering, then that > > > doesn't even work right now (aside from that I think it's something we > > > shouldn't support and definitely not encourage). > > > -Daniel > > > > I'm actually fine with moving the vblank event back to the back porch > > but I'd just be worried about userspace breakages that relied on it > > happening the other way. > > > > It'd be especially confusing if the VLINE was different for fixed vs > > variable refresh. I wouldn't want to move it to vpos 0 unless the event > > is also getting moved for fixed refresh as well. > > That would break DRI3. So we can pick between two options: > - always send out the vblank early, accept that vblank timestamps change. > - for VRR only, send out the vblank only when we know the correct > timestamp. For non-VRR we have to send out the vblank early, because > that's somewhat become api (but not all drivers do it that way, so > it's about as hopeless as frontbuffer rendering in general). > > I still don't see the benefit of the first option, or the exact use case. > Still haven't slept so i don't know if i'm adding anything to the discussion that Daniel hasn't stated already, but anyway: I think i agree with Daniel on this. Having a vblank ts that truly reflects the end of last vblank/start of scanout should make time-based scheduling easier. Then we know when scanout of the active frame starts and calculate from known fixed scanout duration when next vblank starts and when it ends the latest. On X11 you can glXGetSyncValuesOML() query that start time to sync up a compositors update cycle or start of an animation or timed presentation sequence with vblank. E.g., Waylands Weston does this when switching from idle desktop to redraw loop. A movie player or my own software could use this info in the same way, whereas i don't see the current vblank timestams being useful. If you get the current vblank event with timestamp at start of vblank ie. a bit into the short fixed duration front porch, you'd have to render and flip within what small amount of time is left unti the end of the fixed front-porch, because once you get into variable extended front-porch your vblank timestamp is already meaningless. I also think if we want to implement something like the ability to flip for a specific target scanout time -- which i'd really like to have -- having the true end of last vblank gives us a baseline timestamp that is easier to reason about inside the kernel, similar to how a compositor can reason/adjust to when to start composing the next frame or submitting a flip to hit some consistent framerate, or in turn to better obey a presentation request of a client for a specific target time. At the same time we need to keep current behaviour for non-VRR mode, with early vblanks, so all the target_msc vblank count based scheduling keeps working. I don't think clients that are written to use VRR should expect exactly the same timing behaviour wrt. when vblank events get sent or timestamps updated. I certainly expect to have separate code-paths in my app for optimal VRR vs. non-VRR presentation scheduling, a compositor or movie player might adapt in the same way, and fullscreen apps can opt-out of VRR if they want. Clients just need to know what the behavior is in VRR mode. I wonder what we should do for the kernel docs for Linux 5.0-rc? I assume we won't have the time to fix this all before release, would it make sense to have some kernel-docs updates via drm-fixes that declare the vblank/pageflip timestamps as "undefined"/"not to be relied on"/"subject to future change" for this release - essentially declare vblank timestamps as unstable api for the moment, so nobody relies on the current quite broken timestamp behaviour? I do now have hacked up proof-of-concept patches here that after torturing them for a few hours with my own tests now seem to be able to calculate correct pageflip timestamps regardless which of the two options we choose, but they need refinement/cleanup/rebasing onto somethign else than Linux 5.0-rc5, and help from Nicholas, because i tend to get lost in all the indirections and file hierarchy inside the DisplayCore code etc., and probably also a few kinks worked out. And see if shifting the vblank irq handling doesn't cause other weird interactions? I haven't looked how Nicholas below-the-range/framerate compensation works in detail and if it matter that it happens at start of front-porch. I can only say that it doesn't seem to interfere with correct timestamping if i run with framerates intentionally below vrr-min, as far as my limited testing goes. Another thing that may be problematic is disabling/reenabling vblank during idle times, given that we can only reliably timestamp from within pageflip irq handler or vblank irq ~ back-porch/active scanout. Maybe we need to drm_crtc_vblank_get() hold a reference while VRR is active to simplify things for the moment. One thing one could merge is that other VRR friendly throttling patch i sent, given that that works and is independent of the timestamp stuff. I also have a backport for Linux 5.0-rc if that makes sense? That VRR throttling also needs to get reworked if/once we wanted to emit matching vblank/pflip timestamps, as that doesn't like the shifted point inside vblank when vblank counter gets updated. Basically i don't know when the deadlines for merging this kind of work are/were wrt. Linux 5.0, 5.1, 5.2? I'll look more into cleaning/testing those patches tomorrow after lots of sleep and then see what Nicholas thinks of them. One problem with the VRR stuff is that the hardware testing equipment, that i'd usually use to verify if timestamps are really correct and accurate, doesn't work with DisplayPort or HDMI signals, so i can only test for self-consistency of timestamps atm., and look if plots of timestamps look how i would expect them to look if stuff is mostly correct. -mario > > But like you said, the only thing I can think of that should need the > > event right away is low latency / framebuffer rendering. > > > > I can think at least one potential use case where you would want to do > > front buffer rendering for low latency with VRR - emulation of old > > console hardware. There's not really any modern display that has fixed > > refresh modes that match the render frequency of some consoles. But this > > use case isn't out there in the wild right now at least. > > Even on an emulator I'd expect it'd just do regular flips to the > frontbuffer. Especially on mobile there's really no such thing as > frontbuffer rendering, you always do a flip. The DIRTYFB ioctl (which > you need to call instead of doing a page flip) simply does a pageflip > behind your back for you. Which I think is worse than actually > flipping. > > And for emulating exact refresh rates your then again go back to > hrtimer + issuing the flip approach, but using your desired refresh > rate to tune the hrtimer instead of the fixed one. But that needs a > correct vblank timestamp for the current frame to work correctly (or > you force compositor has to always flip, which defeats the power > saving aspects of VRR allowing to stretch refresh rates when nothing > changes). > > > Regarding compositors: I don't really recommend VRR for this use case in > > general since you'll want a fixed refresh rate frequency for input to > > feel consistent and to avoid any flickering / luminance changes. The > > fastest refresh rate will be the most responsive to the user and you can > > just used the fixed refresh rate mode in this case. Lowering the refresh > > rate to save power sounds nice in theory but I'd imagine it'd feel > > terrible in practice. > > Why is a compositor any different from a game? You aim to render at > 60fps, or whatever the refresh rate is. If you're one frame late it's > better to stretch the vblank slightly than a full on 1-frame freeze. > And on mobile it's unfortunately fairly common that the system fails > to clock up quickly enough when starting from an idle state, VRR > sounds like really useful here. > > But like in a game if you don't at least aim to render at a consistent > fps, and handle your input at a consistent fps, then the end result > will indeed be terrible. > -Daniel > > > Nicholas Kazlauskas > > > > > > > >>>>> Now added complication is that amdgpu sends out vblank events really > > >>>>> early, which is used by userspace to do frontbuffer rendering in the > > >>>>> vblank time. But I don't think anyone wants to do both VRR and > > >>>> > > >>>> I think all kms drivers try to call drm_crtc_handle_vblank() at start > > >>>> of vblank to give Mesa the most time for frontbuffer rendering for > > >>>> classic X. But vblank events are also used for scheduling bufferswaps > > >>>> or other stuff for redirected windowed rendering, or via api's like > > >>>> OML_sync_controls glXWaitForMscOML, so there might be other things > > >>>> affected by a more delayed vblank handling. > > >>> > > >>> The frontbuffer rendering is very much X driver specific, and I think > > >>> -amdgpu/radeon is the only one that requires this. No i915 driver ever > > >>> used the vblank interrupt to schedule frontbuffer blits, we use some > > >>> CS-side stalls. > > >>> > > >>> Wrt scheduling pageflips: The rule is that right after you've received the > > >>> vblank for frame X, then an immediately schedule pageflip should hit X+1, > > >>> but not X. amdgpu had this broken, but it's fixed since a while. > > >>> > > >>>>> frontbuffer rendering, hence I think it should be possible to create > > >>>>> correct vblank timestamps for the VRR case, while leaving the current > > >>>>> logic in place for everything else. But that means moving the entire > > >>>>> vblank processing to where you know the next frame's start time for > > >>>>> VRR, both for page flips and for all other vblank events. > > >>>>> -Daniel > > >>>>> > > >>>> > > >>>> I think for amdgpu it would be doable by calling drm_handle_vblank > > >>>> from the pageflip irq handler in VRR mode, and then additionally from > > >>>> the vblank interrupt handler like now. Vblank irq's are triggered via > > >>>> vline interrupts with programmable vertical trigger line position, so > > >>>> we could program the trigger line to be start of back-porch instead of > > >>>> start of front-porch. In the pageflip case we do vblank handling + > > >>>> timestamping + pflip completion from pflip irq, and have the regular > > >>>> back-porch vblank handling for refresh cycles in which no > > >>>> pageflip/pflip irq happened. > > >>> > > >>> Hm, can't we simply program the vblank to occur in the back porch region > > >>> for VRR? The vblank timestamping should automatically correct the > > >>> timestamp in that case. Would require a lot less code changes. > > >>> > > >>> Plus update the documentation ofc. > > >>> > > >>>> Not sure if/how that would mess with below-vrr-refresh-rate-range > > >>>> support for stutter reduction at low fps though? Or with performance, > > >>>> given vblank event dispatch could be delayed a lot in VRR compared to > > >>>> normal mode? Or with all the vblank accounting during modesets or crtc > > >>>> en/disable? > > >> > > >> I think we have extra VLINEs that can be programmed at vpos 0 that could > > >> resolve this but I'm still not really sold on the idea. > > >> > > >>> > > >>> Currently VRR and explicit frame scheduling don't mix. We've discussed > > >>> this, and consens seems to be that we need a new property for page_flip, > > >>> similar to the target frame, but instead of frames a timestamp. Then the > > >>> driver can hit a specific time for the next frame. > > >>> > > >>> And since compositors use the vblank ioctl just to do frame scheduling, I > > >>> don't think we have to worry hugely about interactions there. Much better > > >>> to aim for clean semantics (i.e. vblank and flip complete timestamps > > >>> better match). > > >>> > > >>>> Also, could other drivers like Intel easily do this delayed vblank > > >>>> processing in the non-pageflip case? > > >>> > > >>> Intel's problem :-) But yeah we have page flip timestamp registers, so > > >>> worst case all we need to do is shuffle code around a bit. And we have > > >>> interrupts for both start of vblank and and of vblank. It might be a bit > > >>> more work because we use the vblank interrupt to avoid races in atomic > > >>> updates, but that's not too terrible to fix. > > >>> > > >>>> From my user perspective as a developer of a neuroscience research > > >>>> application that has a couple of exciting/important use cases for VRR > > >>>> mode, i absolutely do need trustworthy and precise pageflip event > > >>>> timestamps that represent reality, otherwise VRR mode would be less > > >>>> than useless for me. I can do without meaningful vblank timestamps in > > >>>> VRR mode though, and expected to do without them, as any scheduling > > >>>> has to happen time-based instead of based on vblank counts anyway, > > >>>> given that vblank counts are quite meaningless if every frame can have > > >>>> a different duration. I'd expect that situation to be similar for > > >>>> other apps that would want to use timestamps in VRR mode like video > > >>>> games, movie players or VR/AR applications. > > >>> > > >>> Yeah for that you want to schedule your frames with a target frame number. > > >>> Would still be nice if vblank timestamps wouldn't be an outright lie (and > > >>> the timestamp is still useful information, even if the frame counter > > >>> isn't - many displays have min/max frame rates for VRR, which is > > >>> information we could perhaps expose). > > >>> -Daniel > > > >>>> > > >>>> -mario > > >>>> > > >>>>>> > > >>>>>> Nicholas Kazlauskas > > >>>>>> > > >>>>>>> > > >>>>>>>> --- > > >>>>>>>> drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- > > >>>>>>>> include/drm/drm_vblank.h | 8 ++++++ > > >>>>>>>> 2 files changed, 56 insertions(+), 1 deletion(-) > > >>>>>>>> > > >>>>>>>> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c > > >>>>>>>> index 98e091175921..4b3a4c38fabe 100644 > > >>>>>>>> --- a/drivers/gpu/drm/drm_vblank.c > > >>>>>>>> +++ b/drivers/gpu/drm/drm_vblank.c > > >>>>>>>> @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, > > >>>>>>>> u64 seq, ktime_t now) > > >>>>>>>> { > > >>>>>>>> struct timespec64 tv; > > >>>>>>>> + ktime_t alt_flip_time; > > >>>>>>>> > > >>>>>>>> switch (e->event.base.type) { > > >>>>>>>> - case DRM_EVENT_VBLANK: > > >>>>>>>> case DRM_EVENT_FLIP_COMPLETE: > > >>>>>>>> + /* > > >>>>>>>> + * For flip completion events, override "now" time > > >>>>>>>> + * with alt_flip_time provided by the driver via > > >>>>>>>> + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode > > >>>>>>>> + * if that time is later than given "now" vblank time. > > >>>>>>>> + */ > > >>>>>>>> + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; > > >>>>>>>> + if (alt_flip_time > now) > > >>>>>>>> + now = alt_flip_time; > > >>>>>>>> + /* Fallthrough */ > > >>>>>>>> + case DRM_EVENT_VBLANK: > > >>>>>>>> tv = ktime_to_timespec64(now); > > >>>>>>>> e->event.vbl.sequence = seq; > > >>>>>>>> /* > > >>>>>>>> @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > > >>>>>>>> > > >>>>>>>> now = ktime_get(); > > >>>>>>>> } > > >>>>>>>> + > > >>>>>>>> e->pipe = pipe; > > >>>>>>>> send_vblank_event(dev, e, seq, now); > > >>>>>>>> } > > >>>>>>>> EXPORT_SYMBOL(drm_crtc_send_vblank_event); > > >>>>>>>> > > >>>>>>>> +/** > > >>>>>>>> + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time > > >>>>>>>> + * @crtc: the source CRTC of the pageflip completion event > > >>>>>>>> + * @flip_time: The alternate pageflip completion timestamp in VRR mode > > >>>>>>>> + * > > >>>>>>>> + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried > > >>>>>>>> + * by the pageflip event can never be earlier than the vblank timestamp of the > > >>>>>>>> + * vblank of flip completion, as that vblank timestamp defines the end of the > > >>>>>>>> + * shortest possible vblank duration. In case of a delayed flip completion > > >>>>>>>> + * inside the extended VRR front porch however, the end of vblank can be much > > >>>>>>>> + * later, so the driver must assign an estimated timestamp of that later end of > > >>>>>>>> + * vblank. For a CRTC in VRR mode, the driver should use this helper function to > > >>>>>>>> + * set an alternate flip completion timestamp in case of late flip completions > > >>>>>>>> + * in extended vblank. In the most simple case, this @flip_time timestamp could > > >>>>>>>> + * simply be a ktime_get() timestamp taken at the start of the pageflip > > >>>>>>>> + * completion routine, with some constant duration of the back porch interval > > >>>>>>>> + * added, although more precise estimates may be possible on some hardware if > > >>>>>>>> + * the hardware provides some means of timestamping the true end of vblank. > > >>>>>>>> + * > > >>>>>>>> + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it > > >>>>>>>> + * will use either the standard vblank timestamp, calculated for a minimum > > >>>>>>>> + * duration vblank, or the provided @flip_time if that time is later than the > > >>>>>>>> + * vblank timestamp, to get the best possible estimate of start of display of > > >>>>>>>> + * the new post-pageflip scanout buffer. > > >>>>>>>> + */ > > >>>>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > > >>>>>>>> + ktime_t flip_time) > > >>>>>>>> +{ > > >>>>>>>> + struct drm_device *dev = crtc->dev; > > >>>>>>>> + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; > > >>>>>>>> + > > >>>>>>>> + vblank->alt_flip_time = flip_time; > > >>>>>>>> +} > > >>>>>>>> +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); > > >>>>>>>> + > > >>>>>>>> static int __enable_vblank(struct drm_device *dev, unsigned int pipe) > > >>>>>>>> { > > >>>>>>>> if (drm_core_check_feature(dev, DRIVER_MODESET)) { > > >>>>>>>> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h > > >>>>>>>> index 6ad9630d4f48..aacf44694ab6 100644 > > >>>>>>>> --- a/include/drm/drm_vblank.h > > >>>>>>>> +++ b/include/drm/drm_vblank.h > > >>>>>>>> @@ -117,6 +117,12 @@ struct drm_vblank_crtc { > > >>>>>>>> * @time: Vblank timestamp corresponding to @count. > > >>>>>>>> */ > > >>>>>>>> ktime_t time; > > >>>>>>>> + /** > > >>>>>>>> + * @alt_flip_time: Vblank timestamp for end of extended vblank due to > > >>>>>>>> + * a late pageflip completion in variable refresh rate mode. Pageflip > > >>>>>>>> + * events will carry the later one of @time and @alt_flip_time. > > >>>>>>>> + */ > > >>>>>>>> + ktime_t alt_flip_time; > > >>>>>>>> > > >>>>>>>> /** > > >>>>>>>> * @refcount: Number of users/waiters of the vblank interrupt. Only when > > >>>>>>>> @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); > > >>>>>>>> u64 drm_crtc_vblank_count(struct drm_crtc *crtc); > > >>>>>>>> u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, > > >>>>>>>> ktime_t *vblanktime); > > >>>>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > > >>>>>>>> + ktime_t flip_time); > > >>>>>>>> void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > > >>>>>>>> struct drm_pending_vblank_event *e); > > >>>>>>>> void drm_crtc_arm_vblank_event(struct drm_crtc *crtc, > > >>>>>>>> -- > > >>>>>>>> 2.17.1 > > >>>>>>>> > > >>>>>>>> _______________________________________________ > > >>>>>>>> dri-devel mailing listjj > > >>>>>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel > > >>>>>>> > > >>>>>> > > >>>>> > > >>>>> > > >>>>> -- > > >>>>> Daniel Vetter > > >>>>> Software Engineer, Intel Corporation > > >>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch > > >>> > > >> > > >> Nicholas Kazlauskas > > >> > > > > > > > > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On 2/13/19 1:10 PM, Mario Kleiner wrote: > On Wed, Feb 13, 2019 at 5:03 PM Daniel Vetter <daniel@ffwll.ch> wrote: >> >> On Wed, Feb 13, 2019 at 4:46 PM Kazlauskas, Nicholas >> <Nicholas.Kazlauskas@amd.com> wrote: >>> >>> On 2/13/19 10:14 AM, Daniel Vetter wrote: >>>> On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas >>>> <Nicholas.Kazlauskas@amd.com> wrote: >>>>> >>>>> On 2/13/19 4:50 AM, Daniel Vetter wrote: >>>>>> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote: >>>>>>> On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter <daniel@ffwll.ch> wrote: >>>>>>>> >>>>>>>> On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas >>>>>>>> <Nicholas.Kazlauskas@amd.com> wrote: >>>>>>>>> >>>>>>>>> On 2/11/19 3:35 AM, Daniel Vetter wrote: >>>>>>>>>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote: >>>>>>>>>>> The pageflip completion timestamps transmitted to userspace >>>>>>>>>>> via pageflip completion events are supposed to describe the >>>>>>>>>>> time at which the first pixel of the new post-pageflip scanout >>>>>>>>>>> buffer leaves the video output of the gpu. This time is >>>>>>>>>>> identical to end of vblank, when active scanout starts. >>>>>>>>>>> >>>>>>>>>>> For a crtc in standard fixed refresh rate, the end of vblank >>>>>>>>>>> is identical to the vblank timestamps calculated by >>>>>>>>>>> drm_update_vblank_count() at each vblank interrupt, or each >>>>>>>>>>> vblank dis-/enable. Therefore pageflip events just carry >>>>>>>>>>> that vblank timestamp as their pageflip timestamp. >>>>>>>>>>> >>>>>>>>>>> For a crtc switched to variable refresh rate mode (vrr), the >>>>>>>>>>> pageflip completion timestamps are identical to the vblank >>>>>>>>>>> timestamps iff the pageflip was executed early in vblank, >>>>>>>>>>> before the minimum vblank duration elapsed. In this case >>>>>>>>>>> the time of display onset is identical to when the crtc >>>>>>>>>>> is running in fixed refresh rate. >>>>>>>>>>> >>>>>>>>>>> However, if a pageflip completes later in the vblank, inside >>>>>>>>>>> the "extended front porch" in vrr mode, then the vblank will >>>>>>>>>>> terminate at a fixed (back porch) duration after flip, so >>>>>>>>>>> the display onset time is delayed correspondingly. In this >>>>>>>>>>> case the vblank timestamp computed at vblank irq time would >>>>>>>>>>> be too early, and we need a way to calculate an estimated >>>>>>>>>>> pageflip timestamp that will be later than the vblank timestamp. >>>>>>>>>>> >>>>>>>>>>> How a driver determines such a "late flip" timestamp is hw >>>>>>>>>>> and driver specific, but this patch adds a new helper function >>>>>>>>>>> that allows the driver to propose such an alternate "late flip" >>>>>>>>>>> timestamp for use in pageflip events: >>>>>>>>>>> >>>>>>>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); >>>>>>>>>>> >>>>>>>>>>> When sending out pageflip events, we now compare that proposed >>>>>>>>>>> flip_timestamp against the vblank timestamp of the current >>>>>>>>>>> vblank of flip completion and choose to send out the greater/ >>>>>>>>>>> later timestamp as flip completion timestamp. >>>>>>>>>>> >>>>>>>>>>> The most simple way for a kms driver to supply a suitable >>>>>>>>>>> flip_timestamp in vrr mode would be to simply take a timestamp >>>>>>>>>>> at start of the pageflip completion handler, e.g., pageflip >>>>>>>>>>> irq handler: flip_timestamp = ktime_get(); and then set that >>>>>>>>>>> as proposed "late" alternative timestamp via ... >>>>>>>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); >>>>>>>>>>> >>>>>>>>>>> More clever approaches could try to add some corrective offset >>>>>>>>>>> for fixed back porch duration, or ideally use hardware features >>>>>>>>>>> like hw timestamps to calculate the exact end time of vblank. >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> >>>>>>>>>>> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> >>>>>>>>>>> Cc: Harry Wentland <harry.wentland@amd.com> >>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>>>>>>>> >>>>>>>>>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us >>>>>>>>>> the right timestampe, once? With this I guess if you do a vblank query in >>>>>>>>>> between the wrong and the right vblank you'll get the bogus value. Not >>>>>>>>>> really great for userspace. >>>>>>>>>> -Daniel >>>>>>>>> >>>>>>>>> I think we calculate the timestamp and send the vblank event both within >>>>>>>>> the pageflip IRQ handler so calculating the right pageflip timestamp >>>>>>>>> once could probably be done. I'm not sure if it's easier than proposing >>>>>>>>> a later flip time with an API like this though. >>>>>>>>> >>>>>>>>> The actual scanout time should be known from the page-flip handler so >>>>>>>>> the semantics for VRR on/off remain the same. This is because the >>>>>>>>> page-flip triggers entering the back porch if we're in the extended >>>>>>>>> front porch. >>>>>>>>> >>>>>>>>> But scanout time from vblank events for something like >>>>>>>>> DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only >>>>>>>>> treated as estimates. If we're in the regular front porch then the >>>>>>>>> timing to scanout is based on the fixed duration front porch for the >>>>>>>>> current mode. If we're in the extended back porch then it's technically >>>>>>>>> driver defined but the most reasonable guess is to assume that the front >>>>>>>>> porch is going to end at any moment, so just return the length of the >>>>>>>>> back porch for getting the scanout time. >>>>>>>>> >>>>>>>>> Proposing the late timestamp shouldn't affect vblank event in the >>>>>>>>> DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip >>>>>>>>> event case. I'm not sure if that's what's guaranteed to happen with this >>>>>>>>> patch though. There doesn't seem to be any locking on either >>>>>>>>> dev->vblank_time_lock or the vblank->seqlock so while it's likely to get >>>>>>>>> the same vblank event back as the one just stored I don't think it's >>>>>>>>> guaranteed. >>>>>>>> >>>>>>>> That's the inconsistency I mean to highlight - the timestamp for the >>>>>>>> same frame as observed through flip complete and through the >>>>>>>> wait_vblank ioctl can differ. Which they really shouldn't. >>>>>>>> >>>>>>> >>>>>>> Ideally they shouldn't differ. The kernel docs for drm_crtc_state say >>>>>>> that vblank and pageflip timestamps should always match. But then the >>>>>>> kernel docs for "Variable refresh properties" in drm_connector.c for >>>>>>> vblank timestamps were changed for the VRR implementation in Linux >>>>>>> 5.0-rc to redefine them when in VRR mode. They are defined, but >>>>>>> probably rather useless for any practical purpose, like this: >>>>>>> >>>>>>> "The semantics for the vertical blank timestamp differ when >>>>>>> variable refresh rate is active. The vertical blank timestamp >>>>>>> is defined to be an estimate using the current mode's fixed >>>>>>> refresh rate timings. The semantics for the page-flip event >>>>>>> timestamp remain the same." >>>>>> >>>>>> Uh I missed that. That sounds like nonsense tbh. >>>>>> >>>>>>> So our docs contradict each other as of Linux 5.0-rc. Certainly having >>>>>>> useful vblank timetamps would be useful. >>>>>> >>>>>> Yup, imo vblank should still match page_flip. Otherwise I expect a lot of >>>>>> hilarity will ensue. >>>>> >>>>> I would imagine you would see more breakage by changing userspace >>>>> expectations for when the event is actually received. >>>>> >>>>> The IOCTL is "wait for vblank", but if we start sending the event near >>>>> the end of vblank it becomes more "wait for pageflip" (near scanout) or >>>>> "wait 15ms while the VRR front porch times out and causes flickering". >>>>> >>>>> I don't speak for all userspace but it seems more useful to me to have >>>>> the event sent at the start of vblank. >>>> >>>> Maybe we need to things. The problem is that the timestampe is defined >>>> to be correct for start-of-frame. And right now the amdgpu >>>> implementation of VRR doesn't even give an accurate timestampe for the >>>> flip, so most likely no one yet thought about this properly. >>>> Definitely no testcases (or it would have been caught), whereas we >>>> have tons of testcases for fixed refresh. >>> >>> I agree that it would be good to have documentation for what waiting for >>> vblank is supposed to mean here. There's a big difference between >>> getting the event at the start of vsync vs the start of the back porch >>> near the end of it. >>> >>>> >>>>> VRR in its current state is useful for applications that aren't >>>>> completely timing sensitive (like most games). This is because the >>>>> driver has free reign in restricting the min/max bounds of the range - >>>>> allowing for enhancements like low framerate compensation / BTR. >>>>> >>>>> For a concrete example, imagine you're not doing frontbuffer rendering >>>>> but you're still trying to target a specific scanout time. So even if we >>>>> were to send the vblank event at vpos 0 you'd get an accurate scanout >>>>> timestamp for the next scanout after the vback porch ends. You're not >>>>> going to be doing any rendering or preparation in this time because it's >>>>> simply too short. So you'd target the scanout period after this one, >>>>> which you'd need to calculate somehow in userspace. Taking the deltas >>>>> between two vblank events is completely useless here because you'll get >>>>> back the vrr min refresh delta. So you can take your source content rate >>>>> or some fixed rate you'd like to target within the range but even then >>>>> you're still not guaranteed to have that flip end up at that scanout time. >>>>> >>>>> The current solution is to just return the scanout pos at vrr max, or >>>>> the current mode's fixed refresh rate. If you're flip before the fixed >>>>> duration front porch ends then that scanout time will likely be accurate >>>>> since you can't flip faster than vrr max. >>>>> >>>>> Is there actually userspace that cares that these timestamps need to >>>>> match? Like said below, it seems that wait for vblank is mostly about >>>>> scheduling and for that I think having the event sent back at the start >>>>> of vblank is more important here. >>>> >>>> There's 3 cases: >>>> - DRI3: Waits for vblank for frontbuffer blits. If you combine that >>>> with VRR then imo you deserve to keep all the pieces. >>>> - generic compositors. Use VRR for scheduling, expect that a page_flip >>>> queued immediately after a page_flip will hit the next frame, and not >>>> the current frame. They also expect that this will not give you the >>>> slowest possible refresh rate, but I guess VRR plus frame scheduling >>>> is currently out of scope. It's kinda just for as-fast-as-possible >>>> games. >>>> - flip timestamp. Currently the entire codebase assumes that vblank ts >>>> = flip ts. I don't really see a point in breaking that. >>>> - better scheduling of VRR: what compositors actually do isn't wait >>>> for the vblank, that only latches the page_flip. They set up a >>>> hrtimer, with a fixed/auto-tuned head-start, from which they start >>>> their rendering. The same could be done for VRR, if we expose the >>>> earliest/latest vblank end times (and how much they can shift). vblank >>>> ioctl is only used for syncing that up and making sure we're hitting >>>> the right frame. This is all for pageflipping compositors. This is >>>> also what they'll probably keep doing when we give them a desired flip >>>> time (instead of a target frame), they still need an hrtimer to fire a >>>> little bit ahead to get the frame rendered. >>>> >>>> So taking all together, we have 2 use-cases I think: >>>> - frontbuffer rendering. Kinda wants vblank event to fire at the >>>> beginning of vblank. Won't ever do VRR >>>> - pageflipping compositors. Don't care about the vblank event, either >>>> they have an hrtimer timer (which gets synced with the vblank ioctl), >>>> or just flip as fast as possible. They don't care one bit when the >>>> vblank event happens, as long as a pageflip scheduled immediately >>>> after the event shows up only hits the next frame (which is the delay >>>> code that currently stretches vblank unecessarily with VRR if you're >>>> already a bit late). >>>> >>>> Making sure that vblank and page flip timestamp match the real >>>> userspace seems like the cleanest semantics, and the least surprising >>>> to userspace. E.g. what happens if you implement VRR in a compositor, >>>> but then tune your hrtimer within the vblank, when the vblank >>>> timestamp is totally not reflecting your real pageflip? I just don't >>>> see what's the upside of having a vblank timestamp that's not actually >>>> reflecting the real flip. >>>> >>>> Also, if your app wants the start of vblank, that's very easy to >>>> compute: Just add the scanout time to the last vblank timestamp. >>>> Figuring out the start of frame otoh isn't doable if you can't query, >>>> and constantly pageflipping just to have an accurate start-of-frame >>>> (for tuning your hrtimer) seems rather wasteful - it outright defeats >>>> some of the use-cases for VRR. >>>> >>>> So what benefit do you see in having vblank ts and pageflip ts not >>>> agree? What's the use-case this solves - aside from "less typing" :-) >>>> If the only use-case is VRR for frontbuffer rendering, then that >>>> doesn't even work right now (aside from that I think it's something we >>>> shouldn't support and definitely not encourage). >>>> -Daniel >>> >>> I'm actually fine with moving the vblank event back to the back porch >>> but I'd just be worried about userspace breakages that relied on it >>> happening the other way. >>> >>> It'd be especially confusing if the VLINE was different for fixed vs >>> variable refresh. I wouldn't want to move it to vpos 0 unless the event >>> is also getting moved for fixed refresh as well. >> >> That would break DRI3. So we can pick between two options: >> - always send out the vblank early, accept that vblank timestamps change. >> - for VRR only, send out the vblank only when we know the correct >> timestamp. For non-VRR we have to send out the vblank early, because >> that's somewhat become api (but not all drivers do it that way, so >> it's about as hopeless as frontbuffer rendering in general). >> >> I still don't see the benefit of the first option, or the exact use case. >> > > Still haven't slept so i don't know if i'm adding anything to the > discussion that Daniel hasn't stated already, but anyway: > > I think i agree with Daniel on this. Having a vblank ts that truly > reflects the end of last vblank/start of scanout should make > time-based scheduling easier. Then we know when scanout of the active > frame starts and calculate from known fixed scanout duration when next > vblank starts and when it ends the latest. On X11 you can > glXGetSyncValuesOML() query that start time to sync up a compositors > update cycle or start of an animation or timed presentation sequence > with vblank. E.g., Waylands Weston does this when switching from idle > desktop to redraw loop. A movie player or my own software could use > this info in the same way, whereas i don't see the current vblank > timestams being useful. If you get the current vblank event with > timestamp at start of vblank ie. a bit into the short fixed duration > front porch, you'd have to render and flip within what small amount of > time is left unti the end of the fixed front-porch, because once you > get into variable extended front-porch your vblank timestamp is > already meaningless. Yeah, the timestamp is more useful for preparing animation/content/etc for the scanout after next. I agree that it should be the same, but there are the challenges in not breaking userspace or the driver. Wait for vblank is really poorly defined as is. Defining it explicitly for VRR as the start of back porch is fine as long as it is documented somewhere. Implementation wise it's going to be a pain since we'd need separate interrupts for vpos=vfront and vpos=0 for proper BTR support, but I think all the ASICs are capable of this. This is simple enough to enforce in IGT testing as well, wait for vblank, grab the timestamp, then do the flip targeting something within the vrr range and grab the timestamp and see if the values are within some sane threshold. > > I also think if we want to implement something like the ability to > flip for a specific target scanout time -- which i'd really like to > have -- having the true end of last vblank gives us a baseline > timestamp that is easier to reason about inside the kernel, similar to > how a compositor can reason/adjust to when to start composing the next > frame or submitting a flip to hit some consistent framerate, or in > turn to better obey a presentation request of a client for a specific > target time. Yes, the target presentation timestamp is ideal here. It probably should have been part of the original proposal but there wasn't anything in userspace that would have been using it. It would have been nice for IGT testing though. > > At the same time we need to keep current behaviour for non-VRR mode, > with early vblanks, so all the target_msc vblank count based > scheduling keeps working. I don't think clients that are written to > use VRR should expect exactly the same timing behaviour wrt. when > vblank events get sent or timestamps updated. I certainly expect to > have separate code-paths in my app for optimal VRR vs. non-VRR > presentation scheduling, a compositor or movie player might adapt in > the same way, and fullscreen apps can opt-out of VRR if they want. FWIW most compositors/movie players/web browsers are blocked from using VRR right not in a blacklist in mesa. So maybe we don't have to worry about this yet at least. > > Clients just need to know what the behavior is in VRR mode. I wonder > what we should do for the kernel docs for Linux 5.0-rc? I assume we > won't have the time to fix this all before release, would it make > sense to have some kernel-docs updates via drm-fixes that declare the > vblank/pageflip timestamps as "undefined"/"not to be relied > on"/"subject to future change" for this release - essentially declare > vblank timestamps as unstable api for the moment, so nobody relies on > the current quite broken timestamp behaviour? > > I do now have hacked up proof-of-concept patches here that after > torturing them for a few hours with my own tests now seem to be able > to calculate correct pageflip timestamps regardless which of the two > options we choose, but they need refinement/cleanup/rebasing onto > somethign else than Linux 5.0-rc5, and help from Nicholas, because i > tend to get lost in all the indirections and file hierarchy inside the > DisplayCore code etc., and probably also a few kinks worked out. And > see if shifting the vblank irq handling doesn't cause other weird > interactions? I haven't looked how Nicholas below-the-range/framerate > compensation works in detail and if it matter that it happens at start > of front-porch. I can only say that it doesn't seem to interfere with > correct timestamping if i run with framerates intentionally below > vrr-min, as far as my limited testing goes. Another thing that may be > problematic is disabling/reenabling vblank during idle times, given > that we can only reliably timestamp from within pageflip irq handler > or vblank irq ~ back-porch/active scanout. Maybe we need to > drm_crtc_vblank_get() hold a reference while VRR is active to simplify > things for the moment. BTR support has the bit in the vblank IRQ handler that adjusts the min/max duration for the display and if we're doing that in the back porch it's already too late. Another vline interrupt could be installed here and only enabled when VRR goes on. If VRR is on then that should be the source for updating the vblank count. This would probably have some sort of messy interaction with the amdgpu kms counter bit I mentioned in another email though, but I haven't really looked into this in depth yet. I think just calling the wait for vblank IOCTL will grab the reference to the vblank and it should enable the interrupts. I think holding an extra reference could be a simple solution if that's a problem though. > > One thing one could merge is that other VRR friendly throttling patch > i sent, given that that works and is independent of the timestamp > stuff. I also have a backport for Linux 5.0-rc if that makes sense? > That VRR throttling also needs to get reworked if/once we wanted to > emit matching vblank/pflip timestamps, as that doesn't like the > shifted point inside vblank when vblank counter gets updated. > > Basically i don't know when the deadlines for merging this kind of > work are/were wrt. Linux 5.0, 5.1, 5.2? > > I'll look more into cleaning/testing those patches tomorrow after lots > of sleep and then see what Nicholas thinks of them. > > One problem with the VRR stuff is that the hardware testing equipment, > that i'd usually use to verify if timestamps are really correct and > accurate, doesn't work with DisplayPort or HDMI signals, so i can only > test for self-consistency of timestamps atm., and look if plots of > timestamps look how i would expect them to look if stuff is mostly > correct. > > -mario The throttling patch is probably fine to merge. There are a few emails on that chain that I need to read through, however. Thanks for looking into this. Nicholas Kazlauskas > >>> But like you said, the only thing I can think of that should need the >>> event right away is low latency / framebuffer rendering. >>> >>> I can think at least one potential use case where you would want to do >>> front buffer rendering for low latency with VRR - emulation of old >>> console hardware. There's not really any modern display that has fixed >>> refresh modes that match the render frequency of some consoles. But this >>> use case isn't out there in the wild right now at least. >> >> Even on an emulator I'd expect it'd just do regular flips to the >> frontbuffer. Especially on mobile there's really no such thing as >> frontbuffer rendering, you always do a flip. The DIRTYFB ioctl (which >> you need to call instead of doing a page flip) simply does a pageflip >> behind your back for you. Which I think is worse than actually >> flipping. >> >> And for emulating exact refresh rates your then again go back to >> hrtimer + issuing the flip approach, but using your desired refresh >> rate to tune the hrtimer instead of the fixed one. But that needs a >> correct vblank timestamp for the current frame to work correctly (or >> you force compositor has to always flip, which defeats the power >> saving aspects of VRR allowing to stretch refresh rates when nothing >> changes). >> >>> Regarding compositors: I don't really recommend VRR for this use case in >>> general since you'll want a fixed refresh rate frequency for input to >>> feel consistent and to avoid any flickering / luminance changes. The >>> fastest refresh rate will be the most responsive to the user and you can >>> just used the fixed refresh rate mode in this case. Lowering the refresh >>> rate to save power sounds nice in theory but I'd imagine it'd feel >>> terrible in practice. >> >> Why is a compositor any different from a game? You aim to render at >> 60fps, or whatever the refresh rate is. If you're one frame late it's >> better to stretch the vblank slightly than a full on 1-frame freeze. >> And on mobile it's unfortunately fairly common that the system fails >> to clock up quickly enough when starting from an idle state, VRR >> sounds like really useful here. >> >> But like in a game if you don't at least aim to render at a consistent >> fps, and handle your input at a consistent fps, then the end result >> will indeed be terrible. >> -Daniel >> >>> Nicholas Kazlauskas >>> >>>> >>>>>>>> Now added complication is that amdgpu sends out vblank events really >>>>>>>> early, which is used by userspace to do frontbuffer rendering in the >>>>>>>> vblank time. But I don't think anyone wants to do both VRR and >>>>>>> >>>>>>> I think all kms drivers try to call drm_crtc_handle_vblank() at start >>>>>>> of vblank to give Mesa the most time for frontbuffer rendering for >>>>>>> classic X. But vblank events are also used for scheduling bufferswaps >>>>>>> or other stuff for redirected windowed rendering, or via api's like >>>>>>> OML_sync_controls glXWaitForMscOML, so there might be other things >>>>>>> affected by a more delayed vblank handling. >>>>>> >>>>>> The frontbuffer rendering is very much X driver specific, and I think >>>>>> -amdgpu/radeon is the only one that requires this. No i915 driver ever >>>>>> used the vblank interrupt to schedule frontbuffer blits, we use some >>>>>> CS-side stalls. >>>>>> >>>>>> Wrt scheduling pageflips: The rule is that right after you've received the >>>>>> vblank for frame X, then an immediately schedule pageflip should hit X+1, >>>>>> but not X. amdgpu had this broken, but it's fixed since a while. >>>>>> >>>>>>>> frontbuffer rendering, hence I think it should be possible to create >>>>>>>> correct vblank timestamps for the VRR case, while leaving the current >>>>>>>> logic in place for everything else. But that means moving the entire >>>>>>>> vblank processing to where you know the next frame's start time for >>>>>>>> VRR, both for page flips and for all other vblank events. >>>>>>>> -Daniel >>>>>>>> >>>>>>> >>>>>>> I think for amdgpu it would be doable by calling drm_handle_vblank >>>>>>> from the pageflip irq handler in VRR mode, and then additionally from >>>>>>> the vblank interrupt handler like now. Vblank irq's are triggered via >>>>>>> vline interrupts with programmable vertical trigger line position, so >>>>>>> we could program the trigger line to be start of back-porch instead of >>>>>>> start of front-porch. In the pageflip case we do vblank handling + >>>>>>> timestamping + pflip completion from pflip irq, and have the regular >>>>>>> back-porch vblank handling for refresh cycles in which no >>>>>>> pageflip/pflip irq happened. >>>>>> >>>>>> Hm, can't we simply program the vblank to occur in the back porch region >>>>>> for VRR? The vblank timestamping should automatically correct the >>>>>> timestamp in that case. Would require a lot less code changes. >>>>>> >>>>>> Plus update the documentation ofc. >>>>>> >>>>>>> Not sure if/how that would mess with below-vrr-refresh-rate-range >>>>>>> support for stutter reduction at low fps though? Or with performance, >>>>>>> given vblank event dispatch could be delayed a lot in VRR compared to >>>>>>> normal mode? Or with all the vblank accounting during modesets or crtc >>>>>>> en/disable? >>>>> >>>>> I think we have extra VLINEs that can be programmed at vpos 0 that could >>>>> resolve this but I'm still not really sold on the idea. >>>>> >>>>>> >>>>>> Currently VRR and explicit frame scheduling don't mix. We've discussed >>>>>> this, and consens seems to be that we need a new property for page_flip, >>>>>> similar to the target frame, but instead of frames a timestamp. Then the >>>>>> driver can hit a specific time for the next frame. >>>>>> >>>>>> And since compositors use the vblank ioctl just to do frame scheduling, I >>>>>> don't think we have to worry hugely about interactions there. Much better >>>>>> to aim for clean semantics (i.e. vblank and flip complete timestamps >>>>>> better match). >>>>>> >>>>>>> Also, could other drivers like Intel easily do this delayed vblank >>>>>>> processing in the non-pageflip case? >>>>>> >>>>>> Intel's problem :-) But yeah we have page flip timestamp registers, so >>>>>> worst case all we need to do is shuffle code around a bit. And we have >>>>>> interrupts for both start of vblank and and of vblank. It might be a bit >>>>>> more work because we use the vblank interrupt to avoid races in atomic >>>>>> updates, but that's not too terrible to fix. >>>>>> >>>>>>> From my user perspective as a developer of a neuroscience research >>>>>>> application that has a couple of exciting/important use cases for VRR >>>>>>> mode, i absolutely do need trustworthy and precise pageflip event >>>>>>> timestamps that represent reality, otherwise VRR mode would be less >>>>>>> than useless for me. I can do without meaningful vblank timestamps in >>>>>>> VRR mode though, and expected to do without them, as any scheduling >>>>>>> has to happen time-based instead of based on vblank counts anyway, >>>>>>> given that vblank counts are quite meaningless if every frame can have >>>>>>> a different duration. I'd expect that situation to be similar for >>>>>>> other apps that would want to use timestamps in VRR mode like video >>>>>>> games, movie players or VR/AR applications. >>>>>> >>>>>> Yeah for that you want to schedule your frames with a target frame number. >>>>>> Would still be nice if vblank timestamps wouldn't be an outright lie (and >>>>>> the timestamp is still useful information, even if the frame counter >>>>>> isn't - many displays have min/max frame rates for VRR, which is >>>>>> information we could perhaps expose). >>>>>> -Daniel > >>>>>>> >>>>>>> -mario >>>>>>> >>>>>>>>> >>>>>>>>> Nicholas Kazlauskas >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> --- >>>>>>>>>>> drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- >>>>>>>>>>> include/drm/drm_vblank.h | 8 ++++++ >>>>>>>>>>> 2 files changed, 56 insertions(+), 1 deletion(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c >>>>>>>>>>> index 98e091175921..4b3a4c38fabe 100644 >>>>>>>>>>> --- a/drivers/gpu/drm/drm_vblank.c >>>>>>>>>>> +++ b/drivers/gpu/drm/drm_vblank.c >>>>>>>>>>> @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, >>>>>>>>>>> u64 seq, ktime_t now) >>>>>>>>>>> { >>>>>>>>>>> struct timespec64 tv; >>>>>>>>>>> + ktime_t alt_flip_time; >>>>>>>>>>> >>>>>>>>>>> switch (e->event.base.type) { >>>>>>>>>>> - case DRM_EVENT_VBLANK: >>>>>>>>>>> case DRM_EVENT_FLIP_COMPLETE: >>>>>>>>>>> + /* >>>>>>>>>>> + * For flip completion events, override "now" time >>>>>>>>>>> + * with alt_flip_time provided by the driver via >>>>>>>>>>> + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode >>>>>>>>>>> + * if that time is later than given "now" vblank time. >>>>>>>>>>> + */ >>>>>>>>>>> + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; >>>>>>>>>>> + if (alt_flip_time > now) >>>>>>>>>>> + now = alt_flip_time; >>>>>>>>>>> + /* Fallthrough */ >>>>>>>>>>> + case DRM_EVENT_VBLANK: >>>>>>>>>>> tv = ktime_to_timespec64(now); >>>>>>>>>>> e->event.vbl.sequence = seq; >>>>>>>>>>> /* >>>>>>>>>>> @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, >>>>>>>>>>> >>>>>>>>>>> now = ktime_get(); >>>>>>>>>>> } >>>>>>>>>>> + >>>>>>>>>>> e->pipe = pipe; >>>>>>>>>>> send_vblank_event(dev, e, seq, now); >>>>>>>>>>> } >>>>>>>>>>> EXPORT_SYMBOL(drm_crtc_send_vblank_event); >>>>>>>>>>> >>>>>>>>>>> +/** >>>>>>>>>>> + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time >>>>>>>>>>> + * @crtc: the source CRTC of the pageflip completion event >>>>>>>>>>> + * @flip_time: The alternate pageflip completion timestamp in VRR mode >>>>>>>>>>> + * >>>>>>>>>>> + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried >>>>>>>>>>> + * by the pageflip event can never be earlier than the vblank timestamp of the >>>>>>>>>>> + * vblank of flip completion, as that vblank timestamp defines the end of the >>>>>>>>>>> + * shortest possible vblank duration. In case of a delayed flip completion >>>>>>>>>>> + * inside the extended VRR front porch however, the end of vblank can be much >>>>>>>>>>> + * later, so the driver must assign an estimated timestamp of that later end of >>>>>>>>>>> + * vblank. For a CRTC in VRR mode, the driver should use this helper function to >>>>>>>>>>> + * set an alternate flip completion timestamp in case of late flip completions >>>>>>>>>>> + * in extended vblank. In the most simple case, this @flip_time timestamp could >>>>>>>>>>> + * simply be a ktime_get() timestamp taken at the start of the pageflip >>>>>>>>>>> + * completion routine, with some constant duration of the back porch interval >>>>>>>>>>> + * added, although more precise estimates may be possible on some hardware if >>>>>>>>>>> + * the hardware provides some means of timestamping the true end of vblank. >>>>>>>>>>> + * >>>>>>>>>>> + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it >>>>>>>>>>> + * will use either the standard vblank timestamp, calculated for a minimum >>>>>>>>>>> + * duration vblank, or the provided @flip_time if that time is later than the >>>>>>>>>>> + * vblank timestamp, to get the best possible estimate of start of display of >>>>>>>>>>> + * the new post-pageflip scanout buffer. >>>>>>>>>>> + */ >>>>>>>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, >>>>>>>>>>> + ktime_t flip_time) >>>>>>>>>>> +{ >>>>>>>>>>> + struct drm_device *dev = crtc->dev; >>>>>>>>>>> + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; >>>>>>>>>>> + >>>>>>>>>>> + vblank->alt_flip_time = flip_time; >>>>>>>>>>> +} >>>>>>>>>>> +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); >>>>>>>>>>> + >>>>>>>>>>> static int __enable_vblank(struct drm_device *dev, unsigned int pipe) >>>>>>>>>>> { >>>>>>>>>>> if (drm_core_check_feature(dev, DRIVER_MODESET)) { >>>>>>>>>>> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h >>>>>>>>>>> index 6ad9630d4f48..aacf44694ab6 100644 >>>>>>>>>>> --- a/include/drm/drm_vblank.h >>>>>>>>>>> +++ b/include/drm/drm_vblank.h >>>>>>>>>>> @@ -117,6 +117,12 @@ struct drm_vblank_crtc { >>>>>>>>>>> * @time: Vblank timestamp corresponding to @count. >>>>>>>>>>> */ >>>>>>>>>>> ktime_t time; >>>>>>>>>>> + /** >>>>>>>>>>> + * @alt_flip_time: Vblank timestamp for end of extended vblank due to >>>>>>>>>>> + * a late pageflip completion in variable refresh rate mode. Pageflip >>>>>>>>>>> + * events will carry the later one of @time and @alt_flip_time. >>>>>>>>>>> + */ >>>>>>>>>>> + ktime_t alt_flip_time; >>>>>>>>>>> >>>>>>>>>>> /** >>>>>>>>>>> * @refcount: Number of users/waiters of the vblank interrupt. Only when >>>>>>>>>>> @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); >>>>>>>>>>> u64 drm_crtc_vblank_count(struct drm_crtc *crtc); >>>>>>>>>>> u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, >>>>>>>>>>> ktime_t *vblanktime); >>>>>>>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, >>>>>>>>>>> + ktime_t flip_time); >>>>>>>>>>> void drm_crtc_send_vblank_event(struct drm_crtc *crtc, >>>>>>>>>>> struct drm_pending_vblank_event *e); >>>>>>>>>>> void drm_crtc_arm_vblank_event(struct drm_crtc *crtc, >>>>>>>>>>> -- >>>>>>>>>>> 2.17.1 >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> dri-devel mailing listjj >>>>>>>>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Daniel Vetter >>>>>>>> Software Engineer, Intel Corporation >>>>>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch >>>>>> >>>>> >>>>> Nicholas Kazlauskas >>>>> >>>> >>>> >>> >> >> >> -- >> Daniel Vetter >> Software Engineer, Intel Corporation >> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Wed, Feb 13, 2019 at 07:10:00PM +0100, Mario Kleiner wrote: > On Wed, Feb 13, 2019 at 5:03 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > On Wed, Feb 13, 2019 at 4:46 PM Kazlauskas, Nicholas > > <Nicholas.Kazlauskas@amd.com> wrote: > > > > > > On 2/13/19 10:14 AM, Daniel Vetter wrote: > > > > On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas > > > > <Nicholas.Kazlauskas@amd.com> wrote: > > > >> > > > >> On 2/13/19 4:50 AM, Daniel Vetter wrote: > > > >>> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote: > > > >>>> On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > >>>>> > > > >>>>> On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas > > > >>>>> <Nicholas.Kazlauskas@amd.com> wrote: > > > >>>>>> > > > >>>>>> On 2/11/19 3:35 AM, Daniel Vetter wrote: > > > >>>>>>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote: > > > >>>>>>>> The pageflip completion timestamps transmitted to userspace > > > >>>>>>>> via pageflip completion events are supposed to describe the > > > >>>>>>>> time at which the first pixel of the new post-pageflip scanout > > > >>>>>>>> buffer leaves the video output of the gpu. This time is > > > >>>>>>>> identical to end of vblank, when active scanout starts. > > > >>>>>>>> > > > >>>>>>>> For a crtc in standard fixed refresh rate, the end of vblank > > > >>>>>>>> is identical to the vblank timestamps calculated by > > > >>>>>>>> drm_update_vblank_count() at each vblank interrupt, or each > > > >>>>>>>> vblank dis-/enable. Therefore pageflip events just carry > > > >>>>>>>> that vblank timestamp as their pageflip timestamp. > > > >>>>>>>> > > > >>>>>>>> For a crtc switched to variable refresh rate mode (vrr), the > > > >>>>>>>> pageflip completion timestamps are identical to the vblank > > > >>>>>>>> timestamps iff the pageflip was executed early in vblank, > > > >>>>>>>> before the minimum vblank duration elapsed. In this case > > > >>>>>>>> the time of display onset is identical to when the crtc > > > >>>>>>>> is running in fixed refresh rate. > > > >>>>>>>> > > > >>>>>>>> However, if a pageflip completes later in the vblank, inside > > > >>>>>>>> the "extended front porch" in vrr mode, then the vblank will > > > >>>>>>>> terminate at a fixed (back porch) duration after flip, so > > > >>>>>>>> the display onset time is delayed correspondingly. In this > > > >>>>>>>> case the vblank timestamp computed at vblank irq time would > > > >>>>>>>> be too early, and we need a way to calculate an estimated > > > >>>>>>>> pageflip timestamp that will be later than the vblank timestamp. > > > >>>>>>>> > > > >>>>>>>> How a driver determines such a "late flip" timestamp is hw > > > >>>>>>>> and driver specific, but this patch adds a new helper function > > > >>>>>>>> that allows the driver to propose such an alternate "late flip" > > > >>>>>>>> timestamp for use in pageflip events: > > > >>>>>>>> > > > >>>>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > > > >>>>>>>> > > > >>>>>>>> When sending out pageflip events, we now compare that proposed > > > >>>>>>>> flip_timestamp against the vblank timestamp of the current > > > >>>>>>>> vblank of flip completion and choose to send out the greater/ > > > >>>>>>>> later timestamp as flip completion timestamp. > > > >>>>>>>> > > > >>>>>>>> The most simple way for a kms driver to supply a suitable > > > >>>>>>>> flip_timestamp in vrr mode would be to simply take a timestamp > > > >>>>>>>> at start of the pageflip completion handler, e.g., pageflip > > > >>>>>>>> irq handler: flip_timestamp = ktime_get(); and then set that > > > >>>>>>>> as proposed "late" alternative timestamp via ... > > > >>>>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); > > > >>>>>>>> > > > >>>>>>>> More clever approaches could try to add some corrective offset > > > >>>>>>>> for fixed back porch duration, or ideally use hardware features > > > >>>>>>>> like hw timestamps to calculate the exact end time of vblank. > > > >>>>>>>> > > > >>>>>>>> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> > > > >>>>>>>> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> > > > >>>>>>>> Cc: Harry Wentland <harry.wentland@amd.com> > > > >>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> > > > >>>>>>> > > > >>>>>>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us > > > >>>>>>> the right timestampe, once? With this I guess if you do a vblank query in > > > >>>>>>> between the wrong and the right vblank you'll get the bogus value. Not > > > >>>>>>> really great for userspace. > > > >>>>>>> -Daniel > > > >>>>>> > > > >>>>>> I think we calculate the timestamp and send the vblank event both within > > > >>>>>> the pageflip IRQ handler so calculating the right pageflip timestamp > > > >>>>>> once could probably be done. I'm not sure if it's easier than proposing > > > >>>>>> a later flip time with an API like this though. > > > >>>>>> > > > >>>>>> The actual scanout time should be known from the page-flip handler so > > > >>>>>> the semantics for VRR on/off remain the same. This is because the > > > >>>>>> page-flip triggers entering the back porch if we're in the extended > > > >>>>>> front porch. > > > >>>>>> > > > >>>>>> But scanout time from vblank events for something like > > > >>>>>> DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only > > > >>>>>> treated as estimates. If we're in the regular front porch then the > > > >>>>>> timing to scanout is based on the fixed duration front porch for the > > > >>>>>> current mode. If we're in the extended back porch then it's technically > > > >>>>>> driver defined but the most reasonable guess is to assume that the front > > > >>>>>> porch is going to end at any moment, so just return the length of the > > > >>>>>> back porch for getting the scanout time. > > > >>>>>> > > > >>>>>> Proposing the late timestamp shouldn't affect vblank event in the > > > >>>>>> DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip > > > >>>>>> event case. I'm not sure if that's what's guaranteed to happen with this > > > >>>>>> patch though. There doesn't seem to be any locking on either > > > >>>>>> dev->vblank_time_lock or the vblank->seqlock so while it's likely to get > > > >>>>>> the same vblank event back as the one just stored I don't think it's > > > >>>>>> guaranteed. > > > >>>>> > > > >>>>> That's the inconsistency I mean to highlight - the timestamp for the > > > >>>>> same frame as observed through flip complete and through the > > > >>>>> wait_vblank ioctl can differ. Which they really shouldn't. > > > >>>>> > > > >>>> > > > >>>> Ideally they shouldn't differ. The kernel docs for drm_crtc_state say > > > >>>> that vblank and pageflip timestamps should always match. But then the > > > >>>> kernel docs for "Variable refresh properties" in drm_connector.c for > > > >>>> vblank timestamps were changed for the VRR implementation in Linux > > > >>>> 5.0-rc to redefine them when in VRR mode. They are defined, but > > > >>>> probably rather useless for any practical purpose, like this: > > > >>>> > > > >>>> "The semantics for the vertical blank timestamp differ when > > > >>>> variable refresh rate is active. The vertical blank timestamp > > > >>>> is defined to be an estimate using the current mode's fixed > > > >>>> refresh rate timings. The semantics for the page-flip event > > > >>>> timestamp remain the same." > > > >>> > > > >>> Uh I missed that. That sounds like nonsense tbh. > > > >>> > > > >>>> So our docs contradict each other as of Linux 5.0-rc. Certainly having > > > >>>> useful vblank timetamps would be useful. > > > >>> > > > >>> Yup, imo vblank should still match page_flip. Otherwise I expect a lot of > > > >>> hilarity will ensue. > > > >> > > > >> I would imagine you would see more breakage by changing userspace > > > >> expectations for when the event is actually received. > > > >> > > > >> The IOCTL is "wait for vblank", but if we start sending the event near > > > >> the end of vblank it becomes more "wait for pageflip" (near scanout) or > > > >> "wait 15ms while the VRR front porch times out and causes flickering". > > > >> > > > >> I don't speak for all userspace but it seems more useful to me to have > > > >> the event sent at the start of vblank. > > > > > > > > Maybe we need to things. The problem is that the timestampe is defined > > > > to be correct for start-of-frame. And right now the amdgpu > > > > implementation of VRR doesn't even give an accurate timestampe for the > > > > flip, so most likely no one yet thought about this properly. > > > > Definitely no testcases (or it would have been caught), whereas we > > > > have tons of testcases for fixed refresh. > > > > > > I agree that it would be good to have documentation for what waiting for > > > vblank is supposed to mean here. There's a big difference between > > > getting the event at the start of vsync vs the start of the back porch > > > near the end of it. > > > > > > > > > > >> VRR in its current state is useful for applications that aren't > > > >> completely timing sensitive (like most games). This is because the > > > >> driver has free reign in restricting the min/max bounds of the range - > > > >> allowing for enhancements like low framerate compensation / BTR. > > > >> > > > >> For a concrete example, imagine you're not doing frontbuffer rendering > > > >> but you're still trying to target a specific scanout time. So even if we > > > >> were to send the vblank event at vpos 0 you'd get an accurate scanout > > > >> timestamp for the next scanout after the vback porch ends. You're not > > > >> going to be doing any rendering or preparation in this time because it's > > > >> simply too short. So you'd target the scanout period after this one, > > > >> which you'd need to calculate somehow in userspace. Taking the deltas > > > >> between two vblank events is completely useless here because you'll get > > > >> back the vrr min refresh delta. So you can take your source content rate > > > >> or some fixed rate you'd like to target within the range but even then > > > >> you're still not guaranteed to have that flip end up at that scanout time. > > > >> > > > >> The current solution is to just return the scanout pos at vrr max, or > > > >> the current mode's fixed refresh rate. If you're flip before the fixed > > > >> duration front porch ends then that scanout time will likely be accurate > > > >> since you can't flip faster than vrr max. > > > >> > > > >> Is there actually userspace that cares that these timestamps need to > > > >> match? Like said below, it seems that wait for vblank is mostly about > > > >> scheduling and for that I think having the event sent back at the start > > > >> of vblank is more important here. > > > > > > > > There's 3 cases: > > > > - DRI3: Waits for vblank for frontbuffer blits. If you combine that > > > > with VRR then imo you deserve to keep all the pieces. > > > > - generic compositors. Use VRR for scheduling, expect that a page_flip > > > > queued immediately after a page_flip will hit the next frame, and not > > > > the current frame. They also expect that this will not give you the > > > > slowest possible refresh rate, but I guess VRR plus frame scheduling > > > > is currently out of scope. It's kinda just for as-fast-as-possible > > > > games. > > > > - flip timestamp. Currently the entire codebase assumes that vblank ts > > > > = flip ts. I don't really see a point in breaking that. > > > > - better scheduling of VRR: what compositors actually do isn't wait > > > > for the vblank, that only latches the page_flip. They set up a > > > > hrtimer, with a fixed/auto-tuned head-start, from which they start > > > > their rendering. The same could be done for VRR, if we expose the > > > > earliest/latest vblank end times (and how much they can shift). vblank > > > > ioctl is only used for syncing that up and making sure we're hitting > > > > the right frame. This is all for pageflipping compositors. This is > > > > also what they'll probably keep doing when we give them a desired flip > > > > time (instead of a target frame), they still need an hrtimer to fire a > > > > little bit ahead to get the frame rendered. > > > > > > > > So taking all together, we have 2 use-cases I think: > > > > - frontbuffer rendering. Kinda wants vblank event to fire at the > > > > beginning of vblank. Won't ever do VRR > > > > - pageflipping compositors. Don't care about the vblank event, either > > > > they have an hrtimer timer (which gets synced with the vblank ioctl), > > > > or just flip as fast as possible. They don't care one bit when the > > > > vblank event happens, as long as a pageflip scheduled immediately > > > > after the event shows up only hits the next frame (which is the delay > > > > code that currently stretches vblank unecessarily with VRR if you're > > > > already a bit late). > > > > > > > > Making sure that vblank and page flip timestamp match the real > > > > userspace seems like the cleanest semantics, and the least surprising > > > > to userspace. E.g. what happens if you implement VRR in a compositor, > > > > but then tune your hrtimer within the vblank, when the vblank > > > > timestamp is totally not reflecting your real pageflip? I just don't > > > > see what's the upside of having a vblank timestamp that's not actually > > > > reflecting the real flip. > > > > > > > > Also, if your app wants the start of vblank, that's very easy to > > > > compute: Just add the scanout time to the last vblank timestamp. > > > > Figuring out the start of frame otoh isn't doable if you can't query, > > > > and constantly pageflipping just to have an accurate start-of-frame > > > > (for tuning your hrtimer) seems rather wasteful - it outright defeats > > > > some of the use-cases for VRR. > > > > > > > > So what benefit do you see in having vblank ts and pageflip ts not > > > > agree? What's the use-case this solves - aside from "less typing" :-) > > > > If the only use-case is VRR for frontbuffer rendering, then that > > > > doesn't even work right now (aside from that I think it's something we > > > > shouldn't support and definitely not encourage). > > > > -Daniel > > > > > > I'm actually fine with moving the vblank event back to the back porch > > > but I'd just be worried about userspace breakages that relied on it > > > happening the other way. > > > > > > It'd be especially confusing if the VLINE was different for fixed vs > > > variable refresh. I wouldn't want to move it to vpos 0 unless the event > > > is also getting moved for fixed refresh as well. > > > > That would break DRI3. So we can pick between two options: > > - always send out the vblank early, accept that vblank timestamps change. > > - for VRR only, send out the vblank only when we know the correct > > timestamp. For non-VRR we have to send out the vblank early, because > > that's somewhat become api (but not all drivers do it that way, so > > it's about as hopeless as frontbuffer rendering in general). > > > > I still don't see the benefit of the first option, or the exact use case. > > > > Still haven't slept so i don't know if i'm adding anything to the > discussion that Daniel hasn't stated already, but anyway: > > I think i agree with Daniel on this. Having a vblank ts that truly > reflects the end of last vblank/start of scanout should make > time-based scheduling easier. Then we know when scanout of the active > frame starts and calculate from known fixed scanout duration when next > vblank starts and when it ends the latest. On X11 you can > glXGetSyncValuesOML() query that start time to sync up a compositors > update cycle or start of an animation or timed presentation sequence > with vblank. E.g., Waylands Weston does this when switching from idle > desktop to redraw loop. A movie player or my own software could use > this info in the same way, whereas i don't see the current vblank > timestams being useful. If you get the current vblank event with > timestamp at start of vblank ie. a bit into the short fixed duration > front porch, you'd have to render and flip within what small amount of > time is left unti the end of the fixed front-porch, because once you > get into variable extended front-porch your vblank timestamp is > already meaningless. > > I also think if we want to implement something like the ability to > flip for a specific target scanout time -- which i'd really like to > have -- having the true end of last vblank gives us a baseline > timestamp that is easier to reason about inside the kernel, similar to > how a compositor can reason/adjust to when to start composing the next > frame or submitting a flip to hit some consistent framerate, or in > turn to better obey a presentation request of a client for a specific > target time. > > At the same time we need to keep current behaviour for non-VRR mode, > with early vblanks, so all the target_msc vblank count based > scheduling keeps working. I don't think clients that are written to > use VRR should expect exactly the same timing behaviour wrt. when > vblank events get sent or timestamps updated. I certainly expect to > have separate code-paths in my app for optimal VRR vs. non-VRR > presentation scheduling, a compositor or movie player might adapt in > the same way, and fullscreen apps can opt-out of VRR if they want. > > Clients just need to know what the behavior is in VRR mode. I wonder > what we should do for the kernel docs for Linux 5.0-rc? I assume we > won't have the time to fix this all before release, would it make > sense to have some kernel-docs updates via drm-fixes that declare the > vblank/pageflip timestamps as "undefined"/"not to be relied > on"/"subject to future change" for this release - essentially declare > vblank timestamps as unstable api for the moment, so nobody relies on > the current quite broken timestamp behaviour? > > I do now have hacked up proof-of-concept patches here that after > torturing them for a few hours with my own tests now seem to be able > to calculate correct pageflip timestamps regardless which of the two > options we choose, but they need refinement/cleanup/rebasing onto > somethign else than Linux 5.0-rc5, and help from Nicholas, because i > tend to get lost in all the indirections and file hierarchy inside the > DisplayCore code etc., and probably also a few kinks worked out. And > see if shifting the vblank irq handling doesn't cause other weird > interactions? I haven't looked how Nicholas below-the-range/framerate > compensation works in detail and if it matter that it happens at start > of front-porch. I can only say that it doesn't seem to interfere with > correct timestamping if i run with framerates intentionally below > vrr-min, as far as my limited testing goes. Another thing that may be > problematic is disabling/reenabling vblank during idle times, given > that we can only reliably timestamp from within pageflip irq handler > or vblank irq ~ back-porch/active scanout. Maybe we need to > drm_crtc_vblank_get() hold a reference while VRR is active to simplify > things for the moment. I think a solid testcase would be really good. Something that uses an hrtimer to sweept across the entire vblank, from 0 until it notices that the kernel doesn't increase the vblank any more (or 1 second or something, we want to stop eventually) should hit all the corner cases, and then making sure all the timestamps we get back on the sw side are at least self-consistent. For non-VRR we have this in igt with kms_flip, and it tends to find lots of bugs. I think it'd be great if we could have something like this for VRR too, once we've agreed on the exact semantics. Plus docs patch, bugfixes and everything ofc. > One thing one could merge is that other VRR friendly throttling patch > i sent, given that that works and is independent of the timestamp > stuff. I also have a backport for Linux 5.0-rc if that makes sense? > That VRR throttling also needs to get reworked if/once we wanted to > emit matching vblank/pflip timestamps, as that doesn't like the > shifted point inside vblank when vblank counter gets updated. > > Basically i don't know when the deadlines for merging this kind of > work are/were wrt. Linux 5.0, 5.1, 5.2? Given how much we're still discussing, and how much needs to be done, I'd say 5.2. And we shrug 5.0 and 5.1 simply off as "welp, broken kernel for VRR if you care about timestamps" if backporting the final fix is too much. > I'll look more into cleaning/testing those patches tomorrow after lots > of sleep and then see what Nicholas thinks of them. > > One problem with the VRR stuff is that the hardware testing equipment, > that i'd usually use to verify if timestamps are really correct and > accurate, doesn't work with DisplayPort or HDMI signals, so i can only > test for self-consistency of timestamps atm., and look if plots of > timestamps look how i would expect them to look if stuff is mostly > correct. Yeah I think self-consistency is already really useful. As mentioned above, kms_flip (which only tests self-consistency) catches tons of stuff. -Daniel > > -mario > > > > But like you said, the only thing I can think of that should need the > > > event right away is low latency / framebuffer rendering. > > > > > > I can think at least one potential use case where you would want to do > > > front buffer rendering for low latency with VRR - emulation of old > > > console hardware. There's not really any modern display that has fixed > > > refresh modes that match the render frequency of some consoles. But this > > > use case isn't out there in the wild right now at least. > > > > Even on an emulator I'd expect it'd just do regular flips to the > > frontbuffer. Especially on mobile there's really no such thing as > > frontbuffer rendering, you always do a flip. The DIRTYFB ioctl (which > > you need to call instead of doing a page flip) simply does a pageflip > > behind your back for you. Which I think is worse than actually > > flipping. > > > > And for emulating exact refresh rates your then again go back to > > hrtimer + issuing the flip approach, but using your desired refresh > > rate to tune the hrtimer instead of the fixed one. But that needs a > > correct vblank timestamp for the current frame to work correctly (or > > you force compositor has to always flip, which defeats the power > > saving aspects of VRR allowing to stretch refresh rates when nothing > > changes). > > > > > Regarding compositors: I don't really recommend VRR for this use case in > > > general since you'll want a fixed refresh rate frequency for input to > > > feel consistent and to avoid any flickering / luminance changes. The > > > fastest refresh rate will be the most responsive to the user and you can > > > just used the fixed refresh rate mode in this case. Lowering the refresh > > > rate to save power sounds nice in theory but I'd imagine it'd feel > > > terrible in practice. > > > > Why is a compositor any different from a game? You aim to render at > > 60fps, or whatever the refresh rate is. If you're one frame late it's > > better to stretch the vblank slightly than a full on 1-frame freeze. > > And on mobile it's unfortunately fairly common that the system fails > > to clock up quickly enough when starting from an idle state, VRR > > sounds like really useful here. > > > > But like in a game if you don't at least aim to render at a consistent > > fps, and handle your input at a consistent fps, then the end result > > will indeed be terrible. > > -Daniel > > > > > Nicholas Kazlauskas > > > > > > > > > > >>>>> Now added complication is that amdgpu sends out vblank events really > > > >>>>> early, which is used by userspace to do frontbuffer rendering in the > > > >>>>> vblank time. But I don't think anyone wants to do both VRR and > > > >>>> > > > >>>> I think all kms drivers try to call drm_crtc_handle_vblank() at start > > > >>>> of vblank to give Mesa the most time for frontbuffer rendering for > > > >>>> classic X. But vblank events are also used for scheduling bufferswaps > > > >>>> or other stuff for redirected windowed rendering, or via api's like > > > >>>> OML_sync_controls glXWaitForMscOML, so there might be other things > > > >>>> affected by a more delayed vblank handling. > > > >>> > > > >>> The frontbuffer rendering is very much X driver specific, and I think > > > >>> -amdgpu/radeon is the only one that requires this. No i915 driver ever > > > >>> used the vblank interrupt to schedule frontbuffer blits, we use some > > > >>> CS-side stalls. > > > >>> > > > >>> Wrt scheduling pageflips: The rule is that right after you've received the > > > >>> vblank for frame X, then an immediately schedule pageflip should hit X+1, > > > >>> but not X. amdgpu had this broken, but it's fixed since a while. > > > >>> > > > >>>>> frontbuffer rendering, hence I think it should be possible to create > > > >>>>> correct vblank timestamps for the VRR case, while leaving the current > > > >>>>> logic in place for everything else. But that means moving the entire > > > >>>>> vblank processing to where you know the next frame's start time for > > > >>>>> VRR, both for page flips and for all other vblank events. > > > >>>>> -Daniel > > > >>>>> > > > >>>> > > > >>>> I think for amdgpu it would be doable by calling drm_handle_vblank > > > >>>> from the pageflip irq handler in VRR mode, and then additionally from > > > >>>> the vblank interrupt handler like now. Vblank irq's are triggered via > > > >>>> vline interrupts with programmable vertical trigger line position, so > > > >>>> we could program the trigger line to be start of back-porch instead of > > > >>>> start of front-porch. In the pageflip case we do vblank handling + > > > >>>> timestamping + pflip completion from pflip irq, and have the regular > > > >>>> back-porch vblank handling for refresh cycles in which no > > > >>>> pageflip/pflip irq happened. > > > >>> > > > >>> Hm, can't we simply program the vblank to occur in the back porch region > > > >>> for VRR? The vblank timestamping should automatically correct the > > > >>> timestamp in that case. Would require a lot less code changes. > > > >>> > > > >>> Plus update the documentation ofc. > > > >>> > > > >>>> Not sure if/how that would mess with below-vrr-refresh-rate-range > > > >>>> support for stutter reduction at low fps though? Or with performance, > > > >>>> given vblank event dispatch could be delayed a lot in VRR compared to > > > >>>> normal mode? Or with all the vblank accounting during modesets or crtc > > > >>>> en/disable? > > > >> > > > >> I think we have extra VLINEs that can be programmed at vpos 0 that could > > > >> resolve this but I'm still not really sold on the idea. > > > >> > > > >>> > > > >>> Currently VRR and explicit frame scheduling don't mix. We've discussed > > > >>> this, and consens seems to be that we need a new property for page_flip, > > > >>> similar to the target frame, but instead of frames a timestamp. Then the > > > >>> driver can hit a specific time for the next frame. > > > >>> > > > >>> And since compositors use the vblank ioctl just to do frame scheduling, I > > > >>> don't think we have to worry hugely about interactions there. Much better > > > >>> to aim for clean semantics (i.e. vblank and flip complete timestamps > > > >>> better match). > > > >>> > > > >>>> Also, could other drivers like Intel easily do this delayed vblank > > > >>>> processing in the non-pageflip case? > > > >>> > > > >>> Intel's problem :-) But yeah we have page flip timestamp registers, so > > > >>> worst case all we need to do is shuffle code around a bit. And we have > > > >>> interrupts for both start of vblank and and of vblank. It might be a bit > > > >>> more work because we use the vblank interrupt to avoid races in atomic > > > >>> updates, but that's not too terrible to fix. > > > >>> > > > >>>> From my user perspective as a developer of a neuroscience research > > > >>>> application that has a couple of exciting/important use cases for VRR > > > >>>> mode, i absolutely do need trustworthy and precise pageflip event > > > >>>> timestamps that represent reality, otherwise VRR mode would be less > > > >>>> than useless for me. I can do without meaningful vblank timestamps in > > > >>>> VRR mode though, and expected to do without them, as any scheduling > > > >>>> has to happen time-based instead of based on vblank counts anyway, > > > >>>> given that vblank counts are quite meaningless if every frame can have > > > >>>> a different duration. I'd expect that situation to be similar for > > > >>>> other apps that would want to use timestamps in VRR mode like video > > > >>>> games, movie players or VR/AR applications. > > > >>> > > > >>> Yeah for that you want to schedule your frames with a target frame number. > > > >>> Would still be nice if vblank timestamps wouldn't be an outright lie (and > > > >>> the timestamp is still useful information, even if the frame counter > > > >>> isn't - many displays have min/max frame rates for VRR, which is > > > >>> information we could perhaps expose). > > > >>> -Daniel > > > > >>>> > > > >>>> -mario > > > >>>> > > > >>>>>> > > > >>>>>> Nicholas Kazlauskas > > > >>>>>> > > > >>>>>>> > > > >>>>>>>> --- > > > >>>>>>>> drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- > > > >>>>>>>> include/drm/drm_vblank.h | 8 ++++++ > > > >>>>>>>> 2 files changed, 56 insertions(+), 1 deletion(-) > > > >>>>>>>> > > > >>>>>>>> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c > > > >>>>>>>> index 98e091175921..4b3a4c38fabe 100644 > > > >>>>>>>> --- a/drivers/gpu/drm/drm_vblank.c > > > >>>>>>>> +++ b/drivers/gpu/drm/drm_vblank.c > > > >>>>>>>> @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, > > > >>>>>>>> u64 seq, ktime_t now) > > > >>>>>>>> { > > > >>>>>>>> struct timespec64 tv; > > > >>>>>>>> + ktime_t alt_flip_time; > > > >>>>>>>> > > > >>>>>>>> switch (e->event.base.type) { > > > >>>>>>>> - case DRM_EVENT_VBLANK: > > > >>>>>>>> case DRM_EVENT_FLIP_COMPLETE: > > > >>>>>>>> + /* > > > >>>>>>>> + * For flip completion events, override "now" time > > > >>>>>>>> + * with alt_flip_time provided by the driver via > > > >>>>>>>> + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode > > > >>>>>>>> + * if that time is later than given "now" vblank time. > > > >>>>>>>> + */ > > > >>>>>>>> + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; > > > >>>>>>>> + if (alt_flip_time > now) > > > >>>>>>>> + now = alt_flip_time; > > > >>>>>>>> + /* Fallthrough */ > > > >>>>>>>> + case DRM_EVENT_VBLANK: > > > >>>>>>>> tv = ktime_to_timespec64(now); > > > >>>>>>>> e->event.vbl.sequence = seq; > > > >>>>>>>> /* > > > >>>>>>>> @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > > > >>>>>>>> > > > >>>>>>>> now = ktime_get(); > > > >>>>>>>> } > > > >>>>>>>> + > > > >>>>>>>> e->pipe = pipe; > > > >>>>>>>> send_vblank_event(dev, e, seq, now); > > > >>>>>>>> } > > > >>>>>>>> EXPORT_SYMBOL(drm_crtc_send_vblank_event); > > > >>>>>>>> > > > >>>>>>>> +/** > > > >>>>>>>> + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time > > > >>>>>>>> + * @crtc: the source CRTC of the pageflip completion event > > > >>>>>>>> + * @flip_time: The alternate pageflip completion timestamp in VRR mode > > > >>>>>>>> + * > > > >>>>>>>> + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried > > > >>>>>>>> + * by the pageflip event can never be earlier than the vblank timestamp of the > > > >>>>>>>> + * vblank of flip completion, as that vblank timestamp defines the end of the > > > >>>>>>>> + * shortest possible vblank duration. In case of a delayed flip completion > > > >>>>>>>> + * inside the extended VRR front porch however, the end of vblank can be much > > > >>>>>>>> + * later, so the driver must assign an estimated timestamp of that later end of > > > >>>>>>>> + * vblank. For a CRTC in VRR mode, the driver should use this helper function to > > > >>>>>>>> + * set an alternate flip completion timestamp in case of late flip completions > > > >>>>>>>> + * in extended vblank. In the most simple case, this @flip_time timestamp could > > > >>>>>>>> + * simply be a ktime_get() timestamp taken at the start of the pageflip > > > >>>>>>>> + * completion routine, with some constant duration of the back porch interval > > > >>>>>>>> + * added, although more precise estimates may be possible on some hardware if > > > >>>>>>>> + * the hardware provides some means of timestamping the true end of vblank. > > > >>>>>>>> + * > > > >>>>>>>> + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it > > > >>>>>>>> + * will use either the standard vblank timestamp, calculated for a minimum > > > >>>>>>>> + * duration vblank, or the provided @flip_time if that time is later than the > > > >>>>>>>> + * vblank timestamp, to get the best possible estimate of start of display of > > > >>>>>>>> + * the new post-pageflip scanout buffer. > > > >>>>>>>> + */ > > > >>>>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > > > >>>>>>>> + ktime_t flip_time) > > > >>>>>>>> +{ > > > >>>>>>>> + struct drm_device *dev = crtc->dev; > > > >>>>>>>> + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; > > > >>>>>>>> + > > > >>>>>>>> + vblank->alt_flip_time = flip_time; > > > >>>>>>>> +} > > > >>>>>>>> +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); > > > >>>>>>>> + > > > >>>>>>>> static int __enable_vblank(struct drm_device *dev, unsigned int pipe) > > > >>>>>>>> { > > > >>>>>>>> if (drm_core_check_feature(dev, DRIVER_MODESET)) { > > > >>>>>>>> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h > > > >>>>>>>> index 6ad9630d4f48..aacf44694ab6 100644 > > > >>>>>>>> --- a/include/drm/drm_vblank.h > > > >>>>>>>> +++ b/include/drm/drm_vblank.h > > > >>>>>>>> @@ -117,6 +117,12 @@ struct drm_vblank_crtc { > > > >>>>>>>> * @time: Vblank timestamp corresponding to @count. > > > >>>>>>>> */ > > > >>>>>>>> ktime_t time; > > > >>>>>>>> + /** > > > >>>>>>>> + * @alt_flip_time: Vblank timestamp for end of extended vblank due to > > > >>>>>>>> + * a late pageflip completion in variable refresh rate mode. Pageflip > > > >>>>>>>> + * events will carry the later one of @time and @alt_flip_time. > > > >>>>>>>> + */ > > > >>>>>>>> + ktime_t alt_flip_time; > > > >>>>>>>> > > > >>>>>>>> /** > > > >>>>>>>> * @refcount: Number of users/waiters of the vblank interrupt. Only when > > > >>>>>>>> @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); > > > >>>>>>>> u64 drm_crtc_vblank_count(struct drm_crtc *crtc); > > > >>>>>>>> u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, > > > >>>>>>>> ktime_t *vblanktime); > > > >>>>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, > > > >>>>>>>> + ktime_t flip_time); > > > >>>>>>>> void drm_crtc_send_vblank_event(struct drm_crtc *crtc, > > > >>>>>>>> struct drm_pending_vblank_event *e); > > > >>>>>>>> void drm_crtc_arm_vblank_event(struct drm_crtc *crtc, > > > >>>>>>>> -- > > > >>>>>>>> 2.17.1 > > > >>>>>>>> > > > >>>>>>>> _______________________________________________ > > > >>>>>>>> dri-devel mailing listjj > > > >>>>>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > >>>>>>> > > > >>>>>> > > > >>>>> > > > >>>>> > > > >>>>> -- > > > >>>>> Daniel Vetter > > > >>>>> Software Engineer, Intel Corporation > > > >>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch > > > >>> > > > >> > > > >> Nicholas Kazlauskas > > > >> > > > > > > > > > > > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c index 98e091175921..4b3a4c38fabe 100644 --- a/drivers/gpu/drm/drm_vblank.c +++ b/drivers/gpu/drm/drm_vblank.c @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev, u64 seq, ktime_t now) { struct timespec64 tv; + ktime_t alt_flip_time; switch (e->event.base.type) { - case DRM_EVENT_VBLANK: case DRM_EVENT_FLIP_COMPLETE: + /* + * For flip completion events, override "now" time + * with alt_flip_time provided by the driver via + * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode + * if that time is later than given "now" vblank time. + */ + alt_flip_time = dev->vblank[e->pipe].alt_flip_time; + if (alt_flip_time > now) + now = alt_flip_time; + /* Fallthrough */ + case DRM_EVENT_VBLANK: tv = ktime_to_timespec64(now); e->event.vbl.sequence = seq; /* @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc, now = ktime_get(); } + e->pipe = pipe; send_vblank_event(dev, e, seq, now); } EXPORT_SYMBOL(drm_crtc_send_vblank_event); +/** + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time + * @crtc: the source CRTC of the pageflip completion event + * @flip_time: The alternate pageflip completion timestamp in VRR mode + * + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried + * by the pageflip event can never be earlier than the vblank timestamp of the + * vblank of flip completion, as that vblank timestamp defines the end of the + * shortest possible vblank duration. In case of a delayed flip completion + * inside the extended VRR front porch however, the end of vblank can be much + * later, so the driver must assign an estimated timestamp of that later end of + * vblank. For a CRTC in VRR mode, the driver should use this helper function to + * set an alternate flip completion timestamp in case of late flip completions + * in extended vblank. In the most simple case, this @flip_time timestamp could + * simply be a ktime_get() timestamp taken at the start of the pageflip + * completion routine, with some constant duration of the back porch interval + * added, although more precise estimates may be possible on some hardware if + * the hardware provides some means of timestamping the true end of vblank. + * + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it + * will use either the standard vblank timestamp, calculated for a minimum + * duration vblank, or the provided @flip_time if that time is later than the + * vblank timestamp, to get the best possible estimate of start of display of + * the new post-pageflip scanout buffer. + */ +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, + ktime_t flip_time) +{ + struct drm_device *dev = crtc->dev; + struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)]; + + vblank->alt_flip_time = flip_time; +} +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp); + static int __enable_vblank(struct drm_device *dev, unsigned int pipe) { if (drm_core_check_feature(dev, DRIVER_MODESET)) { diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h index 6ad9630d4f48..aacf44694ab6 100644 --- a/include/drm/drm_vblank.h +++ b/include/drm/drm_vblank.h @@ -117,6 +117,12 @@ struct drm_vblank_crtc { * @time: Vblank timestamp corresponding to @count. */ ktime_t time; + /** + * @alt_flip_time: Vblank timestamp for end of extended vblank due to + * a late pageflip completion in variable refresh rate mode. Pageflip + * events will carry the later one of @time and @alt_flip_time. + */ + ktime_t alt_flip_time; /** * @refcount: Number of users/waiters of the vblank interrupt. Only when @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); u64 drm_crtc_vblank_count(struct drm_crtc *crtc); u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc, ktime_t *vblanktime); +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc, + ktime_t flip_time); void drm_crtc_send_vblank_event(struct drm_crtc *crtc, struct drm_pending_vblank_event *e); void drm_crtc_arm_vblank_event(struct drm_crtc *crtc,
The pageflip completion timestamps transmitted to userspace via pageflip completion events are supposed to describe the time at which the first pixel of the new post-pageflip scanout buffer leaves the video output of the gpu. This time is identical to end of vblank, when active scanout starts. For a crtc in standard fixed refresh rate, the end of vblank is identical to the vblank timestamps calculated by drm_update_vblank_count() at each vblank interrupt, or each vblank dis-/enable. Therefore pageflip events just carry that vblank timestamp as their pageflip timestamp. For a crtc switched to variable refresh rate mode (vrr), the pageflip completion timestamps are identical to the vblank timestamps iff the pageflip was executed early in vblank, before the minimum vblank duration elapsed. In this case the time of display onset is identical to when the crtc is running in fixed refresh rate. However, if a pageflip completes later in the vblank, inside the "extended front porch" in vrr mode, then the vblank will terminate at a fixed (back porch) duration after flip, so the display onset time is delayed correspondingly. In this case the vblank timestamp computed at vblank irq time would be too early, and we need a way to calculate an estimated pageflip timestamp that will be later than the vblank timestamp. How a driver determines such a "late flip" timestamp is hw and driver specific, but this patch adds a new helper function that allows the driver to propose such an alternate "late flip" timestamp for use in pageflip events: drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); When sending out pageflip events, we now compare that proposed flip_timestamp against the vblank timestamp of the current vblank of flip completion and choose to send out the greater/ later timestamp as flip completion timestamp. The most simple way for a kms driver to supply a suitable flip_timestamp in vrr mode would be to simply take a timestamp at start of the pageflip completion handler, e.g., pageflip irq handler: flip_timestamp = ktime_get(); and then set that as proposed "late" alternative timestamp via ... drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp); More clever approaches could try to add some corrective offset for fixed back porch duration, or ideally use hardware features like hw timestamps to calculate the exact end time of vblank. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> --- drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++- include/drm/drm_vblank.h | 8 ++++++ 2 files changed, 56 insertions(+), 1 deletion(-)