[3/6] drm/i915: reset eDP timestamps on resume

Message ID	1389989862-1904-1-git-send-email-przanoni@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: Paulo Zanoni <przanoni@gmail.com> To: intel-gfx@lists.freedesktop.org Date: Fri, 17 Jan 2014 18:17:42 -0200 Message-Id: <1389989862-1904-1-git-send-email-przanoni@gmail.com> In-Reply-To: <CA+gsUGSKkuBLrNXYo39Cn1XmMovN+R_7c3fzVNQcJBhT9D7-5g@mail.gmail.com> References: <CA+gsUGSKkuBLrNXYo39Cn1XmMovN+R_7c3fzVNQcJBhT9D7-5g@mail.gmail.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Subject: [Intel-gfx] [PATCH 3/6] drm/i915: reset eDP timestamps on resume Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: intel-gfx-bounces@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org

Paulo Zanoni Jan. 17, 2014, 8:17 p.m. UTC

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

The eDP code records a few timestamps containing the last time we took
some actions, because we need to wait before doing some other actions.
The problem is that if we store a timestamp when suspending and then
look at it when resuming, we'll ignore the unknown amount of time we
actually were suspended.

This happens with the panel power cycle delay: it's 500ms on my
machine, and it's delaying the resume sequence by 200ms due to a
timestamp we recorded before suspending. This patch should solve this
problem by resetting the timestamps.

v2: - Fix the madatory jiffies/milliseconds bug.
v3: - We can use drm_connector->reset after Daniel's recent refactor.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_dp.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

Chris Wilson Jan. 17, 2014, 8:22 p.m. UTC | #1

On Fri, Jan 17, 2014 at 06:17:42PM -0200, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> The eDP code records a few timestamps containing the last time we took
> some actions, because we need to wait before doing some other actions.
> The problem is that if we store a timestamp when suspending and then
> look at it when resuming, we'll ignore the unknown amount of time we
> actually were suspended.
> 
> This happens with the panel power cycle delay: it's 500ms on my
> machine, and it's delaying the resume sequence by 200ms due to a
> timestamp we recorded before suspending. This patch should solve this
> problem by resetting the timestamps.

But you don't explain why this is safe. The code nerfs the timeouts so
that they are ignored, yet the delays are independent. Should this be
based on realtime rather than jiffies?
-Chris

Paulo Zanoni Jan. 17, 2014, 9:11 p.m. UTC | #2

2014/1/17 Chris Wilson <chris@chris-wilson.co.uk>:
> On Fri, Jan 17, 2014 at 06:17:42PM -0200, Paulo Zanoni wrote:
>> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>
>> The eDP code records a few timestamps containing the last time we took
>> some actions, because we need to wait before doing some other actions.
>> The problem is that if we store a timestamp when suspending and then
>> look at it when resuming, we'll ignore the unknown amount of time we
>> actually were suspended.
>>
>> This happens with the panel power cycle delay: it's 500ms on my
>> machine, and it's delaying the resume sequence by 200ms due to a
>> timestamp we recorded before suspending. This patch should solve this
>> problem by resetting the timestamps.
>
> But you don't explain why this is safe. The code nerfs the timeouts so
> that they are ignored, yet the delays are independent. Should this be
> based on realtime rather than jiffies?

I'm not sure I understand your question. What's the problem you see exactly?

Thanks,
Paulo

> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre

Chris Wilson Jan. 17, 2014, 9:21 p.m. UTC | #3

On Fri, Jan 17, 2014 at 07:11:14PM -0200, Paulo Zanoni wrote:
> 2014/1/17 Chris Wilson <chris@chris-wilson.co.uk>:
> > On Fri, Jan 17, 2014 at 06:17:42PM -0200, Paulo Zanoni wrote:
> >> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >>
> >> The eDP code records a few timestamps containing the last time we took
> >> some actions, because we need to wait before doing some other actions.
> >> The problem is that if we store a timestamp when suspending and then
> >> look at it when resuming, we'll ignore the unknown amount of time we
> >> actually were suspended.
> >>
> >> This happens with the panel power cycle delay: it's 500ms on my
> >> machine, and it's delaying the resume sequence by 200ms due to a
> >> timestamp we recorded before suspending. This patch should solve this
> >> problem by resetting the timestamps.
> >
> > But you don't explain why this is safe. The code nerfs the timeouts so
> > that they are ignored, yet the delays are independent. Should this be
> > based on realtime rather than jiffies?
> 
> I'm not sure I understand your question. What's the problem you see exactly?

Given the fast suspend & resume, we will not have waited the required
panel off time before poking it again etc. What makes that safe?
-Chris

Daniel Vetter Jan. 17, 2014, 9:34 p.m. UTC | #4

On Fri, Jan 17, 2014 at 10:21 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Fri, Jan 17, 2014 at 07:11:14PM -0200, Paulo Zanoni wrote:
>> 2014/1/17 Chris Wilson <chris@chris-wilson.co.uk>:
>> > On Fri, Jan 17, 2014 at 06:17:42PM -0200, Paulo Zanoni wrote:
>> >> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> >>
>> >> The eDP code records a few timestamps containing the last time we took
>> >> some actions, because we need to wait before doing some other actions.
>> >> The problem is that if we store a timestamp when suspending and then
>> >> look at it when resuming, we'll ignore the unknown amount of time we
>> >> actually were suspended.
>> >>
>> >> This happens with the panel power cycle delay: it's 500ms on my
>> >> machine, and it's delaying the resume sequence by 200ms due to a
>> >> timestamp we recorded before suspending. This patch should solve this
>> >> problem by resetting the timestamps.
>> >
>> > But you don't explain why this is safe. The code nerfs the timeouts so
>> > that they are ignored, yet the delays are independent. Should this be
>> > based on realtime rather than jiffies?
>>
>> I'm not sure I understand your question. What's the problem you see exactly?
>
> Given the fast suspend & resume, we will not have waited the required
> panel off time before poking it again etc. What makes that safe?

Even worse the kernel might abort the suspend due to some issue and
we'll immediately resume. Also, and immediate thaw operation after
freezing is how hibernate works. Iirc the hw always enforces the full
power off delay after a power reset for exactly this reason (at least
on current platforms afaik). But with the minimal delays in patch 6
that won't help any more, either.

I fear we need a bit more smarts here using a realtime clock source to
figure out whether actually sufficient time elapsed :(
-Daniel

Paulo Zanoni Jan. 20, 2014, 3:47 p.m. UTC | #5

2014/1/17 Daniel Vetter <daniel@ffwll.ch>:
> On Fri, Jan 17, 2014 at 10:21 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> On Fri, Jan 17, 2014 at 07:11:14PM -0200, Paulo Zanoni wrote:
>>> 2014/1/17 Chris Wilson <chris@chris-wilson.co.uk>:
>>> > On Fri, Jan 17, 2014 at 06:17:42PM -0200, Paulo Zanoni wrote:
>>> >> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>> >>
>>> >> The eDP code records a few timestamps containing the last time we took
>>> >> some actions, because we need to wait before doing some other actions.
>>> >> The problem is that if we store a timestamp when suspending and then
>>> >> look at it when resuming, we'll ignore the unknown amount of time we
>>> >> actually were suspended.
>>> >>
>>> >> This happens with the panel power cycle delay: it's 500ms on my
>>> >> machine, and it's delaying the resume sequence by 200ms due to a
>>> >> timestamp we recorded before suspending. This patch should solve this
>>> >> problem by resetting the timestamps.
>>> >
>>> > But you don't explain why this is safe. The code nerfs the timeouts so
>>> > that they are ignored, yet the delays are independent. Should this be
>>> > based on realtime rather than jiffies?
>>>
>>> I'm not sure I understand your question. What's the problem you see exactly?
>>
>> Given the fast suspend & resume, we will not have waited the required
>> panel off time before poking it again etc. What makes that safe?
>
> Even worse the kernel might abort the suspend due to some issue and
> we'll immediately resume. Also, and immediate thaw operation after
> freezing is how hibernate works. Iirc the hw always enforces the full
> power off delay after a power reset for exactly this reason (at least
> on current platforms afaik).

Oh, now I get it. My bad. A brief experiment here shows that
do_gettimeofday or current_kernel_time can probably be used to fix the
problem. But since that requires properly getting all the timestamps
in the correct units, maybe rewriting some functions, I'll wait a few
days before I come back to this problem.

> But with the minimal delays in patch 6
> that won't help any more, either.

Patch 6 should be independent of this. This patch is just to save us
some time in the resume cases, but patch 6 is to correct the amount of
time we wait while we disable the panel. I don't see a reason to block
patch 6 on this one.

Thanks for the reviews,
Paulo

>
> I fear we need a bit more smarts here using a realtime clock source to
> figure out whether actually sufficient time elapsed :(
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Daniel Vetter Jan. 20, 2014, 4:10 p.m. UTC | #6

On Mon, Jan 20, 2014 at 01:47:51PM -0200, Paulo Zanoni wrote:
> 2014/1/17 Daniel Vetter <daniel@ffwll.ch>:
> > On Fri, Jan 17, 2014 at 10:21 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >> On Fri, Jan 17, 2014 at 07:11:14PM -0200, Paulo Zanoni wrote:
> >>> 2014/1/17 Chris Wilson <chris@chris-wilson.co.uk>:
> >>> > On Fri, Jan 17, 2014 at 06:17:42PM -0200, Paulo Zanoni wrote:
> >>> >> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >>> >>
> >>> >> The eDP code records a few timestamps containing the last time we took
> >>> >> some actions, because we need to wait before doing some other actions.
> >>> >> The problem is that if we store a timestamp when suspending and then
> >>> >> look at it when resuming, we'll ignore the unknown amount of time we
> >>> >> actually were suspended.
> >>> >>
> >>> >> This happens with the panel power cycle delay: it's 500ms on my
> >>> >> machine, and it's delaying the resume sequence by 200ms due to a
> >>> >> timestamp we recorded before suspending. This patch should solve this
> >>> >> problem by resetting the timestamps.
> >>> >
> >>> > But you don't explain why this is safe. The code nerfs the timeouts so
> >>> > that they are ignored, yet the delays are independent. Should this be
> >>> > based on realtime rather than jiffies?
> >>>
> >>> I'm not sure I understand your question. What's the problem you see exactly?
> >>
> >> Given the fast suspend & resume, we will not have waited the required
> >> panel off time before poking it again etc. What makes that safe?
> >
> > Even worse the kernel might abort the suspend due to some issue and
> > we'll immediately resume. Also, and immediate thaw operation after
> > freezing is how hibernate works. Iirc the hw always enforces the full
> > power off delay after a power reset for exactly this reason (at least
> > on current platforms afaik).
> 
> Oh, now I get it. My bad. A brief experiment here shows that
> do_gettimeofday or current_kernel_time can probably be used to fix the
> problem. But since that requires properly getting all the timestamps
> in the correct units, maybe rewriting some functions, I'll wait a few
> days before I come back to this problem.

A simpler solution might be to grab timestamps in our resume code both
right at the end of our freeze function and at the beginning of thaw. Then
the driver code could convert this once into
dev->jiffies_elapsed_in_suspend or something and the dp->reset function
could simply use that to adjust the timeout values a bit (instead of
completely resetting everything).

The tricky bit should only be how to get the off-by-one stuff right: For
timeouts/waits we need to round up (+1), for this elapsed time we need to
round down one jiffy i.e. -1, but not less than 0 ofc ;-)

So I think you should be able to fix up your patch with very little work
(and no changes to the logic you've already created&tested).

> > But with the minimal delays in patch 6
> > that won't help any more, either.
> 
> Patch 6 should be independent of this. This patch is just to save us
> some time in the resume cases, but patch 6 is to correct the amount of
> time we wait while we disable the panel. I don't see a reason to block
> patch 6 on this one.

Hm yeah, with the new logic (but without the reset on resume) we should
still be save. I'll merge it, together with my prep patch to move the
reset callbacks around a bit.
-Daniel

[3/6] drm/i915: reset eDP timestamps on resume

Commit Message

Comments

Patch