[v2,2/8] drm/i915: Add more control to wait_for routines

Message ID	20171201172032.47357-3-seanpaul@chromium.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: Sean Paul <seanpaul@chromium.org> To: dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org Date: Fri, 1 Dec 2017 12:20:24 -0500 Message-Id: <20171201172032.47357-3-seanpaul@chromium.org> In-Reply-To: <20171201172032.47357-1-seanpaul@chromium.org> References: <20171201172032.47357-1-seanpaul@chromium.org> Cc: David Airlie <airlied@linux.ie>, linux-kernel@vger.kernel.org, Rodrigo Vivi <rodrigo.vivi@intel.com>, daniel.vetter@intel.com Subject: [Intel-gfx] [PATCH v2 2/8] drm/i915: Add more control to wait_for routines Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Sean Paul Dec. 1, 2017, 5:20 p.m. UTC

This patch adds a little more control to a couple wait_for routines such
that we can avoid open-coding read/wait/timeout patterns which:
 - need the value of the register after the wait_for
 - run arbitrary operation for the read portion

This patch also chooses the correct sleep function (based on
timers-howto.txt) for the polling interval the caller specifies.

Changes in v2:
- Added to the series

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
---
 drivers/gpu/drm/i915/intel_drv.h    | 17 ++++++++++++-----
 drivers/gpu/drm/i915/intel_uncore.c | 23 ++++++++++++++++-------
 drivers/gpu/drm/i915/intel_uncore.h | 14 +++++++++++++-
 3 files changed, 41 insertions(+), 13 deletions(-)

Chris Wilson Dec. 1, 2017, 5:44 p.m. UTC | #1

Quoting Sean Paul (2017-12-01 17:20:24)
>  /**
> - * _wait_for - magic (register) wait macro
> + * __wait_for - magic wait macro
>   *
> - * Does the right thing for modeset paths when run under kdgb or similar atomic
> - * contexts. Note that it's important that we check the condition again after
> + * Macro to help avoid open coding check/wait/timeout patterns, will do the
> + * right think wrt to choosing msleep vs usleep_range based on how long the wait
> + * interval is. Note that it's important that we check the condition again after
>   * having timed out, since the timeout could be due to preemption or similar and
>   * we've never had a chance to check the condition before the timeout.
>   */
> -#define _wait_for(COND, US, W) ({ \
> +#define __wait_for(OP, COND, US, W) ({ \
>         unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1;   \
>         int ret__;                                                      \
>         might_sleep();                                                  \
>         for (;;) {                                                      \
>                 bool expired__ = time_after(jiffies, timeout__);        \
> +               OP;                                                     \
>                 if (COND) {                                             \
>                         ret__ = 0;                                      \
>                         break;                                          \
> @@ -62,11 +64,16 @@
>                         ret__ = -ETIMEDOUT;                             \
>                         break;                                          \
>                 }                                                       \
> -               usleep_range((W), (W) * 2);                             \
> +               if (W > (20 * 1000))                                    \
> +                       msleep(W / 1000);                               \
> +               else                                                    \
> +                       usleep_range((W), (W) * 2);                     \

The current wait_for() is a little more complicated nowadays (variable
W).

Are ms intervals going to be that common? Using a state-machine springs
to mind, but you could argue that msleep() is just a yield. Using msleep
though is going to leave D processes visible and a bump in load :|

>         }                                                               \
>         ret__;                                                          \
>  })
>  
> +#define _wait_for(COND, US, W)         __wait_for(;,(COND), US, W)
> +
>  #define wait_for(COND, MS)             _wait_for((COND), (MS) * 1000, 1000)
>  
>  /* If CONFIG_PREEMPT_COUNT is disabled, in_atomic() always reports false. */
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index b4621271e7a2..c851b0c0602d 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -1770,12 +1770,14 @@ int __intel_wait_for_register_fw(struct drm_i915_private *dev_priv,
>  }
>  
>  /**
> - * intel_wait_for_register - wait until register matches expected state
> + * __intel_wait_for_register - wait until register matches expected state
>   * @dev_priv: the i915 device
>   * @reg: the register to read
>   * @mask: mask to apply to register value
>   * @value: expected value
> - * @timeout_ms: timeout in millisecond
> + * @fast_timeout_us: fast timeout in microsecond for atomic/tight wait
> + * @slow_timeout_ms: slow timeout in millisecond
> + * @out_value: optional placeholder to hold registry value
>   *
>   * This routine waits until the target register @reg contains the expected
>   * @value after applying the @mask, i.e. it waits until ::
> @@ -1786,15 +1788,18 @@ int __intel_wait_for_register_fw(struct drm_i915_private *dev_priv,
>   *
>   * Returns 0 if the register matches the desired condition, or -ETIMEOUT.
>   */
> -int intel_wait_for_register(struct drm_i915_private *dev_priv,
> +int __intel_wait_for_register(struct drm_i915_private *dev_priv,
>                             i915_reg_t reg,
>                             u32 mask,
>                             u32 value,
> -                           unsigned int timeout_ms)
> +                           unsigned int fast_timeout_us,
> +                           unsigned int slow_timeout_ms,
> +                           u32 *out_value)
>  {
>         unsigned fw =
>                 intel_uncore_forcewake_for_reg(dev_priv, reg, FW_REG_READ);
>         int ret;
> +       u32 reg_value;
>  
>         might_sleep();
>  
> @@ -1803,14 +1808,18 @@ int intel_wait_for_register(struct drm_i915_private *dev_priv,
>  
>         ret = __intel_wait_for_register_fw(dev_priv,
>                                            reg, mask, value,
> -                                          2, 0, NULL);
> +                                          fast_timeout_us, 0, &reg_value);
>  
>         intel_uncore_forcewake_put__locked(dev_priv, fw);
>         spin_unlock_irq(&dev_priv->uncore.lock);
>  
>         if (ret)
> -               ret = wait_for((I915_READ_NOTRACE(reg) & mask) == value,
> -                              timeout_ms);
> +               ret = __wait_for(reg_value = I915_READ_NOTRACE(reg),
> +                                (reg_value & mask) == value,
> +                                slow_timeout_ms * 1000, 1000);
> +
> +       if (out_value)
> +               *out_value = reg_value;

Looks good.
-Chris

Sean Paul Dec. 1, 2017, 5:48 p.m. UTC | #2

On Fri, Dec 1, 2017 at 12:44 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> Quoting Sean Paul (2017-12-01 17:20:24)
>>  /**
>> - * _wait_for - magic (register) wait macro
>> + * __wait_for - magic wait macro
>>   *
>> - * Does the right thing for modeset paths when run under kdgb or similar atomic
>> - * contexts. Note that it's important that we check the condition again after
>> + * Macro to help avoid open coding check/wait/timeout patterns, will do the
>> + * right think wrt to choosing msleep vs usleep_range based on how long the wait
>> + * interval is. Note that it's important that we check the condition again after
>>   * having timed out, since the timeout could be due to preemption or similar and
>>   * we've never had a chance to check the condition before the timeout.
>>   */
>> -#define _wait_for(COND, US, W) ({ \
>> +#define __wait_for(OP, COND, US, W) ({ \
>>         unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1;   \
>>         int ret__;                                                      \
>>         might_sleep();                                                  \
>>         for (;;) {                                                      \
>>                 bool expired__ = time_after(jiffies, timeout__);        \
>> +               OP;                                                     \
>>                 if (COND) {                                             \
>>                         ret__ = 0;                                      \
>>                         break;                                          \
>> @@ -62,11 +64,16 @@
>>                         ret__ = -ETIMEDOUT;                             \
>>                         break;                                          \
>>                 }                                                       \
>> -               usleep_range((W), (W) * 2);                             \
>> +               if (W > (20 * 1000))                                    \
>> +                       msleep(W / 1000);                               \
>> +               else                                                    \
>> +                       usleep_range((W), (W) * 2);                     \
>
> The current wait_for() is a little more complicated nowadays (variable
> W).
>

Hmm, am I based off the wrong tree? I'm using drm-intel-next.

> Are ms intervals going to be that common? Using a state-machine springs
> to mind, but you could argue that msleep() is just a yield. Using msleep
> though is going to leave D processes visible and a bump in load :|
>

Probably uncommon, but at the very least, I need one. I wouldn't feel
comfortable handling such a large wait using usleep_range.

Sean

>>         }                                                               \
>>         ret__;                                                          \
>>  })
>>
>> +#define _wait_for(COND, US, W)         __wait_for(;,(COND), US, W)
>> +
>>  #define wait_for(COND, MS)             _wait_for((COND), (MS) * 1000, 1000)
>>
>>  /* If CONFIG_PREEMPT_COUNT is disabled, in_atomic() always reports false. */
>> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
>> index b4621271e7a2..c851b0c0602d 100644
>> --- a/drivers/gpu/drm/i915/intel_uncore.c
>> +++ b/drivers/gpu/drm/i915/intel_uncore.c
>> @@ -1770,12 +1770,14 @@ int __intel_wait_for_register_fw(struct drm_i915_private *dev_priv,
>>  }
>>
>>  /**
>> - * intel_wait_for_register - wait until register matches expected state
>> + * __intel_wait_for_register - wait until register matches expected state
>>   * @dev_priv: the i915 device
>>   * @reg: the register to read
>>   * @mask: mask to apply to register value
>>   * @value: expected value
>> - * @timeout_ms: timeout in millisecond
>> + * @fast_timeout_us: fast timeout in microsecond for atomic/tight wait
>> + * @slow_timeout_ms: slow timeout in millisecond
>> + * @out_value: optional placeholder to hold registry value
>>   *
>>   * This routine waits until the target register @reg contains the expected
>>   * @value after applying the @mask, i.e. it waits until ::
>> @@ -1786,15 +1788,18 @@ int __intel_wait_for_register_fw(struct drm_i915_private *dev_priv,
>>   *
>>   * Returns 0 if the register matches the desired condition, or -ETIMEOUT.
>>   */
>> -int intel_wait_for_register(struct drm_i915_private *dev_priv,
>> +int __intel_wait_for_register(struct drm_i915_private *dev_priv,
>>                             i915_reg_t reg,
>>                             u32 mask,
>>                             u32 value,
>> -                           unsigned int timeout_ms)
>> +                           unsigned int fast_timeout_us,
>> +                           unsigned int slow_timeout_ms,
>> +                           u32 *out_value)
>>  {
>>         unsigned fw =
>>                 intel_uncore_forcewake_for_reg(dev_priv, reg, FW_REG_READ);
>>         int ret;
>> +       u32 reg_value;
>>
>>         might_sleep();
>>
>> @@ -1803,14 +1808,18 @@ int intel_wait_for_register(struct drm_i915_private *dev_priv,
>>
>>         ret = __intel_wait_for_register_fw(dev_priv,
>>                                            reg, mask, value,
>> -                                          2, 0, NULL);
>> +                                          fast_timeout_us, 0, &reg_value);
>>
>>         intel_uncore_forcewake_put__locked(dev_priv, fw);
>>         spin_unlock_irq(&dev_priv->uncore.lock);
>>
>>         if (ret)
>> -               ret = wait_for((I915_READ_NOTRACE(reg) & mask) == value,
>> -                              timeout_ms);
>> +               ret = __wait_for(reg_value = I915_READ_NOTRACE(reg),
>> +                                (reg_value & mask) == value,
>> +                                slow_timeout_ms * 1000, 1000);
>> +
>> +       if (out_value)
>> +               *out_value = reg_value;
>
> Looks good.
> -Chris

Chris Wilson Dec. 1, 2017, 5:55 p.m. UTC | #3

Quoting Sean Paul (2017-12-01 17:48:17)
> On Fri, Dec 1, 2017 at 12:44 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > Quoting Sean Paul (2017-12-01 17:20:24)
> >>  /**
> >> - * _wait_for - magic (register) wait macro
> >> + * __wait_for - magic wait macro
> >>   *
> >> - * Does the right thing for modeset paths when run under kdgb or similar atomic
> >> - * contexts. Note that it's important that we check the condition again after
> >> + * Macro to help avoid open coding check/wait/timeout patterns, will do the
> >> + * right think wrt to choosing msleep vs usleep_range based on how long the wait
> >> + * interval is. Note that it's important that we check the condition again after
> >>   * having timed out, since the timeout could be due to preemption or similar and
> >>   * we've never had a chance to check the condition before the timeout.
> >>   */
> >> -#define _wait_for(COND, US, W) ({ \
> >> +#define __wait_for(OP, COND, US, W) ({ \
> >>         unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1;   \
> >>         int ret__;                                                      \
> >>         might_sleep();                                                  \
> >>         for (;;) {                                                      \
> >>                 bool expired__ = time_after(jiffies, timeout__);        \
> >> +               OP;                                                     \
> >>                 if (COND) {                                             \
> >>                         ret__ = 0;                                      \
> >>                         break;                                          \
> >> @@ -62,11 +64,16 @@
> >>                         ret__ = -ETIMEDOUT;                             \
> >>                         break;                                          \
> >>                 }                                                       \
> >> -               usleep_range((W), (W) * 2);                             \
> >> +               if (W > (20 * 1000))                                    \
> >> +                       msleep(W / 1000);                               \
> >> +               else                                                    \
> >> +                       usleep_range((W), (W) * 2);                     \
> >
> > The current wait_for() is a little more complicated nowadays (variable
> > W).
> >
> 
> Hmm, am I based off the wrong tree? I'm using drm-intel-next.

drm-intel-next is what was sent as a PR; drm-intel-next-queued is always
current. To be sure CI, doesn't complain about merge conflicts, base on
drm-tip.

> > Are ms intervals going to be that common? Using a state-machine springs
> > to mind, but you could argue that msleep() is just a yield. Using msleep
> > though is going to leave D processes visible and a bump in load :|
> >
> 
> Probably uncommon, but at the very least, I need one. I wouldn't feel
> comfortable handling such a large wait using usleep_range.

We can use the constants to remove the predicate for when it never will
be needed, so it's not an issue -- either wait will produce a long
uninterruptible sleep.  It's just if that we were to frequently hit that
long sleep, there would be some push back against using wait_for() on that
path as no one likes an "idle" system with load 1.
-Chris

Chris Wilson Dec. 1, 2017, 5:57 p.m. UTC | #4

Quoting Chris Wilson (2017-12-01 17:55:15)
> Quoting Sean Paul (2017-12-01 17:48:17)
> > On Fri, Dec 1, 2017 at 12:44 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > The current wait_for() is a little more complicated nowadays (variable
> > > W).
> > >
> > 
> > Hmm, am I based off the wrong tree? I'm using drm-intel-next.
> 
> drm-intel-next is what was sent as a PR; drm-intel-next-queued is always
> current. To be sure CI, doesn't complain about merge conflicts, base on
> drm-tip.

Speaking of CI, do you have any instructions on how we might set up a
test system? Just needs a compatible monitor and some test code?
Could chamelium or something like that be used as a validator?
-Chris

Sean Paul Dec. 1, 2017, 6 p.m. UTC | #5

On Fri, Dec 1, 2017 at 12:57 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> Quoting Chris Wilson (2017-12-01 17:55:15)
>> Quoting Sean Paul (2017-12-01 17:48:17)
>> > On Fri, Dec 1, 2017 at 12:44 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> > > The current wait_for() is a little more complicated nowadays (variable
>> > > W).
>> > >
>> >
>> > Hmm, am I based off the wrong tree? I'm using drm-intel-next.
>>
>> drm-intel-next is what was sent as a PR; drm-intel-next-queued is always
>> current. To be sure CI, doesn't complain about merge conflicts, base on
>> drm-tip.
>

Ahhhh, i forgot about -queued. ok, will rebase.

> Speaking of CI, do you have any instructions on how we might set up a
> test system?

I'm working on an igt test for the property now.

>  Just needs a compatible monitor and some test code?

Yep. For testing, I exposed the property via sysfs and fiddle with it that way.

> Could chamelium or something like that be used as a validator?

You would have to implement the receiver side of HDCP on chamelium in
order for this to work. So, probably not.

Sean

> -Chris

[v2,2/8] drm/i915: Add more control to wait_for routines

Commit Message

Comments

Patch