diff mbox series

[v6] drm/i915: handle uncore spinlock when not available

Message ID 20231130113505.1321348-1-luciano.coelho@intel.com (mailing list archive)
State New, archived
Headers show
Series [v6] drm/i915: handle uncore spinlock when not available | expand

Commit Message

Luca Coelho Nov. 30, 2023, 11:35 a.m. UTC
The uncore code may not always be available (e.g. when we build the
display code with Xe), so we can't always rely on having the uncore's
spinlock.

To handle this, split the spin_lock/unlock_irqsave/restore() into
spin_lock/unlock() followed by a call to local_irq_save/restore() and
create wrapper functions for locking and unlocking the uncore's
spinlock.  In these functions, we have a condition check and only
actually try to lock/unlock the spinlock when I915 is defined, and
thus uncore is available.

This keeps the ifdefs contained in these new functions and all such
logic inside the display code.

Cc: Tvrtko Ursulin <tvrto.ursulin@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
---


In v2:

   * Renamed uncore_spin_*() to intel_spin_*()
   * Corrected the order: save, lock, unlock, restore

In v3:

   * Undid the change to pass drm_i915_private instead of the lock
     itself, since we would have to include i915_drv.h and that pulls
     in a truckload of other includes.

In v4:

   * After a brief attempt to replace this with a different patch,
     we're back to this one;
   * Pass drm_i195_private again, and move the functions to
     intel_vblank.c, so we don't need to include i915_drv.h in a
     header file and it's already included in intel_vblank.c;

In v5:

   * Remove stray include in intel_display.h;
   * Remove unnecessary inline modifiers in the new functions.

In v6:

   * Just removed the umlauts from Ville's name, because patchwork
     didn't catch my patch and I suspect it was some UTF-8 confusion.

 drivers/gpu/drm/i915/display/intel_vblank.c | 49 ++++++++++++++++-----
 1 file changed, 39 insertions(+), 10 deletions(-)

Comments

Tvrtko Ursulin Nov. 30, 2023, 12:21 p.m. UTC | #1
On 30/11/2023 11:35, Luca Coelho wrote:
> The uncore code may not always be available (e.g. when we build the
> display code with Xe), so we can't always rely on having the uncore's
> spinlock.
> 
> To handle this, split the spin_lock/unlock_irqsave/restore() into
> spin_lock/unlock() followed by a call to local_irq_save/restore() and
> create wrapper functions for locking and unlocking the uncore's
> spinlock.  In these functions, we have a condition check and only
> actually try to lock/unlock the spinlock when I915 is defined, and
> thus uncore is available.
> 
> This keeps the ifdefs contained in these new functions and all such
> logic inside the display code.
> 
> Cc: Tvrtko Ursulin <tvrto.ursulin@intel.com>
> Cc: Jani Nikula <jani.nikula@intel.com>
> Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
> ---
> 
> 
> In v2:
> 
>     * Renamed uncore_spin_*() to intel_spin_*()
>     * Corrected the order: save, lock, unlock, restore
> 
> In v3:
> 
>     * Undid the change to pass drm_i915_private instead of the lock
>       itself, since we would have to include i915_drv.h and that pulls
>       in a truckload of other includes.
> 
> In v4:
> 
>     * After a brief attempt to replace this with a different patch,
>       we're back to this one;
>     * Pass drm_i195_private again, and move the functions to
>       intel_vblank.c, so we don't need to include i915_drv.h in a
>       header file and it's already included in intel_vblank.c;
> 
> In v5:
> 
>     * Remove stray include in intel_display.h;
>     * Remove unnecessary inline modifiers in the new functions.
> 
> In v6:
> 
>     * Just removed the umlauts from Ville's name, because patchwork
>       didn't catch my patch and I suspect it was some UTF-8 confusion.
> 
>   drivers/gpu/drm/i915/display/intel_vblank.c | 49 ++++++++++++++++-----
>   1 file changed, 39 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_vblank.c b/drivers/gpu/drm/i915/display/intel_vblank.c
> index 2cec2abf9746..221fcd6bf77b 100644
> --- a/drivers/gpu/drm/i915/display/intel_vblank.c
> +++ b/drivers/gpu/drm/i915/display/intel_vblank.c
> @@ -265,6 +265,30 @@ int intel_crtc_scanline_to_hw(struct intel_crtc *crtc, int scanline)
>   	return (scanline + vtotal - crtc->scanline_offset) % vtotal;
>   }
>   
> +/*
> + * The uncore version of the spin lock functions is used to decide
> + * whether we need to lock the uncore lock or not.  This is only
> + * needed in i915, not in Xe.
> + *
> + * This lock in i915 is needed because some old platforms (at least
> + * IVB and possibly HSW as well), which are not supported in Xe, need
> + * all register accesses to the same cacheline to be serialized,
> + * otherwise they may hang.
> + */
> +static void intel_vblank_section_enter(struct drm_i915_private *i915)
> +{
> +#ifdef I915
> +	spin_lock(&i915->uncore.lock);
> +#endif
> +}
> +
> +static void intel_vblank_section_exit(struct drm_i915_private *i915)
> +{
> +#ifdef I915
> +	spin_unlock(&i915->uncore.lock);
> +#endif
> +}
> +
>   static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
>   				     bool in_vblank_irq,
>   				     int *vpos, int *hpos,
> @@ -302,11 +326,12 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
>   	}
>   
>   	/*
> -	 * Lock uncore.lock, as we will do multiple timing critical raw
> -	 * register reads, potentially with preemption disabled, so the
> -	 * following code must not block on uncore.lock.
> +	 * Enter vblank critical section, as we will do multiple
> +	 * timing critical raw register reads, potentially with
> +	 * preemption disabled, so the following code must not block.
>   	 */
> -	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> +	local_irq_save(irqflags);
> +	intel_vblank_section_enter(dev_priv);

Shouldn't local_irq_save go into intel_vblank_section_enter()? It seems 
all callers from both i915 and xe end up doing that anyway and naming 
"vblank_start" was presumed there would be more to the section than 
cacheline mmio bug. I mean that there is some benefit from keeping the 
readout timings tight.

Regards,

Tvrtko

>   
>   	/* preempt_disable_rt() should go right here in PREEMPT_RT patchset. */
>   
> @@ -374,7 +399,8 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
>   
>   	/* preempt_enable_rt() should go right here in PREEMPT_RT patchset. */
>   
> -	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
> +	intel_vblank_section_exit(dev_priv);
> +	local_irq_restore(irqflags);
>   
>   	/*
>   	 * While in vblank, position will be negative
> @@ -412,9 +438,13 @@ int intel_get_crtc_scanline(struct intel_crtc *crtc)
>   	unsigned long irqflags;
>   	int position;
>   
> -	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> +	local_irq_save(irqflags);
> +	intel_vblank_section_enter(dev_priv);
> +
>   	position = __intel_get_crtc_scanline(crtc);
> -	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
> +
> +	intel_vblank_section_exit(dev_priv);
> +	local_irq_restore(irqflags);
>   
>   	return position;
>   }
> @@ -537,7 +567,7 @@ void intel_crtc_update_active_timings(const struct intel_crtc_state *crtc_state,
>   	 * Need to audit everything to make sure it's safe.
>   	 */
>   	spin_lock_irqsave(&i915->drm.vblank_time_lock, irqflags);
> -	spin_lock(&i915->uncore.lock);
> +	intel_vblank_section_enter(i915);
>   
>   	drm_calc_timestamping_constants(&crtc->base, &adjusted_mode);
>   
> @@ -546,7 +576,6 @@ void intel_crtc_update_active_timings(const struct intel_crtc_state *crtc_state,
>   	crtc->mode_flags = mode_flags;
>   
>   	crtc->scanline_offset = intel_crtc_scanline_offset(crtc_state);
> -
> -	spin_unlock(&i915->uncore.lock);
> +	intel_vblank_section_exit(i915);
>   	spin_unlock_irqrestore(&i915->drm.vblank_time_lock, irqflags);
>   }
Luca Coelho Nov. 30, 2023, 12:26 p.m. UTC | #2
On Thu, 2023-11-30 at 12:21 +0000, Tvrtko Ursulin wrote:
> On 30/11/2023 11:35, Luca Coelho wrote:
> > The uncore code may not always be available (e.g. when we build the
> > display code with Xe), so we can't always rely on having the uncore's
> > spinlock.
> > 
> > To handle this, split the spin_lock/unlock_irqsave/restore() into
> > spin_lock/unlock() followed by a call to local_irq_save/restore() and
> > create wrapper functions for locking and unlocking the uncore's
> > spinlock.  In these functions, we have a condition check and only
> > actually try to lock/unlock the spinlock when I915 is defined, and
> > thus uncore is available.
> > 
> > This keeps the ifdefs contained in these new functions and all such
> > logic inside the display code.
> > 
> > Cc: Tvrtko Ursulin <tvrto.ursulin@intel.com>
> > Cc: Jani Nikula <jani.nikula@intel.com>
> > Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
> > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
> > ---
> > 
> > 
> > In v2:
> > 
> >     * Renamed uncore_spin_*() to intel_spin_*()
> >     * Corrected the order: save, lock, unlock, restore
> > 
> > In v3:
> > 
> >     * Undid the change to pass drm_i915_private instead of the lock
> >       itself, since we would have to include i915_drv.h and that pulls
> >       in a truckload of other includes.
> > 
> > In v4:
> > 
> >     * After a brief attempt to replace this with a different patch,
> >       we're back to this one;
> >     * Pass drm_i195_private again, and move the functions to
> >       intel_vblank.c, so we don't need to include i915_drv.h in a
> >       header file and it's already included in intel_vblank.c;
> > 
> > In v5:
> > 
> >     * Remove stray include in intel_display.h;
> >     * Remove unnecessary inline modifiers in the new functions.
> > 
> > In v6:
> > 
> >     * Just removed the umlauts from Ville's name, because patchwork
> >       didn't catch my patch and I suspect it was some UTF-8 confusion.
> > 
> >   drivers/gpu/drm/i915/display/intel_vblank.c | 49 ++++++++++++++++-----
> >   1 file changed, 39 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_vblank.c b/drivers/gpu/drm/i915/display/intel_vblank.c
> > index 2cec2abf9746..221fcd6bf77b 100644
> > --- a/drivers/gpu/drm/i915/display/intel_vblank.c
> > +++ b/drivers/gpu/drm/i915/display/intel_vblank.c
> > @@ -265,6 +265,30 @@ int intel_crtc_scanline_to_hw(struct intel_crtc *crtc, int scanline)
> >   	return (scanline + vtotal - crtc->scanline_offset) % vtotal;
> >   }
> >   
> > +/*
> > + * The uncore version of the spin lock functions is used to decide
> > + * whether we need to lock the uncore lock or not.  This is only
> > + * needed in i915, not in Xe.
> > + *
> > + * This lock in i915 is needed because some old platforms (at least
> > + * IVB and possibly HSW as well), which are not supported in Xe, need
> > + * all register accesses to the same cacheline to be serialized,
> > + * otherwise they may hang.
> > + */
> > +static void intel_vblank_section_enter(struct drm_i915_private *i915)
> > +{
> > +#ifdef I915
> > +	spin_lock(&i915->uncore.lock);
> > +#endif
> > +}
> > +
> > +static void intel_vblank_section_exit(struct drm_i915_private *i915)
> > +{
> > +#ifdef I915
> > +	spin_unlock(&i915->uncore.lock);
> > +#endif
> > +}
> > +
> >   static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
> >   				     bool in_vblank_irq,
> >   				     int *vpos, int *hpos,
> > @@ -302,11 +326,12 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
> >   	}
> >   
> >   	/*
> > -	 * Lock uncore.lock, as we will do multiple timing critical raw
> > -	 * register reads, potentially with preemption disabled, so the
> > -	 * following code must not block on uncore.lock.
> > +	 * Enter vblank critical section, as we will do multiple
> > +	 * timing critical raw register reads, potentially with
> > +	 * preemption disabled, so the following code must not block.
> >   	 */
> > -	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> > +	local_irq_save(irqflags);
> > +	intel_vblank_section_enter(dev_priv);
> 
> Shouldn't local_irq_save go into intel_vblank_section_enter()? It seems 
> all callers from both i915 and xe end up doing that anyway and naming 
> "vblank_start" was presumed there would be more to the section than 
> cacheline mmio bug. I mean that there is some benefit from keeping the 
> readout timings tight.
> 

The reason is that there is one caller that has already disabled
interrupts when this function is called (see below), so we shouldn't do
it again. 


> >   
> >   	/* preempt_disable_rt() should go right here in PREEMPT_RT patchset. */
> >   
> > @@ -374,7 +399,8 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
> >   
> >   	/* preempt_enable_rt() should go right here in PREEMPT_RT patchset. */
> >   
> > -	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
> > +	intel_vblank_section_exit(dev_priv);
> > +	local_irq_restore(irqflags);
> >   
> >   	/*
> >   	 * While in vblank, position will be negative
> > @@ -412,9 +438,13 @@ int intel_get_crtc_scanline(struct intel_crtc *crtc)
> >   	unsigned long irqflags;
> >   	int position;
> >   
> > -	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> > +	local_irq_save(irqflags);
> > +	intel_vblank_section_enter(dev_priv);
> > +
> >   	position = __intel_get_crtc_scanline(crtc);
> > -	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
> > +
> > +	intel_vblank_section_exit(dev_priv);
> > +	local_irq_restore(irqflags);
> >   
> >   	return position;
> >   }
> > @@ -537,7 +567,7 @@ void intel_crtc_update_active_timings(const struct intel_crtc_state *crtc_state,
> >   	 * Need to audit everything to make sure it's safe.
> >   	 */
> >   	spin_lock_irqsave(&i915->drm.vblank_time_lock, irqflags);
> > -	spin_lock(&i915->uncore.lock);
> > +	intel_vblank_section_enter(i915);

Here.

--
Cheers,
Luca.
Tvrtko Ursulin Nov. 30, 2023, 1:24 p.m. UTC | #3
On 30/11/2023 12:26, Coelho, Luciano wrote:
> On Thu, 2023-11-30 at 12:21 +0000, Tvrtko Ursulin wrote:
>> On 30/11/2023 11:35, Luca Coelho wrote:
>>> The uncore code may not always be available (e.g. when we build the
>>> display code with Xe), so we can't always rely on having the uncore's
>>> spinlock.
>>>
>>> To handle this, split the spin_lock/unlock_irqsave/restore() into
>>> spin_lock/unlock() followed by a call to local_irq_save/restore() and
>>> create wrapper functions for locking and unlocking the uncore's
>>> spinlock.  In these functions, we have a condition check and only
>>> actually try to lock/unlock the spinlock when I915 is defined, and
>>> thus uncore is available.
>>>
>>> This keeps the ifdefs contained in these new functions and all such
>>> logic inside the display code.
>>>
>>> Cc: Tvrtko Ursulin <tvrto.ursulin@intel.com>
>>> Cc: Jani Nikula <jani.nikula@intel.com>
>>> Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
>>> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
>>> ---
>>>
>>>
>>> In v2:
>>>
>>>      * Renamed uncore_spin_*() to intel_spin_*()
>>>      * Corrected the order: save, lock, unlock, restore
>>>
>>> In v3:
>>>
>>>      * Undid the change to pass drm_i915_private instead of the lock
>>>        itself, since we would have to include i915_drv.h and that pulls
>>>        in a truckload of other includes.
>>>
>>> In v4:
>>>
>>>      * After a brief attempt to replace this with a different patch,
>>>        we're back to this one;
>>>      * Pass drm_i195_private again, and move the functions to
>>>        intel_vblank.c, so we don't need to include i915_drv.h in a
>>>        header file and it's already included in intel_vblank.c;
>>>
>>> In v5:
>>>
>>>      * Remove stray include in intel_display.h;
>>>      * Remove unnecessary inline modifiers in the new functions.
>>>
>>> In v6:
>>>
>>>      * Just removed the umlauts from Ville's name, because patchwork
>>>        didn't catch my patch and I suspect it was some UTF-8 confusion.
>>>
>>>    drivers/gpu/drm/i915/display/intel_vblank.c | 49 ++++++++++++++++-----
>>>    1 file changed, 39 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/display/intel_vblank.c b/drivers/gpu/drm/i915/display/intel_vblank.c
>>> index 2cec2abf9746..221fcd6bf77b 100644
>>> --- a/drivers/gpu/drm/i915/display/intel_vblank.c
>>> +++ b/drivers/gpu/drm/i915/display/intel_vblank.c
>>> @@ -265,6 +265,30 @@ int intel_crtc_scanline_to_hw(struct intel_crtc *crtc, int scanline)
>>>    	return (scanline + vtotal - crtc->scanline_offset) % vtotal;
>>>    }
>>>    
>>> +/*
>>> + * The uncore version of the spin lock functions is used to decide
>>> + * whether we need to lock the uncore lock or not.  This is only
>>> + * needed in i915, not in Xe.
>>> + *
>>> + * This lock in i915 is needed because some old platforms (at least
>>> + * IVB and possibly HSW as well), which are not supported in Xe, need
>>> + * all register accesses to the same cacheline to be serialized,
>>> + * otherwise they may hang.
>>> + */
>>> +static void intel_vblank_section_enter(struct drm_i915_private *i915)
>>> +{
>>> +#ifdef I915
>>> +	spin_lock(&i915->uncore.lock);
>>> +#endif
>>> +}
>>> +
>>> +static void intel_vblank_section_exit(struct drm_i915_private *i915)
>>> +{
>>> +#ifdef I915
>>> +	spin_unlock(&i915->uncore.lock);
>>> +#endif
>>> +}
>>> +
>>>    static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
>>>    				     bool in_vblank_irq,
>>>    				     int *vpos, int *hpos,
>>> @@ -302,11 +326,12 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
>>>    	}
>>>    
>>>    	/*
>>> -	 * Lock uncore.lock, as we will do multiple timing critical raw
>>> -	 * register reads, potentially with preemption disabled, so the
>>> -	 * following code must not block on uncore.lock.
>>> +	 * Enter vblank critical section, as we will do multiple
>>> +	 * timing critical raw register reads, potentially with
>>> +	 * preemption disabled, so the following code must not block.
>>>    	 */
>>> -	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
>>> +	local_irq_save(irqflags);
>>> +	intel_vblank_section_enter(dev_priv);
>>
>> Shouldn't local_irq_save go into intel_vblank_section_enter()? It seems
>> all callers from both i915 and xe end up doing that anyway and naming
>> "vblank_start" was presumed there would be more to the section than
>> cacheline mmio bug. I mean that there is some benefit from keeping the
>> readout timings tight.
>>
> 
> The reason is that there is one caller that has already disabled
> interrupts when this function is called (see below), so we shouldn't do
> it again.

Yeah I saw that but with irqsave/restore it is safe to nest. So for me 
it is more a fundamental question which I raise above.

Regards,

Tvrtko

> 
>>>    
>>>    	/* preempt_disable_rt() should go right here in PREEMPT_RT patchset. */
>>>    
>>> @@ -374,7 +399,8 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
>>>    
>>>    	/* preempt_enable_rt() should go right here in PREEMPT_RT patchset. */
>>>    
>>> -	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
>>> +	intel_vblank_section_exit(dev_priv);
>>> +	local_irq_restore(irqflags);
>>>    
>>>    	/*
>>>    	 * While in vblank, position will be negative
>>> @@ -412,9 +438,13 @@ int intel_get_crtc_scanline(struct intel_crtc *crtc)
>>>    	unsigned long irqflags;
>>>    	int position;
>>>    
>>> -	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
>>> +	local_irq_save(irqflags);
>>> +	intel_vblank_section_enter(dev_priv);
>>> +
>>>    	position = __intel_get_crtc_scanline(crtc);
>>> -	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
>>> +
>>> +	intel_vblank_section_exit(dev_priv);
>>> +	local_irq_restore(irqflags);
>>>    
>>>    	return position;
>>>    }
>>> @@ -537,7 +567,7 @@ void intel_crtc_update_active_timings(const struct intel_crtc_state *crtc_state,
>>>    	 * Need to audit everything to make sure it's safe.
>>>    	 */
>>>    	spin_lock_irqsave(&i915->drm.vblank_time_lock, irqflags);
>>> -	spin_lock(&i915->uncore.lock);
>>> +	intel_vblank_section_enter(i915);
> 
> Here.
> 
> --
> Cheers,
> Luca.
Luca Coelho Nov. 30, 2023, 1:54 p.m. UTC | #4
On Thu, 2023-11-30 at 13:24 +0000, Tvrtko Ursulin wrote:
> On 30/11/2023 12:26, Coelho, Luciano wrote:
> > On Thu, 2023-11-30 at 12:21 +0000, Tvrtko Ursulin wrote:
> > > On 30/11/2023 11:35, Luca Coelho wrote:
> > > > The uncore code may not always be available (e.g. when we build the
> > > > display code with Xe), so we can't always rely on having the uncore's
> > > > spinlock.
> > > > 
> > > > To handle this, split the spin_lock/unlock_irqsave/restore() into
> > > > spin_lock/unlock() followed by a call to local_irq_save/restore() and
> > > > create wrapper functions for locking and unlocking the uncore's
> > > > spinlock.  In these functions, we have a condition check and only
> > > > actually try to lock/unlock the spinlock when I915 is defined, and
> > > > thus uncore is available.
> > > > 
> > > > This keeps the ifdefs contained in these new functions and all such
> > > > logic inside the display code.
> > > > 
> > > > Cc: Tvrtko Ursulin <tvrto.ursulin@intel.com>
> > > > Cc: Jani Nikula <jani.nikula@intel.com>
> > > > Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
> > > > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
> > > > ---
> > > > 
> > > > 
> > > > In v2:
> > > > 
> > > >      * Renamed uncore_spin_*() to intel_spin_*()
> > > >      * Corrected the order: save, lock, unlock, restore
> > > > 
> > > > In v3:
> > > > 
> > > >      * Undid the change to pass drm_i915_private instead of the lock
> > > >        itself, since we would have to include i915_drv.h and that pulls
> > > >        in a truckload of other includes.
> > > > 
> > > > In v4:
> > > > 
> > > >      * After a brief attempt to replace this with a different patch,
> > > >        we're back to this one;
> > > >      * Pass drm_i195_private again, and move the functions to
> > > >        intel_vblank.c, so we don't need to include i915_drv.h in a
> > > >        header file and it's already included in intel_vblank.c;
> > > > 
> > > > In v5:
> > > > 
> > > >      * Remove stray include in intel_display.h;
> > > >      * Remove unnecessary inline modifiers in the new functions.
> > > > 
> > > > In v6:
> > > > 
> > > >      * Just removed the umlauts from Ville's name, because patchwork
> > > >        didn't catch my patch and I suspect it was some UTF-8 confusion.
> > > > 
> > > >    drivers/gpu/drm/i915/display/intel_vblank.c | 49 ++++++++++++++++-----
> > > >    1 file changed, 39 insertions(+), 10 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/display/intel_vblank.c b/drivers/gpu/drm/i915/display/intel_vblank.c
> > > > index 2cec2abf9746..221fcd6bf77b 100644
> > > > --- a/drivers/gpu/drm/i915/display/intel_vblank.c
> > > > +++ b/drivers/gpu/drm/i915/display/intel_vblank.c
> > > > @@ -265,6 +265,30 @@ int intel_crtc_scanline_to_hw(struct intel_crtc *crtc, int scanline)
> > > >    	return (scanline + vtotal - crtc->scanline_offset) % vtotal;
> > > >    }
> > > >    
> > > > +/*
> > > > + * The uncore version of the spin lock functions is used to decide
> > > > + * whether we need to lock the uncore lock or not.  This is only
> > > > + * needed in i915, not in Xe.
> > > > + *
> > > > + * This lock in i915 is needed because some old platforms (at least
> > > > + * IVB and possibly HSW as well), which are not supported in Xe, need
> > > > + * all register accesses to the same cacheline to be serialized,
> > > > + * otherwise they may hang.
> > > > + */
> > > > +static void intel_vblank_section_enter(struct drm_i915_private *i915)
> > > > +{
> > > > +#ifdef I915
> > > > +	spin_lock(&i915->uncore.lock);
> > > > +#endif
> > > > +}
> > > > +
> > > > +static void intel_vblank_section_exit(struct drm_i915_private *i915)
> > > > +{
> > > > +#ifdef I915
> > > > +	spin_unlock(&i915->uncore.lock);
> > > > +#endif
> > > > +}
> > > > +
> > > >    static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
> > > >    				     bool in_vblank_irq,
> > > >    				     int *vpos, int *hpos,
> > > > @@ -302,11 +326,12 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
> > > >    	}
> > > >    
> > > >    	/*
> > > > -	 * Lock uncore.lock, as we will do multiple timing critical raw
> > > > -	 * register reads, potentially with preemption disabled, so the
> > > > -	 * following code must not block on uncore.lock.
> > > > +	 * Enter vblank critical section, as we will do multiple
> > > > +	 * timing critical raw register reads, potentially with
> > > > +	 * preemption disabled, so the following code must not block.
> > > >    	 */
> > > > -	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> > > > +	local_irq_save(irqflags);
> > > > +	intel_vblank_section_enter(dev_priv);
> > > 
> > > Shouldn't local_irq_save go into intel_vblank_section_enter()? It seems
> > > all callers from both i915 and xe end up doing that anyway and naming
> > > "vblank_start" was presumed there would be more to the section than
> > > cacheline mmio bug. I mean that there is some benefit from keeping the
> > > readout timings tight.
> > > 
> > 
> > The reason is that there is one caller that has already disabled
> > interrupts when this function is called (see below), so we shouldn't do
> > it again.
> 
> Yeah I saw that but with irqsave/restore it is safe to nest. So for me 
> it is more a fundamental question which I raise above.

Sure, it should be safe to nest, but it seemed a bit ugly to me.

I can change it, if you prefer, as your point seems valid, but I will
wait to see what Rodrigo says, since he had already given his r-b, lest
we start ping-ponging on this too much.

--
Cheers,
Luca.
Rodrigo Vivi Nov. 30, 2023, 2:31 p.m. UTC | #5
On Thu, Nov 30, 2023 at 01:54:13PM +0000, Coelho, Luciano wrote:
> On Thu, 2023-11-30 at 13:24 +0000, Tvrtko Ursulin wrote:
> > On 30/11/2023 12:26, Coelho, Luciano wrote:
> > > On Thu, 2023-11-30 at 12:21 +0000, Tvrtko Ursulin wrote:
> > > > On 30/11/2023 11:35, Luca Coelho wrote:
> > > > > The uncore code may not always be available (e.g. when we build the
> > > > > display code with Xe), so we can't always rely on having the uncore's
> > > > > spinlock.
> > > > > 
> > > > > To handle this, split the spin_lock/unlock_irqsave/restore() into
> > > > > spin_lock/unlock() followed by a call to local_irq_save/restore() and
> > > > > create wrapper functions for locking and unlocking the uncore's
> > > > > spinlock.  In these functions, we have a condition check and only
> > > > > actually try to lock/unlock the spinlock when I915 is defined, and
> > > > > thus uncore is available.
> > > > > 
> > > > > This keeps the ifdefs contained in these new functions and all such
> > > > > logic inside the display code.
> > > > > 
> > > > > Cc: Tvrtko Ursulin <tvrto.ursulin@intel.com>
> > > > > Cc: Jani Nikula <jani.nikula@intel.com>
> > > > > Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
> > > > > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > > Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
> > > > > ---
> > > > > 
> > > > > 
> > > > > In v2:
> > > > > 
> > > > >      * Renamed uncore_spin_*() to intel_spin_*()
> > > > >      * Corrected the order: save, lock, unlock, restore
> > > > > 
> > > > > In v3:
> > > > > 
> > > > >      * Undid the change to pass drm_i915_private instead of the lock
> > > > >        itself, since we would have to include i915_drv.h and that pulls
> > > > >        in a truckload of other includes.
> > > > > 
> > > > > In v4:
> > > > > 
> > > > >      * After a brief attempt to replace this with a different patch,
> > > > >        we're back to this one;
> > > > >      * Pass drm_i195_private again, and move the functions to
> > > > >        intel_vblank.c, so we don't need to include i915_drv.h in a
> > > > >        header file and it's already included in intel_vblank.c;
> > > > > 
> > > > > In v5:
> > > > > 
> > > > >      * Remove stray include in intel_display.h;
> > > > >      * Remove unnecessary inline modifiers in the new functions.
> > > > > 
> > > > > In v6:
> > > > > 
> > > > >      * Just removed the umlauts from Ville's name, because patchwork
> > > > >        didn't catch my patch and I suspect it was some UTF-8 confusion.
> > > > > 
> > > > >    drivers/gpu/drm/i915/display/intel_vblank.c | 49 ++++++++++++++++-----
> > > > >    1 file changed, 39 insertions(+), 10 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_vblank.c b/drivers/gpu/drm/i915/display/intel_vblank.c
> > > > > index 2cec2abf9746..221fcd6bf77b 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_vblank.c
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_vblank.c
> > > > > @@ -265,6 +265,30 @@ int intel_crtc_scanline_to_hw(struct intel_crtc *crtc, int scanline)
> > > > >    	return (scanline + vtotal - crtc->scanline_offset) % vtotal;
> > > > >    }
> > > > >    
> > > > > +/*
> > > > > + * The uncore version of the spin lock functions is used to decide
> > > > > + * whether we need to lock the uncore lock or not.  This is only
> > > > > + * needed in i915, not in Xe.
> > > > > + *
> > > > > + * This lock in i915 is needed because some old platforms (at least
> > > > > + * IVB and possibly HSW as well), which are not supported in Xe, need
> > > > > + * all register accesses to the same cacheline to be serialized,
> > > > > + * otherwise they may hang.
> > > > > + */
> > > > > +static void intel_vblank_section_enter(struct drm_i915_private *i915)
> > > > > +{
> > > > > +#ifdef I915
> > > > > +	spin_lock(&i915->uncore.lock);
> > > > > +#endif
> > > > > +}
> > > > > +
> > > > > +static void intel_vblank_section_exit(struct drm_i915_private *i915)
> > > > > +{
> > > > > +#ifdef I915
> > > > > +	spin_unlock(&i915->uncore.lock);
> > > > > +#endif
> > > > > +}
> > > > > +
> > > > >    static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
> > > > >    				     bool in_vblank_irq,
> > > > >    				     int *vpos, int *hpos,
> > > > > @@ -302,11 +326,12 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
> > > > >    	}
> > > > >    
> > > > >    	/*
> > > > > -	 * Lock uncore.lock, as we will do multiple timing critical raw
> > > > > -	 * register reads, potentially with preemption disabled, so the
> > > > > -	 * following code must not block on uncore.lock.
> > > > > +	 * Enter vblank critical section, as we will do multiple
> > > > > +	 * timing critical raw register reads, potentially with
> > > > > +	 * preemption disabled, so the following code must not block.
> > > > >    	 */
> > > > > -	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> > > > > +	local_irq_save(irqflags);
> > > > > +	intel_vblank_section_enter(dev_priv);
> > > > 
> > > > Shouldn't local_irq_save go into intel_vblank_section_enter()? It seems
> > > > all callers from both i915 and xe end up doing that anyway and naming
> > > > "vblank_start" was presumed there would be more to the section than
> > > > cacheline mmio bug. I mean that there is some benefit from keeping the
> > > > readout timings tight.
> > > > 
> > > 
> > > The reason is that there is one caller that has already disabled
> > > interrupts when this function is called (see below), so we shouldn't do
> > > it again.
> > 
> > Yeah I saw that but with irqsave/restore it is safe to nest. So for me 
> > it is more a fundamental question which I raise above.
> 
> Sure, it should be safe to nest, but it seemed a bit ugly to me.
> 
> I can change it, if you prefer, as your point seems valid, but I will
> wait to see what Rodrigo says, since he had already given his r-b, lest
> we start ping-ponging on this too much.

I believe we should go with this patch as is, because this brings absolutely
no code change. Even though we believe the irqsave is a safe thing on that
side it would be a change in behavior.

So, probably a follow-up patch to also convert the other case and moving
everything inside the new vblank_start/end functions?

> 
> --
> Cheers,
> Luca.
Luca Coelho Nov. 30, 2023, 3:44 p.m. UTC | #6
On Thu, 2023-11-30 at 09:31 -0500, Rodrigo Vivi wrote:
> On Thu, Nov 30, 2023 at 01:54:13PM +0000, Coelho, Luciano wrote:
> > On Thu, 2023-11-30 at 13:24 +0000, Tvrtko Ursulin wrote:
> > > On 30/11/2023 12:26, Coelho, Luciano wrote:
> > > > On Thu, 2023-11-30 at 12:21 +0000, Tvrtko Ursulin wrote:
> > > > > On 30/11/2023 11:35, Luca Coelho wrote:
> > > > > > The uncore code may not always be available (e.g. when we build the
> > > > > > display code with Xe), so we can't always rely on having the uncore's
> > > > > > spinlock.
> > > > > > 
> > > > > > To handle this, split the spin_lock/unlock_irqsave/restore() into
> > > > > > spin_lock/unlock() followed by a call to local_irq_save/restore() and
> > > > > > create wrapper functions for locking and unlocking the uncore's
> > > > > > spinlock.  In these functions, we have a condition check and only
> > > > > > actually try to lock/unlock the spinlock when I915 is defined, and
> > > > > > thus uncore is available.
> > > > > > 
> > > > > > This keeps the ifdefs contained in these new functions and all such
> > > > > > logic inside the display code.
> > > > > > 
> > > > > > Cc: Tvrtko Ursulin <tvrto.ursulin@intel.com>
> > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
> > > > > > Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
> > > > > > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > > > Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
> > > > > > ---
> > > > > > 
> > > > > > 
> > > > > > In v2:
> > > > > > 
> > > > > >      * Renamed uncore_spin_*() to intel_spin_*()
> > > > > >      * Corrected the order: save, lock, unlock, restore
> > > > > > 
> > > > > > In v3:
> > > > > > 
> > > > > >      * Undid the change to pass drm_i915_private instead of the lock
> > > > > >        itself, since we would have to include i915_drv.h and that pulls
> > > > > >        in a truckload of other includes.
> > > > > > 
> > > > > > In v4:
> > > > > > 
> > > > > >      * After a brief attempt to replace this with a different patch,
> > > > > >        we're back to this one;
> > > > > >      * Pass drm_i195_private again, and move the functions to
> > > > > >        intel_vblank.c, so we don't need to include i915_drv.h in a
> > > > > >        header file and it's already included in intel_vblank.c;
> > > > > > 
> > > > > > In v5:
> > > > > > 
> > > > > >      * Remove stray include in intel_display.h;
> > > > > >      * Remove unnecessary inline modifiers in the new functions.
> > > > > > 
> > > > > > In v6:
> > > > > > 
> > > > > >      * Just removed the umlauts from Ville's name, because patchwork
> > > > > >        didn't catch my patch and I suspect it was some UTF-8 confusion.
> > > > > > 
> > > > > >    drivers/gpu/drm/i915/display/intel_vblank.c | 49 ++++++++++++++++-----
> > > > > >    1 file changed, 39 insertions(+), 10 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/display/intel_vblank.c b/drivers/gpu/drm/i915/display/intel_vblank.c
> > > > > > index 2cec2abf9746..221fcd6bf77b 100644
> > > > > > --- a/drivers/gpu/drm/i915/display/intel_vblank.c
> > > > > > +++ b/drivers/gpu/drm/i915/display/intel_vblank.c
> > > > > > @@ -265,6 +265,30 @@ int intel_crtc_scanline_to_hw(struct intel_crtc *crtc, int scanline)
> > > > > >    	return (scanline + vtotal - crtc->scanline_offset) % vtotal;
> > > > > >    }
> > > > > >    
> > > > > > +/*
> > > > > > + * The uncore version of the spin lock functions is used to decide
> > > > > > + * whether we need to lock the uncore lock or not.  This is only
> > > > > > + * needed in i915, not in Xe.
> > > > > > + *
> > > > > > + * This lock in i915 is needed because some old platforms (at least
> > > > > > + * IVB and possibly HSW as well), which are not supported in Xe, need
> > > > > > + * all register accesses to the same cacheline to be serialized,
> > > > > > + * otherwise they may hang.
> > > > > > + */
> > > > > > +static void intel_vblank_section_enter(struct drm_i915_private *i915)
> > > > > > +{
> > > > > > +#ifdef I915
> > > > > > +	spin_lock(&i915->uncore.lock);
> > > > > > +#endif
> > > > > > +}
> > > > > > +
> > > > > > +static void intel_vblank_section_exit(struct drm_i915_private *i915)
> > > > > > +{
> > > > > > +#ifdef I915
> > > > > > +	spin_unlock(&i915->uncore.lock);
> > > > > > +#endif
> > > > > > +}
> > > > > > +
> > > > > >    static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
> > > > > >    				     bool in_vblank_irq,
> > > > > >    				     int *vpos, int *hpos,
> > > > > > @@ -302,11 +326,12 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
> > > > > >    	}
> > > > > >    
> > > > > >    	/*
> > > > > > -	 * Lock uncore.lock, as we will do multiple timing critical raw
> > > > > > -	 * register reads, potentially with preemption disabled, so the
> > > > > > -	 * following code must not block on uncore.lock.
> > > > > > +	 * Enter vblank critical section, as we will do multiple
> > > > > > +	 * timing critical raw register reads, potentially with
> > > > > > +	 * preemption disabled, so the following code must not block.
> > > > > >    	 */
> > > > > > -	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> > > > > > +	local_irq_save(irqflags);
> > > > > > +	intel_vblank_section_enter(dev_priv);
> > > > > 
> > > > > Shouldn't local_irq_save go into intel_vblank_section_enter()? It seems
> > > > > all callers from both i915 and xe end up doing that anyway and naming
> > > > > "vblank_start" was presumed there would be more to the section than
> > > > > cacheline mmio bug. I mean that there is some benefit from keeping the
> > > > > readout timings tight.
> > > > > 
> > > > 
> > > > The reason is that there is one caller that has already disabled
> > > > interrupts when this function is called (see below), so we shouldn't do
> > > > it again.
> > > 
> > > Yeah I saw that but with irqsave/restore it is safe to nest. So for me 
> > > it is more a fundamental question which I raise above.
> > 
> > Sure, it should be safe to nest, but it seemed a bit ugly to me.
> > 
> > I can change it, if you prefer, as your point seems valid, but I will
> > wait to see what Rodrigo says, since he had already given his r-b, lest
> > we start ping-ponging on this too much.
> 
> I believe we should go with this patch as is, because this brings absolutely
> no code change. Even though we believe the irqsave is a safe thing on that
> side it would be a change in behavior.
> 
> So, probably a follow-up patch to also convert the other case and moving
> everything inside the new vblank_start/end functions?

Okay, cool.  So, if someone can merge this patch once it passes CI,
I'll send a follow up patch doing as Tvrtko suggested.

--
Cheers,
Luca.
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/display/intel_vblank.c b/drivers/gpu/drm/i915/display/intel_vblank.c
index 2cec2abf9746..221fcd6bf77b 100644
--- a/drivers/gpu/drm/i915/display/intel_vblank.c
+++ b/drivers/gpu/drm/i915/display/intel_vblank.c
@@ -265,6 +265,30 @@  int intel_crtc_scanline_to_hw(struct intel_crtc *crtc, int scanline)
 	return (scanline + vtotal - crtc->scanline_offset) % vtotal;
 }
 
+/*
+ * The uncore version of the spin lock functions is used to decide
+ * whether we need to lock the uncore lock or not.  This is only
+ * needed in i915, not in Xe.
+ *
+ * This lock in i915 is needed because some old platforms (at least
+ * IVB and possibly HSW as well), which are not supported in Xe, need
+ * all register accesses to the same cacheline to be serialized,
+ * otherwise they may hang.
+ */
+static void intel_vblank_section_enter(struct drm_i915_private *i915)
+{
+#ifdef I915
+	spin_lock(&i915->uncore.lock);
+#endif
+}
+
+static void intel_vblank_section_exit(struct drm_i915_private *i915)
+{
+#ifdef I915
+	spin_unlock(&i915->uncore.lock);
+#endif
+}
+
 static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
 				     bool in_vblank_irq,
 				     int *vpos, int *hpos,
@@ -302,11 +326,12 @@  static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
 	}
 
 	/*
-	 * Lock uncore.lock, as we will do multiple timing critical raw
-	 * register reads, potentially with preemption disabled, so the
-	 * following code must not block on uncore.lock.
+	 * Enter vblank critical section, as we will do multiple
+	 * timing critical raw register reads, potentially with
+	 * preemption disabled, so the following code must not block.
 	 */
-	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
+	local_irq_save(irqflags);
+	intel_vblank_section_enter(dev_priv);
 
 	/* preempt_disable_rt() should go right here in PREEMPT_RT patchset. */
 
@@ -374,7 +399,8 @@  static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc,
 
 	/* preempt_enable_rt() should go right here in PREEMPT_RT patchset. */
 
-	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
+	intel_vblank_section_exit(dev_priv);
+	local_irq_restore(irqflags);
 
 	/*
 	 * While in vblank, position will be negative
@@ -412,9 +438,13 @@  int intel_get_crtc_scanline(struct intel_crtc *crtc)
 	unsigned long irqflags;
 	int position;
 
-	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
+	local_irq_save(irqflags);
+	intel_vblank_section_enter(dev_priv);
+
 	position = __intel_get_crtc_scanline(crtc);
-	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
+
+	intel_vblank_section_exit(dev_priv);
+	local_irq_restore(irqflags);
 
 	return position;
 }
@@ -537,7 +567,7 @@  void intel_crtc_update_active_timings(const struct intel_crtc_state *crtc_state,
 	 * Need to audit everything to make sure it's safe.
 	 */
 	spin_lock_irqsave(&i915->drm.vblank_time_lock, irqflags);
-	spin_lock(&i915->uncore.lock);
+	intel_vblank_section_enter(i915);
 
 	drm_calc_timestamping_constants(&crtc->base, &adjusted_mode);
 
@@ -546,7 +576,6 @@  void intel_crtc_update_active_timings(const struct intel_crtc_state *crtc_state,
 	crtc->mode_flags = mode_flags;
 
 	crtc->scanline_offset = intel_crtc_scanline_offset(crtc_state);
-
-	spin_unlock(&i915->uncore.lock);
+	intel_vblank_section_exit(i915);
 	spin_unlock_irqrestore(&i915->drm.vblank_time_lock, irqflags);
 }