diff mbox series

[v3] drm/i915/gt: Use spin_lock_irqsave() in interruptible context

Message ID pusppq5ybyszau2oocboj3mtj5x574gwij323jlclm5zxvimmu@mnfg6odxbpsv (mailing list archive)
State New, archived
Headers show
Series [v3] drm/i915/gt: Use spin_lock_irqsave() in interruptible context | expand

Commit Message

Krzysztof Karas Jan. 16, 2025, 10:40 a.m. UTC
spin_lock/unlock() functions used in interrupt contexts could
result in a deadlock, as seen in GitLab issue #13399:
https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13399,
which occurs when interrupt comes in while holding a lock.

Try to remedy the problem by saving irq state before spin lock
acquisition.

v2: add irqs' state save/restore calls to all locks/unlocks in
 signal_irq_work() execution (Maciej)

v3: use with spin_lock_irqsave() in guc_lrc_desc_unpin() instead
 of other lock/unlock calls and add Fixes and Cc tags (Tvrtko);
 change title and commit message

Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Fixes: 2f2cc53b5fe7 ("drm/i915/guc: Close deregister-context race against CT-loss")
Cc: <stable@vger.kernel.org> # v6.9+
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Maciej Patelczyk Jan. 16, 2025, 11:22 a.m. UTC | #1
On 16.01.2025 11:40, Krzysztof Karas wrote:

> spin_lock/unlock() functions used in interrupt contexts could
> result in a deadlock, as seen in GitLab issue #13399:
> https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13399,
> which occurs when interrupt comes in while holding a lock.
>
> Try to remedy the problem by saving irq state before spin lock
> acquisition.
>
> v2: add irqs' state save/restore calls to all locks/unlocks in
>   signal_irq_work() execution (Maciej)
>
> v3: use with spin_lock_irqsave() in guc_lrc_desc_unpin() instead
>   of other lock/unlock calls and add Fixes and Cc tags (Tvrtko);
>   change title and commit message
>
> Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
> Fixes: 2f2cc53b5fe7 ("drm/i915/guc: Close deregister-context race against CT-loss")
> Cc: <stable@vger.kernel.org> # v6.9+
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 12f1ba7ca9c1..29d9c81473cc 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -3433,10 +3433,10 @@ static inline int guc_lrc_desc_unpin(struct intel_context *ce)
>   	 */
>   	ret = deregister_context(ce, ce->guc_id.id);
>   	if (ret) {
> -		spin_lock(&ce->guc_state.lock);
> +		spin_lock_irqsave(&ce->guc_state.lock, flags);
>   		set_context_registered(ce);
>   		clr_context_destroyed(ce);
> -		spin_unlock(&ce->guc_state.lock);
> +		spin_unlock_irqrestore(&ce->guc_state.lock, flags);
>   		/*
>   		 * As gt-pm is awake at function entry, intel_wakeref_put_async merely decrements
>   		 * the wakeref immediately but per function spec usage call this after unlock.

Looked at this and wasn't seeing it. There is spin_lock_irqsave() used 
earlier.

Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Andi Shyti Feb. 13, 2025, 6:56 p.m. UTC | #2
Hi Krzysztof,

On Thu, Jan 16, 2025 at 10:40:46AM +0000, Krzysztof Karas wrote:
> spin_lock/unlock() functions used in interrupt contexts could
> result in a deadlock, as seen in GitLab issue #13399:
> https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13399,

I moved this link in the tag sectoin as Closes:

> which occurs when interrupt comes in while holding a lock.
> 
> Try to remedy the problem by saving irq state before spin lock
> acquisition.
> 
> v2: add irqs' state save/restore calls to all locks/unlocks in
>  signal_irq_work() execution (Maciej)
> 
> v3: use with spin_lock_irqsave() in guc_lrc_desc_unpin() instead
>  of other lock/unlock calls and add Fixes and Cc tags (Tvrtko);
>  change title and commit message
> 
> Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
> Fixes: 2f2cc53b5fe7 ("drm/i915/guc: Close deregister-context race against CT-loss")

I moved Fixes: above your SoB.

> Cc: <stable@vger.kernel.org> # v6.9+

Anyway, good catch, but please, remember next time to relly add
in Cc the stable kernel mailing list, the guc guys in Cc for GuC
related patches and the author of the patch you are fixing.

Reviewed and merged in drm-intel-gt-next.

Thanks,
Andi
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 12f1ba7ca9c1..29d9c81473cc 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -3433,10 +3433,10 @@  static inline int guc_lrc_desc_unpin(struct intel_context *ce)
 	 */
 	ret = deregister_context(ce, ce->guc_id.id);
 	if (ret) {
-		spin_lock(&ce->guc_state.lock);
+		spin_lock_irqsave(&ce->guc_state.lock, flags);
 		set_context_registered(ce);
 		clr_context_destroyed(ce);
-		spin_unlock(&ce->guc_state.lock);
+		spin_unlock_irqrestore(&ce->guc_state.lock, flags);
 		/*
 		 * As gt-pm is awake at function entry, intel_wakeref_put_async merely decrements
 		 * the wakeref immediately but per function spec usage call this after unlock.