diff mbox series

drm/i915/gt: Disarm breadcrumbs if engines are already idle

Message ID 20240423165505.465734-2-janusz.krzysztofik@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/gt: Disarm breadcrumbs if engines are already idle | expand

Commit Message

Janusz Krzysztofik April 23, 2024, 4:23 p.m. UTC
From: Chris Wilson <chris@chris-wilson.co.uk>

The breadcrumbs use a GT wakeref for guarding the interrupt, but are
disarmed during release of the engine wakeref. This leaves a hole where
we may attach a breadcrumb just as the engine is parking (after it has
parked its breadcrumbs), execute the irq worker with some signalers still
attached, but never be woken again.

That issue manifests itself in CI with IGT runner timeouts while tests
are waiting indefinitely for release of all GT wakerefs.

<6> [209.151778] i915: Running live_engine_pm_selftests/live_engine_busy_stats
<7> [209.231628] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_5
<7> [209.231816] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_4
<7> [209.231944] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_3
<7> [209.232056] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_2
<7> [209.232166] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling DC_off
<7> [209.232270] i915 0000:00:02.0: [drm:skl_enable_dc6 [i915]] Enabling DC6
<7> [209.232368] i915 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [i915]] Setting DC state from 00 to 02
<4> [299.356116] [IGT] Inactivity timeout exceeded. Killing the current test with SIGQUIT.
...
<6> [299.356526] sysrq: Show State
...
<6> [299.373964] task:i915_selftest   state:D stack:11784 pid:5578  tgid:5578  ppid:873    flags:0x00004002
<6> [299.373967] Call Trace:
<6> [299.373968]  <TASK>
<6> [299.373970]  __schedule+0x3bb/0xda0
<6> [299.373974]  schedule+0x41/0x110
<6> [299.373976]  intel_wakeref_wait_for_idle+0x82/0x100 [i915]
<6> [299.374083]  ? __pfx_var_wake_function+0x10/0x10
<6> [299.374087]  live_engine_busy_stats+0x9b/0x500 [i915]
<6> [299.374173]  __i915_subtests+0xbe/0x240 [i915]
<6> [299.374277]  ? __pfx___intel_gt_live_setup+0x10/0x10 [i915]
<6> [299.374369]  ? __pfx___intel_gt_live_teardown+0x10/0x10 [i915]
<6> [299.374456]  intel_engine_live_selftests+0x1c/0x30 [i915]
<6> [299.374547]  __run_selftests+0xbb/0x190 [i915]
<6> [299.374635]  i915_live_selftests+0x4b/0x90 [i915]
<6> [299.374717]  i915_pci_probe+0x10d/0x210 [i915]

At the end of the interrupt worker, if there are no more engines awake,
disarm the breadcrumb and go to sleep.

Fixes: 9d5612ca165a ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10026
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: <stable@vger.kernel.org> # v5.12+
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

Comments

Nirmoy Das April 26, 2024, 4:13 p.m. UTC | #1
On 4/23/2024 6:23 PM, Janusz Krzysztofik wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
>
> The breadcrumbs use a GT wakeref for guarding the interrupt, but are
> disarmed during release of the engine wakeref. This leaves a hole where
> we may attach a breadcrumb just as the engine is parking (after it has
> parked its breadcrumbs), execute the irq worker with some signalers still
> attached, but never be woken again.
>
> That issue manifests itself in CI with IGT runner timeouts while tests
> are waiting indefinitely for release of all GT wakerefs.
>
> <6> [209.151778] i915: Running live_engine_pm_selftests/live_engine_busy_stats
> <7> [209.231628] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_5
> <7> [209.231816] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_4
> <7> [209.231944] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_3
> <7> [209.232056] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_2
> <7> [209.232166] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling DC_off
> <7> [209.232270] i915 0000:00:02.0: [drm:skl_enable_dc6 [i915]] Enabling DC6
> <7> [209.232368] i915 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [i915]] Setting DC state from 00 to 02
> <4> [299.356116] [IGT] Inactivity timeout exceeded. Killing the current test with SIGQUIT.
> ...
> <6> [299.356526] sysrq: Show State
> ...
> <6> [299.373964] task:i915_selftest   state:D stack:11784 pid:5578  tgid:5578  ppid:873    flags:0x00004002
> <6> [299.373967] Call Trace:
> <6> [299.373968]  <TASK>
> <6> [299.373970]  __schedule+0x3bb/0xda0
> <6> [299.373974]  schedule+0x41/0x110
> <6> [299.373976]  intel_wakeref_wait_for_idle+0x82/0x100 [i915]
> <6> [299.374083]  ? __pfx_var_wake_function+0x10/0x10
> <6> [299.374087]  live_engine_busy_stats+0x9b/0x500 [i915]
> <6> [299.374173]  __i915_subtests+0xbe/0x240 [i915]
> <6> [299.374277]  ? __pfx___intel_gt_live_setup+0x10/0x10 [i915]
> <6> [299.374369]  ? __pfx___intel_gt_live_teardown+0x10/0x10 [i915]
> <6> [299.374456]  intel_engine_live_selftests+0x1c/0x30 [i915]
> <6> [299.374547]  __run_selftests+0xbb/0x190 [i915]
> <6> [299.374635]  i915_live_selftests+0x4b/0x90 [i915]
> <6> [299.374717]  i915_pci_probe+0x10d/0x210 [i915]
>
> At the end of the interrupt worker, if there are no more engines awake,
> disarm the breadcrumb and go to sleep.
>
> Fixes: 9d5612ca165a ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission")
> Closes: https://gitlab.freedesktop.org/drm/intel/issues/10026
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Andrzej Hajda <andrzej.hajda@intel.com>
> Cc: <stable@vger.kernel.org> # v5.12+
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>


Acked-by: Nirmoy Das <nirmoy.das@intel.com>

I will let others/Andrzej r-b this as I am not very familiar with the code.


Thanks,

Nirmoy

> ---
>   drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 15 +++++++--------
>   1 file changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> index d650beb8ed22f..20b9b04ec1e0b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> @@ -263,8 +263,13 @@ static void signal_irq_work(struct irq_work *work)
>   		i915_request_put(rq);
>   	}
>   
> +	/* Lazy irq enabling after HW submission */
>   	if (!READ_ONCE(b->irq_armed) && !list_empty(&b->signalers))
>   		intel_breadcrumbs_arm_irq(b);
> +
> +	/* And confirm that we still want irqs enabled before we yield */
> +	if (READ_ONCE(b->irq_armed) && !atomic_read(&b->active))
> +		intel_breadcrumbs_disarm_irq(b);
>   }
>   
>   struct intel_breadcrumbs *
> @@ -315,13 +320,7 @@ void __intel_breadcrumbs_park(struct intel_breadcrumbs *b)
>   		return;
>   
>   	/* Kick the work once more to drain the signalers, and disarm the irq */
> -	irq_work_sync(&b->irq_work);
> -	while (READ_ONCE(b->irq_armed) && !atomic_read(&b->active)) {
> -		local_irq_disable();
> -		signal_irq_work(&b->irq_work);
> -		local_irq_enable();
> -		cond_resched();
> -	}
> +	irq_work_queue(&b->irq_work);
>   }
>   
>   void intel_breadcrumbs_free(struct kref *kref)
> @@ -404,7 +403,7 @@ static void insert_breadcrumb(struct i915_request *rq)
>   	 * the request as it may have completed and raised the interrupt as
>   	 * we were attaching it into the lists.
>   	 */
> -	if (!b->irq_armed || __i915_request_is_complete(rq))
> +	if (!READ_ONCE(b->irq_armed) || __i915_request_is_complete(rq))
>   		irq_work_queue(&b->irq_work);
>   }
>
Janusz Krzysztofik April 29, 2024, 9:20 a.m. UTC | #2
Hi Andrzej,

On Friday, 26 April 2024 18:13:02 CEST Nirmoy Das wrote:
> 
> On 4/23/2024 6:23 PM, Janusz Krzysztofik wrote:
> > From: Chris Wilson <chris@chris-wilson.co.uk>
> >
> > The breadcrumbs use a GT wakeref for guarding the interrupt, but are
> > disarmed during release of the engine wakeref. This leaves a hole where
> > we may attach a breadcrumb just as the engine is parking (after it has
> > parked its breadcrumbs), execute the irq worker with some signalers still
> > attached, but never be woken again.
> >
> > That issue manifests itself in CI with IGT runner timeouts while tests
> > are waiting indefinitely for release of all GT wakerefs.
> >
> > <6> [209.151778] i915: Running live_engine_pm_selftests/live_engine_busy_stats
> > <7> [209.231628] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_5
> > <7> [209.231816] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_4
> > <7> [209.231944] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_3
> > <7> [209.232056] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_2
> > <7> [209.232166] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling DC_off
> > <7> [209.232270] i915 0000:00:02.0: [drm:skl_enable_dc6 [i915]] Enabling DC6
> > <7> [209.232368] i915 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [i915]] Setting DC state from 00 to 02
> > <4> [299.356116] [IGT] Inactivity timeout exceeded. Killing the current test with SIGQUIT.
> > ...
> > <6> [299.356526] sysrq: Show State
> > ...
> > <6> [299.373964] task:i915_selftest   state:D stack:11784 pid:5578  tgid:5578  ppid:873    flags:0x00004002
> > <6> [299.373967] Call Trace:
> > <6> [299.373968]  <TASK>
> > <6> [299.373970]  __schedule+0x3bb/0xda0
> > <6> [299.373974]  schedule+0x41/0x110
> > <6> [299.373976]  intel_wakeref_wait_for_idle+0x82/0x100 [i915]
> > <6> [299.374083]  ? __pfx_var_wake_function+0x10/0x10
> > <6> [299.374087]  live_engine_busy_stats+0x9b/0x500 [i915]
> > <6> [299.374173]  __i915_subtests+0xbe/0x240 [i915]
> > <6> [299.374277]  ? __pfx___intel_gt_live_setup+0x10/0x10 [i915]
> > <6> [299.374369]  ? __pfx___intel_gt_live_teardown+0x10/0x10 [i915]
> > <6> [299.374456]  intel_engine_live_selftests+0x1c/0x30 [i915]
> > <6> [299.374547]  __run_selftests+0xbb/0x190 [i915]
> > <6> [299.374635]  i915_live_selftests+0x4b/0x90 [i915]
> > <6> [299.374717]  i915_pci_probe+0x10d/0x210 [i915]
> >
> > At the end of the interrupt worker, if there are no more engines awake,
> > disarm the breadcrumb and go to sleep.
> >
> > Fixes: 9d5612ca165a ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission")
> > Closes: https://gitlab.freedesktop.org/drm/intel/issues/10026
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Andrzej Hajda <andrzej.hajda@intel.com>
> > Cc: <stable@vger.kernel.org> # v5.12+
> > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> 
> 
> Acked-by: Nirmoy Das <nirmoy.das@intel.com>
> 
> I will let others/Andrzej r-b this as I am not very familiar with the code.

This patch should be familiar to you, could you please take a look?

Thanks,
Janusz

> 
> 
> Thanks,
> 
> Nirmoy
> 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 15 +++++++--------
> >   1 file changed, 7 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> > index d650beb8ed22f..20b9b04ec1e0b 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> > @@ -263,8 +263,13 @@ static void signal_irq_work(struct irq_work *work)
> >   		i915_request_put(rq);
> >   	}
> >   
> > +	/* Lazy irq enabling after HW submission */
> >   	if (!READ_ONCE(b->irq_armed) && !list_empty(&b->signalers))
> >   		intel_breadcrumbs_arm_irq(b);
> > +
> > +	/* And confirm that we still want irqs enabled before we yield */
> > +	if (READ_ONCE(b->irq_armed) && !atomic_read(&b->active))
> > +		intel_breadcrumbs_disarm_irq(b);
> >   }
> >   
> >   struct intel_breadcrumbs *
> > @@ -315,13 +320,7 @@ void __intel_breadcrumbs_park(struct intel_breadcrumbs *b)
> >   		return;
> >   
> >   	/* Kick the work once more to drain the signalers, and disarm the irq */
> > -	irq_work_sync(&b->irq_work);
> > -	while (READ_ONCE(b->irq_armed) && !atomic_read(&b->active)) {
> > -		local_irq_disable();
> > -		signal_irq_work(&b->irq_work);
> > -		local_irq_enable();
> > -		cond_resched();
> > -	}
> > +	irq_work_queue(&b->irq_work);
> >   }
> >   
> >   void intel_breadcrumbs_free(struct kref *kref)
> > @@ -404,7 +403,7 @@ static void insert_breadcrumb(struct i915_request *rq)
> >   	 * the request as it may have completed and raised the interrupt as
> >   	 * we were attaching it into the lists.
> >   	 */
> > -	if (!b->irq_armed || __i915_request_is_complete(rq))
> > +	if (!READ_ONCE(b->irq_armed) || __i915_request_is_complete(rq))
> >   		irq_work_queue(&b->irq_work);
> >   }
> >   
>
Andrzej Hajda May 9, 2024, 12:41 p.m. UTC | #3
On 23.04.2024 18:23, Janusz Krzysztofik wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> The breadcrumbs use a GT wakeref for guarding the interrupt, but are
> disarmed during release of the engine wakeref. This leaves a hole where
> we may attach a breadcrumb just as the engine is parking (after it has
> parked its breadcrumbs), execute the irq worker with some signalers still
> attached, but never be woken again.
> 
> That issue manifests itself in CI with IGT runner timeouts while tests
> are waiting indefinitely for release of all GT wakerefs.
> 
> <6> [209.151778] i915: Running live_engine_pm_selftests/live_engine_busy_stats
> <7> [209.231628] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_5
> <7> [209.231816] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_4
> <7> [209.231944] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_3
> <7> [209.232056] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_2
> <7> [209.232166] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling DC_off
> <7> [209.232270] i915 0000:00:02.0: [drm:skl_enable_dc6 [i915]] Enabling DC6
> <7> [209.232368] i915 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [i915]] Setting DC state from 00 to 02
> <4> [299.356116] [IGT] Inactivity timeout exceeded. Killing the current test with SIGQUIT.
> ...
> <6> [299.356526] sysrq: Show State
> ...
> <6> [299.373964] task:i915_selftest   state:D stack:11784 pid:5578  tgid:5578  ppid:873    flags:0x00004002
> <6> [299.373967] Call Trace:
> <6> [299.373968]  <TASK>
> <6> [299.373970]  __schedule+0x3bb/0xda0
> <6> [299.373974]  schedule+0x41/0x110
> <6> [299.373976]  intel_wakeref_wait_for_idle+0x82/0x100 [i915]
> <6> [299.374083]  ? __pfx_var_wake_function+0x10/0x10
> <6> [299.374087]  live_engine_busy_stats+0x9b/0x500 [i915]
> <6> [299.374173]  __i915_subtests+0xbe/0x240 [i915]
> <6> [299.374277]  ? __pfx___intel_gt_live_setup+0x10/0x10 [i915]
> <6> [299.374369]  ? __pfx___intel_gt_live_teardown+0x10/0x10 [i915]
> <6> [299.374456]  intel_engine_live_selftests+0x1c/0x30 [i915]
> <6> [299.374547]  __run_selftests+0xbb/0x190 [i915]
> <6> [299.374635]  i915_live_selftests+0x4b/0x90 [i915]
> <6> [299.374717]  i915_pci_probe+0x10d/0x210 [i915]
> 
> At the end of the interrupt worker, if there are no more engines awake,
> disarm the breadcrumb and go to sleep.
> 
> Fixes: 9d5612ca165a ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission")
> Closes: https://gitlab.freedesktop.org/drm/intel/issues/10026
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Andrzej Hajda <andrzej.hajda@intel.com>
> Cc: <stable@vger.kernel.org> # v5.12+
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>

Completely forgot this one.

Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Regards
Andrzej


> ---
>   drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 15 +++++++--------
>   1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> index d650beb8ed22f..20b9b04ec1e0b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> @@ -263,8 +263,13 @@ static void signal_irq_work(struct irq_work *work)
>   		i915_request_put(rq);
>   	}
>   
> +	/* Lazy irq enabling after HW submission */
>   	if (!READ_ONCE(b->irq_armed) && !list_empty(&b->signalers))
>   		intel_breadcrumbs_arm_irq(b);
> +
> +	/* And confirm that we still want irqs enabled before we yield */
> +	if (READ_ONCE(b->irq_armed) && !atomic_read(&b->active))
> +		intel_breadcrumbs_disarm_irq(b);
>   }
>   
>   struct intel_breadcrumbs *
> @@ -315,13 +320,7 @@ void __intel_breadcrumbs_park(struct intel_breadcrumbs *b)
>   		return;
>   
>   	/* Kick the work once more to drain the signalers, and disarm the irq */
> -	irq_work_sync(&b->irq_work);
> -	while (READ_ONCE(b->irq_armed) && !atomic_read(&b->active)) {
> -		local_irq_disable();
> -		signal_irq_work(&b->irq_work);
> -		local_irq_enable();
> -		cond_resched();
> -	}
> +	irq_work_queue(&b->irq_work);
>   }
>   
>   void intel_breadcrumbs_free(struct kref *kref)
> @@ -404,7 +403,7 @@ static void insert_breadcrumb(struct i915_request *rq)
>   	 * the request as it may have completed and raised the interrupt as
>   	 * we were attaching it into the lists.
>   	 */
> -	if (!b->irq_armed || __i915_request_is_complete(rq))
> +	if (!READ_ONCE(b->irq_armed) || __i915_request_is_complete(rq))
>   		irq_work_queue(&b->irq_work);
>   }
>
Andi Shyti May 14, 2024, 12:46 p.m. UTC | #4
Hi Janusz,

On Tue, Apr 23, 2024 at 06:23:10PM +0200, Janusz Krzysztofik wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> The breadcrumbs use a GT wakeref for guarding the interrupt, but are
> disarmed during release of the engine wakeref. This leaves a hole where
> we may attach a breadcrumb just as the engine is parking (after it has
> parked its breadcrumbs), execute the irq worker with some signalers still
> attached, but never be woken again.
> 
> That issue manifests itself in CI with IGT runner timeouts while tests
> are waiting indefinitely for release of all GT wakerefs.
> 
> <6> [209.151778] i915: Running live_engine_pm_selftests/live_engine_busy_stats
> <7> [209.231628] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_5
> <7> [209.231816] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_4
> <7> [209.231944] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_3
> <7> [209.232056] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_2
> <7> [209.232166] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling DC_off
> <7> [209.232270] i915 0000:00:02.0: [drm:skl_enable_dc6 [i915]] Enabling DC6
> <7> [209.232368] i915 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [i915]] Setting DC state from 00 to 02
> <4> [299.356116] [IGT] Inactivity timeout exceeded. Killing the current test with SIGQUIT.
> ...
> <6> [299.356526] sysrq: Show State
> ...
> <6> [299.373964] task:i915_selftest   state:D stack:11784 pid:5578  tgid:5578  ppid:873    flags:0x00004002
> <6> [299.373967] Call Trace:
> <6> [299.373968]  <TASK>
> <6> [299.373970]  __schedule+0x3bb/0xda0
> <6> [299.373974]  schedule+0x41/0x110
> <6> [299.373976]  intel_wakeref_wait_for_idle+0x82/0x100 [i915]
> <6> [299.374083]  ? __pfx_var_wake_function+0x10/0x10
> <6> [299.374087]  live_engine_busy_stats+0x9b/0x500 [i915]
> <6> [299.374173]  __i915_subtests+0xbe/0x240 [i915]
> <6> [299.374277]  ? __pfx___intel_gt_live_setup+0x10/0x10 [i915]
> <6> [299.374369]  ? __pfx___intel_gt_live_teardown+0x10/0x10 [i915]
> <6> [299.374456]  intel_engine_live_selftests+0x1c/0x30 [i915]
> <6> [299.374547]  __run_selftests+0xbb/0x190 [i915]
> <6> [299.374635]  i915_live_selftests+0x4b/0x90 [i915]
> <6> [299.374717]  i915_pci_probe+0x10d/0x210 [i915]
> 
> At the end of the interrupt worker, if there are no more engines awake,
> disarm the breadcrumb and go to sleep.
> 
> Fixes: 9d5612ca165a ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission")
> Closes: https://gitlab.freedesktop.org/drm/intel/issues/10026
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Andrzej Hajda <andrzej.hajda@intel.com>
> Cc: <stable@vger.kernel.org> # v5.12+
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>

Reviewed and applied!

Thanks,
Andi
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index d650beb8ed22f..20b9b04ec1e0b 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -263,8 +263,13 @@  static void signal_irq_work(struct irq_work *work)
 		i915_request_put(rq);
 	}
 
+	/* Lazy irq enabling after HW submission */
 	if (!READ_ONCE(b->irq_armed) && !list_empty(&b->signalers))
 		intel_breadcrumbs_arm_irq(b);
+
+	/* And confirm that we still want irqs enabled before we yield */
+	if (READ_ONCE(b->irq_armed) && !atomic_read(&b->active))
+		intel_breadcrumbs_disarm_irq(b);
 }
 
 struct intel_breadcrumbs *
@@ -315,13 +320,7 @@  void __intel_breadcrumbs_park(struct intel_breadcrumbs *b)
 		return;
 
 	/* Kick the work once more to drain the signalers, and disarm the irq */
-	irq_work_sync(&b->irq_work);
-	while (READ_ONCE(b->irq_armed) && !atomic_read(&b->active)) {
-		local_irq_disable();
-		signal_irq_work(&b->irq_work);
-		local_irq_enable();
-		cond_resched();
-	}
+	irq_work_queue(&b->irq_work);
 }
 
 void intel_breadcrumbs_free(struct kref *kref)
@@ -404,7 +403,7 @@  static void insert_breadcrumb(struct i915_request *rq)
 	 * the request as it may have completed and raised the interrupt as
 	 * we were attaching it into the lists.
 	 */
-	if (!b->irq_armed || __i915_request_is_complete(rq))
+	if (!READ_ONCE(b->irq_armed) || __i915_request_is_complete(rq))
 		irq_work_queue(&b->irq_work);
 }