[v2,3/3] drm/panfrost: Synchronize and disable interrupts before powering off

Message ID	20231128124510.391007-4-angelogioacchino.delregno@collabora.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> sender: kholk11) by madras.collabora.co.uk (Postfix) with ESMTPSA id 8EF8C66072F6; Tue, 28 Nov 2023 12:45:19 +0000 (GMT) From: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> To: boris.brezillon@collabora.com Subject: [PATCH v2 3/3] drm/panfrost: Synchronize and disable interrupts before powering off Date: Tue, 28 Nov 2023 13:45:10 +0100 Message-ID: <20231128124510.391007-4-angelogioacchino.delregno@collabora.com> In-Reply-To: <20231128124510.391007-1-angelogioacchino.delregno@collabora.com> References: <20231128124510.391007-1-angelogioacchino.delregno@collabora.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Cc: linux-kernel@vger.kernel.org, mripard@kernel.org, steven.price@arm.com, krzysztof.kozlowski@linaro.org, dri-devel@lists.freedesktop.org, tzimmermann@suse.de, m.szyprowski@samsung.com, kernel@collabora.com, angelogioacchino.delregno@collabora.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	drm/panfrost: Fix poweroff and sync IRQs for suspend \| expand [v2,0/3] drm/panfrost: Fix poweroff and sync IRQs for suspend [v2,1/3] drm/panfrost: Ignore core_mask for poweroff and disable PWRTRANS irq [v2,2/3] drm/panfrost: Add gpu_irq, mmu_irq to struct panfrost_device [v2,3/3] drm/panfrost: Synchronize and disable interrupts before powering off

AngeloGioacchino Del Regno Nov. 28, 2023, 12:45 p.m. UTC

To make sure that we don't unintentionally perform any unclocked and/or
unpowered R/W operation on GPU registers, before turning off clocks and
regulators we must make sure that no GPU, JOB or MMU ISR execution is
pending: doing that required to add a mechanism to synchronize the
interrupts on suspend.

Add functions panfrost_{gpu,job,mmu}_suspend_irq() which will perform
interrupts masking and ISR execution synchronization, and then call
those in the panfrost_device_runtime_suspend() handler in the exact
sequence of job (may require mmu!) -> mmu -> gpu.

As a side note, JOB and MMU suspend_irq functions needed some special
treatment: as their interrupt handlers will unmask interrupts, it was
necessary to add a bitmap for "is_suspending" which is used to address
the possible corner case of unintentional IRQ unmasking because of ISR
execution after a call to synchronize_irq().

Of course, unmasking the interrupts is being done as part of the reset
happening during runtime_resume(): since we're anyway resuming all of
GPU, JOB, MMU, the only additional action is to zero out the newly
introduced `is_suspending` bitmap directly in the resume handler, as
to avoid adding panfrost_{job,mmu}_resume_irq() function just for
clearing own bits, especially because it currently makes way more sense
to just zero out the bitmap.

Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
---
 drivers/gpu/drm/panfrost/panfrost_device.c |  4 ++++
 drivers/gpu/drm/panfrost/panfrost_device.h |  7 +++++++
 drivers/gpu/drm/panfrost/panfrost_gpu.c    |  7 +++++++
 drivers/gpu/drm/panfrost/panfrost_gpu.h    |  1 +
 drivers/gpu/drm/panfrost/panfrost_job.c    | 18 +++++++++++++++---
 drivers/gpu/drm/panfrost/panfrost_job.h    |  1 +
 drivers/gpu/drm/panfrost/panfrost_mmu.c    | 17 ++++++++++++++---
 drivers/gpu/drm/panfrost/panfrost_mmu.h    |  1 +
 8 files changed, 50 insertions(+), 6 deletions(-)

Boris Brezillon Nov. 28, 2023, 1:57 p.m. UTC | #1

On Tue, 28 Nov 2023 13:45:10 +0100
AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
wrote:

> To make sure that we don't unintentionally perform any unclocked and/or
> unpowered R/W operation on GPU registers, before turning off clocks and
> regulators we must make sure that no GPU, JOB or MMU ISR execution is
> pending: doing that required to add a mechanism to synchronize the
> interrupts on suspend.
> 
> Add functions panfrost_{gpu,job,mmu}_suspend_irq() which will perform
> interrupts masking and ISR execution synchronization, and then call
> those in the panfrost_device_runtime_suspend() handler in the exact
> sequence of job (may require mmu!) -> mmu -> gpu.
> 
> As a side note, JOB and MMU suspend_irq functions needed some special
> treatment: as their interrupt handlers will unmask interrupts, it was
> necessary to add a bitmap for "is_suspending" which is used to address
> the possible corner case of unintentional IRQ unmasking because of ISR
> execution after a call to synchronize_irq().
> 
> Of course, unmasking the interrupts is being done as part of the reset
> happening during runtime_resume(): since we're anyway resuming all of
> GPU, JOB, MMU, the only additional action is to zero out the newly
> introduced `is_suspending` bitmap directly in the resume handler, as
> to avoid adding panfrost_{job,mmu}_resume_irq() function just for
> clearing own bits, especially because it currently makes way more sense
> to just zero out the bitmap.
> 
> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> ---
>  drivers/gpu/drm/panfrost/panfrost_device.c |  4 ++++
>  drivers/gpu/drm/panfrost/panfrost_device.h |  7 +++++++
>  drivers/gpu/drm/panfrost/panfrost_gpu.c    |  7 +++++++
>  drivers/gpu/drm/panfrost/panfrost_gpu.h    |  1 +
>  drivers/gpu/drm/panfrost/panfrost_job.c    | 18 +++++++++++++++---
>  drivers/gpu/drm/panfrost/panfrost_job.h    |  1 +
>  drivers/gpu/drm/panfrost/panfrost_mmu.c    | 17 ++++++++++++++---
>  drivers/gpu/drm/panfrost/panfrost_mmu.h    |  1 +
>  8 files changed, 50 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
> index c90ad5ee34e7..ed34aa55a7da 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
> @@ -407,6 +407,7 @@ static int panfrost_device_runtime_resume(struct device *dev)
>  {
>  	struct panfrost_device *pfdev = dev_get_drvdata(dev);
>  
> +	bitmap_zero(pfdev->is_suspending, PANFROST_COMP_BIT_MAX);
>  	panfrost_device_reset(pfdev);
>  	panfrost_devfreq_resume(pfdev);
>  
> @@ -421,6 +422,9 @@ static int panfrost_device_runtime_suspend(struct device *dev)
>  		return -EBUSY;
>  
>  	panfrost_devfreq_suspend(pfdev);
> +	panfrost_job_suspend_irq(pfdev);
> +	panfrost_mmu_suspend_irq(pfdev);
> +	panfrost_gpu_suspend_irq(pfdev);
>  	panfrost_gpu_power_off(pfdev);
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
> index 54a8aad54259..29f89f2d3679 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -25,6 +25,12 @@ struct panfrost_perfcnt;
>  #define NUM_JOB_SLOTS 3
>  #define MAX_PM_DOMAINS 5
>  
> +enum panfrost_drv_comp_bits {
> +	PANFROST_COMP_BIT_MMU,
> +	PANFROST_COMP_BIT_JOB,
> +	PANFROST_COMP_BIT_MAX
> +};
> +
>  /**
>   * enum panfrost_gpu_pm - Supported kernel power management features
>   * @GPU_PM_CLK_DIS:  Allow disabling clocks during system suspend
> @@ -109,6 +115,7 @@ struct panfrost_device {
>  
>  	struct panfrost_features features;
>  	const struct panfrost_compatible *comp;
> +	DECLARE_BITMAP(is_suspending, PANFROST_COMP_BIT_MAX);
>  
>  	spinlock_t as_lock;
>  	unsigned long as_in_use_mask;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> index 7adc4441fa14..2bf645993ab4 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> @@ -452,6 +452,13 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>  		dev_err(pfdev->dev, "l2 power transition timeout");
>  }
>  
> +void panfrost_gpu_suspend_irq(struct panfrost_device *pfdev)
> +{
> +	gpu_write(pfdev, GPU_INT_MASK, 0);
> +	gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);

Shouldn't the synchronize_irq() guarantee that all monitored interrupts
are cleared before you return?

> +	synchronize_irq(pfdev->gpu_irq);
> +}
> +
>  int panfrost_gpu_init(struct panfrost_device *pfdev)
>  {
>  	int err;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.h b/drivers/gpu/drm/panfrost/panfrost_gpu.h
> index 876fdad9f721..d841b86504ea 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.h
> @@ -15,6 +15,7 @@ u32 panfrost_gpu_get_latest_flush_id(struct panfrost_device *pfdev);
>  int panfrost_gpu_soft_reset(struct panfrost_device *pfdev);
>  void panfrost_gpu_power_on(struct panfrost_device *pfdev);
>  void panfrost_gpu_power_off(struct panfrost_device *pfdev);
> +void panfrost_gpu_suspend_irq(struct panfrost_device *pfdev);
>  
>  void panfrost_cycle_counter_get(struct panfrost_device *pfdev);
>  void panfrost_cycle_counter_put(struct panfrost_device *pfdev);
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index f9446e197428..e8de44cc56e2 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -413,6 +413,14 @@ void panfrost_job_enable_interrupts(struct panfrost_device *pfdev)
>  	job_write(pfdev, JOB_INT_MASK, irq_mask);
>  }
>  
> +void panfrost_job_suspend_irq(struct panfrost_device *pfdev)
> +{
> +	set_bit(PANFROST_COMP_BIT_JOB, pfdev->is_suspending);
> +
> +	job_write(pfdev, JOB_INT_MASK, 0);
> +	synchronize_irq(pfdev->js->irq);
> +}
> +
>  static void panfrost_job_handle_err(struct panfrost_device *pfdev,
>  				    struct panfrost_job *job,
>  				    unsigned int js)
> @@ -792,9 +800,13 @@ static irqreturn_t panfrost_job_irq_handler_thread(int irq, void *data)
>  	struct panfrost_device *pfdev = data;
>  
>  	panfrost_job_handle_irqs(pfdev);
> -	job_write(pfdev, JOB_INT_MASK,
> -		  GENMASK(16 + NUM_JOB_SLOTS - 1, 16) |
> -		  GENMASK(NUM_JOB_SLOTS - 1, 0));
> +
> +	/* Enable interrupts only if we're not about to get suspended */
> +	if (!test_bit(PANFROST_COMP_BIT_JOB, pfdev->is_suspending))

The irq-line is requested with IRQF_SHARED, meaning the line might be
shared between all three GPU IRQs, but also with other devices. I think
if we want to be totally safe, we need to also check this is_suspending
field in the hard irq handlers before accessing the xxx_INT_yyy
registers.

> +		job_write(pfdev, JOB_INT_MASK,
> +			  GENMASK(16 + NUM_JOB_SLOTS - 1, 16) |
> +			  GENMASK(NUM_JOB_SLOTS - 1, 0));
> +
>  	return IRQ_HANDLED;
>  }
>  
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
> index 17ff808dba07..ec581b97852b 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.h
> @@ -47,6 +47,7 @@ int panfrost_job_get_slot(struct panfrost_job *job);
>  int panfrost_job_push(struct panfrost_job *job);
>  void panfrost_job_put(struct panfrost_job *job);
>  void panfrost_job_enable_interrupts(struct panfrost_device *pfdev);
> +void panfrost_job_suspend_irq(struct panfrost_device *pfdev);
>  int panfrost_job_is_idle(struct panfrost_device *pfdev);
>  
>  #endif
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index ac4296c1e54b..6ccf0a65b8fb 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -744,9 +744,12 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data)
>  			status = mmu_read(pfdev, MMU_INT_RAWSTAT) & ~pfdev->as_faulty_mask;
>  	}
>  
> -	spin_lock(&pfdev->as_lock);
> -	mmu_write(pfdev, MMU_INT_MASK, ~pfdev->as_faulty_mask);
> -	spin_unlock(&pfdev->as_lock);
> +	/* Enable interrupts only if we're not about to get suspended */
> +	if (!test_bit(PANFROST_COMP_BIT_MMU, pfdev->is_suspending)) {
> +		spin_lock(&pfdev->as_lock);
> +		mmu_write(pfdev, MMU_INT_MASK, ~pfdev->as_faulty_mask);
> +		spin_unlock(&pfdev->as_lock);
> +	}
>  
>  	return IRQ_HANDLED;
>  };
> @@ -777,3 +780,11 @@ void panfrost_mmu_fini(struct panfrost_device *pfdev)
>  {
>  	mmu_write(pfdev, MMU_INT_MASK, 0);
>  }
> +
> +void panfrost_mmu_suspend_irq(struct panfrost_device *pfdev)
> +{
> +	set_bit(PANFROST_COMP_BIT_MMU, pfdev->is_suspending);
> +
> +	mmu_write(pfdev, MMU_INT_MASK, 0);
> +	synchronize_irq(pfdev->mmu_irq);
> +}
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.h b/drivers/gpu/drm/panfrost/panfrost_mmu.h
> index cc2a0d307feb..022a9a74a114 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.h
> @@ -14,6 +14,7 @@ void panfrost_mmu_unmap(struct panfrost_gem_mapping *mapping);
>  int panfrost_mmu_init(struct panfrost_device *pfdev);
>  void panfrost_mmu_fini(struct panfrost_device *pfdev);
>  void panfrost_mmu_reset(struct panfrost_device *pfdev);
> +void panfrost_mmu_suspend_irq(struct panfrost_device *pfdev);
>  
>  u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, struct panfrost_mmu *mmu);
>  void panfrost_mmu_as_put(struct panfrost_device *pfdev, struct panfrost_mmu *mmu);

Boris Brezillon Nov. 28, 2023, 2:06 p.m. UTC | #2

On Tue, 28 Nov 2023 13:45:10 +0100
AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
wrote:

> To make sure that we don't unintentionally perform any unclocked and/or
> unpowered R/W operation on GPU registers, before turning off clocks and
> regulators we must make sure that no GPU, JOB or MMU ISR execution is
> pending: doing that required to add a mechanism to synchronize the
> interrupts on suspend.
> 
> Add functions panfrost_{gpu,job,mmu}_suspend_irq() which will perform
> interrupts masking and ISR execution synchronization, and then call
> those in the panfrost_device_runtime_suspend() handler in the exact
> sequence of job (may require mmu!) -> mmu -> gpu.
> 
> As a side note, JOB and MMU suspend_irq functions needed some special
> treatment: as their interrupt handlers will unmask interrupts, it was
> necessary to add a bitmap for "is_suspending" which is used to address
> the possible corner case of unintentional IRQ unmasking because of ISR
> execution after a call to synchronize_irq().
> 
> Of course, unmasking the interrupts is being done as part of the reset
> happening during runtime_resume(): since we're anyway resuming all of
> GPU, JOB, MMU, the only additional action is to zero out the newly
> introduced `is_suspending` bitmap directly in the resume handler, as
> to avoid adding panfrost_{job,mmu}_resume_irq() function just for
> clearing own bits, especially because it currently makes way more sense
> to just zero out the bitmap.
> 
> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> ---
>  drivers/gpu/drm/panfrost/panfrost_device.c |  4 ++++
>  drivers/gpu/drm/panfrost/panfrost_device.h |  7 +++++++
>  drivers/gpu/drm/panfrost/panfrost_gpu.c    |  7 +++++++
>  drivers/gpu/drm/panfrost/panfrost_gpu.h    |  1 +
>  drivers/gpu/drm/panfrost/panfrost_job.c    | 18 +++++++++++++++---
>  drivers/gpu/drm/panfrost/panfrost_job.h    |  1 +
>  drivers/gpu/drm/panfrost/panfrost_mmu.c    | 17 ++++++++++++++---
>  drivers/gpu/drm/panfrost/panfrost_mmu.h    |  1 +
>  8 files changed, 50 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
> index c90ad5ee34e7..ed34aa55a7da 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
> @@ -407,6 +407,7 @@ static int panfrost_device_runtime_resume(struct device *dev)
>  {
>  	struct panfrost_device *pfdev = dev_get_drvdata(dev);
>  
> +	bitmap_zero(pfdev->is_suspending, PANFROST_COMP_BIT_MAX);

I would let each sub-block clear their bit in the reset path, since
that's where the IRQs are effectively unmasked.

>  	panfrost_device_reset(pfdev);
>  	panfrost_devfreq_resume(pfdev);
>  
> @@ -421,6 +422,9 @@ static int panfrost_device_runtime_suspend(struct device *dev)
>  		return -EBUSY;
>  
>  	panfrost_devfreq_suspend(pfdev);
> +	panfrost_job_suspend_irq(pfdev);
> +	panfrost_mmu_suspend_irq(pfdev);
> +	panfrost_gpu_suspend_irq(pfdev);
>  	panfrost_gpu_power_off(pfdev);
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
> index 54a8aad54259..29f89f2d3679 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -25,6 +25,12 @@ struct panfrost_perfcnt;
>  #define NUM_JOB_SLOTS 3
>  #define MAX_PM_DOMAINS 5
>  
> +enum panfrost_drv_comp_bits {
> +	PANFROST_COMP_BIT_MMU,
> +	PANFROST_COMP_BIT_JOB,
> +	PANFROST_COMP_BIT_MAX
> +};
> +
>  /**
>   * enum panfrost_gpu_pm - Supported kernel power management features
>   * @GPU_PM_CLK_DIS:  Allow disabling clocks during system suspend
> @@ -109,6 +115,7 @@ struct panfrost_device {
>  
>  	struct panfrost_features features;
>  	const struct panfrost_compatible *comp;
> +	DECLARE_BITMAP(is_suspending, PANFROST_COMP_BIT_MAX);

nit: Maybe s/is_suspending/suspended_irqs/, given the state remains
until the device is resumed.

AngeloGioacchino Del Regno Nov. 28, 2023, 3:10 p.m. UTC | #3

Il 28/11/23 15:06, Boris Brezillon ha scritto:
> On Tue, 28 Nov 2023 13:45:10 +0100
> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> wrote:
> 
>> To make sure that we don't unintentionally perform any unclocked and/or
>> unpowered R/W operation on GPU registers, before turning off clocks and
>> regulators we must make sure that no GPU, JOB or MMU ISR execution is
>> pending: doing that required to add a mechanism to synchronize the
>> interrupts on suspend.
>>
>> Add functions panfrost_{gpu,job,mmu}_suspend_irq() which will perform
>> interrupts masking and ISR execution synchronization, and then call
>> those in the panfrost_device_runtime_suspend() handler in the exact
>> sequence of job (may require mmu!) -> mmu -> gpu.
>>
>> As a side note, JOB and MMU suspend_irq functions needed some special
>> treatment: as their interrupt handlers will unmask interrupts, it was
>> necessary to add a bitmap for "is_suspending" which is used to address
>> the possible corner case of unintentional IRQ unmasking because of ISR
>> execution after a call to synchronize_irq().
>>
>> Of course, unmasking the interrupts is being done as part of the reset
>> happening during runtime_resume(): since we're anyway resuming all of
>> GPU, JOB, MMU, the only additional action is to zero out the newly
>> introduced `is_suspending` bitmap directly in the resume handler, as
>> to avoid adding panfrost_{job,mmu}_resume_irq() function just for
>> clearing own bits, especially because it currently makes way more sense
>> to just zero out the bitmap.
>>
>> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>> ---
>>   drivers/gpu/drm/panfrost/panfrost_device.c |  4 ++++
>>   drivers/gpu/drm/panfrost/panfrost_device.h |  7 +++++++
>>   drivers/gpu/drm/panfrost/panfrost_gpu.c    |  7 +++++++
>>   drivers/gpu/drm/panfrost/panfrost_gpu.h    |  1 +
>>   drivers/gpu/drm/panfrost/panfrost_job.c    | 18 +++++++++++++++---
>>   drivers/gpu/drm/panfrost/panfrost_job.h    |  1 +
>>   drivers/gpu/drm/panfrost/panfrost_mmu.c    | 17 ++++++++++++++---
>>   drivers/gpu/drm/panfrost/panfrost_mmu.h    |  1 +
>>   8 files changed, 50 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
>> index c90ad5ee34e7..ed34aa55a7da 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
>> @@ -407,6 +407,7 @@ static int panfrost_device_runtime_resume(struct device *dev)
>>   {
>>   	struct panfrost_device *pfdev = dev_get_drvdata(dev);
>>   
>> +	bitmap_zero(pfdev->is_suspending, PANFROST_COMP_BIT_MAX);
> 
> I would let each sub-block clear their bit in the reset path, since
> that's where the IRQs are effectively unmasked.
> 


Honestly I wouldn't like seeing that: the reason is that this is something that
is done *for* suspend/resume and only for that, while reset may be called out of
the suspend/resume handlers.

I find clearing the suspend bits in the HW reset path a bit confusing, especially
when it is possible to avoid doing it there...

>>   	panfrost_device_reset(pfdev);
>>   	panfrost_devfreq_resume(pfdev);
>>   
>> @@ -421,6 +422,9 @@ static int panfrost_device_runtime_suspend(struct device *dev)
>>   		return -EBUSY;
>>   
>>   	panfrost_devfreq_suspend(pfdev);
>> +	panfrost_job_suspend_irq(pfdev);
>> +	panfrost_mmu_suspend_irq(pfdev);
>> +	panfrost_gpu_suspend_irq(pfdev);
>>   	panfrost_gpu_power_off(pfdev);
>>   
>>   	return 0;
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
>> index 54a8aad54259..29f89f2d3679 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
>> @@ -25,6 +25,12 @@ struct panfrost_perfcnt;
>>   #define NUM_JOB_SLOTS 3
>>   #define MAX_PM_DOMAINS 5
>>   
>> +enum panfrost_drv_comp_bits {
>> +	PANFROST_COMP_BIT_MMU,
>> +	PANFROST_COMP_BIT_JOB,
>> +	PANFROST_COMP_BIT_MAX
>> +};
>> +
>>   /**
>>    * enum panfrost_gpu_pm - Supported kernel power management features
>>    * @GPU_PM_CLK_DIS:  Allow disabling clocks during system suspend
>> @@ -109,6 +115,7 @@ struct panfrost_device {
>>   
>>   	struct panfrost_features features;
>>   	const struct panfrost_compatible *comp;
>> +	DECLARE_BITMAP(is_suspending, PANFROST_COMP_BIT_MAX);
> 
> nit: Maybe s/is_suspending/suspended_irqs/, given the state remains
> until the device is resumed.

If we keep the `is_suspending` name, we can use this one more generically in
case we ever need to, what do you think?

Cheers,
Angelo

AngeloGioacchino Del Regno Nov. 28, 2023, 3:10 p.m. UTC | #4

Il 28/11/23 14:57, Boris Brezillon ha scritto:
> On Tue, 28 Nov 2023 13:45:10 +0100
> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> wrote:
> 
>> To make sure that we don't unintentionally perform any unclocked and/or
>> unpowered R/W operation on GPU registers, before turning off clocks and
>> regulators we must make sure that no GPU, JOB or MMU ISR execution is
>> pending: doing that required to add a mechanism to synchronize the
>> interrupts on suspend.
>>
>> Add functions panfrost_{gpu,job,mmu}_suspend_irq() which will perform
>> interrupts masking and ISR execution synchronization, and then call
>> those in the panfrost_device_runtime_suspend() handler in the exact
>> sequence of job (may require mmu!) -> mmu -> gpu.
>>
>> As a side note, JOB and MMU suspend_irq functions needed some special
>> treatment: as their interrupt handlers will unmask interrupts, it was
>> necessary to add a bitmap for "is_suspending" which is used to address
>> the possible corner case of unintentional IRQ unmasking because of ISR
>> execution after a call to synchronize_irq().
>>
>> Of course, unmasking the interrupts is being done as part of the reset
>> happening during runtime_resume(): since we're anyway resuming all of
>> GPU, JOB, MMU, the only additional action is to zero out the newly
>> introduced `is_suspending` bitmap directly in the resume handler, as
>> to avoid adding panfrost_{job,mmu}_resume_irq() function just for
>> clearing own bits, especially because it currently makes way more sense
>> to just zero out the bitmap.
>>
>> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>> ---
>>   drivers/gpu/drm/panfrost/panfrost_device.c |  4 ++++
>>   drivers/gpu/drm/panfrost/panfrost_device.h |  7 +++++++
>>   drivers/gpu/drm/panfrost/panfrost_gpu.c    |  7 +++++++
>>   drivers/gpu/drm/panfrost/panfrost_gpu.h    |  1 +
>>   drivers/gpu/drm/panfrost/panfrost_job.c    | 18 +++++++++++++++---
>>   drivers/gpu/drm/panfrost/panfrost_job.h    |  1 +
>>   drivers/gpu/drm/panfrost/panfrost_mmu.c    | 17 ++++++++++++++---
>>   drivers/gpu/drm/panfrost/panfrost_mmu.h    |  1 +
>>   8 files changed, 50 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
>> index c90ad5ee34e7..ed34aa55a7da 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
>> @@ -407,6 +407,7 @@ static int panfrost_device_runtime_resume(struct device *dev)
>>   {
>>   	struct panfrost_device *pfdev = dev_get_drvdata(dev);
>>   
>> +	bitmap_zero(pfdev->is_suspending, PANFROST_COMP_BIT_MAX);
>>   	panfrost_device_reset(pfdev);
>>   	panfrost_devfreq_resume(pfdev);
>>   
>> @@ -421,6 +422,9 @@ static int panfrost_device_runtime_suspend(struct device *dev)
>>   		return -EBUSY;
>>   
>>   	panfrost_devfreq_suspend(pfdev);
>> +	panfrost_job_suspend_irq(pfdev);
>> +	panfrost_mmu_suspend_irq(pfdev);
>> +	panfrost_gpu_suspend_irq(pfdev);
>>   	panfrost_gpu_power_off(pfdev);
>>   
>>   	return 0;
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
>> index 54a8aad54259..29f89f2d3679 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
>> @@ -25,6 +25,12 @@ struct panfrost_perfcnt;
>>   #define NUM_JOB_SLOTS 3
>>   #define MAX_PM_DOMAINS 5
>>   
>> +enum panfrost_drv_comp_bits {
>> +	PANFROST_COMP_BIT_MMU,
>> +	PANFROST_COMP_BIT_JOB,
>> +	PANFROST_COMP_BIT_MAX
>> +};
>> +
>>   /**
>>    * enum panfrost_gpu_pm - Supported kernel power management features
>>    * @GPU_PM_CLK_DIS:  Allow disabling clocks during system suspend
>> @@ -109,6 +115,7 @@ struct panfrost_device {
>>   
>>   	struct panfrost_features features;
>>   	const struct panfrost_compatible *comp;
>> +	DECLARE_BITMAP(is_suspending, PANFROST_COMP_BIT_MAX);
>>   
>>   	spinlock_t as_lock;
>>   	unsigned long as_in_use_mask;
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>> index 7adc4441fa14..2bf645993ab4 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>> @@ -452,6 +452,13 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>>   		dev_err(pfdev->dev, "l2 power transition timeout");
>>   }
>>   
>> +void panfrost_gpu_suspend_irq(struct panfrost_device *pfdev)
>> +{
>> +	gpu_write(pfdev, GPU_INT_MASK, 0);
>> +	gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
> 
> Shouldn't the synchronize_irq() guarantee that all monitored interrupts
> are cleared before you return?
> 

Yeah, right - even though we're writing GPU_INT_STAT in CLEAR, there's no reason
why INT_STAT would "contain less bits" than the ones that we *want to* clear.

Effectively, I can remove that INT_CLEAR write, as it makes little sense - thanks
for catching that!

>> +	synchronize_irq(pfdev->gpu_irq);
>> +}
>> +
>>   int panfrost_gpu_init(struct panfrost_device *pfdev)
>>   {
>>   	int err;
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.h b/drivers/gpu/drm/panfrost/panfrost_gpu.h
>> index 876fdad9f721..d841b86504ea 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.h
>> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.h
>> @@ -15,6 +15,7 @@ u32 panfrost_gpu_get_latest_flush_id(struct panfrost_device *pfdev);
>>   int panfrost_gpu_soft_reset(struct panfrost_device *pfdev);
>>   void panfrost_gpu_power_on(struct panfrost_device *pfdev);
>>   void panfrost_gpu_power_off(struct panfrost_device *pfdev);
>> +void panfrost_gpu_suspend_irq(struct panfrost_device *pfdev);
>>   
>>   void panfrost_cycle_counter_get(struct panfrost_device *pfdev);
>>   void panfrost_cycle_counter_put(struct panfrost_device *pfdev);
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>> index f9446e197428..e8de44cc56e2 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>> @@ -413,6 +413,14 @@ void panfrost_job_enable_interrupts(struct panfrost_device *pfdev)
>>   	job_write(pfdev, JOB_INT_MASK, irq_mask);
>>   }
>>   
>> +void panfrost_job_suspend_irq(struct panfrost_device *pfdev)
>> +{
>> +	set_bit(PANFROST_COMP_BIT_JOB, pfdev->is_suspending);
>> +
>> +	job_write(pfdev, JOB_INT_MASK, 0);
>> +	synchronize_irq(pfdev->js->irq);
>> +}
>> +
>>   static void panfrost_job_handle_err(struct panfrost_device *pfdev,
>>   				    struct panfrost_job *job,
>>   				    unsigned int js)
>> @@ -792,9 +800,13 @@ static irqreturn_t panfrost_job_irq_handler_thread(int irq, void *data)
>>   	struct panfrost_device *pfdev = data;
>>   
>>   	panfrost_job_handle_irqs(pfdev);
>> -	job_write(pfdev, JOB_INT_MASK,
>> -		  GENMASK(16 + NUM_JOB_SLOTS - 1, 16) |
>> -		  GENMASK(NUM_JOB_SLOTS - 1, 0));
>> +
>> +	/* Enable interrupts only if we're not about to get suspended */
>> +	if (!test_bit(PANFROST_COMP_BIT_JOB, pfdev->is_suspending))
> 
> The irq-line is requested with IRQF_SHARED, meaning the line might be
> shared between all three GPU IRQs, but also with other devices. I think
> if we want to be totally safe, we need to also check this is_suspending
> field in the hard irq handlers before accessing the xxx_INT_yyy
> registers.
> 

This would mean that we would have to force canceling jobs in the suspend
handler, but if the IRQ never fired, would we still be able to find the
right bits flipped in JOB_INT_RAWSTAT?

 From what I understand, are you suggesting to call, in job_suspend_irq()
something like

void panfrost_job_suspend_irq(struct panfrost_device *pfdev)
{
         u32 status;

	set_bit(PANFROST_COMP_BIT_JOB, pfdev->is_suspending);

	job_write(pfdev, JOB_INT_MASK, 0);
	synchronize_irq(pfdev->js->irq);

	status = job_read(pfdev, JOB_INT_STAT);
	if (status)
		panfrost_job_irq_handler_thread(pfdev->js->irq, (void*)pfdev);
}

and then while still retaining the check in the IRQ thread handler, also
check it in the hardirq handler like

static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
{
	struct panfrost_device *pfdev = data;
	u32 status;

	if (!test_bit(PANFROST_COMP_BIT_JOB, pfdev->is_suspending))
		return IRQ_NONE;

	status = job_read(pfdev, JOB_INT_STAT);
	if (!status)
	        return IRQ_NONE;

	job_write(pfdev, JOB_INT_MASK, 0);
	return IRQ_WAKE_THREAD;
}

(rinse and repeat for panfrost_mmu)

..or am I misunderstanding you?

Cheers,
Angelo

Boris Brezillon Nov. 28, 2023, 3:38 p.m. UTC | #5

On Tue, 28 Nov 2023 16:10:43 +0100
AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
wrote:

> Il 28/11/23 15:06, Boris Brezillon ha scritto:
> > On Tue, 28 Nov 2023 13:45:10 +0100
> > AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> > wrote:
> >   
> >> To make sure that we don't unintentionally perform any unclocked and/or
> >> unpowered R/W operation on GPU registers, before turning off clocks and
> >> regulators we must make sure that no GPU, JOB or MMU ISR execution is
> >> pending: doing that required to add a mechanism to synchronize the
> >> interrupts on suspend.
> >>
> >> Add functions panfrost_{gpu,job,mmu}_suspend_irq() which will perform
> >> interrupts masking and ISR execution synchronization, and then call
> >> those in the panfrost_device_runtime_suspend() handler in the exact
> >> sequence of job (may require mmu!) -> mmu -> gpu.
> >>
> >> As a side note, JOB and MMU suspend_irq functions needed some special
> >> treatment: as their interrupt handlers will unmask interrupts, it was
> >> necessary to add a bitmap for "is_suspending" which is used to address
> >> the possible corner case of unintentional IRQ unmasking because of ISR
> >> execution after a call to synchronize_irq().
> >>
> >> Of course, unmasking the interrupts is being done as part of the reset
> >> happening during runtime_resume(): since we're anyway resuming all of
> >> GPU, JOB, MMU, the only additional action is to zero out the newly
> >> introduced `is_suspending` bitmap directly in the resume handler, as
> >> to avoid adding panfrost_{job,mmu}_resume_irq() function just for
> >> clearing own bits, especially because it currently makes way more sense
> >> to just zero out the bitmap.
> >>
> >> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> >> ---
> >>   drivers/gpu/drm/panfrost/panfrost_device.c |  4 ++++
> >>   drivers/gpu/drm/panfrost/panfrost_device.h |  7 +++++++
> >>   drivers/gpu/drm/panfrost/panfrost_gpu.c    |  7 +++++++
> >>   drivers/gpu/drm/panfrost/panfrost_gpu.h    |  1 +
> >>   drivers/gpu/drm/panfrost/panfrost_job.c    | 18 +++++++++++++++---
> >>   drivers/gpu/drm/panfrost/panfrost_job.h    |  1 +
> >>   drivers/gpu/drm/panfrost/panfrost_mmu.c    | 17 ++++++++++++++---
> >>   drivers/gpu/drm/panfrost/panfrost_mmu.h    |  1 +
> >>   8 files changed, 50 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
> >> index c90ad5ee34e7..ed34aa55a7da 100644
> >> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
> >> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
> >> @@ -407,6 +407,7 @@ static int panfrost_device_runtime_resume(struct device *dev)
> >>   {
> >>   	struct panfrost_device *pfdev = dev_get_drvdata(dev);
> >>   
> >> +	bitmap_zero(pfdev->is_suspending, PANFROST_COMP_BIT_MAX);  
> > 
> > I would let each sub-block clear their bit in the reset path, since
> > that's where the IRQs are effectively unmasked.
> >   
> 
> 
> Honestly I wouldn't like seeing that: the reason is that this is something that
> is done *for* suspend/resume and only for that, while reset may be called out of
> the suspend/resume handlers.
> 
> I find clearing the suspend bits in the HW reset path a bit confusing, especially
> when it is possible to avoid doing it there...

Well, I do think it's preferable to keep the irq_is_no_longer_suspended
state update where the interrupt is effectively unmasked. Note that
when you do a reset, the IRQ is silently suspended just after the
reset happens, because the xxx_INT_MASKs are restored to their default
value, so I do consider that clearing this bit in the reset path makes
sense.

> 
> >>   	panfrost_device_reset(pfdev);
> >>   	panfrost_devfreq_resume(pfdev);
> >>   
> >> @@ -421,6 +422,9 @@ static int panfrost_device_runtime_suspend(struct device *dev)
> >>   		return -EBUSY;
> >>   
> >>   	panfrost_devfreq_suspend(pfdev);
> >> +	panfrost_job_suspend_irq(pfdev);
> >> +	panfrost_mmu_suspend_irq(pfdev);
> >> +	panfrost_gpu_suspend_irq(pfdev);
> >>   	panfrost_gpu_power_off(pfdev);
> >>   
> >>   	return 0;
> >> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
> >> index 54a8aad54259..29f89f2d3679 100644
> >> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> >> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> >> @@ -25,6 +25,12 @@ struct panfrost_perfcnt;
> >>   #define NUM_JOB_SLOTS 3
> >>   #define MAX_PM_DOMAINS 5
> >>   
> >> +enum panfrost_drv_comp_bits {
> >> +	PANFROST_COMP_BIT_MMU,
> >> +	PANFROST_COMP_BIT_JOB,
> >> +	PANFROST_COMP_BIT_MAX
> >> +};
> >> +
> >>   /**
> >>    * enum panfrost_gpu_pm - Supported kernel power management features
> >>    * @GPU_PM_CLK_DIS:  Allow disabling clocks during system suspend
> >> @@ -109,6 +115,7 @@ struct panfrost_device {
> >>   
> >>   	struct panfrost_features features;
> >>   	const struct panfrost_compatible *comp;
> >> +	DECLARE_BITMAP(is_suspending, PANFROST_COMP_BIT_MAX);  
> > 
> > nit: Maybe s/is_suspending/suspended_irqs/, given the state remains
> > until the device is resumed.  
> 
> If we keep the `is_suspending` name, we can use this one more generically in
> case we ever need to, what do you think?

I'm lost. Why would we want to reserve a name for something we don't
know about? My comment was mostly relating to the fact this bitmap
doesn't reflect the is_suspending state, but rather is_suspended,
because it remains set until the device is resumed. And we actually want
it to reflect the is_suspended state, so we can catch interrupts that
are not for us without reading regs in the hard irq handler, when the
GPU is suspended.

AngeloGioacchino Del Regno Nov. 28, 2023, 3:42 p.m. UTC | #6

Il 28/11/23 16:38, Boris Brezillon ha scritto:
> On Tue, 28 Nov 2023 16:10:43 +0100
> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> wrote:
> 
>> Il 28/11/23 15:06, Boris Brezillon ha scritto:
>>> On Tue, 28 Nov 2023 13:45:10 +0100
>>> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>>> wrote:
>>>    
>>>> To make sure that we don't unintentionally perform any unclocked and/or
>>>> unpowered R/W operation on GPU registers, before turning off clocks and
>>>> regulators we must make sure that no GPU, JOB or MMU ISR execution is
>>>> pending: doing that required to add a mechanism to synchronize the
>>>> interrupts on suspend.
>>>>
>>>> Add functions panfrost_{gpu,job,mmu}_suspend_irq() which will perform
>>>> interrupts masking and ISR execution synchronization, and then call
>>>> those in the panfrost_device_runtime_suspend() handler in the exact
>>>> sequence of job (may require mmu!) -> mmu -> gpu.
>>>>
>>>> As a side note, JOB and MMU suspend_irq functions needed some special
>>>> treatment: as their interrupt handlers will unmask interrupts, it was
>>>> necessary to add a bitmap for "is_suspending" which is used to address
>>>> the possible corner case of unintentional IRQ unmasking because of ISR
>>>> execution after a call to synchronize_irq().
>>>>
>>>> Of course, unmasking the interrupts is being done as part of the reset
>>>> happening during runtime_resume(): since we're anyway resuming all of
>>>> GPU, JOB, MMU, the only additional action is to zero out the newly
>>>> introduced `is_suspending` bitmap directly in the resume handler, as
>>>> to avoid adding panfrost_{job,mmu}_resume_irq() function just for
>>>> clearing own bits, especially because it currently makes way more sense
>>>> to just zero out the bitmap.
>>>>
>>>> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>>>> ---
>>>>    drivers/gpu/drm/panfrost/panfrost_device.c |  4 ++++
>>>>    drivers/gpu/drm/panfrost/panfrost_device.h |  7 +++++++
>>>>    drivers/gpu/drm/panfrost/panfrost_gpu.c    |  7 +++++++
>>>>    drivers/gpu/drm/panfrost/panfrost_gpu.h    |  1 +
>>>>    drivers/gpu/drm/panfrost/panfrost_job.c    | 18 +++++++++++++++---
>>>>    drivers/gpu/drm/panfrost/panfrost_job.h    |  1 +
>>>>    drivers/gpu/drm/panfrost/panfrost_mmu.c    | 17 ++++++++++++++---
>>>>    drivers/gpu/drm/panfrost/panfrost_mmu.h    |  1 +
>>>>    8 files changed, 50 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
>>>> index c90ad5ee34e7..ed34aa55a7da 100644
>>>> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
>>>> @@ -407,6 +407,7 @@ static int panfrost_device_runtime_resume(struct device *dev)
>>>>    {
>>>>    	struct panfrost_device *pfdev = dev_get_drvdata(dev);
>>>>    
>>>> +	bitmap_zero(pfdev->is_suspending, PANFROST_COMP_BIT_MAX);
>>>
>>> I would let each sub-block clear their bit in the reset path, since
>>> that's where the IRQs are effectively unmasked.
>>>    
>>
>>
>> Honestly I wouldn't like seeing that: the reason is that this is something that
>> is done *for* suspend/resume and only for that, while reset may be called out of
>> the suspend/resume handlers.
>>
>> I find clearing the suspend bits in the HW reset path a bit confusing, especially
>> when it is possible to avoid doing it there...
> 
> Well, I do think it's preferable to keep the irq_is_no_longer_suspended
> state update where the interrupt is effectively unmasked. Note that
> when you do a reset, the IRQ is silently suspended just after the
> reset happens, because the xxx_INT_MASKs are restored to their default
> value, so I do consider that clearing this bit in the reset path makes
> sense.
> 

Okay then, I can move it, no problem.

>>
>>>>    	panfrost_device_reset(pfdev);
>>>>    	panfrost_devfreq_resume(pfdev);
>>>>    
>>>> @@ -421,6 +422,9 @@ static int panfrost_device_runtime_suspend(struct device *dev)
>>>>    		return -EBUSY;
>>>>    
>>>>    	panfrost_devfreq_suspend(pfdev);
>>>> +	panfrost_job_suspend_irq(pfdev);
>>>> +	panfrost_mmu_suspend_irq(pfdev);
>>>> +	panfrost_gpu_suspend_irq(pfdev);
>>>>    	panfrost_gpu_power_off(pfdev);
>>>>    
>>>>    	return 0;
>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
>>>> index 54a8aad54259..29f89f2d3679 100644
>>>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
>>>> @@ -25,6 +25,12 @@ struct panfrost_perfcnt;
>>>>    #define NUM_JOB_SLOTS 3
>>>>    #define MAX_PM_DOMAINS 5
>>>>    
>>>> +enum panfrost_drv_comp_bits {
>>>> +	PANFROST_COMP_BIT_MMU,
>>>> +	PANFROST_COMP_BIT_JOB,
>>>> +	PANFROST_COMP_BIT_MAX
>>>> +};
>>>> +
>>>>    /**
>>>>     * enum panfrost_gpu_pm - Supported kernel power management features
>>>>     * @GPU_PM_CLK_DIS:  Allow disabling clocks during system suspend
>>>> @@ -109,6 +115,7 @@ struct panfrost_device {
>>>>    
>>>>    	struct panfrost_features features;
>>>>    	const struct panfrost_compatible *comp;
>>>> +	DECLARE_BITMAP(is_suspending, PANFROST_COMP_BIT_MAX);
>>>
>>> nit: Maybe s/is_suspending/suspended_irqs/, given the state remains
>>> until the device is resumed.
>>
>> If we keep the `is_suspending` name, we can use this one more generically in
>> case we ever need to, what do you think?
> 
> I'm lost. Why would we want to reserve a name for something we don't
> know about? My comment was mostly relating to the fact this bitmap
> doesn't reflect the is_suspending state, but rather is_suspended,
> because it remains set until the device is resumed. And we actually want
> it to reflect the is_suspended state, so we can catch interrupts that
> are not for us without reading regs in the hard irq handler, when the
> GPU is suspended.

`is_suspended` (fun story: that's the first name I gave it) looks good to me,
the doubt I raised was about calling it `suspended_irqs` instead, as I would
prefer to keep names "more generic", but that's just personal preference at
this point anyway.

Cheers,
Angelo

Boris Brezillon Nov. 28, 2023, 3:53 p.m. UTC | #7

On Tue, 28 Nov 2023 16:10:45 +0100
AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
wrote:

> >>   static void panfrost_job_handle_err(struct panfrost_device *pfdev,
> >>   				    struct panfrost_job *job,
> >>   				    unsigned int js)
> >> @@ -792,9 +800,13 @@ static irqreturn_t panfrost_job_irq_handler_thread(int irq, void *data)
> >>   	struct panfrost_device *pfdev = data;
> >>   
> >>   	panfrost_job_handle_irqs(pfdev);
> >> -	job_write(pfdev, JOB_INT_MASK,
> >> -		  GENMASK(16 + NUM_JOB_SLOTS - 1, 16) |
> >> -		  GENMASK(NUM_JOB_SLOTS - 1, 0));
> >> +
> >> +	/* Enable interrupts only if we're not about to get suspended */
> >> +	if (!test_bit(PANFROST_COMP_BIT_JOB, pfdev->is_suspending))  
> > 
> > The irq-line is requested with IRQF_SHARED, meaning the line might be
> > shared between all three GPU IRQs, but also with other devices. I think
> > if we want to be totally safe, we need to also check this is_suspending
> > field in the hard irq handlers before accessing the xxx_INT_yyy
> > registers.
> >   
> 
> This would mean that we would have to force canceling jobs in the suspend
> handler, but if the IRQ never fired, would we still be able to find the
> right bits flipped in JOB_INT_RAWSTAT?

There should be no jobs left if we enter suspend. If there is, that's a
bug we should fix, but I'm digressing.

> 
>  From what I understand, are you suggesting to call, in job_suspend_irq()
> something like
> 
> void panfrost_job_suspend_irq(struct panfrost_device *pfdev)
> {
>          u32 status;
> 
> 	set_bit(PANFROST_COMP_BIT_JOB, pfdev->is_suspending);
> 
> 	job_write(pfdev, JOB_INT_MASK, 0);
> 	synchronize_irq(pfdev->js->irq);
> 
> 	status = job_read(pfdev, JOB_INT_STAT);

I guess you meant _RAWSTAT. _STAT should always be zero after we've
written 0 to _INT_MASK.

> 	if (status)
> 		panfrost_job_irq_handler_thread(pfdev->js->irq, (void*)pfdev);

Nope, we don't need to read the STAT reg and forcibly call the threaded
handler if it's != 0. The synchronize_irq() call should do exactly that
(make sure all pending interrupts are processed before returning), and
our previous job_write(pfdev, JOB_INT_MASK, 0) guarantees that no new
interrupts will kick in after that point.

> }
> 
> and then while still retaining the check in the IRQ thread handler, also
> check it in the hardirq handler like
> 
> static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
> {
> 	struct panfrost_device *pfdev = data;
> 	u32 status;
> 
> 	if (!test_bit(PANFROST_COMP_BIT_JOB, pfdev->is_suspending))
> 		return IRQ_NONE;

Yes, that's the extra check I was talking about, and that's also the
very reason I'm suggesting to call this field suspended_irqs instead of
is_suspending. Ultimately, each bit in this bitmap encodes the status
of a specific IRQ, not the transition from active-to-suspended,
otherwise we'd be clearing the bit at the end of
panfrost_job_suspend_irq(), right after the synchronize_irq(). But if
we were doing that, our hard IRQ handler could be called because other
devices raised an interrupt on the very same IRQ line while we are
suspended, and we'd be doing an invalid GPU reg read while the
clks/power-domains are off.

> 
> 	status = job_read(pfdev, JOB_INT_STAT);
> 	if (!status)
> 	        return IRQ_NONE;
> 
> 	job_write(pfdev, JOB_INT_MASK, 0);
> 	return IRQ_WAKE_THREAD;
> }
> 
> (rinse and repeat for panfrost_mmu)
> 
> ..or am I misunderstanding you?
> 
> Cheers,
> Angelo
> 
>

Boris Brezillon Nov. 28, 2023, 3:58 p.m. UTC | #8

On Tue, 28 Nov 2023 16:42:25 +0100
AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
wrote:

> >>  
> >>>>    	panfrost_device_reset(pfdev);
> >>>>    	panfrost_devfreq_resume(pfdev);
> >>>>    
> >>>> @@ -421,6 +422,9 @@ static int panfrost_device_runtime_suspend(struct device *dev)
> >>>>    		return -EBUSY;
> >>>>    
> >>>>    	panfrost_devfreq_suspend(pfdev);
> >>>> +	panfrost_job_suspend_irq(pfdev);
> >>>> +	panfrost_mmu_suspend_irq(pfdev);
> >>>> +	panfrost_gpu_suspend_irq(pfdev);
> >>>>    	panfrost_gpu_power_off(pfdev);
> >>>>    
> >>>>    	return 0;
> >>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
> >>>> index 54a8aad54259..29f89f2d3679 100644
> >>>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> >>>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> >>>> @@ -25,6 +25,12 @@ struct panfrost_perfcnt;
> >>>>    #define NUM_JOB_SLOTS 3
> >>>>    #define MAX_PM_DOMAINS 5
> >>>>    
> >>>> +enum panfrost_drv_comp_bits {
> >>>> +	PANFROST_COMP_BIT_MMU,
> >>>> +	PANFROST_COMP_BIT_JOB,
> >>>> +	PANFROST_COMP_BIT_MAX
> >>>> +};
> >>>> +
> >>>>    /**
> >>>>     * enum panfrost_gpu_pm - Supported kernel power management features
> >>>>     * @GPU_PM_CLK_DIS:  Allow disabling clocks during system suspend
> >>>> @@ -109,6 +115,7 @@ struct panfrost_device {
> >>>>    
> >>>>    	struct panfrost_features features;
> >>>>    	const struct panfrost_compatible *comp;
> >>>> +	DECLARE_BITMAP(is_suspending, PANFROST_COMP_BIT_MAX);  
> >>>
> >>> nit: Maybe s/is_suspending/suspended_irqs/, given the state remains
> >>> until the device is resumed.  
> >>
> >> If we keep the `is_suspending` name, we can use this one more generically in
> >> case we ever need to, what do you think?  
> > 
> > I'm lost. Why would we want to reserve a name for something we don't
> > know about? My comment was mostly relating to the fact this bitmap
> > doesn't reflect the is_suspending state, but rather is_suspended,
> > because it remains set until the device is resumed. And we actually want
> > it to reflect the is_suspended state, so we can catch interrupts that
> > are not for us without reading regs in the hard irq handler, when the
> > GPU is suspended.  
> 
> `is_suspended` (fun story: that's the first name I gave it) looks good to me,
> the doubt I raised was about calling it `suspended_irqs` instead, as I would
> prefer to keep names "more generic", but that's just personal preference at
> this point anyway.

Ah, sure, is_suspended is fine.

AngeloGioacchino Del Regno Nov. 28, 2023, 4:10 p.m. UTC | #9

Il 28/11/23 16:53, Boris Brezillon ha scritto:
> On Tue, 28 Nov 2023 16:10:45 +0100
> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> wrote:
> 
>>>>    static void panfrost_job_handle_err(struct panfrost_device *pfdev,
>>>>    				    struct panfrost_job *job,
>>>>    				    unsigned int js)
>>>> @@ -792,9 +800,13 @@ static irqreturn_t panfrost_job_irq_handler_thread(int irq, void *data)
>>>>    	struct panfrost_device *pfdev = data;
>>>>    
>>>>    	panfrost_job_handle_irqs(pfdev);
>>>> -	job_write(pfdev, JOB_INT_MASK,
>>>> -		  GENMASK(16 + NUM_JOB_SLOTS - 1, 16) |
>>>> -		  GENMASK(NUM_JOB_SLOTS - 1, 0));
>>>> +
>>>> +	/* Enable interrupts only if we're not about to get suspended */
>>>> +	if (!test_bit(PANFROST_COMP_BIT_JOB, pfdev->is_suspending))
>>>
>>> The irq-line is requested with IRQF_SHARED, meaning the line might be
>>> shared between all three GPU IRQs, but also with other devices. I think
>>> if we want to be totally safe, we need to also check this is_suspending
>>> field in the hard irq handlers before accessing the xxx_INT_yyy
>>> registers.
>>>    
>>
>> This would mean that we would have to force canceling jobs in the suspend
>> handler, but if the IRQ never fired, would we still be able to find the
>> right bits flipped in JOB_INT_RAWSTAT?
> 
> There should be no jobs left if we enter suspend. If there is, that's a
> bug we should fix, but I'm digressing.
> 
>>
>>   From what I understand, are you suggesting to call, in job_suspend_irq()
>> something like
>>
>> void panfrost_job_suspend_irq(struct panfrost_device *pfdev)
>> {
>>           u32 status;
>>
>> 	set_bit(PANFROST_COMP_BIT_JOB, pfdev->is_suspending);
>>
>> 	job_write(pfdev, JOB_INT_MASK, 0);
>> 	synchronize_irq(pfdev->js->irq);
>>
>> 	status = job_read(pfdev, JOB_INT_STAT);
> 
> I guess you meant _RAWSTAT. _STAT should always be zero after we've
> written 0 to _INT_MASK.
> 

Whoops! Yes, as I wrote up there, I meant _RAWSTAT, sorry! :-)

>> 	if (status)
>> 		panfrost_job_irq_handler_thread(pfdev->js->irq, (void*)pfdev);
> 
> Nope, we don't need to read the STAT reg and forcibly call the threaded
> handler if it's != 0. The synchronize_irq() call should do exactly that
> (make sure all pending interrupts are processed before returning), and
> our previous job_write(pfdev, JOB_INT_MASK, 0) guarantees that no new
> interrupts will kick in after that point.
> 

Unless we synchronize_irq() *before* masking all interrupts (which would be
wrong, as some interrupt could still fire after execution of the ISR), we get
*either of* two scenarios:

  - COMP_BIT_JOB is not set, softirq thread unmasks some interrupts by
    writing to JOB_INT_MASK; or
  - COMP_BIT_JOB is set, hardirq handler returns IRQ_NONE, the threaded
    interrupt handler doesn't get executed, jobs are not canceled.

So if we don't forbicly call the threaded handler if RAWSTAT != 0 in there,
and if the extra check is present in the hardirq handler, and if the hardirq
handler wasn't executed already before our synchronize_irq() call (so: if the
hardirq execution has to be done to synchronize irqs), we are not guaranteeing
that jobs cancellation/dequeuing/removal/whatever-handling is done before
entering suspend.

That, unless the suggestion was to call panfrost_job_handle_irqs() instead of
the handler thread like that (because reading it back, it makes sense to do so).

Cheers!

>> }
>>
>> and then while still retaining the check in the IRQ thread handler, also
>> check it in the hardirq handler like
>>
>> static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
>> {
>> 	struct panfrost_device *pfdev = data;
>> 	u32 status;
>>
>> 	if (!test_bit(PANFROST_COMP_BIT_JOB, pfdev->is_suspending))
>> 		return IRQ_NONE;
> 
> Yes, that's the extra check I was talking about, and that's also the
> very reason I'm suggesting to call this field suspended_irqs instead of
> is_suspending. Ultimately, each bit in this bitmap encodes the status
> of a specific IRQ, not the transition from active-to-suspended,
> otherwise we'd be clearing the bit at the end of
> panfrost_job_suspend_irq(), right after the synchronize_irq(). But if
> we were doing that, our hard IRQ handler could be called because other
> devices raised an interrupt on the very same IRQ line while we are
> suspended, and we'd be doing an invalid GPU reg read while the
> clks/power-domains are off.
> 
>>
>> 	status = job_read(pfdev, JOB_INT_STAT);
>> 	if (!status)
>> 	        return IRQ_NONE;
>>
>> 	job_write(pfdev, JOB_INT_MASK, 0);
>> 	return IRQ_WAKE_THREAD;
>> }
>>
>> (rinse and repeat for panfrost_mmu)
>>
>> ..or am I misunderstanding you?
>>
>> Cheers,
>> Angelo
>>
>>
>

Boris Brezillon Nov. 28, 2023, 4:41 p.m. UTC | #10

On Tue, 28 Nov 2023 17:10:17 +0100
AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
wrote:

> >> 	if (status)
> >> 		panfrost_job_irq_handler_thread(pfdev->js->irq, (void*)pfdev);  
> > 
> > Nope, we don't need to read the STAT reg and forcibly call the threaded
> > handler if it's != 0. The synchronize_irq() call should do exactly that
> > (make sure all pending interrupts are processed before returning), and
> > our previous job_write(pfdev, JOB_INT_MASK, 0) guarantees that no new
> > interrupts will kick in after that point.
> >   
> 
> Unless we synchronize_irq() *before* masking all interrupts (which would be
> wrong, as some interrupt could still fire after execution of the ISR), we get
> *either of* two scenarios:
> 
>   - COMP_BIT_JOB is not set, softirq thread unmasks some interrupts by
>     writing to JOB_INT_MASK; or
>   - COMP_BIT_JOB is set, hardirq handler returns IRQ_NONE, the threaded
>     interrupt handler doesn't get executed, jobs are not canceled.
> 
> So if we don't forbicly call the threaded handler if RAWSTAT != 0 in there,
> and if the extra check is present in the hardirq handler, and if the hardirq
> handler wasn't executed already before our synchronize_irq() call (so: if the
> hardirq execution has to be done to synchronize irqs), we are not guaranteeing
> that jobs cancellation/dequeuing/removal/whatever-handling is done before
> entering suspend.

Except the job event processing should have happened before we reached
that point. panfrost_xxx_suspend_irq() are just here to make sure

- we're done processing pending IRQs that we started processing before
  the _INT_MASK update happened
- we ignore new ones, if any

If we end up with unprocessed JOB/MMU irqs we care about when we're
here, this should be fixed by:

1. Making sure the paths updating the MMU AS are retaining a runtime PM
  ref (pm_runtime_get_sync()) before doing their stuff, and releasing
  it (pm_runtime_put()) when they are done

2. Making sure we retain a runtime PM ref while we have jobs queued to
   the various JM queues

3. Making sure we acquire a runtime PM ref when we are about to push a
   job to one of the JM queue

For #2 and #3, we retain one runtime PM ref per active job, just before
queuing it [1], and release the ref when the job is completed [2][3].
We're not supposed to receive interrupts if we have no active jobs, and
if we do, we can safely ignore them, because there's not much we would
do with those anyway.

For #1, we retain the runtime PM ref when flushing TLBs of an
active AS, and when destroying an active MMU context. The last
operation that requires touching GPU regs is panfrost_mmu_enable(),
which is called from panfrost_mmu_as_get(), which is turn is called
from panfrost_job_hw_submit() after this function has acquired a
runtime PM ref. All MMU updates are synchronous, and the interrupts
that might result from an AS are caused by GPU jobs. Meaning that any
MMU interrupt remaining when we're in the suspend path can safely be
ignored.

> 
> That, unless the suggestion was to call panfrost_job_handle_irqs() instead of
> the handler thread like that (because reading it back, it makes sense to do so).

Nope, the suggestion was to keep things unchanged in
panfrost_job_suspend_irq(), and just add the extra is_suspended check
in panfrost_job_irq_handler().

[1]https://elixir.bootlin.com/linux/v6.7-rc3/source/drivers/gpu/drm/panfrost/panfrost_job.c#L207
[2]https://elixir.bootlin.com/linux/v6.7-rc3/source/drivers/gpu/drm/panfrost/panfrost_job.c#L462
[3]https://elixir.bootlin.com/linux/v6.7-rc3/source/drivers/gpu/drm/panfrost/panfrost_job.c#L481
[4]https://elixir.bootlin.com/linux/v6.7-rc3/source/drivers/gpu/drm/panfrost/panfrost_mmu.c#L279
[5]https://elixir.bootlin.com/linux/v6.7-rc3/source/drivers/gpu/drm/panfrost/panfrost_mmu.c#L555

[v2,3/3] drm/panfrost: Synchronize and disable interrupts before powering off

Commit Message

Comments

Patch