diff mbox series

[v4,01/11] ASoC: SOF: core: add 'no_wq' probe and remove callbacks

Message ID 20230830153652.217855-2-maarten.lankhorst@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series sound: Use -EPROBE_DEFER instead of i915 module loading. | expand

Commit Message

Maarten Lankhorst Aug. 30, 2023, 3:36 p.m. UTC
From: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>

The existing DSP probe may be handled in a workqueue to allow for
extra time, typically for the i915 request_module and HDAudio codec
handling.

With the upcoming changes for i915/Xe driver relying on the
-EPROBE_DEFER mechanism, we need to have a first pass of the probe
which cannot be pushed to a workqueue. Introduce 2 new optional
callbacks.

Note that instead of probe_no_wq/probe we could have use a more
self-explanatory naming such as probe/probe_wq_allowed, but that would
have been a very intrusive change.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 sound/soc/sof/core.c     | 11 +++++++++--
 sound/soc/sof/ops.h      | 16 ++++++++++++++++
 sound/soc/sof/sof-priv.h |  2 ++
 3 files changed, 27 insertions(+), 2 deletions(-)

Comments

Maarten Lankhorst Aug. 31, 2023, 10:44 a.m. UTC | #1
Hey,

Den 2023-08-30 kl. 19:13, skrev Pierre-Louis Bossart:
>> +static inline int snd_sof_remove_no_wq(struct snd_sof_dev *sdev)
>> +{
>> +	if (sof_ops(sdev)->remove_no_wq)
>> +		return sof_ops(sdev)->remove_no_wq(sdev);
>> +
>> +	return 0;
>> +}
>>   	/* probe/remove/shutdown */
>> +	int (*probe_no_wq)(struct snd_sof_dev *sof_dev); /* optional */
>> +	int (*remove_no_wq)(struct snd_sof_dev *sof_dev); /* optional */
> My initial PR didn't have this remove_no_wq() callback.
>
> For symmetry it could be useful if it is meant to undo what the
> probe_no_wq() did, but conceptually the first thing we do in the remove
> is make sure that workqueue is either not scheduled or we wait until it
> completes.
>
> int snd_sof_device_remove(struct device *dev)
> {
> 	struct snd_sof_dev *sdev = dev_get_drvdata(dev);
> 	struct snd_sof_pdata *pdata = sdev->pdata;
> 	int ret;
>
> 	if (IS_ENABLED(CONFIG_SND_SOC_SOF_PROBE_WORK_QUEUE))
> 		cancel_work_sync(&sdev->probe_work);

That is exactly what I was using it for later on.

I had to have a counter to hda_init() in patch 10/11.

Cheers,
Maarten
Kai Vehmanen Sept. 1, 2023, 12:15 p.m. UTC | #2
Hi,

On Wed, 30 Aug 2023, Maarten Lankhorst wrote:

> With the upcoming changes for i915/Xe driver relying on the
> -EPROBE_DEFER mechanism, we need to have a first pass of the probe
> which cannot be pushed to a workqueue. Introduce 2 new optional
> callbacks.
[...]
> diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c
> index 30db685cc5f4b..54c384a5d6140 100644
> --- a/sound/soc/sof/core.c
> +++ b/sound/soc/sof/core.c
> @@ -327,8 +327,6 @@ static int sof_probe_continue(struct snd_sof_dev *sdev)
>  dsp_err:
>  	snd_sof_remove(sdev);
>  probe_err:
> -	sof_ops_free(sdev);
> -

this seems a bit out-of-place in this patch. It seems a valid change,
but not really related to this patch, right?

We seem to have a related fix waiting to be sent to alsa-devel, by
Peter:
"ASoC: SOF: core: Only call sof_ops_free() on remove if the probe wa"
https://github.com/thesofproject/linux/pull/4515

... not yet in Mark's tree.

Otherwise patch looks good to me.

Br, Kai
Kai Vehmanen Sept. 1, 2023, 12:22 p.m. UTC | #3
Hi,

On Wed, 30 Aug 2023, Pierre-Louis Bossart wrote:

> >  	/* probe/remove/shutdown */
> > +	int (*probe_no_wq)(struct snd_sof_dev *sof_dev); /* optional */
> > +	int (*remove_no_wq)(struct snd_sof_dev *sof_dev); /* optional */
> 
> My initial PR didn't have this remove_no_wq() callback.
> 
> For symmetry it could be useful if it is meant to undo what the
> probe_no_wq() did, but conceptually the first thing we do in the remove
> is make sure that workqueue is either not scheduled or we wait until it
> completes.

I think that's exactly what the patch ends up with, remove_no_wq releases 
resources acquired in probe_no_wq, i.e. they are symmetrical.
 
> int snd_sof_device_remove(struct device *dev)
> {
> 	struct snd_sof_dev *sdev = dev_get_drvdata(dev);
> 	struct snd_sof_pdata *pdata = sdev->pdata;
> 	int ret;
> 
> 	if (IS_ENABLED(CONFIG_SND_SOC_SOF_PROBE_WORK_QUEUE))
> 		cancel_work_sync(&sdev->probe_work);

So this seems ok as well:
 - if wq is used, at remove, we first wait wq to be finished 
   and only then proceed with removal
 - the remove_no_wq is run only when the first step is completed

Br, Kai
Peter Ujfalusi Sept. 1, 2023, 12:44 p.m. UTC | #4
On 01/09/2023 15:15, Kai Vehmanen wrote:
> Hi,
> 
> On Wed, 30 Aug 2023, Maarten Lankhorst wrote:
> 
>> With the upcoming changes for i915/Xe driver relying on the
>> -EPROBE_DEFER mechanism, we need to have a first pass of the probe
>> which cannot be pushed to a workqueue. Introduce 2 new optional
>> callbacks.
> [...]
>> diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c
>> index 30db685cc5f4b..54c384a5d6140 100644
>> --- a/sound/soc/sof/core.c
>> +++ b/sound/soc/sof/core.c
>> @@ -327,8 +327,6 @@ static int sof_probe_continue(struct snd_sof_dev *sdev)
>>  dsp_err:
>>  	snd_sof_remove(sdev);
>>  probe_err:
>> -	sof_ops_free(sdev);
>> -
> 
> this seems a bit out-of-place in this patch. It seems a valid change,
> but not really related to this patch, right?

The ops needs to be preserved even if the wq fails since the patch wants
to call snd_sof_remove_no_wq() unconditionally on remove.

> We seem to have a related fix waiting to be sent to alsa-devel, by
> Peter:
> "ASoC: SOF: core: Only call sof_ops_free() on remove if the probe wa"
> https://github.com/thesofproject/linux/pull/4515

I guess we can revert that in sof-dev, if this is the preferred way?

> ... not yet in Mark's tree.
> 
> Otherwise patch looks good to me.

I would have not created the snd_sof_remove_no_wq() as it makes not much
functional sense.
It might be even better if the remove in the wq would do the
hda_codec_i915_exit() as the module will remain in there until the user
removes it.

> 
> Br, Kai
Pierre-Louis Bossart Sept. 5, 2023, 12:37 p.m. UTC | #5
On 9/1/23 08:44, Péter Ujfalusi wrote:
> 
> 
> On 01/09/2023 15:15, Kai Vehmanen wrote:
>> Hi,
>>
>> On Wed, 30 Aug 2023, Maarten Lankhorst wrote:
>>
>>> With the upcoming changes for i915/Xe driver relying on the
>>> -EPROBE_DEFER mechanism, we need to have a first pass of the probe
>>> which cannot be pushed to a workqueue. Introduce 2 new optional
>>> callbacks.
>> [...]
>>> diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c
>>> index 30db685cc5f4b..54c384a5d6140 100644
>>> --- a/sound/soc/sof/core.c
>>> +++ b/sound/soc/sof/core.c
>>> @@ -327,8 +327,6 @@ static int sof_probe_continue(struct snd_sof_dev *sdev)
>>>  dsp_err:
>>>  	snd_sof_remove(sdev);
>>>  probe_err:
>>> -	sof_ops_free(sdev);
>>> -
>>
>> this seems a bit out-of-place in this patch. It seems a valid change,
>> but not really related to this patch, right?
> 
> The ops needs to be preserved even if the wq fails since the patch wants
> to call snd_sof_remove_no_wq() unconditionally on remove.
> 
>> We seem to have a related fix waiting to be sent to alsa-devel, by
>> Peter:
>> "ASoC: SOF: core: Only call sof_ops_free() on remove if the probe wa"
>> https://github.com/thesofproject/linux/pull/4515
> 
> I guess we can revert that in sof-dev, if this is the preferred way?
> 
>> ... not yet in Mark's tree.
>>
>> Otherwise patch looks good to me.
> 
> I would have not created the snd_sof_remove_no_wq() as it makes not much
> functional sense.
> It might be even better if the remove in the wq would do the
> hda_codec_i915_exit() as the module will remain in there until the user
> removes it.

I think find all this very confusing, because there is no workqueue used
in the remove steps. The workqueue is only used ONCE during the probe.
Pierre-Louis Bossart Sept. 7, 2023, 5:29 p.m. UTC | #6
On 9/5/23 08:37, Pierre-Louis Bossart wrote:
> 
> 
> On 9/1/23 08:44, Péter Ujfalusi wrote:
>>
>>
>> On 01/09/2023 15:15, Kai Vehmanen wrote:
>>> Hi,
>>>
>>> On Wed, 30 Aug 2023, Maarten Lankhorst wrote:
>>>
>>>> With the upcoming changes for i915/Xe driver relying on the
>>>> -EPROBE_DEFER mechanism, we need to have a first pass of the probe
>>>> which cannot be pushed to a workqueue. Introduce 2 new optional
>>>> callbacks.
>>> [...]
>>>> diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c
>>>> index 30db685cc5f4b..54c384a5d6140 100644
>>>> --- a/sound/soc/sof/core.c
>>>> +++ b/sound/soc/sof/core.c
>>>> @@ -327,8 +327,6 @@ static int sof_probe_continue(struct snd_sof_dev *sdev)
>>>>  dsp_err:
>>>>  	snd_sof_remove(sdev);
>>>>  probe_err:
>>>> -	sof_ops_free(sdev);
>>>> -
>>>
>>> this seems a bit out-of-place in this patch. It seems a valid change,
>>> but not really related to this patch, right?
>>
>> The ops needs to be preserved even if the wq fails since the patch wants
>> to call snd_sof_remove_no_wq() unconditionally on remove.
>>
>>> We seem to have a related fix waiting to be sent to alsa-devel, by
>>> Peter:
>>> "ASoC: SOF: core: Only call sof_ops_free() on remove if the probe wa"
>>> https://github.com/thesofproject/linux/pull/4515
>>
>> I guess we can revert that in sof-dev, if this is the preferred way?
>>
>>> ... not yet in Mark's tree.
>>>
>>> Otherwise patch looks good to me.
>>
>> I would have not created the snd_sof_remove_no_wq() as it makes not much
>> functional sense.
>> It might be even better if the remove in the wq would do the
>> hda_codec_i915_exit() as the module will remain in there until the user
>> removes it.
> 
> I think find all this very confusing, because there is no workqueue used
> in the remove steps. The workqueue is only used ONCE during the probe.

Maybe we should just remove any references to workqueues, and have

probe_start (cannot run in a wq)
probe (may run in a wq)
remove (cannot run in a wq, needs to call cancel_work_sync() if the
probe runs in a wq)
remove_last (cannot run in a wq, releases all resources acquired in
probe_start)

Or something similar that shows the symmetry between steps and when the
wq is allowed.
Peter Ujfalusi Sept. 11, 2023, 6:51 a.m. UTC | #7
On 07/09/2023 20:29, Pierre-Louis Bossart wrote:
>> I think find all this very confusing, because there is no workqueue used
>> in the remove steps. The workqueue is only used ONCE during the probe.
> 
> Maybe we should just remove any references to workqueues, and have
> 
> probe_start (cannot run in a wq)
> probe (may run in a wq)
> remove (cannot run in a wq, needs to call cancel_work_sync() if the
> probe runs in a wq)
> remove_last (cannot run in a wq, releases all resources acquired in
> probe_start)
> 
> Or something similar that shows the symmetry between steps and when the
> wq is allowed.

What we have atm:
snd_sof_probe - might be called from wq
snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
		 step)

We want a callbacks for hardware/device probing, right, split the
snd_sof_probe (and remove) to be able to support a sane level of
deferred probing support.

With that in mind:
snd_sof_device_probe - Not called from wq (to handle deferred probing)
snd_sof_probe - might be called from wq

snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
		 step)
snd_sof_device_remove - Not called from wq (to up the
			snd_sof_device_probe step)

Naming option: s/device/hardware

However, I think the snd_sof_device_remove itself is redundant and we
might not need it at all as in case we have wq and there is a failure in
there we do want to release resources as much as possible. The module
will be kept loaded (no deferred handling in wq) and that might block
PM, other devices to behave correctly. Iow, if the wq has failure we
should do a cleanup to the best effort to reach a level like the driver
is not even loaded.

Doable? I thin it is.
Pierre-Louis Bossart Sept. 12, 2023, 12:25 a.m. UTC | #8
> What we have atm:
> snd_sof_probe - might be called from wq
> snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
> 		 step)

I don't think it's correct, snd_sof_remove cannot be called from a wq.
The device core knows nothing about workqueues.

> We want a callbacks for hardware/device probing, right, split the
> snd_sof_probe (and remove) to be able to support a sane level of
> deferred probing support.
> 
> With that in mind:
> snd_sof_device_probe - Not called from wq (to handle deferred probing)
> snd_sof_probe - might be called from wq
> 
> snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
> 		 step)
> snd_sof_device_remove - Not called from wq (to up the
> 			snd_sof_device_probe step)
> 
> Naming option: s/device/hardware

I like the 'device' hint since it's directly related to the device (or
subsystem) callbacks.

> However, I think the snd_sof_device_remove itself is redundant and we
> might not need it at all as in case we have wq and there is a failure in
> there we do want to release resources as much as possible. The module
> will be kept loaded (no deferred handling in wq) and that might block
> PM, other devices to behave correctly. Iow, if the wq has failure we
> should do a cleanup to the best effort to reach a level like the driver
> is not even loaded.

If we have a failure in a workqueue used for probe, then we have to
clean-up everything since nothing in the device core will do so for us.
Peter Ujfalusi Sept. 12, 2023, 6:10 a.m. UTC | #9
On 12/09/2023 03:25, Pierre-Louis Bossart wrote:
> 
> 
>> What we have atm:
>> snd_sof_probe - might be called from wq
>> snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
>> 		 step)
> 
> I don't think it's correct, snd_sof_remove cannot be called from a wq.
> The device core knows nothing about workqueues.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/sound/soc/sof/core.c#n328

it is called on the error path of sof_probe_continue(), which can be run
in a workque.

>> We want a callbacks for hardware/device probing, right, split the
>> snd_sof_probe (and remove) to be able to support a sane level of
>> deferred probing support.
>>
>> With that in mind:
>> snd_sof_device_probe - Not called from wq (to handle deferred probing)
>> snd_sof_probe - might be called from wq
>>
>> snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
>> 		 step)
>> snd_sof_device_remove - Not called from wq (to up the
>> 			snd_sof_device_probe step)
>>
>> Naming option: s/device/hardware
> 
> I like the 'device' hint since it's directly related to the device (or
> subsystem) callbacks.
> 
>> However, I think the snd_sof_device_remove itself is redundant and we
>> might not need it at all as in case we have wq and there is a failure in
>> there we do want to release resources as much as possible. The module
>> will be kept loaded (no deferred handling in wq) and that might block
>> PM, other devices to behave correctly. Iow, if the wq has failure we
>> should do a cleanup to the best effort to reach a level like the driver
>> is not even loaded.
> 
> If we have a failure in a workqueue used for probe, then we have to
> clean-up everything since nothing in the device core will do so for us.

Yes, this makes the snd_sof_device_remove() redundant or at least the
definition of it is no longer a mirror of snd_sof_device_probe():

snd_sof_device_remove - might be called from wq (cleans up the
			snd_sof_device_probe step)

Any failure in sof_probe_continue() should execute the
snd_sof_device_remove(), snd_sof_remove() is only involved after the
snd_sof_probe() have returned with success.

I think this makes actually makes sense and it is well defined.
On module remove we need to take into account the case when we have
failed in wq similarly as we do currently (the resources have been freed
up already).
Pierre-Louis Bossart Sept. 12, 2023, 2:02 p.m. UTC | #10
On 9/12/23 02:10, Péter Ujfalusi wrote:
> 
> 
> On 12/09/2023 03:25, Pierre-Louis Bossart wrote:
>>
>>
>>> What we have atm:
>>> snd_sof_probe - might be called from wq
>>> snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
>>> 		 step)
>>
>> I don't think it's correct, snd_sof_remove cannot be called from a wq.
>> The device core knows nothing about workqueues.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/sound/soc/sof/core.c#n328
> 
> it is called on the error path of sof_probe_continue(), which can be run
> in a workque.
> 
>>> We want a callbacks for hardware/device probing, right, split the
>>> snd_sof_probe (and remove) to be able to support a sane level of
>>> deferred probing support.
>>>
>>> With that in mind:
>>> snd_sof_device_probe - Not called from wq (to handle deferred probing)
>>> snd_sof_probe - might be called from wq
>>>
>>> snd_sof_remove - might be called from wq (cleans up the snd_sof_probe
>>> 		 step)
>>> snd_sof_device_remove - Not called from wq (to up the
>>> 			snd_sof_device_probe step)
>>>
>>> Naming option: s/device/hardware
>>
>> I like the 'device' hint since it's directly related to the device (or
>> subsystem) callbacks.
>>
>>> However, I think the snd_sof_device_remove itself is redundant and we
>>> might not need it at all as in case we have wq and there is a failure in
>>> there we do want to release resources as much as possible. The module
>>> will be kept loaded (no deferred handling in wq) and that might block
>>> PM, other devices to behave correctly. Iow, if the wq has failure we
>>> should do a cleanup to the best effort to reach a level like the driver
>>> is not even loaded.
>>
>> If we have a failure in a workqueue used for probe, then we have to
>> clean-up everything since nothing in the device core will do so for us.
> 
> Yes, this makes the snd_sof_device_remove() redundant or at least the
> definition of it is no longer a mirror of snd_sof_device_probe():
> 
> snd_sof_device_remove - might be called from wq (cleans up the
> 			snd_sof_device_probe step)
> 
> Any failure in sof_probe_continue() should execute the
> snd_sof_device_remove(), snd_sof_remove() is only involved after the
> snd_sof_probe() have returned with success.
> 
> I think this makes actually makes sense and it is well defined.
> On module remove we need to take into account the case when we have
> failed in wq similarly as we do currently (the resources have been freed
> up already).

Agree, I stand corrected, thanks Peter.
diff mbox series

Patch

diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c
index 30db685cc5f4b..54c384a5d6140 100644
--- a/sound/soc/sof/core.c
+++ b/sound/soc/sof/core.c
@@ -327,8 +327,6 @@  static int sof_probe_continue(struct snd_sof_dev *sdev)
 dsp_err:
 	snd_sof_remove(sdev);
 probe_err:
-	sof_ops_free(sdev);
-
 	/* all resources freed, update state to match */
 	sof_set_fw_state(sdev, SOF_FW_BOOT_NOT_STARTED);
 	sdev->first_boot = true;
@@ -436,6 +434,14 @@  int snd_sof_device_probe(struct device *dev, struct snd_sof_pdata *plat_data)
 
 	sof_set_fw_state(sdev, SOF_FW_BOOT_NOT_STARTED);
 
+	/*
+	 * first pass of probe which isn't allowed to run in a work-queue,
+	 * typically to rely on -EPROBE_DEFER dependencies
+	 */
+	ret = snd_sof_probe_no_wq(sdev);
+	if (ret < 0)
+		return ret;
+
 	if (IS_ENABLED(CONFIG_SND_SOC_SOF_PROBE_WORK_QUEUE)) {
 		INIT_WORK(&sdev->probe_work, sof_probe_work);
 		schedule_work(&sdev->probe_work);
@@ -487,6 +493,7 @@  int snd_sof_device_remove(struct device *dev)
 		snd_sof_free_debug(sdev);
 		snd_sof_remove(sdev);
 	}
+	snd_sof_remove_no_wq(sdev);
 
 	sof_ops_free(sdev);
 
diff --git a/sound/soc/sof/ops.h b/sound/soc/sof/ops.h
index 9ab7b9be765bc..2a03b152e9313 100644
--- a/sound/soc/sof/ops.h
+++ b/sound/soc/sof/ops.h
@@ -38,6 +38,14 @@  static inline void sof_ops_free(struct snd_sof_dev *sdev)
 /* Mandatory operations are verified during probing */
 
 /* init */
+static inline int snd_sof_probe_no_wq(struct snd_sof_dev *sdev)
+{
+	if (sof_ops(sdev)->probe_no_wq)
+		return sof_ops(sdev)->probe_no_wq(sdev);
+
+	return 0;
+}
+
 static inline int snd_sof_probe(struct snd_sof_dev *sdev)
 {
 	return sof_ops(sdev)->probe(sdev);
@@ -51,6 +59,14 @@  static inline int snd_sof_remove(struct snd_sof_dev *sdev)
 	return 0;
 }
 
+static inline int snd_sof_remove_no_wq(struct snd_sof_dev *sdev)
+{
+	if (sof_ops(sdev)->remove_no_wq)
+		return sof_ops(sdev)->remove_no_wq(sdev);
+
+	return 0;
+}
+
 static inline int snd_sof_shutdown(struct snd_sof_dev *sdev)
 {
 	if (sof_ops(sdev)->shutdown)
diff --git a/sound/soc/sof/sof-priv.h b/sound/soc/sof/sof-priv.h
index d4f6702e93dcb..addaef282ee92 100644
--- a/sound/soc/sof/sof-priv.h
+++ b/sound/soc/sof/sof-priv.h
@@ -165,6 +165,8 @@  struct sof_firmware {
 struct snd_sof_dsp_ops {
 
 	/* probe/remove/shutdown */
+	int (*probe_no_wq)(struct snd_sof_dev *sof_dev); /* optional */
+	int (*remove_no_wq)(struct snd_sof_dev *sof_dev); /* optional */
 	int (*probe)(struct snd_sof_dev *sof_dev); /* mandatory */
 	int (*remove)(struct snd_sof_dev *sof_dev); /* optional */
 	int (*shutdown)(struct snd_sof_dev *sof_dev); /* optional */