diff mbox series

[v1] ALSA: hda: fix snd_hda_bus_reset when single_cmd is not supported

Message ID 20240821014238.338864-1-soxiebing@163.com (mailing list archive)
State New, archived
Headers show
Series [v1] ALSA: hda: fix snd_hda_bus_reset when single_cmd is not supported | expand

Commit Message

soxiebing Aug. 21, 2024, 1:42 a.m. UTC
From: songxiebing <songxiebing@kylinos.cn>

When an azx_get_desponse timeout occurs, ensure that bus_reset
can be used when fallback_to_single_cmd is not supported.

Signed-off-by: songxiebing <songxiebing@kylinos.cn>
---
 sound/pci/hda/hda_controller.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Takashi Iwai Aug. 21, 2024, 7:11 a.m. UTC | #1
On Wed, 21 Aug 2024 03:42:38 +0200,
soxiebing wrote:
> 
> From: songxiebing <songxiebing@kylinos.cn>
> 
> When an azx_get_desponse timeout occurs, ensure that bus_reset
> can be used when fallback_to_single_cmd is not supported.
> 
> Signed-off-by: songxiebing <songxiebing@kylinos.cn>

Why do you need to change?  Does it fix any real problem you faced?


thanks,

Takashi
soxiebing Aug. 26, 2024, 9:19 a.m. UTC | #2
>On Wed, 21 Aug 2024 03:42:38 +0200,
>soxiebing wrote:
>> 
>> From: songxiebing <songxiebing@kylinos.cn>
>> 
>> When an azx_get_desponse timeout occurs, ensure that bus_reset
>> can be used when fallback_to_single_cmd is not supported.
>> 
>> Signed-off-by: songxiebing <songxiebing@kylinos.cn>
>
>Why do you need to change?  Does it fix any real problem you faced?

Thanks for reply, i am testing all these days, but the problem is
still exists even if using bus reset.

The problem i encountered is that hda_call_codec_resume returned 
timeout of 120 seconds(defined CONFIG_DPM_WATCHDOG)) when doing s4, 
azx_get_response timeout occured, it is a low probability event.

To avoid exceeding 120s, can i change the count value to 3 in 
hda_set_power_state ?
Takashi Iwai Aug. 26, 2024, 10:01 a.m. UTC | #3
On Mon, 26 Aug 2024 11:19:58 +0200,
soxiebing wrote:
> 
> >On Wed, 21 Aug 2024 03:42:38 +0200,
> >soxiebing wrote:
> >> 
> >> From: songxiebing <songxiebing@kylinos.cn>
> >> 
> >> When an azx_get_desponse timeout occurs, ensure that bus_reset
> >> can be used when fallback_to_single_cmd is not supported.
> >> 
> >> Signed-off-by: songxiebing <songxiebing@kylinos.cn>
> >
> >Why do you need to change?  Does it fix any real problem you faced?
> 
> Thanks for reply, i am testing all these days, but the problem is
> still exists even if using bus reset.
>
> The problem i encountered is that hda_call_codec_resume returned 
> timeout of 120 seconds(defined CONFIG_DPM_WATCHDOG)) when doing s4, 
> azx_get_response timeout occured, it is a low probability event.
> 
> To avoid exceeding 120s, can i change the count value to 3 in 
> hda_set_power_state ?

So the change you suggested isn't for any real "fix" but to allow the
possible workaround with single_cmd to be applicable somehow in a
different form.  Then we should rather try debugging the original
issue, instead of change it.

Does the response timeout happen *during* the S4 suspend, or during S4
resume, or after S4 resume?

To be noted, the behavior you changed is only for the single_cmd
option is set explicitly, so it's more or less the designed behavior,
and I don't think it's good to change blindly.


Takashi
soxiebing Aug. 27, 2024, 3:30 a.m. UTC | #4
>On Mon, 26 Aug 2024 11:19:58 +0200,
>soxiebing wrote:
>> 
>> >On Wed, 21 Aug 2024 03:42:38 +0200,
>> >soxiebing wrote:
>> >> 
>> >> From: songxiebing <songxiebing@kylinos.cn>
>> >> 
>> >> When an azx_get_desponse timeout occurs, ensure that bus_reset
>> >> can be used when fallback_to_single_cmd is not supported.
>> >> 
>> >> Signed-off-by: songxiebing <songxiebing@kylinos.cn>
>> >
>> >Why do you need to change?  Does it fix any real problem you faced?
>> 
>> Thanks for reply, i am testing all these days, but the problem is
>> still exists even if using bus reset.
>>
>> The problem i encountered is that hda_call_codec_resume returned 
>> timeout of 120 seconds(defined CONFIG_DPM_WATCHDOG)) when doing s4, 
>> azx_get_response timeout occured, it is a low probability event.
>> 
>> To avoid exceeding 120s, can i change the count value to 3 in 
>> hda_set_power_state ?
>
>So the change you suggested isn't for any real "fix" but to allow the
>possible workaround with single_cmd to be applicable somehow in a
>different form.  Then we should rather try debugging the original
>issue, instead of change it.
>
>Does the response timeout happen *during* the S4 suspend, or during S4
>resume, or after S4 resume?
>
>To be noted, the behavior you changed is only for the single_cmd
>option is set explicitly, so it's more or less the designed behavior,
>and I don't think it's good to change blindly.

The timeout happen during S4 resume. 

In hda_set_power_state function, the count value setting at most 10 times,
i feel like there are a lot of retries, can we change it to 3 times, to avoid
exceeding 120s ?

thanks,

soxiebing
Takashi Iwai Aug. 27, 2024, 5:53 a.m. UTC | #5
On Tue, 27 Aug 2024 05:30:43 +0200,
soxiebing wrote:
> 
> >On Mon, 26 Aug 2024 11:19:58 +0200,
> >soxiebing wrote:
> >> 
> >> >On Wed, 21 Aug 2024 03:42:38 +0200,
> >> >soxiebing wrote:
> >> >> 
> >> >> From: songxiebing <songxiebing@kylinos.cn>
> >> >> 
> >> >> When an azx_get_desponse timeout occurs, ensure that bus_reset
> >> >> can be used when fallback_to_single_cmd is not supported.
> >> >> 
> >> >> Signed-off-by: songxiebing <songxiebing@kylinos.cn>
> >> >
> >> >Why do you need to change?  Does it fix any real problem you faced?
> >> 
> >> Thanks for reply, i am testing all these days, but the problem is
> >> still exists even if using bus reset.
> >>
> >> The problem i encountered is that hda_call_codec_resume returned 
> >> timeout of 120 seconds(defined CONFIG_DPM_WATCHDOG)) when doing s4, 
> >> azx_get_response timeout occured, it is a low probability event.
> >> 
> >> To avoid exceeding 120s, can i change the count value to 3 in 
> >> hda_set_power_state ?
> >
> >So the change you suggested isn't for any real "fix" but to allow the
> >possible workaround with single_cmd to be applicable somehow in a
> >different form.  Then we should rather try debugging the original
> >issue, instead of change it.
> >
> >Does the response timeout happen *during* the S4 suspend, or during S4
> >resume, or after S4 resume?
> >
> >To be noted, the behavior you changed is only for the single_cmd
> >option is set explicitly, so it's more or less the designed behavior,
> >and I don't think it's good to change blindly.
> 
> The timeout happen during S4 resume. 
> 
> In hda_set_power_state function, the count value setting at most 10 times,
> i feel like there are a lot of retries, can we change it to 3 times, to avoid
> exceeding 120s ?

No, no, if such a long timeout happens, something is already wrong.
Extending it is no real fix at all.

You need to figure out which access or action causes the problem at
first.


Takashi
diff mbox series

Patch

diff --git a/sound/pci/hda/hda_controller.c b/sound/pci/hda/hda_controller.c
index 5d86e5a9c814..5ea198d67bda 100644
--- a/sound/pci/hda/hda_controller.c
+++ b/sound/pci/hda/hda_controller.c
@@ -809,10 +809,6 @@  static int azx_rirb_get_response(struct hdac_bus *bus, unsigned int addr,
 		return -EIO;
 	}
 
-	/* no fallback mechanism? */
-	if (!chip->fallback_to_single_cmd)
-		return -EIO;
-
 	/* a fatal communication error; need either to reset or to fallback
 	 * to the single_cmd mode
 	 */
@@ -824,6 +820,10 @@  static int azx_rirb_get_response(struct hdac_bus *bus, unsigned int addr,
 		return -EAGAIN; /* give a chance to retry */
 	}
 
+	/* no fallback mechanism? */
+	if (!chip->fallback_to_single_cmd)
+		return -EIO;
+
 	dev_err(chip->card->dev,
 		"azx_get_response timeout, switching to single_cmd mode: last cmd=0x%08x\n",
 		bus->last_cmd[addr]);