Message ID | 20240821014238.338864-1-soxiebing@163.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v1] ALSA: hda: fix snd_hda_bus_reset when single_cmd is not supported | expand |
On Wed, 21 Aug 2024 03:42:38 +0200, soxiebing wrote: > > From: songxiebing <songxiebing@kylinos.cn> > > When an azx_get_desponse timeout occurs, ensure that bus_reset > can be used when fallback_to_single_cmd is not supported. > > Signed-off-by: songxiebing <songxiebing@kylinos.cn> Why do you need to change? Does it fix any real problem you faced? thanks, Takashi
>On Wed, 21 Aug 2024 03:42:38 +0200, >soxiebing wrote: >> >> From: songxiebing <songxiebing@kylinos.cn> >> >> When an azx_get_desponse timeout occurs, ensure that bus_reset >> can be used when fallback_to_single_cmd is not supported. >> >> Signed-off-by: songxiebing <songxiebing@kylinos.cn> > >Why do you need to change? Does it fix any real problem you faced? Thanks for reply, i am testing all these days, but the problem is still exists even if using bus reset. The problem i encountered is that hda_call_codec_resume returned timeout of 120 seconds(defined CONFIG_DPM_WATCHDOG)) when doing s4, azx_get_response timeout occured, it is a low probability event. To avoid exceeding 120s, can i change the count value to 3 in hda_set_power_state ?
On Mon, 26 Aug 2024 11:19:58 +0200, soxiebing wrote: > > >On Wed, 21 Aug 2024 03:42:38 +0200, > >soxiebing wrote: > >> > >> From: songxiebing <songxiebing@kylinos.cn> > >> > >> When an azx_get_desponse timeout occurs, ensure that bus_reset > >> can be used when fallback_to_single_cmd is not supported. > >> > >> Signed-off-by: songxiebing <songxiebing@kylinos.cn> > > > >Why do you need to change? Does it fix any real problem you faced? > > Thanks for reply, i am testing all these days, but the problem is > still exists even if using bus reset. > > The problem i encountered is that hda_call_codec_resume returned > timeout of 120 seconds(defined CONFIG_DPM_WATCHDOG)) when doing s4, > azx_get_response timeout occured, it is a low probability event. > > To avoid exceeding 120s, can i change the count value to 3 in > hda_set_power_state ? So the change you suggested isn't for any real "fix" but to allow the possible workaround with single_cmd to be applicable somehow in a different form. Then we should rather try debugging the original issue, instead of change it. Does the response timeout happen *during* the S4 suspend, or during S4 resume, or after S4 resume? To be noted, the behavior you changed is only for the single_cmd option is set explicitly, so it's more or less the designed behavior, and I don't think it's good to change blindly. Takashi
>On Mon, 26 Aug 2024 11:19:58 +0200, >soxiebing wrote: >> >> >On Wed, 21 Aug 2024 03:42:38 +0200, >> >soxiebing wrote: >> >> >> >> From: songxiebing <songxiebing@kylinos.cn> >> >> >> >> When an azx_get_desponse timeout occurs, ensure that bus_reset >> >> can be used when fallback_to_single_cmd is not supported. >> >> >> >> Signed-off-by: songxiebing <songxiebing@kylinos.cn> >> > >> >Why do you need to change? Does it fix any real problem you faced? >> >> Thanks for reply, i am testing all these days, but the problem is >> still exists even if using bus reset. >> >> The problem i encountered is that hda_call_codec_resume returned >> timeout of 120 seconds(defined CONFIG_DPM_WATCHDOG)) when doing s4, >> azx_get_response timeout occured, it is a low probability event. >> >> To avoid exceeding 120s, can i change the count value to 3 in >> hda_set_power_state ? > >So the change you suggested isn't for any real "fix" but to allow the >possible workaround with single_cmd to be applicable somehow in a >different form. Then we should rather try debugging the original >issue, instead of change it. > >Does the response timeout happen *during* the S4 suspend, or during S4 >resume, or after S4 resume? > >To be noted, the behavior you changed is only for the single_cmd >option is set explicitly, so it's more or less the designed behavior, >and I don't think it's good to change blindly. The timeout happen during S4 resume. In hda_set_power_state function, the count value setting at most 10 times, i feel like there are a lot of retries, can we change it to 3 times, to avoid exceeding 120s ? thanks, soxiebing
On Tue, 27 Aug 2024 05:30:43 +0200, soxiebing wrote: > > >On Mon, 26 Aug 2024 11:19:58 +0200, > >soxiebing wrote: > >> > >> >On Wed, 21 Aug 2024 03:42:38 +0200, > >> >soxiebing wrote: > >> >> > >> >> From: songxiebing <songxiebing@kylinos.cn> > >> >> > >> >> When an azx_get_desponse timeout occurs, ensure that bus_reset > >> >> can be used when fallback_to_single_cmd is not supported. > >> >> > >> >> Signed-off-by: songxiebing <songxiebing@kylinos.cn> > >> > > >> >Why do you need to change? Does it fix any real problem you faced? > >> > >> Thanks for reply, i am testing all these days, but the problem is > >> still exists even if using bus reset. > >> > >> The problem i encountered is that hda_call_codec_resume returned > >> timeout of 120 seconds(defined CONFIG_DPM_WATCHDOG)) when doing s4, > >> azx_get_response timeout occured, it is a low probability event. > >> > >> To avoid exceeding 120s, can i change the count value to 3 in > >> hda_set_power_state ? > > > >So the change you suggested isn't for any real "fix" but to allow the > >possible workaround with single_cmd to be applicable somehow in a > >different form. Then we should rather try debugging the original > >issue, instead of change it. > > > >Does the response timeout happen *during* the S4 suspend, or during S4 > >resume, or after S4 resume? > > > >To be noted, the behavior you changed is only for the single_cmd > >option is set explicitly, so it's more or less the designed behavior, > >and I don't think it's good to change blindly. > > The timeout happen during S4 resume. > > In hda_set_power_state function, the count value setting at most 10 times, > i feel like there are a lot of retries, can we change it to 3 times, to avoid > exceeding 120s ? No, no, if such a long timeout happens, something is already wrong. Extending it is no real fix at all. You need to figure out which access or action causes the problem at first. Takashi
diff --git a/sound/pci/hda/hda_controller.c b/sound/pci/hda/hda_controller.c index 5d86e5a9c814..5ea198d67bda 100644 --- a/sound/pci/hda/hda_controller.c +++ b/sound/pci/hda/hda_controller.c @@ -809,10 +809,6 @@ static int azx_rirb_get_response(struct hdac_bus *bus, unsigned int addr, return -EIO; } - /* no fallback mechanism? */ - if (!chip->fallback_to_single_cmd) - return -EIO; - /* a fatal communication error; need either to reset or to fallback * to the single_cmd mode */ @@ -824,6 +820,10 @@ static int azx_rirb_get_response(struct hdac_bus *bus, unsigned int addr, return -EAGAIN; /* give a chance to retry */ } + /* no fallback mechanism? */ + if (!chip->fallback_to_single_cmd) + return -EIO; + dev_err(chip->card->dev, "azx_get_response timeout, switching to single_cmd mode: last cmd=0x%08x\n", bus->last_cmd[addr]);