Message ID | 20220328091411.31488-1-mkumard@nvidia.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | ALSA: hda: Avoid unsol event during RPM suspending | expand |
On Mon, 28 Mar 2022 11:14:11 +0200, Mohan Kumar wrote: > > There is a corner case with unsol event handling during codec runtime > suspending state. When the codec runtime suspend call initiated, the > codec->in_pm atomic variable would be 0, currently the codec runtime > suspend function calls snd_hdac_enter_pm() which will just increments > the codec->in_pm atomic variable. Consider unsol event happened just > after this step and before snd_hdac_leave_pm() in the codec runtime > suspend function. The snd_hdac_power_up_pm() in the unsol event > flow in hdmi_present_sense_via_verbs() function would just increment > the codec->in_pm atomic variable without calling pm_runtime_get_sync > function. > > As codec runtime suspend flow is already in progress and in parallel > unsol event is also accessing the codec verbs, as soon as codec > suspend flow completes and clocks are switched off before completing > the unsol event handling as both functions doesn't wait for each other. > This will result in below errors > > [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching > to polling mode: last cmd=0x505f2f57 > [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5, > last cmd=0x505f2f57 > [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5, > last cmd=0x505f2f57 > > To avoid this, the unsol event flow should not perform any codec verb > related operations during RPM_SUSPENDING state. > > Signed-off-by: Mohan Kumar <mkumard@nvidia.com> Thanks, that's a hairy problem... The logic sounds good, but can we check the PM state before calling snd_hda_power_up_pm()? Takashi
On 3/28/2022 3:12 PM, Takashi Iwai wrote: > External email: Use caution opening links or attachments > > > On Mon, 28 Mar 2022 11:14:11 +0200, > Mohan Kumar wrote: >> There is a corner case with unsol event handling during codec runtime >> suspending state. When the codec runtime suspend call initiated, the >> codec->in_pm atomic variable would be 0, currently the codec runtime >> suspend function calls snd_hdac_enter_pm() which will just increments >> the codec->in_pm atomic variable. Consider unsol event happened just >> after this step and before snd_hdac_leave_pm() in the codec runtime >> suspend function. The snd_hdac_power_up_pm() in the unsol event >> flow in hdmi_present_sense_via_verbs() function would just increment >> the codec->in_pm atomic variable without calling pm_runtime_get_sync >> function. >> >> As codec runtime suspend flow is already in progress and in parallel >> unsol event is also accessing the codec verbs, as soon as codec >> suspend flow completes and clocks are switched off before completing >> the unsol event handling as both functions doesn't wait for each other. >> This will result in below errors >> >> [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching >> to polling mode: last cmd=0x505f2f57 >> [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5, >> last cmd=0x505f2f57 >> [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5, >> last cmd=0x505f2f57 >> >> To avoid this, the unsol event flow should not perform any codec verb >> related operations during RPM_SUSPENDING state. >> >> Signed-off-by: Mohan Kumar <mkumard@nvidia.com> > Thanks, that's a hairy problem... > > The logic sounds good, but can we check the PM state before calling > snd_hda_power_up_pm()? If am not wrong, PM apis exposed either provide RPM_ACTIVE or RPM_SUSPENDED status. Don't see anything which provides info on RPM_SUSPENDING. We might need to exactly know this state to fix this issue. > > Takashi
On Mon, 28 Mar 2022 12:19:03 +0200, Mohan Kumar D wrote: > > > On 3/28/2022 3:12 PM, Takashi Iwai wrote: > > External email: Use caution opening links or attachments > > > > > > On Mon, 28 Mar 2022 11:14:11 +0200, > > Mohan Kumar wrote: > >> There is a corner case with unsol event handling during codec runtime > >> suspending state. When the codec runtime suspend call initiated, the > >> codec->in_pm atomic variable would be 0, currently the codec runtime > >> suspend function calls snd_hdac_enter_pm() which will just increments > >> the codec->in_pm atomic variable. Consider unsol event happened just > >> after this step and before snd_hdac_leave_pm() in the codec runtime > >> suspend function. The snd_hdac_power_up_pm() in the unsol event > >> flow in hdmi_present_sense_via_verbs() function would just increment > >> the codec->in_pm atomic variable without calling pm_runtime_get_sync > >> function. > >> > >> As codec runtime suspend flow is already in progress and in parallel > >> unsol event is also accessing the codec verbs, as soon as codec > >> suspend flow completes and clocks are switched off before completing > >> the unsol event handling as both functions doesn't wait for each other. > >> This will result in below errors > >> > >> [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching > >> to polling mode: last cmd=0x505f2f57 > >> [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5, > >> last cmd=0x505f2f57 > >> [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5, > >> last cmd=0x505f2f57 > >> > >> To avoid this, the unsol event flow should not perform any codec verb > >> related operations during RPM_SUSPENDING state. > >> > >> Signed-off-by: Mohan Kumar <mkumard@nvidia.com> > > Thanks, that's a hairy problem... > > > > The logic sounds good, but can we check the PM state before calling > > snd_hda_power_up_pm()? > > If am not wrong, PM apis exposed either provide RPM_ACTIVE or > RPM_SUSPENDED status. Don't see anything which provides info on > RPM_SUSPENDING. We might need to exactly know this state to fix this > issue. Well, maybe my question wasn't clear. What I meant was that your change below > ret = snd_hda_power_up_pm(codec); > - if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec))) > + if ((ret < 0 && pm_runtime_suspended(dev)) || > + (dev->power.runtime_status == RPM_SUSPENDING)) > goto out; can be rather like: > + if (dev->power.runtime_status == RPM_SUSPENDING) > + return; > ret = snd_hda_power_up_pm(codec); > if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec))) so that it skips unneeded power up/down calls. Basically the state is set at drivers/base/power/runtime.c rpm_suspend() just before calling the device's runtime_suspend callback. So the state is supposed to be same before and after snd_hda_power_up_pm() in that case. thanks, Takashi
Hi Mohan, Thank you for the patch! Yet something to improve: [auto build test ERROR on tiwai-sound/for-next] [also build test ERROR on v5.17 next-20220328] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/intel-lab-lkp/linux/commits/Mohan-Kumar/ALSA-hda-Avoid-unsol-event-during-RPM-suspending/20220328-171517 base: https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git for-next config: alpha-buildonly-randconfig-r001-20220327 (https://download.01.org/0day-ci/archive/20220328/202203282137.YsDmfIAj-lkp@intel.com/config) compiler: alpha-linux-gcc (GCC) 11.2.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/80c4e21f5e97cd4b779806fa5da5bb7392e2874f git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Mohan-Kumar/ALSA-hda-Avoid-unsol-event-during-RPM-suspending/20220328-171517 git checkout 80c4e21f5e97cd4b779806fa5da5bb7392e2874f # save the config file to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=alpha SHELL=/bin/bash sound/pci/hda/ If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): sound/pci/hda/patch_hdmi.c: In function 'hdmi_present_sense_via_verbs': >> sound/pci/hda/patch_hdmi.c:1644:28: error: 'struct dev_pm_info' has no member named 'runtime_status' 1644 | (dev->power.runtime_status == RPM_SUSPENDING)) | ^ vim +1644 sound/pci/hda/patch_hdmi.c 1620 1621 /* update ELD and jack state via HD-audio verbs */ 1622 static void hdmi_present_sense_via_verbs(struct hdmi_spec_per_pin *per_pin, 1623 int repoll) 1624 { 1625 struct hda_codec *codec = per_pin->codec; 1626 struct hdmi_spec *spec = codec->spec; 1627 struct hdmi_eld *eld = &spec->temp_eld; 1628 struct device *dev = hda_codec_dev(codec); 1629 hda_nid_t pin_nid = per_pin->pin_nid; 1630 int dev_id = per_pin->dev_id; 1631 /* 1632 * Always execute a GetPinSense verb here, even when called from 1633 * hdmi_intrinsic_event; for some NVIDIA HW, the unsolicited 1634 * response's PD bit is not the real PD value, but indicates that 1635 * the real PD value changed. An older version of the HD-audio 1636 * specification worked this way. Hence, we just ignore the data in 1637 * the unsolicited response to avoid custom WARs. 1638 */ 1639 int present; 1640 int ret; 1641 1642 ret = snd_hda_power_up_pm(codec); 1643 if ((ret < 0 && pm_runtime_suspended(dev)) || > 1644 (dev->power.runtime_status == RPM_SUSPENDING)) 1645 goto out; 1646 1647 present = snd_hda_jack_pin_sense(codec, pin_nid, dev_id); 1648 1649 mutex_lock(&per_pin->lock); 1650 eld->monitor_present = !!(present & AC_PINSENSE_PRESENCE); 1651 if (eld->monitor_present) 1652 eld->eld_valid = !!(present & AC_PINSENSE_ELDV); 1653 else 1654 eld->eld_valid = false; 1655 1656 codec_dbg(codec, 1657 "HDMI status: Codec=%d NID=0x%x Presence_Detect=%d ELD_Valid=%d\n", 1658 codec->addr, pin_nid, eld->monitor_present, eld->eld_valid); 1659 1660 if (eld->eld_valid) { 1661 if (spec->ops.pin_get_eld(codec, pin_nid, dev_id, 1662 eld->eld_buffer, &eld->eld_size) < 0) 1663 eld->eld_valid = false; 1664 } 1665 1666 update_eld(codec, per_pin, eld, repoll); 1667 mutex_unlock(&per_pin->lock); 1668 out: 1669 snd_hda_power_down_pm(codec); 1670 } 1671
On 3/28/2022 4:27 PM, Takashi Iwai wrote: > External email: Use caution opening links or attachments > > > On Mon, 28 Mar 2022 12:19:03 +0200, > Mohan Kumar D wrote: >> >> On 3/28/2022 3:12 PM, Takashi Iwai wrote: >>> External email: Use caution opening links or attachments >>> >>> >>> On Mon, 28 Mar 2022 11:14:11 +0200, >>> Mohan Kumar wrote: >>>> There is a corner case with unsol event handling during codec runtime >>>> suspending state. When the codec runtime suspend call initiated, the >>>> codec->in_pm atomic variable would be 0, currently the codec runtime >>>> suspend function calls snd_hdac_enter_pm() which will just increments >>>> the codec->in_pm atomic variable. Consider unsol event happened just >>>> after this step and before snd_hdac_leave_pm() in the codec runtime >>>> suspend function. The snd_hdac_power_up_pm() in the unsol event >>>> flow in hdmi_present_sense_via_verbs() function would just increment >>>> the codec->in_pm atomic variable without calling pm_runtime_get_sync >>>> function. >>>> >>>> As codec runtime suspend flow is already in progress and in parallel >>>> unsol event is also accessing the codec verbs, as soon as codec >>>> suspend flow completes and clocks are switched off before completing >>>> the unsol event handling as both functions doesn't wait for each other. >>>> This will result in below errors >>>> >>>> [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching >>>> to polling mode: last cmd=0x505f2f57 >>>> [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5, >>>> last cmd=0x505f2f57 >>>> [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5, >>>> last cmd=0x505f2f57 >>>> >>>> To avoid this, the unsol event flow should not perform any codec verb >>>> related operations during RPM_SUSPENDING state. >>>> >>>> Signed-off-by: Mohan Kumar <mkumard@nvidia.com> >>> Thanks, that's a hairy problem... >>> >>> The logic sounds good, but can we check the PM state before calling >>> snd_hda_power_up_pm()? >> If am not wrong, PM apis exposed either provide RPM_ACTIVE or >> RPM_SUSPENDED status. Don't see anything which provides info on >> RPM_SUSPENDING. We might need to exactly know this state to fix this >> issue. > Well, maybe my question wasn't clear. What I meant was that your > change below > >> ret = snd_hda_power_up_pm(codec); >> - if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec))) >> + if ((ret < 0 && pm_runtime_suspended(dev)) || >> + (dev->power.runtime_status == RPM_SUSPENDING)) >> goto out; > can be rather like: > >> + if (dev->power.runtime_status == RPM_SUSPENDING) >> + return; >> ret = snd_hda_power_up_pm(codec); >> if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec))) > so that it skips unneeded power up/down calls. > > Basically the state is set at drivers/base/power/runtime.c > rpm_suspend() just before calling the device's runtime_suspend > callback. So the state is supposed to be same before and after > snd_hda_power_up_pm() in that case. Thanks!, Make sense, will push the updated patch after testing with latest suggestion. > > thanks, > > Takashi
Hi Mohan, Thank you for the patch! Yet something to improve: [auto build test ERROR on tiwai-sound/for-next] [also build test ERROR on v5.17 next-20220328] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/intel-lab-lkp/linux/commits/Mohan-Kumar/ALSA-hda-Avoid-unsol-event-during-RPM-suspending/20220328-171517 base: https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git for-next config: s390-randconfig-c005-20220327 (https://download.01.org/0day-ci/archive/20220329/202203290053.emhsIdTK-lkp@intel.com/config) compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 0f6d9501cf49ce02937099350d08f20c4af86f3d) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install s390 cross compiling tool for clang build # apt-get install binutils-s390x-linux-gnu # https://github.com/intel-lab-lkp/linux/commit/80c4e21f5e97cd4b779806fa5da5bb7392e2874f git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Mohan-Kumar/ALSA-hda-Avoid-unsol-event-during-RPM-suspending/20220328-171517 git checkout 80c4e21f5e97cd4b779806fa5da5bb7392e2874f # save the config file to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash sound/pci/hda/ If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): In file included from sound/pci/hda/patch_hdmi.c:21: In file included from include/linux/pci.h:39: In file included from include/linux/io.h:13: In file included from arch/s390/include/asm/io.h:75: include/asm-generic/io.h:464:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __raw_readb(PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr)); ~~~~~~~~~~ ^ include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu' #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x)) ^ include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16' #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x)) ^ In file included from sound/pci/hda/patch_hdmi.c:21: In file included from include/linux/pci.h:39: In file included from include/linux/io.h:13: In file included from arch/s390/include/asm/io.h:75: include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr)); ~~~~~~~~~~ ^ include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu' #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x)) ^ include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32' #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x)) ^ In file included from sound/pci/hda/patch_hdmi.c:21: In file included from include/linux/pci.h:39: In file included from include/linux/io.h:13: In file included from arch/s390/include/asm/io.h:75: include/asm-generic/io.h:501:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writeb(value, PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:511:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:521:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:609:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsb(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:617:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsw(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:625:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsl(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:634:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesb(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:643:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesw(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:652:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesl(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ >> sound/pci/hda/patch_hdmi.c:1644:15: error: no member named 'runtime_status' in 'struct dev_pm_info' (dev->power.runtime_status == RPM_SUSPENDING)) ~~~~~~~~~~ ^ 12 warnings and 1 error generated. vim +1644 sound/pci/hda/patch_hdmi.c 1620 1621 /* update ELD and jack state via HD-audio verbs */ 1622 static void hdmi_present_sense_via_verbs(struct hdmi_spec_per_pin *per_pin, 1623 int repoll) 1624 { 1625 struct hda_codec *codec = per_pin->codec; 1626 struct hdmi_spec *spec = codec->spec; 1627 struct hdmi_eld *eld = &spec->temp_eld; 1628 struct device *dev = hda_codec_dev(codec); 1629 hda_nid_t pin_nid = per_pin->pin_nid; 1630 int dev_id = per_pin->dev_id; 1631 /* 1632 * Always execute a GetPinSense verb here, even when called from 1633 * hdmi_intrinsic_event; for some NVIDIA HW, the unsolicited 1634 * response's PD bit is not the real PD value, but indicates that 1635 * the real PD value changed. An older version of the HD-audio 1636 * specification worked this way. Hence, we just ignore the data in 1637 * the unsolicited response to avoid custom WARs. 1638 */ 1639 int present; 1640 int ret; 1641 1642 ret = snd_hda_power_up_pm(codec); 1643 if ((ret < 0 && pm_runtime_suspended(dev)) || > 1644 (dev->power.runtime_status == RPM_SUSPENDING)) 1645 goto out; 1646 1647 present = snd_hda_jack_pin_sense(codec, pin_nid, dev_id); 1648 1649 mutex_lock(&per_pin->lock); 1650 eld->monitor_present = !!(present & AC_PINSENSE_PRESENCE); 1651 if (eld->monitor_present) 1652 eld->eld_valid = !!(present & AC_PINSENSE_ELDV); 1653 else 1654 eld->eld_valid = false; 1655 1656 codec_dbg(codec, 1657 "HDMI status: Codec=%d NID=0x%x Presence_Detect=%d ELD_Valid=%d\n", 1658 codec->addr, pin_nid, eld->monitor_present, eld->eld_valid); 1659 1660 if (eld->eld_valid) { 1661 if (spec->ops.pin_get_eld(codec, pin_nid, dev_id, 1662 eld->eld_buffer, &eld->eld_size) < 0) 1663 eld->eld_valid = false; 1664 } 1665 1666 update_eld(codec, per_pin, eld, repoll); 1667 mutex_unlock(&per_pin->lock); 1668 out: 1669 snd_hda_power_down_pm(codec); 1670 } 1671
On Mon, 28 Mar 2022 15:51:17 +0200, Mohan Kumar D wrote: > > > On 3/28/2022 4:27 PM, Takashi Iwai wrote: > > External email: Use caution opening links or attachments > > > > > > On Mon, 28 Mar 2022 12:19:03 +0200, > > Mohan Kumar D wrote: > >> > >> On 3/28/2022 3:12 PM, Takashi Iwai wrote: > >>> External email: Use caution opening links or attachments > >>> > >>> > >>> On Mon, 28 Mar 2022 11:14:11 +0200, > >>> Mohan Kumar wrote: > >>>> There is a corner case with unsol event handling during codec runtime > >>>> suspending state. When the codec runtime suspend call initiated, the > >>>> codec->in_pm atomic variable would be 0, currently the codec runtime > >>>> suspend function calls snd_hdac_enter_pm() which will just increments > >>>> the codec->in_pm atomic variable. Consider unsol event happened just > >>>> after this step and before snd_hdac_leave_pm() in the codec runtime > >>>> suspend function. The snd_hdac_power_up_pm() in the unsol event > >>>> flow in hdmi_present_sense_via_verbs() function would just increment > >>>> the codec->in_pm atomic variable without calling pm_runtime_get_sync > >>>> function. > >>>> > >>>> As codec runtime suspend flow is already in progress and in parallel > >>>> unsol event is also accessing the codec verbs, as soon as codec > >>>> suspend flow completes and clocks are switched off before completing > >>>> the unsol event handling as both functions doesn't wait for each other. > >>>> This will result in below errors > >>>> > >>>> [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching > >>>> to polling mode: last cmd=0x505f2f57 > >>>> [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5, > >>>> last cmd=0x505f2f57 > >>>> [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5, > >>>> last cmd=0x505f2f57 > >>>> > >>>> To avoid this, the unsol event flow should not perform any codec verb > >>>> related operations during RPM_SUSPENDING state. > >>>> > >>>> Signed-off-by: Mohan Kumar <mkumard@nvidia.com> > >>> Thanks, that's a hairy problem... > >>> > >>> The logic sounds good, but can we check the PM state before calling > >>> snd_hda_power_up_pm()? > >> If am not wrong, PM apis exposed either provide RPM_ACTIVE or > >> RPM_SUSPENDED status. Don't see anything which provides info on > >> RPM_SUSPENDING. We might need to exactly know this state to fix this > >> issue. > > Well, maybe my question wasn't clear. What I meant was that your > > change below > > > >> ret = snd_hda_power_up_pm(codec); > >> - if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec))) > >> + if ((ret < 0 && pm_runtime_suspended(dev)) || > >> + (dev->power.runtime_status == RPM_SUSPENDING)) > >> goto out; > > can be rather like: > > > >> + if (dev->power.runtime_status == RPM_SUSPENDING) > >> + return; > >> ret = snd_hda_power_up_pm(codec); > >> if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec))) > > so that it skips unneeded power up/down calls. > > > > Basically the state is set at drivers/base/power/runtime.c > > rpm_suspend() just before calling the device's runtime_suspend > > callback. So the state is supposed to be same before and after > > snd_hda_power_up_pm() in that case. > Thanks!, Make sense, will push the updated patch after testing with > latest suggestion. Thanks. Also don't forget to cover a case the test bot complained: the reference to power.runtime_status needs #ifdef CONFIG_PM. Takashi
On 3/28/2022 9:45 PM, Takashi Iwai wrote: > External email: Use caution opening links or attachments > > > On Mon, 28 Mar 2022 15:51:17 +0200, > Mohan Kumar D wrote: >> >> On 3/28/2022 4:27 PM, Takashi Iwai wrote: >>> External email: Use caution opening links or attachments >>> >>> >>> On Mon, 28 Mar 2022 12:19:03 +0200, >>> Mohan Kumar D wrote: >>>> On 3/28/2022 3:12 PM, Takashi Iwai wrote: >>>>> External email: Use caution opening links or attachments >>>>> >>>>> >>>>> On Mon, 28 Mar 2022 11:14:11 +0200, >>>>> Mohan Kumar wrote: >>>>>> There is a corner case with unsol event handling during codec runtime >>>>>> suspending state. When the codec runtime suspend call initiated, the >>>>>> codec->in_pm atomic variable would be 0, currently the codec runtime >>>>>> suspend function calls snd_hdac_enter_pm() which will just increments >>>>>> the codec->in_pm atomic variable. Consider unsol event happened just >>>>>> after this step and before snd_hdac_leave_pm() in the codec runtime >>>>>> suspend function. The snd_hdac_power_up_pm() in the unsol event >>>>>> flow in hdmi_present_sense_via_verbs() function would just increment >>>>>> the codec->in_pm atomic variable without calling pm_runtime_get_sync >>>>>> function. >>>>>> >>>>>> As codec runtime suspend flow is already in progress and in parallel >>>>>> unsol event is also accessing the codec verbs, as soon as codec >>>>>> suspend flow completes and clocks are switched off before completing >>>>>> the unsol event handling as both functions doesn't wait for each other. >>>>>> This will result in below errors >>>>>> >>>>>> [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching >>>>>> to polling mode: last cmd=0x505f2f57 >>>>>> [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5, >>>>>> last cmd=0x505f2f57 >>>>>> [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5, >>>>>> last cmd=0x505f2f57 >>>>>> >>>>>> To avoid this, the unsol event flow should not perform any codec verb >>>>>> related operations during RPM_SUSPENDING state. >>>>>> >>>>>> Signed-off-by: Mohan Kumar <mkumard@nvidia.com> >>>>> Thanks, that's a hairy problem... >>>>> >>>>> The logic sounds good, but can we check the PM state before calling >>>>> snd_hda_power_up_pm()? >>>> If am not wrong, PM apis exposed either provide RPM_ACTIVE or >>>> RPM_SUSPENDED status. Don't see anything which provides info on >>>> RPM_SUSPENDING. We might need to exactly know this state to fix this >>>> issue. >>> Well, maybe my question wasn't clear. What I meant was that your >>> change below >>> >>>> ret = snd_hda_power_up_pm(codec); >>>> - if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec))) >>>> + if ((ret < 0 && pm_runtime_suspended(dev)) || >>>> + (dev->power.runtime_status == RPM_SUSPENDING)) >>>> goto out; >>> can be rather like: >>> >>>> + if (dev->power.runtime_status == RPM_SUSPENDING) >>>> + return; >>>> ret = snd_hda_power_up_pm(codec); >>>> if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec))) >>> so that it skips unneeded power up/down calls. >>> >>> Basically the state is set at drivers/base/power/runtime.c >>> rpm_suspend() just before calling the device's runtime_suspend >>> callback. So the state is supposed to be same before and after >>> snd_hda_power_up_pm() in that case. >> Thanks!, Make sense, will push the updated patch after testing with >> latest suggestion. > Thanks. Also don't forget to cover a case the test bot complained: > the reference to power.runtime_status needs #ifdef CONFIG_PM. Sure, will take care same in next patch update. > > Takashi
diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c index c85ed7bc121e..67870c8d84a5 100644 --- a/sound/pci/hda/patch_hdmi.c +++ b/sound/pci/hda/patch_hdmi.c @@ -1625,6 +1625,7 @@ static void hdmi_present_sense_via_verbs(struct hdmi_spec_per_pin *per_pin, struct hda_codec *codec = per_pin->codec; struct hdmi_spec *spec = codec->spec; struct hdmi_eld *eld = &spec->temp_eld; + struct device *dev = hda_codec_dev(codec); hda_nid_t pin_nid = per_pin->pin_nid; int dev_id = per_pin->dev_id; /* @@ -1639,7 +1640,8 @@ static void hdmi_present_sense_via_verbs(struct hdmi_spec_per_pin *per_pin, int ret; ret = snd_hda_power_up_pm(codec); - if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec))) + if ((ret < 0 && pm_runtime_suspended(dev)) || + (dev->power.runtime_status == RPM_SUSPENDING)) goto out; present = snd_hda_jack_pin_sense(codec, pin_nid, dev_id);
There is a corner case with unsol event handling during codec runtime suspending state. When the codec runtime suspend call initiated, the codec->in_pm atomic variable would be 0, currently the codec runtime suspend function calls snd_hdac_enter_pm() which will just increments the codec->in_pm atomic variable. Consider unsol event happened just after this step and before snd_hdac_leave_pm() in the codec runtime suspend function. The snd_hdac_power_up_pm() in the unsol event flow in hdmi_present_sense_via_verbs() function would just increment the codec->in_pm atomic variable without calling pm_runtime_get_sync function. As codec runtime suspend flow is already in progress and in parallel unsol event is also accessing the codec verbs, as soon as codec suspend flow completes and clocks are switched off before completing the unsol event handling as both functions doesn't wait for each other. This will result in below errors [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching to polling mode: last cmd=0x505f2f57 [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5, last cmd=0x505f2f57 [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5, last cmd=0x505f2f57 To avoid this, the unsol event flow should not perform any codec verb related operations during RPM_SUSPENDING state. Signed-off-by: Mohan Kumar <mkumard@nvidia.com> --- sound/pci/hda/patch_hdmi.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)