Message ID | 20200221100739.3883842-1-perex@perex.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ASoC: SOF - topology - do not change the link trigger order for pre-1.4 firmware | expand |
Hi, +Ranjani who did the link reorder patch On Fri, 21 Feb 2020, Jaroslav Kysela wrote: > This patch is for SOF v1.3 firmware. The DSP firmware will crash (DSP oops) > without this patch. The 1.4.1 firmare has this issue fixed. > > The ABI version is used for the comparison, because the firmware version > for the firmware files before 1.4.2 was not set properly (git hash was > used). build fails when this is applied on broonie/for-next. You need an additional --- a/sound/soc/sof/topology.c +++ b/sound/soc/sof/topology.c @@ -3108,6 +3108,7 @@ static int sof_link_load(struct snd_soc_component *scomp, int index, struct snd_soc_dai_link *link, struct snd_soc_tplg_link_config *cfg) { + struct snd_sof_dev *sdev = snd_soc_component_get_drvdata(scomp); struct snd_soc_tplg_private *private = &cfg->priv; > - /* set trigger order */ > - link->trigger[0] = SND_SOC_DPCM_TRIGGER_POST; > - link->trigger[1] = SND_SOC_DPCM_TRIGGER_POST; > + /* this causes DSP panic on firmware v1.3 */ > + if (v->abi_version > SOF_ABI_VER(3, 7, 0)) { > + /* set trigger order */ > + link->trigger[0] = SND_SOC_DPCM_TRIGGER_POST; > + link->trigger[1] = SND_SOC_DPCM_TRIGGER_POST; My results with older firmwares and this patch are a bit mixed. When I apply this patch and boot with v1.3 FW on a CFL platform (ABI 3.7.0, version 1:1:0-5dd9a), I get a DSP panic at stream stop with this patch, but _without_ it, playback is fine. :P I tested both v1.3.1 and v1.3, and I get a DSP panic at stream stop with your patch (ABI 3:7:0 on both of these so trigger order is not changed). With v1.4 and all newer, streaming works as expected. The original problem was sensitive to timing, so apparently there is still some variation how this triggers on different platforms. With 1.4, 1.4.1 and 1.4.2 now out, primary solution is just to upgrade the firmware. If this fix helps with some real-life case to cope with an old firmware, we should probably still consider this. Ranjani, does the above make sense? Br, Kai
Dne 21. 02. 20 v 15:44 Kai Vehmanen napsal(a): > Hi, > > +Ranjani who did the link reorder patch > > On Fri, 21 Feb 2020, Jaroslav Kysela wrote: > >> This patch is for SOF v1.3 firmware. The DSP firmware will crash (DSP oops) >> without this patch. The 1.4.1 firmare has this issue fixed. >> >> The ABI version is used for the comparison, because the firmware version >> for the firmware files before 1.4.2 was not set properly (git hash was >> used). > > build fails when this is applied on broonie/for-next. You need an > additional > > --- a/sound/soc/sof/topology.c > +++ b/sound/soc/sof/topology.c > @@ -3108,6 +3108,7 @@ static int sof_link_load(struct snd_soc_component > *scomp, int index, > struct snd_soc_dai_link *link, > struct snd_soc_tplg_link_config *cfg) > { > + struct snd_sof_dev *sdev = snd_soc_component_get_drvdata(scomp); > struct snd_soc_tplg_private *private = &cfg->priv; > > >> - /* set trigger order */ >> - link->trigger[0] = SND_SOC_DPCM_TRIGGER_POST; >> - link->trigger[1] = SND_SOC_DPCM_TRIGGER_POST; >> + /* this causes DSP panic on firmware v1.3 */ >> + if (v->abi_version > SOF_ABI_VER(3, 7, 0)) { >> + /* set trigger order */ >> + link->trigger[0] = SND_SOC_DPCM_TRIGGER_POST; >> + link->trigger[1] = SND_SOC_DPCM_TRIGGER_POST; > > My results with older firmwares and this patch are a bit mixed. When I > apply this patch and boot with v1.3 FW on a CFL platform (ABI 3.7.0, > version 1:1:0-5dd9a), I get a DSP panic at stream stop with this patch, > but _without_ it, playback is fine. :P > > I tested both v1.3.1 and v1.3, and I get a DSP panic at stream stop with > your patch (ABI 3:7:0 on both of these so trigger order is not changed). > With v1.4 and all newer, streaming works as expected. Ok, then I wonder how the pre-5.5 kernel worked (because it's just revert of 5eee2b3f60065a2530d13f28e771be48b989eb4c for older firmware versions which should be tested with the old code). > The original problem was sensitive to timing, so apparently there is still > some variation how this triggers on different platforms. With 1.4, 1.4.1 > and 1.4.2 now out, primary solution is just to upgrade the firmware. > > If this fix helps with some real-life case to cope with an old firmware, > we should probably still consider this. Ranjani, does the above make > sense? Ok, it's really weird that we cannot determine the firmware/driver combination which cause the DSP lock. I would propose to block the older firmware load <1.4 (or 1.4.2 which has the correct firmware version!) then (at least with a big warning or module option which is off by default) for the newer kernels. Jaroslav
> Ok, it's really weird that we cannot determine the firmware/driver > combination which cause the DSP lock. I would propose to block the older > firmware load <1.4 (or 1.4.2 which has the correct firmware version!) > then (at least with a big warning or module option which is off by > default) for the newer kernels. The driver has no information on firmware version until the FW_READY IPC, so we can't block the load. And with the firmware handling happening in a work queue, blocking will not result in a failed probe. IIRC those timing issues are only for the HDaudio link DMA, I don't think we had issues with DMIC or SSP. A warning urging people to update the firmware is probably easier to implement.
Dne 21. 02. 20 v 20:23 Pierre-Louis Bossart napsal(a): > >> Ok, it's really weird that we cannot determine the firmware/driver >> combination which cause the DSP lock. I would propose to block the older >> firmware load <1.4 (or 1.4.2 which has the correct firmware version!) >> then (at least with a big warning or module option which is off by >> default) for the newer kernels. > > The driver has no information on firmware version until the FW_READY > IPC, so we can't block the load. And with the firmware handling > happening in a work queue, blocking will not result in a failed probe. > IIRC those timing issues are only for the HDaudio link DMA, I don't > think we had issues with DMIC or SSP. > A warning urging people to update the firmware is probably easier to > implement. > It makes sense. At least a hint that something may be wrong. I believe that it might help to identify issues. Jaroslav
On Fri, Feb 21, 2020 at 10:41 AM Jaroslav Kysela <perex@perex.cz> wrote: > Dne 21. 02. 20 v 15:44 Kai Vehmanen napsal(a): > > Hi, > > > > +Ranjani who did the link reorder patch > > > > On Fri, 21 Feb 2020, Jaroslav Kysela wrote: > > > >> This patch is for SOF v1.3 firmware. The DSP firmware will crash (DSP > oops) > >> without this patch. The 1.4.1 firmare has this issue fixed. > >> > >> The ABI version is used for the comparison, because the firmware version > >> for the firmware files before 1.4.2 was not set properly (git hash was > >> used). > > > > build fails when this is applied on broonie/for-next. You need an > > additional > > > > --- a/sound/soc/sof/topology.c > > +++ b/sound/soc/sof/topology.c > > @@ -3108,6 +3108,7 @@ static int sof_link_load(struct snd_soc_component > > *scomp, int index, > > struct snd_soc_dai_link *link, > > struct snd_soc_tplg_link_config *cfg) > > { > > + struct snd_sof_dev *sdev = snd_soc_component_get_drvdata(scomp); > > struct snd_soc_tplg_private *private = &cfg->priv; > > > > > >> - /* set trigger order */ > >> - link->trigger[0] = SND_SOC_DPCM_TRIGGER_POST; > >> - link->trigger[1] = SND_SOC_DPCM_TRIGGER_POST; > >> + /* this causes DSP panic on firmware v1.3 */ > >> + if (v->abi_version > SOF_ABI_VER(3, 7, 0)) { > >> + /* set trigger order */ > >> + link->trigger[0] = SND_SOC_DPCM_TRIGGER_POST; > >> + link->trigger[1] = SND_SOC_DPCM_TRIGGER_POST; > > > > My results with older firmwares and this patch are a bit mixed. When I > > apply this patch and boot with v1.3 FW on a CFL platform (ABI 3.7.0, > > version 1:1:0-5dd9a), I get a DSP panic at stream stop with this patch, > > but _without_ it, playback is fine. :P > > > > I tested both v1.3.1 and v1.3, and I get a DSP panic at stream stop with > > your patch (ABI 3:7:0 on both of these so trigger order is not changed). > > With v1.4 and all newer, streaming works as expected. > > Ok, then I wonder how the pre-5.5 kernel worked (because it's just revert > of > 5eee2b3f60065a2530d13f28e771be48b989eb4c for older firmware versions which > should be tested with the old code). > Hi Jaorslav/Kai, The patch that switch the trigger order in SOF was preceded by another one which mirrored the actions in the trigger callback based on the trigger order. My guess would be that to avoid the DSP panic for the 1.3 FW, you'll likely need to revert the below as well. acbf27746ecfa96b290b54cc7f05273482ea128a ASoC: pcm: update FE/BE trigger order based on the command Thanks, Ranjani
Hi Jaroslav, I love your patch! Yet something to improve: [auto build test ERROR on asoc/for-next] [also build test ERROR on v5.6-rc2 next-20200221] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Jaroslav-Kysela/ASoC-SOF-topology-do-not-change-the-link-trigger-order-for-pre-1-4-firmware/20200222-155003 base: https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next config: c6x-allyesconfig (attached as .config) compiler: c6x-elf-gcc (GCC) 7.5.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.5.0 make.cross ARCH=c6x If you fix the issue, kindly add following tag Reported-by: kbuild test robot <lkp@intel.com> All errors (new ones prefixed by >>): sound/soc/sof/topology.c: In function 'sof_link_load': >> sound/soc/sof/topology.c:3114:34: error: 'sdev' undeclared (first use in this function); did you mean 'cdev'? struct sof_ipc_fw_version *v = &sdev->fw_ready.version; ^~~~ cdev sound/soc/sof/topology.c:3114:34: note: each undeclared identifier is reported only once for each function it appears in vim +3114 sound/soc/sof/topology.c 3105 3106 /* DAI link - used for any driver specific init */ 3107 static int sof_link_load(struct snd_soc_component *scomp, int index, 3108 struct snd_soc_dai_link *link, 3109 struct snd_soc_tplg_link_config *cfg) 3110 { 3111 struct snd_soc_tplg_private *private = &cfg->priv; 3112 struct sof_ipc_dai_config config; 3113 struct snd_soc_tplg_hw_config *hw_config; > 3114 struct sof_ipc_fw_version *v = &sdev->fw_ready.version; 3115 int num_hw_configs; 3116 int ret; 3117 int i = 0; 3118 3119 if (!link->platforms) { 3120 dev_err(scomp->dev, "error: no platforms\n"); 3121 return -EINVAL; 3122 } 3123 link->platforms->name = dev_name(scomp->dev); 3124 3125 /* 3126 * Set nonatomic property for FE dai links as their trigger action 3127 * involves IPC's. 3128 */ 3129 if (!link->no_pcm) { 3130 link->nonatomic = true; 3131 3132 /* this causes DSP panic on firmware v1.3 */ 3133 if (v->abi_version > SOF_ABI_VER(3, 7, 0)) { 3134 /* set trigger order */ 3135 link->trigger[0] = SND_SOC_DPCM_TRIGGER_POST; 3136 link->trigger[1] = SND_SOC_DPCM_TRIGGER_POST; 3137 } 3138 3139 /* nothing more to do for FE dai links */ 3140 return 0; 3141 } 3142 3143 /* check we have some tokens - we need at least DAI type */ 3144 if (le32_to_cpu(private->size) == 0) { 3145 dev_err(scomp->dev, "error: expected tokens for DAI, none found\n"); 3146 return -EINVAL; 3147 } 3148 3149 /* Send BE DAI link configurations to DSP */ 3150 memset(&config, 0, sizeof(config)); 3151 3152 /* get any common DAI tokens */ 3153 ret = sof_parse_tokens(scomp, &config, dai_link_tokens, 3154 ARRAY_SIZE(dai_link_tokens), private->array, 3155 le32_to_cpu(private->size)); 3156 if (ret != 0) { 3157 dev_err(scomp->dev, "error: parse link tokens failed %d\n", 3158 le32_to_cpu(private->size)); 3159 return ret; 3160 } 3161 3162 /* 3163 * DAI links are expected to have at least 1 hw_config. 3164 * But some older topologies might have no hw_config for HDA dai links. 3165 */ 3166 num_hw_configs = le32_to_cpu(cfg->num_hw_configs); 3167 if (!num_hw_configs) { 3168 if (config.type != SOF_DAI_INTEL_HDA) { 3169 dev_err(scomp->dev, "error: unexpected DAI config count %d!\n", 3170 le32_to_cpu(cfg->num_hw_configs)); 3171 return -EINVAL; 3172 } 3173 } else { 3174 dev_dbg(scomp->dev, "tplg: %d hw_configs found, default id: %d!\n", 3175 cfg->num_hw_configs, le32_to_cpu(cfg->default_hw_config_id)); 3176 3177 for (i = 0; i < num_hw_configs; i++) { 3178 if (cfg->hw_config[i].id == cfg->default_hw_config_id) 3179 break; 3180 } 3181 3182 if (i == num_hw_configs) { 3183 dev_err(scomp->dev, "error: default hw_config id: %d not found!\n", 3184 le32_to_cpu(cfg->default_hw_config_id)); 3185 return -EINVAL; 3186 } 3187 } 3188 3189 /* configure dai IPC message */ 3190 hw_config = &cfg->hw_config[i]; 3191 3192 config.hdr.cmd = SOF_IPC_GLB_DAI_MSG | SOF_IPC_DAI_CONFIG; 3193 config.format = le32_to_cpu(hw_config->fmt); 3194 3195 /* now load DAI specific data and send IPC - type comes from token */ 3196 switch (config.type) { 3197 case SOF_DAI_INTEL_SSP: 3198 ret = sof_link_ssp_load(scomp, index, link, cfg, hw_config, 3199 &config); 3200 break; 3201 case SOF_DAI_INTEL_DMIC: 3202 ret = sof_link_dmic_load(scomp, index, link, cfg, hw_config, 3203 &config); 3204 break; 3205 case SOF_DAI_INTEL_HDA: 3206 ret = sof_link_hda_load(scomp, index, link, cfg, hw_config, 3207 &config); 3208 break; 3209 case SOF_DAI_INTEL_ALH: 3210 ret = sof_link_alh_load(scomp, index, link, cfg, hw_config, 3211 &config); 3212 break; 3213 case SOF_DAI_IMX_SAI: 3214 ret = sof_link_sai_load(scomp, index, link, cfg, hw_config, 3215 &config); 3216 break; 3217 case SOF_DAI_IMX_ESAI: 3218 ret = sof_link_esai_load(scomp, index, link, cfg, hw_config, 3219 &config); 3220 break; 3221 default: 3222 dev_err(scomp->dev, "error: invalid DAI type %d\n", 3223 config.type); 3224 ret = -EINVAL; 3225 break; 3226 } 3227 if (ret < 0) 3228 return ret; 3229 3230 return 0; 3231 } 3232 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Hi, On Fri, 21 Feb 2020, Jaroslav Kysela wrote: > Dne 21. 02. 20 v 20:23 Pierre-Louis Bossart napsal(a): > > > Ok, it's really weird that we cannot determine the firmware/driver > > > combination which cause the DSP lock. I would propose to block the older > > > firmware load <1.4 (or 1.4.2 which has the correct firmware version!) [...] > It makes sense. At least a hint that something may be wrong. I believe that it > might help to identify issues. I've continued testing today on multiple machines using the official (old) v1.3 binaries [1] we have and I cannot reproduce the DSP error you Jaroslav have seen. On all of my machines, latest sound tree with old v1.3 FW works just fine. This matches earlier reports on SOF issue #2102. I also looked back at the history of the kernel trigger order change, and it's a kernel-only change, to fix issues with certain pause-resume cases. It's not a change that was done in tandem with some specific FW side change, so I can't find a solid reason why DMA triggering order should be changed for old FW versions. One FW patch that was done at a time (and referred in the discussions) is: dai: prevent dai_config while in active state https://github.com/thesofproject/sof/commit/c623e9246325dbee615a5cad0c8e4b0c29976056 .. but this is not changing the logic, just avoiding a DSP crash by returning an error (but IPC and use-case will still fail). So although I cannot explain why Jaroslav you see the crash on the old v1.3 firmware on the Lenovo device, I would still recommend to leave current kernel code as is and not add any warnings. To summarize my rationale: - we have known error in SOF driver logic, which was fixed in 5.5, and now backported to 5.4 - if above driver error was hit, very old FW versions would end up with DSP crash, instead returning a proper error - for many systems, new 5.5 kernel and old 1.3 FW works ok with no notable issues - we have at least one system, where new kernel and old FW does not work -> on these machines, upgrade to v1.4.2 firmware helps Unless we get more reports, I'd lean towards not adding any new warnings. If someone hits a similar case as Jaroslav you did, we can see this from dmesg based on fw version and DSP oops dump (and/or reported IPC error). And the recommended action is to upgrade the FW to 1.4.2. How about it? [1] https://github.com/thesofproject/sof/releases Br, Kai
Dne 27. 02. 20 v 13:45 Kai Vehmanen napsal(a): > Hi, > > On Fri, 21 Feb 2020, Jaroslav Kysela wrote: >> Dne 21. 02. 20 v 20:23 Pierre-Louis Bossart napsal(a): >>>> Ok, it's really weird that we cannot determine the firmware/driver >>>> combination which cause the DSP lock. I would propose to block the older >>>> firmware load <1.4 (or 1.4.2 which has the correct firmware version!) > [...] >> It makes sense. At least a hint that something may be wrong. I believe that it >> might help to identify issues. > > I've continued testing today on multiple machines using the official (old) > v1.3 binaries [1] we have and I cannot reproduce the DSP error you > Jaroslav have seen. On all of my machines, latest sound tree with old v1.3 > FW works just fine. This matches earlier reports on SOF issue #2102. > > I also looked back at the history of the kernel trigger order change, and > it's a kernel-only change, to fix issues with certain pause-resume cases. > It's not a change that was done in tandem with some specific FW side > change, so I can't find a solid reason why DMA triggering order should be > changed for old FW versions. One FW patch that was done at a time (and > referred in the discussions) is: > > dai: prevent dai_config while in active state > https://github.com/thesofproject/sof/commit/c623e9246325dbee615a5cad0c8e4b0c29976056 > > .. but this is not changing the logic, just avoiding a DSP crash by > returning an error (but IPC and use-case will still fail). > > So although I cannot explain why Jaroslav you see the crash on the old > v1.3 firmware on the Lenovo device, I would still recommend to leave > current kernel code as is and not add any warnings. To summarize my > rationale: > > - we have known error in SOF driver logic, which was fixed > in 5.5, and now backported to 5.4 > - if above driver error was hit, very old FW versions would > end up with DSP crash, instead returning a proper error > - for many systems, new 5.5 kernel and old 1.3 FW works ok with > no notable issues > - we have at least one system, where new kernel and old FW does > not work -> on these machines, upgrade to v1.4.2 firmware helps > > Unless we get more reports, I'd lean towards not adding any new warnings. > If someone hits a similar case as Jaroslav you did, we can see this from > dmesg based on fw version and DSP oops dump (and/or reported IPC error). > And the recommended action is to upgrade the FW to 1.4.2. > > How about it? Ok, it seems that it's really a combination of the driver code and 1.3.2 firmware. I tested 5.5 kernel with 1.3.2 again and it's fine on this platform. Let's keep this without change. Thank you for your tests. Jaroslav > > [1] https://github.com/thesofproject/sof/releases > > Br, Kai >
diff --git a/sound/soc/sof/topology.c b/sound/soc/sof/topology.c index 9f4f8868b386..58ff4766b47b 100644 --- a/sound/soc/sof/topology.c +++ b/sound/soc/sof/topology.c @@ -3111,6 +3111,7 @@ static int sof_link_load(struct snd_soc_component *scomp, int index, struct snd_soc_tplg_private *private = &cfg->priv; struct sof_ipc_dai_config config; struct snd_soc_tplg_hw_config *hw_config; + struct sof_ipc_fw_version *v = &sdev->fw_ready.version; int num_hw_configs; int ret; int i = 0; @@ -3128,9 +3129,12 @@ static int sof_link_load(struct snd_soc_component *scomp, int index, if (!link->no_pcm) { link->nonatomic = true; - /* set trigger order */ - link->trigger[0] = SND_SOC_DPCM_TRIGGER_POST; - link->trigger[1] = SND_SOC_DPCM_TRIGGER_POST; + /* this causes DSP panic on firmware v1.3 */ + if (v->abi_version > SOF_ABI_VER(3, 7, 0)) { + /* set trigger order */ + link->trigger[0] = SND_SOC_DPCM_TRIGGER_POST; + link->trigger[1] = SND_SOC_DPCM_TRIGGER_POST; + } /* nothing more to do for FE dai links */ return 0;
This patch is for SOF v1.3 firmware. The DSP firmware will crash (DSP oops) without this patch. The 1.4.1 firmare has this issue fixed. The ABI version is used for the comparison, because the firmware version for the firmware files before 1.4.2 was not set properly (git hash was used). BugLink: https://github.com/thesofproject/sof/issues/2102 Signed-off-by: Jaroslav Kysela <perex@perex.cz> Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Cc: Mark Brown <broonie@kernel.org> --- sound/soc/sof/topology.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)