Message ID | 20200702225334.32414-1-hauke@hauke-m.de (mailing list archive) |
---|---|
State | Mainlined |
Commit | fcec538ef8cca0ad0b84432235dccd9059c8e6f8 |
Headers | show |
Series | MIPS: Add missing EHB in mtc0 -> mfc0 sequence for DSPen | expand |
On 7/3/20 12:53 AM, Hauke Mehrtens wrote: > This resolves the hazard between the mtc0 in the change_c0_status() and > the mfc0 in configure_exception_vector(). Without resolving this hazard > configure_exception_vector() could read an old value and would restore > this old value again. This would revert the changes change_c0_status() > did. I checked this by printing out the read_c0_status() at the end of > per_cpu_trap_init() and the ST0_MX is not set without this patch. > > The hazard is documented in the MIPS Architecture Reference Manual Vol. > III: MIPS32/microMIPS32 Privileged Resource Architecture (MD00088), rev > 6.03 table 8.1 which includes: > > Producer | Consumer | Hazard > ----------|----------|---------------------------- > mtc0 | mfc0 | any coprocessor 0 register > > I saw this hazard on an Atheros AR9344 rev 2 SoC with a MIPS 74Kc CPU. > There the change_c0_status() function would activate the DSPen by > setting ST0_MX in the c0_status register. This was reverted and then the > system got a DSP exception when the DSP registers were saved in > save_dsp() in the first process switch. The crash looks like this: > > [ 0.089999] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) > [ 0.097796] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) > [ 0.107070] Kernel panic - not syncing: Unexpected DSP exception > [ 0.113470] Rebooting in 1 seconds.. > > We saw this problem in OpenWrt only on the MIPS 74Kc based Atheros SoCs, > not on the 24Kc based SoCs. We only saw it with kernel 5.4 not with > kernel 4.19, in addition we had to use GCC 8.4 or 9.X, with GCC 8.3 it > did not happen. > > In the kernel I bisected this problem to commit 9012d011660e ("compiler: > allow all arches to enable CONFIG_OPTIMIZE_INLINING"), but when this was > reverted it also happened after commit 172dcd935c34b ("MIPS: Always > allocate exception vector for MIPSr2+"). > > Commit 0b24cae4d535 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.") > does similar changes to a different file. I am not sure if there are > more places affected by this problem. > > Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> > Cc: <stable@vger.kernel.org> > --- > arch/mips/kernel/traps.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c > index 7c32c956156a..1234ea21dd8f 100644 > --- a/arch/mips/kernel/traps.c > +++ b/arch/mips/kernel/traps.c > @@ -2169,6 +2169,7 @@ static void configure_status(void) > > change_c0_status(ST0_CU|ST0_MX|ST0_RE|ST0_FR|ST0_BEV|ST0_TS|ST0_KX|ST0_SX|ST0_UX, > status_set); > + back_to_back_c0_hazard(); > } > > unsigned int hwrena; > The MIPS InterAptiv system software user manual suggests to put this back_to_back_c0_hazard() close to the consumer. Should I move it there? Should I also add a back_to_back_c0_hazard() close to the other places which could be affected by similar problems even when I do not see them at runtime? Hauke
在 2020/7/4 上午6:31, Hauke Mehrtens 写道: > On 7/3/20 12:53 AM, Hauke Mehrtens wrote: >> This resolves the hazard between the mtc0 in the change_c0_status() and >> the mfc0 in configure_exception_vector(). Without resolving this hazard >> configure_exception_vector() could read an old value and would restore >> this old value again. This would revert the changes change_c0_status() >> did. I checked this by printing out the read_c0_status() at the end of >> per_cpu_trap_init() and the ST0_MX is not set without this patch. >> >> The hazard is documented in the MIPS Architecture Reference Manual Vol. >> III: MIPS32/microMIPS32 Privileged Resource Architecture (MD00088), rev >> 6.03 table 8.1 which includes: >> >> Producer | Consumer | Hazard >> ----------|----------|---------------------------- >> mtc0 | mfc0 | any coprocessor 0 register >> >> I saw this hazard on an Atheros AR9344 rev 2 SoC with a MIPS 74Kc CPU. >> There the change_c0_status() function would activate the DSPen by >> setting ST0_MX in the c0_status register. This was reverted and then the >> system got a DSP exception when the DSP registers were saved in >> save_dsp() in the first process switch. The crash looks like this: >> >> [ 0.089999] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) >> [ 0.097796] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) >> [ 0.107070] Kernel panic - not syncing: Unexpected DSP exception >> [ 0.113470] Rebooting in 1 seconds.. >> >> We saw this problem in OpenWrt only on the MIPS 74Kc based Atheros SoCs, >> not on the 24Kc based SoCs. We only saw it with kernel 5.4 not with >> kernel 4.19, in addition we had to use GCC 8.4 or 9.X, with GCC 8.3 it >> did not happen. >> >> In the kernel I bisected this problem to commit 9012d011660e ("compiler: >> allow all arches to enable CONFIG_OPTIMIZE_INLINING"), but when this was >> reverted it also happened after commit 172dcd935c34b ("MIPS: Always >> allocate exception vector for MIPSr2+"). >> >> Commit 0b24cae4d535 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.") >> does similar changes to a different file. I am not sure if there are >> more places affected by this problem. >> >> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> >> Cc: <stable@vger.kernel.org> >> --- >> arch/mips/kernel/traps.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c >> index 7c32c956156a..1234ea21dd8f 100644 >> --- a/arch/mips/kernel/traps.c >> +++ b/arch/mips/kernel/traps.c >> @@ -2169,6 +2169,7 @@ static void configure_status(void) >> >> change_c0_status(ST0_CU|ST0_MX|ST0_RE|ST0_FR|ST0_BEV|ST0_TS|ST0_KX|ST0_SX|ST0_UX, >> status_set); >> + back_to_back_c0_hazard(); >> } >> >> unsigned int hwrena; >> > The MIPS InterAptiv system software user manual suggests to put this > back_to_back_c0_hazard() close to the consumer. Should I move it there? I assume the suggestion was for performance concern. I'd prefer just put it after c0 write, so we can easily trace missing barriers. > Should I also add a back_to_back_c0_hazard() close to the other places > which could be affected by similar problems even when I do not see them > at runtime? That's preventing us from potential risks. Please do so. Thanks. > > Hauke > - Jiaxun
On 7/4/20 9:37 AM, Jiaxun Yang wrote: > > > 在 2020/7/4 上午6:31, Hauke Mehrtens 写道: >> On 7/3/20 12:53 AM, Hauke Mehrtens wrote: >>> This resolves the hazard between the mtc0 in the change_c0_status() and >>> the mfc0 in configure_exception_vector(). Without resolving this hazard >>> configure_exception_vector() could read an old value and would restore >>> this old value again. This would revert the changes change_c0_status() >>> did. I checked this by printing out the read_c0_status() at the end of >>> per_cpu_trap_init() and the ST0_MX is not set without this patch. >>> >>> The hazard is documented in the MIPS Architecture Reference Manual Vol. >>> III: MIPS32/microMIPS32 Privileged Resource Architecture (MD00088), rev >>> 6.03 table 8.1 which includes: >>> >>> Producer | Consumer | Hazard >>> ----------|----------|---------------------------- >>> mtc0 | mfc0 | any coprocessor 0 register >>> >>> I saw this hazard on an Atheros AR9344 rev 2 SoC with a MIPS 74Kc CPU. >>> There the change_c0_status() function would activate the DSPen by >>> setting ST0_MX in the c0_status register. This was reverted and then the >>> system got a DSP exception when the DSP registers were saved in >>> save_dsp() in the first process switch. The crash looks like this: >>> >>> [ 0.089999] Mount-cache hash table entries: 1024 (order: 0, 4096 >>> bytes, linear) >>> [ 0.097796] Mountpoint-cache hash table entries: 1024 (order: 0, >>> 4096 bytes, linear) >>> [ 0.107070] Kernel panic - not syncing: Unexpected DSP exception >>> [ 0.113470] Rebooting in 1 seconds.. >>> >>> We saw this problem in OpenWrt only on the MIPS 74Kc based Atheros SoCs, >>> not on the 24Kc based SoCs. We only saw it with kernel 5.4 not with >>> kernel 4.19, in addition we had to use GCC 8.4 or 9.X, with GCC 8.3 it >>> did not happen. >>> >>> In the kernel I bisected this problem to commit 9012d011660e ("compiler: >>> allow all arches to enable CONFIG_OPTIMIZE_INLINING"), but when this was >>> reverted it also happened after commit 172dcd935c34b ("MIPS: Always >>> allocate exception vector for MIPSr2+"). >>> >>> Commit 0b24cae4d535 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.") >>> does similar changes to a different file. I am not sure if there are >>> more places affected by this problem. >>> >>> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> >>> Cc: <stable@vger.kernel.org> >>> --- >>> arch/mips/kernel/traps.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c >>> index 7c32c956156a..1234ea21dd8f 100644 >>> --- a/arch/mips/kernel/traps.c >>> +++ b/arch/mips/kernel/traps.c >>> @@ -2169,6 +2169,7 @@ static void configure_status(void) >>> >>> change_c0_status(ST0_CU|ST0_MX|ST0_RE|ST0_FR|ST0_BEV|ST0_TS|ST0_KX|ST0_SX|ST0_UX, >>> >>> status_set); >>> + back_to_back_c0_hazard(); >>> } >>> unsigned int hwrena; >>> >> The MIPS InterAptiv system software user manual suggests to put this >> back_to_back_c0_hazard() close to the consumer. Should I move it there? > I assume the suggestion was for performance concern. > I'd prefer just put it after c0 write, so we can easily trace missing > barriers. Yes, I also think this is for performance reasons, but this is not in a hotpath. I would leave it as is now and do not plan to change this patch. >> Should I also add a back_to_back_c0_hazard() close to the other places >> which could be affected by similar problems even when I do not see them >> at runtime? > That's preventing us from potential risks. Please do so. I will send separate patches for these later. Hauke
On Fri, Jul 03, 2020 at 12:53:34AM +0200, Hauke Mehrtens wrote: > This resolves the hazard between the mtc0 in the change_c0_status() and > the mfc0 in configure_exception_vector(). Without resolving this hazard > configure_exception_vector() could read an old value and would restore > this old value again. This would revert the changes change_c0_status() > did. I checked this by printing out the read_c0_status() at the end of > per_cpu_trap_init() and the ST0_MX is not set without this patch. > > The hazard is documented in the MIPS Architecture Reference Manual Vol. > III: MIPS32/microMIPS32 Privileged Resource Architecture (MD00088), rev > 6.03 table 8.1 which includes: > > Producer | Consumer | Hazard > ----------|----------|---------------------------- > mtc0 | mfc0 | any coprocessor 0 register > > I saw this hazard on an Atheros AR9344 rev 2 SoC with a MIPS 74Kc CPU. > There the change_c0_status() function would activate the DSPen by > setting ST0_MX in the c0_status register. This was reverted and then the > system got a DSP exception when the DSP registers were saved in > save_dsp() in the first process switch. The crash looks like this: > > [ 0.089999] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) > [ 0.097796] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) > [ 0.107070] Kernel panic - not syncing: Unexpected DSP exception > [ 0.113470] Rebooting in 1 seconds.. > > We saw this problem in OpenWrt only on the MIPS 74Kc based Atheros SoCs, > not on the 24Kc based SoCs. We only saw it with kernel 5.4 not with > kernel 4.19, in addition we had to use GCC 8.4 or 9.X, with GCC 8.3 it > did not happen. > > In the kernel I bisected this problem to commit 9012d011660e ("compiler: > allow all arches to enable CONFIG_OPTIMIZE_INLINING"), but when this was > reverted it also happened after commit 172dcd935c34b ("MIPS: Always > allocate exception vector for MIPSr2+"). > > Commit 0b24cae4d535 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.") > does similar changes to a different file. I am not sure if there are > more places affected by this problem. > > Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> > Cc: <stable@vger.kernel.org> > --- > arch/mips/kernel/traps.c | 1 + > 1 file changed, 1 insertion(+) applied to mips-fixes. Thomas.
diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c index 7c32c956156a..1234ea21dd8f 100644 --- a/arch/mips/kernel/traps.c +++ b/arch/mips/kernel/traps.c @@ -2169,6 +2169,7 @@ static void configure_status(void) change_c0_status(ST0_CU|ST0_MX|ST0_RE|ST0_FR|ST0_BEV|ST0_TS|ST0_KX|ST0_SX|ST0_UX, status_set); + back_to_back_c0_hazard(); } unsigned int hwrena;
This resolves the hazard between the mtc0 in the change_c0_status() and the mfc0 in configure_exception_vector(). Without resolving this hazard configure_exception_vector() could read an old value and would restore this old value again. This would revert the changes change_c0_status() did. I checked this by printing out the read_c0_status() at the end of per_cpu_trap_init() and the ST0_MX is not set without this patch. The hazard is documented in the MIPS Architecture Reference Manual Vol. III: MIPS32/microMIPS32 Privileged Resource Architecture (MD00088), rev 6.03 table 8.1 which includes: Producer | Consumer | Hazard ----------|----------|---------------------------- mtc0 | mfc0 | any coprocessor 0 register I saw this hazard on an Atheros AR9344 rev 2 SoC with a MIPS 74Kc CPU. There the change_c0_status() function would activate the DSPen by setting ST0_MX in the c0_status register. This was reverted and then the system got a DSP exception when the DSP registers were saved in save_dsp() in the first process switch. The crash looks like this: [ 0.089999] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) [ 0.097796] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) [ 0.107070] Kernel panic - not syncing: Unexpected DSP exception [ 0.113470] Rebooting in 1 seconds.. We saw this problem in OpenWrt only on the MIPS 74Kc based Atheros SoCs, not on the 24Kc based SoCs. We only saw it with kernel 5.4 not with kernel 4.19, in addition we had to use GCC 8.4 or 9.X, with GCC 8.3 it did not happen. In the kernel I bisected this problem to commit 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING"), but when this was reverted it also happened after commit 172dcd935c34b ("MIPS: Always allocate exception vector for MIPSr2+"). Commit 0b24cae4d535 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.") does similar changes to a different file. I am not sure if there are more places affected by this problem. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Cc: <stable@vger.kernel.org> --- arch/mips/kernel/traps.c | 1 + 1 file changed, 1 insertion(+)