Message ID | 20200410015245.23230-1-xiexiuqi@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: panic on synchronous external abort in kernel context | expand |
On Fri, Apr 10, 2020 at 09:52:45AM +0800, Xie XiuQi wrote: > We should panic even panic_on_oops is not set, when we can't recover > from synchronous external abort in kernel context. > > Othervise, there are two issues: > 1) fallback to do_exit() in exception context, cause this core hung up. > do_sea() > -> arm64_notify_die > -> die > -> do_exit > 2) errors may propagated. > > Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> > Cc: Xiaofei Tan <tanxiaofei@huawei.com> > --- > arch/arm64/include/asm/esr.h | 12 ++++++++++++ > arch/arm64/kernel/traps.c | 2 ++ > 2 files changed, 14 insertions(+) > > diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h > index cb29253ae86b..acfc71c6d148 100644 > --- a/arch/arm64/include/asm/esr.h > +++ b/arch/arm64/include/asm/esr.h > @@ -326,6 +326,18 @@ static inline bool esr_is_data_abort(u32 esr) > return ec == ESR_ELx_EC_DABT_LOW || ec == ESR_ELx_EC_DABT_CUR; > } > > +static inline bool esr_is_inst_abort(u32 esr) > +{ > + const u32 ec = ESR_ELx_EC(esr); > + > + return ec == ESR_ELx_EC_IABT_LOW || ec == ESR_ELx_EC_IABT_CUR; > +} > + > +static inline bool esr_is_ext_abort(u32 esr) > +{ > + return esr_is_data_abort(esr) || esr_is_inst_abort(esr); > +} A data abort or an intstruction abort are not necessarily synchronus external aborts, so this isn't right. What exactly are you trying to catch here? If you are seeing a problem in practice, can you please share your log from a crash? Thanks, Mark. > + > const char *esr_get_class_string(u32 esr); > #endif /* __ASSEMBLY */ > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c > index cf402be5c573..08f7f7688d5b 100644 > --- a/arch/arm64/kernel/traps.c > +++ b/arch/arm64/kernel/traps.c > @@ -202,6 +202,8 @@ void die(const char *str, struct pt_regs *regs, int err) > panic("Fatal exception in interrupt"); > if (panic_on_oops) > panic("Fatal exception"); > + if (esr_is_ext_abort(err)) > + panic("Synchronous external abort in kernel context"); > > raw_spin_unlock_irqrestore(&die_lock, flags); > > -- > 2.20.1 >
Hi Xie, On 14/04/2020 11:59, Mark Rutland wrote: > On Fri, Apr 10, 2020 at 09:52:45AM +0800, Xie XiuQi wrote: >> We should panic even panic_on_oops is not set, when we can't recover >> from synchronous external abort in kernel context. Hmm, fault-from-kernel-context doesn't mean the fault affects the kernel. If the kernel is reading or writing from user-space memory for a syscall, its the user-space memory that is affected. This thread can't make progress, so we kill it. If its a kernel thread or we were in irq context, we panic(). I don't think you really want all faults that happen as a result of a kernel access to be fatal! [...] > What exactly are you trying to catch here? If you are seeing a problem > in practice, can you please share your log from a crash? Yes please! I suspect you want to make memory_failure() smarter about faults that affect the kernel text or data. If so, please do it in memory_failure() where it benefits all architectures, and all methods of reporting errors. (we may need a 'synchronous' hint to memory_failure(), it expects everything to be asynchronous). If its not memory, we should extend the RAS handling to know what this error is, and that it is fatal. (e.g. PE state is infected) Thanks, James
Hi Mark, On 2020/4/14 18:59, Mark Rutland wrote: > On Fri, Apr 10, 2020 at 09:52:45AM +0800, Xie XiuQi wrote: >> We should panic even panic_on_oops is not set, when we can't recover >> from synchronous external abort in kernel context. >> >> Othervise, there are two issues: >> 1) fallback to do_exit() in exception context, cause this core hung up. >> do_sea() >> -> arm64_notify_die >> -> die >> -> do_exit >> 2) errors may propagated. >> >> Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> >> Cc: Xiaofei Tan <tanxiaofei@huawei.com> >> --- >> arch/arm64/include/asm/esr.h | 12 ++++++++++++ >> arch/arm64/kernel/traps.c | 2 ++ >> 2 files changed, 14 insertions(+) >> >> diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h >> index cb29253ae86b..acfc71c6d148 100644 >> --- a/arch/arm64/include/asm/esr.h >> +++ b/arch/arm64/include/asm/esr.h >> @@ -326,6 +326,18 @@ static inline bool esr_is_data_abort(u32 esr) >> return ec == ESR_ELx_EC_DABT_LOW || ec == ESR_ELx_EC_DABT_CUR; >> } >> >> +static inline bool esr_is_inst_abort(u32 esr) >> +{ >> + const u32 ec = ESR_ELx_EC(esr); >> + >> + return ec == ESR_ELx_EC_IABT_LOW || ec == ESR_ELx_EC_IABT_CUR; >> +} >> + >> +static inline bool esr_is_ext_abort(u32 esr) >> +{ >> + return esr_is_data_abort(esr) || esr_is_inst_abort(esr); >> +} > > A data abort or an intstruction abort are not necessarily synchronus > external aborts, so this isn't right. > > What exactly are you trying to catch here? If you are seeing a problem > in practice, can you please share your log from a crash? Yes, I meet a problem in practice. Tan Xiaofei report this issue when doing ras error inject testing: 1) panic_on_oops is not set; 2) trigger a sea by inject a ECC error; 3) a cpu core receive a sea in kernel context: do_mem_abort -> arm64_notify_die -> die # kernel context, call die() directly; -> do_exit # kernel process context, call do_exit(SIGSEGV); -> do_task_dead() # call do_task_dead(), and hung up this core; Actually, we should not call do_exit() for kernel task, or before return from external abort. So, I send this patch intend to panic directly when receive a sea in kernel context. crash log: NOTICE: [TotemRasIntCpuNodeEri]:[1879L] NOTICE: [RasEriInterrupt]:[173L]NodeTYP2Status = 0x0 [ 387.740609] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 9 [ 387.748837] {1}[Hardware Error]: event severity: recoverable [ 387.754470] {1}[Hardware Error]: Error 0, type: recoverable [ 387.760103] {1}[Hardware Error]: section_type: ARM processor error [ 387.766425] {1}[Hardware Error]: MIDR: 0x00000000481fd010 [ 387.771972] {1}[Hardware Error]: Multiprocessor Affinity Register (MPIDR): 0x0000000081080000 [ 387.780628] {1}[Hardware Error]: error affinity level: 0 [ 387.786088] {1}[Hardware Error]: running state: 0x1 [ 387.791115] {1}[Hardware Error]: Power State Coordination Interface state: 0 [ 387.798301] {1}[Hardware Error]: Error info structure 0: [ 387.803761] {1}[Hardware Error]: num errors: 1 [ 387.808356] {1}[Hardware Error]: error_type: 0, cache error [ 387.814160] {1}[Hardware Error]: error_info: 0x0000000024400017 [ 387.820311] {1}[Hardware Error]: transaction type: Instruction [ 387.826461] {1}[Hardware Error]: operation type: Generic error (type cannot be determined) [ 387.835031] {1}[Hardware Error]: cache level: 1 [ 387.839878] {1}[Hardware Error]: the error has been corrected [ 387.845942] {1}[Hardware Error]: physical fault address: 0x00000027caf50770 [ 387.853162] Internal error: synchronous external abort: 96000610 [#1] PREEMPT SMP [ 387.860611] Modules linked in: l1l2_inject(O) vfio_iommu_type1 vfio_pci vfio_virqfd vfio ib_ipoib ib_umad rpcrdma ib_iser libiscsi scsi_transport_iscsi hns_roce_hw_v2 crct10dif_ce ses hns3 hclge hnae3 hisi_trng_v2 rng_core hisi_zip hisi_sec2 hisi_hpre hisi_qm uacce hisi_sas_v3_hw hisi_sas_main libsas scsi_transport_sas [ 387.888725] CPU: 0 PID: 940 Comm: kworker/0:3 Kdump: loaded Tainted: G O 5.5.0-rc4-g5993cbe #1 [ 387.898592] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.B210.01 03/12/2020 [ Message from 387.907429] Workqueue: events error_inject [l1l2_inject] [ 387.914098] pstate: 80c00009 (Nzcv daif +PAN +UAO) s[yslogd@localho 387.918867] pc : lsu_inj_ue+0x58/0x70 [l1l2_inject] [ 387.925103] lr : error_inject+0x64/0xb0 [l1l2_inject] [ 387.930132] sp : ffff800020af3d90 [ 387.933435] x29: ffff800020af3d90 x28: 0000000000000000 st[ at Mar 30 11: 387.938723] x27: ffff0027dae6e838 x26: ffff80001178bcc8 [ 387.945391] x25: 0000000000000000 x24: ffffa2c77162e090 [ 387.950680] x23: 0000000000000000 x22: ffff2027d7c33d40 33[:55 ... ker 387.955968] x21: ffff2027d7c37a00 x20: ffff0027d679c000 [ 387.962636] x19: ffffa2c77162e088 x18: 0000000020cbf59a ne[l:Internal err 387.967924] x17: 000000000000000e x16: ffffa2c7b812bc98 [ 387.974592] x15: 0000000000000001 x14: 0000000000000000 [ 387.979880] x13: ffff2027cf75a780 x12: ffffa2c7b8299c18 or[: synchronous 387.985168] x11: 0000000000000000 x10: 00000000000009f0 [ 387.991836] x9 : ffff800020af3d50 x8 : fefefefefefefeff ex[ternal abort: 387.997124] x7 : 0000000000000000 x6 : 001d4ed88e000000 [ 388.003792] x5 : 000073746e657665 x4 : 0000080110f81381 [ 388.009080] x3 : 000000000000002f x2 : 807fffffffffffff 96[000610 [#1] PR 388.014369] x1 : ffffa2c77162c518 x0 : 0000000000000081 [ 388.021037] Call trace: EEMPT SMP [ 388.023475] lsu_inj_ue+0x58/0x70 [l1l2_inject] [ 388.029019] error_inject+0x64/0xb0 [l1l2_inject] [ 388.033707] process_one_work+0x158/0x4b8 [ 388.037699] worker_thread+0x50/0x498 [ 388.041348] kthread+0xfc/0x128 [ 388.044480] ret_from_fork+0x10/0x1c [ 388.048042] Code: b2790000 d519f780 f9800020 d5033f9f (58001001) [ 388.054109] ---[ end trace 39d51c21b0e42ba6 ]--- core 0 hung up at here. > > Thanks, > Mark. > >> + >> const char *esr_get_class_string(u32 esr); >> #endif /* __ASSEMBLY */ >> >> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c >> index cf402be5c573..08f7f7688d5b 100644 >> --- a/arch/arm64/kernel/traps.c >> +++ b/arch/arm64/kernel/traps.c >> @@ -202,6 +202,8 @@ void die(const char *str, struct pt_regs *regs, int err) >> panic("Fatal exception in interrupt"); >> if (panic_on_oops) >> panic("Fatal exception"); >> + if (esr_is_ext_abort(err)) >> + panic("Synchronous external abort in kernel context"); >> >> raw_spin_unlock_irqrestore(&die_lock, flags); >> >> -- >> 2.20.1 >> > . >
Hi James, On 2020/4/14 20:16, James Morse wrote: > Hi Xie, > > On 14/04/2020 11:59, Mark Rutland wrote: >> On Fri, Apr 10, 2020 at 09:52:45AM +0800, Xie XiuQi wrote: >>> We should panic even panic_on_oops is not set, when we can't recover >>> from synchronous external abort in kernel context. > > Hmm, fault-from-kernel-context doesn't mean the fault affects the kernel. If the kernel is > reading or writing from user-space memory for a syscall, its the user-space memory that is > affected. This thread can't make progress, so we kill it. > If its a kernel thread or we were in irq context, we panic(). > > I don't think you really want all faults that happen as a result of a kernel access to be > fatal! Yes, you're right. I just want to fix a hung up when ras error inject testing. panic_on_oops is not set in the kernel image for testing. When receiving a sea in kernel context, the PE trap in do_exit(), and can't return any more. I analyze the source code, the call trace might like this: do_mem_abort -> arm64_notify_die -> die # kernel context, call die() directly; -> do_exit # kernel process context, call do_exit(SIGSEGV); -> do_task_dead() # call do_task_dead(), and hung up this core; > > [...] > >> What exactly are you trying to catch here? If you are seeing a problem >> in practice, can you please share your log from a crash? > > Yes please! > crash log: NOTICE: [TotemRasIntCpuNodeEri]:[1879L] NOTICE: [RasEriInterrupt]:[173L]NodeTYP2Status = 0x0 [ 387.740609] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 9 [ 387.748837] {1}[Hardware Error]: event severity: recoverable [ 387.754470] {1}[Hardware Error]: Error 0, type: recoverable [ 387.760103] {1}[Hardware Error]: section_type: ARM processor error [ 387.766425] {1}[Hardware Error]: MIDR: 0x00000000481fd010 [ 387.771972] {1}[Hardware Error]: Multiprocessor Affinity Register (MPIDR): 0x0000000081080000 [ 387.780628] {1}[Hardware Error]: error affinity level: 0 [ 387.786088] {1}[Hardware Error]: running state: 0x1 [ 387.791115] {1}[Hardware Error]: Power State Coordination Interface state: 0 [ 387.798301] {1}[Hardware Error]: Error info structure 0: [ 387.803761] {1}[Hardware Error]: num errors: 1 [ 387.808356] {1}[Hardware Error]: error_type: 0, cache error [ 387.814160] {1}[Hardware Error]: error_info: 0x0000000024400017 [ 387.820311] {1}[Hardware Error]: transaction type: Instruction [ 387.826461] {1}[Hardware Error]: operation type: Generic error (type cannot be determined) [ 387.835031] {1}[Hardware Error]: cache level: 1 [ 387.839878] {1}[Hardware Error]: the error has been corrected [ 387.845942] {1}[Hardware Error]: physical fault address: 0x00000027caf50770 [ 387.853162] Internal error: synchronous external abort: 96000610 [#1] PREEMPT SMP [ 387.860611] Modules linked in: l1l2_inject(O) vfio_iommu_type1 vfio_pci vfio_virqfd vfio ib_ipoib ib_umad rpcrdma ib_iser libiscsi scsi_transport_iscsi hns_roce_hw_v2 crct10dif_ce ses hns3 hclge hnae3 hisi_trng_v2 rng_core hisi_zip hisi_sec2 hisi_hpre hisi_qm uacce hisi_sas_v3_hw hisi_sas_main libsas scsi_transport_sas [ 387.888725] CPU: 0 PID: 940 Comm: kworker/0:3 Kdump: loaded Tainted: G O 5.5.0-rc4-g5993cbe #1 [ 387.898592] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.B210.01 03/12/2020 [ Message from 387.907429] Workqueue: events error_inject [l1l2_inject] [ 387.914098] pstate: 80c00009 (Nzcv daif +PAN +UAO) s[yslogd@localho 387.918867] pc : lsu_inj_ue+0x58/0x70 [l1l2_inject] [ 387.925103] lr : error_inject+0x64/0xb0 [l1l2_inject] [ 387.930132] sp : ffff800020af3d90 [ 387.933435] x29: ffff800020af3d90 x28: 0000000000000000 st[ at Mar 30 11: 387.938723] x27: ffff0027dae6e838 x26: ffff80001178bcc8 [ 387.945391] x25: 0000000000000000 x24: ffffa2c77162e090 [ 387.950680] x23: 0000000000000000 x22: ffff2027d7c33d40 33[:55 ... ker 387.955968] x21: ffff2027d7c37a00 x20: ffff0027d679c000 [ 387.962636] x19: ffffa2c77162e088 x18: 0000000020cbf59a ne[l:Internal err 387.967924] x17: 000000000000000e x16: ffffa2c7b812bc98 [ 387.974592] x15: 0000000000000001 x14: 0000000000000000 [ 387.979880] x13: ffff2027cf75a780 x12: ffffa2c7b8299c18 or[: synchronous 387.985168] x11: 0000000000000000 x10: 00000000000009f0 [ 387.991836] x9 : ffff800020af3d50 x8 : fefefefefefefeff ex[ternal abort: 387.997124] x7 : 0000000000000000 x6 : 001d4ed88e000000 [ 388.003792] x5 : 000073746e657665 x4 : 0000080110f81381 [ 388.009080] x3 : 000000000000002f x2 : 807fffffffffffff 96[000610 [#1] PR 388.014369] x1 : ffffa2c77162c518 x0 : 0000000000000081 [ 388.021037] Call trace: EEMPT SMP [ 388.023475] lsu_inj_ue+0x58/0x70 [l1l2_inject] [ 388.029019] error_inject+0x64/0xb0 [l1l2_inject] [ 388.033707] process_one_work+0x158/0x4b8 [ 388.037699] worker_thread+0x50/0x498 [ 388.041348] kthread+0xfc/0x128 [ 388.044480] ret_from_fork+0x10/0x1c [ 388.048042] Code: b2790000 d519f780 f9800020 d5033f9f (58001001) [ 388.054109] ---[ end trace 39d51c21b0e42ba6 ]--- core 0 hung up at here. > > I suspect you want to make memory_failure() smarter about faults that affect the kernel > text or data. If so, please do it in memory_failure() where it benefits all architectures, > and all methods of reporting errors. > (we may need a 'synchronous' hint to memory_failure(), it expects everything to be > asynchronous). > > If its not memory, we should extend the RAS handling to know what this error is, and that > it is fatal. (e.g. PE state is infected) > > > Thanks, > > James > . >
Hi Xie, On 14/04/2020 13:39, Xie XiuQi wrote: > On 2020/4/14 20:16, James Morse wrote: >> On 14/04/2020 11:59, Mark Rutland wrote: >>> On Fri, Apr 10, 2020 at 09:52:45AM +0800, Xie XiuQi wrote: >>>> We should panic even panic_on_oops is not set, when we can't recover >>>> from synchronous external abort in kernel context. >> >> Hmm, fault-from-kernel-context doesn't mean the fault affects the kernel. If the kernel is >> reading or writing from user-space memory for a syscall, its the user-space memory that is >> affected. This thread can't make progress, so we kill it. >> If its a kernel thread or we were in irq context, we panic(). >> >> I don't think you really want all faults that happen as a result of a kernel access to be >> fatal! > Yes, you're right. I just want to fix a hung up when ras error inject testing. > > panic_on_oops is not set in the kernel image for testing. When receiving a sea in kernel > context, the PE trap in do_exit(), and can't return any more. trap? gets trapped, (or gets stuck, to prevent confusion with the architectures use of the word 'trap'!) > I analyze the source code, the call trace might like this: > do_mem_abort > -> arm64_notify_die > -> die # kernel context, call die() directly; > -> do_exit # kernel process context, call do_exit(SIGSEGV); > -> do_task_dead() # call do_task_dead(), and hung up this core; Thanks for the trace. This describes a corrected error in your I-cache, that occurred while the kernel was executing a kernel thread. These shouldn't be fatal, because it was corrected ... but the kernel doesn't know that because it doesn't know how to parse those records. There are two things wrong here: 1. it locks up while trying to kill the thread. 2. it tried to kill the thread in the first place! For 1, does your l1l2_inject module take any spinlocks or tinker with the pre-empt counter? I suspect this is some rarely-tested path in do_task_dead() that sleeps, but can't from your l1l2_inject module because the pre-empt counter has been raised. CONFIG_DEBUG_ATOMIC_SLEEP may point at the function to blame. It may be accessing some thread data that kthreads don't have, taking a second exception and blocking on the die_lock. LOCKDEP should catch this one. We should fix this one first. 2 is a bit more complicated. Today, this is fatal because the arch code assumes this was probably a memory error, and if it returns to user-space it can't avoid getting stuck in a loop until the scheduled memory_failure() work runs. Instead it unconditionally signals the process. [0] fixes this up for memory errors. But in this case it will assume all the work has been done by APEI, (or will be before we get back to user-space). APEI ignored the processor error you fed it, because it doesn't know what they are, they are just printed out. This is fine for corrected errors, but were are reliant on your firmware describing uncorrected errors with a 'fatal' severity... which might be too heavy a hammer. (Ideally that would mean 'uncontained', and the kernel should handle, or detect it can't, any other errror...) We can discuss the right thing to do here when support for parsing these 'arm processor errors' is posted. (If you think I need to do something different in [0] because of this, please shout!) > [ 387.740609] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 9 > [ 387.748837] {1}[Hardware Error]: event severity: recoverable > [ 387.754470] {1}[Hardware Error]: Error 0, type: recoverable > [ 387.760103] {1}[Hardware Error]: section_type: ARM processor error et voila! Case 2. Linux doesn't handle these 'ARM processor error' things, because it doesn't know what they are. It just prints them out. > [ 387.766425] {1}[Hardware Error]: MIDR: 0x00000000481fd010 > [ 387.771972] {1}[Hardware Error]: Multiprocessor Affinity Register (MPIDR): 0x0000000081080000 > [ 387.780628] {1}[Hardware Error]: error affinity level: 0 > [ 387.786088] {1}[Hardware Error]: running state: 0x1 > [ 387.791115] {1}[Hardware Error]: Power State Coordination Interface state: 0 > [ 387.798301] {1}[Hardware Error]: Error info structure 0: > [ 387.803761] {1}[Hardware Error]: num errors: 1 > [ 387.808356] {1}[Hardware Error]: error_type: 0, cache error > [ 387.814160] {1}[Hardware Error]: error_info: 0x0000000024400017 > [ 387.820311] {1}[Hardware Error]: transaction type: Instruction > [ 387.826461] {1}[Hardware Error]: operation type: Generic error (type cannot be determined) > [ 387.835031] {1}[Hardware Error]: cache level: 1 > [ 387.839878] {1}[Hardware Error]: the error has been corrected As this is corrected, I think the bug is a deadlock somewhere in do_task_dead(). > [ 387.845942] {1}[Hardware Error]: physical fault address: 0x00000027caf50770 (and your firmware gives you the physical address, excellent, the kernel can do something with this!) > [ 388.021037] Call trace: > [ 388.023475] lsu_inj_ue+0x58/0x70 [l1l2_inject] > [ 388.029019] error_inject+0x64/0xb0 [l1l2_inject] > [ 388.033707] process_one_work+0x158/0x4b8 > [ 388.037699] worker_thread+0x50/0x498 > [ 388.041348] kthread+0xfc/0x128 > [ 388.044480] ret_from_fork+0x10/0x1c > [ 388.048042] Code: b2790000 d519f780 f9800020 d5033f9f (58001001) > [ 388.054109] ---[ end trace 39d51c21b0e42ba6 ]--- > > core 0 hung up at here. DEBUG_ATOMIC_SLEEP and maybe LOCKDEP should help you pin down where the kernel is getting stuck. It looks like a bug in the core code. Thanks, James [0] https://lore.kernel.org/linux-acpi/20200228174817.74278-4-james.morse@arm.com/
Hi James, Sorry for reply late. On 2020/4/14 22:53, James Morse wrote: > Hi Xie, > > On 14/04/2020 13:39, Xie XiuQi wrote: >> On 2020/4/14 20:16, James Morse wrote: >>> On 14/04/2020 11:59, Mark Rutland wrote: >>>> On Fri, Apr 10, 2020 at 09:52:45AM +0800, Xie XiuQi wrote: >>>>> We should panic even panic_on_oops is not set, when we can't recover >>>>> from synchronous external abort in kernel context. >>> >>> Hmm, fault-from-kernel-context doesn't mean the fault affects the kernel. If the kernel is >>> reading or writing from user-space memory for a syscall, its the user-space memory that is >>> affected. This thread can't make progress, so we kill it. >>> If its a kernel thread or we were in irq context, we panic(). >>> >>> I don't think you really want all faults that happen as a result of a kernel access to be >>> fatal! > >> Yes, you're right. I just want to fix a hung up when ras error inject testing. >> >> panic_on_oops is not set in the kernel image for testing. When receiving a sea in kernel >> context, the PE trap in do_exit(), and can't return any more. > > trap? gets trapped, (or gets stuck, to prevent confusion with the architectures use of the > word 'trap'!) > > >> I analyze the source code, the call trace might like this: >> do_mem_abort >> -> arm64_notify_die >> -> die # kernel context, call die() directly; >> -> do_exit # kernel process context, call do_exit(SIGSEGV); >> -> do_task_dead() # call do_task_dead(), and hung up this core; > > Thanks for the trace. This describes a corrected error in your I-cache, that occurred > while the kernel was executing a kernel thread. These shouldn't be fatal, because it was > corrected ... but the kernel doesn't know that because it doesn't know how to parse those > records. > > There are two things wrong here: > 1. it locks up while trying to kill the thread. > 2. it tried to kill the thread in the first place! > > For 1, does your l1l2_inject module take any spinlocks or tinker with the pre-empt counter? > > I suspect this is some rarely-tested path in do_task_dead() that sleeps, but can't from > your l1l2_inject module because the pre-empt counter has been raised. > > CONFIG_DEBUG_ATOMIC_SLEEP may point at the function to blame. > > It may be accessing some thread data that kthreads don't have, taking a second exception > and blocking on the die_lock. LOCKDEP should catch this one. > > We should fix this one first. > I analyze the l1l2_inject module, there is a kworker thread used to inject error. do_sea() try to kill the kworker thread, so the work(s) queued on this kworker is blocked. panic_on_oops option is usually set on production environment, so if someone unset this option for testing, he should take his own risk. Is it right? > > 2 is a bit more complicated. Today, this is fatal because the arch code assumes this was > probably a memory error, and if it returns to user-space it can't avoid getting stuck in a > loop until the scheduled memory_failure() work runs. Instead it unconditionally signals > the process. > > [0] fixes this up for memory errors. But in this case it will assume all the work has been > done by APEI, (or will be before we get back to user-space). APEI ignored the processor > error you fed it, because it doesn't know what they are, they are just printed out. > > This is fine for corrected errors, but were are reliant on your firmware describing > uncorrected errors with a 'fatal' severity... which might be too heavy a hammer. (Ideally > that would mean 'uncontained', and the kernel should handle, or detect it can't, any other > errror...) > > We can discuss the right thing to do here when support for parsing these 'arm processor > errors' is posted. > (If you think I need to do something different in [0] because of this, please shout!) For some errors which could be recoverable or corrected, do_sea() should not kill process or die() unconditionally. It's better detect this situation. > > >> [ 387.740609] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 9 >> [ 387.748837] {1}[Hardware Error]: event severity: recoverable >> [ 387.754470] {1}[Hardware Error]: Error 0, type: recoverable > >> [ 387.760103] {1}[Hardware Error]: section_type: ARM processor error > > et voila! Case 2. Linux doesn't handle these 'ARM processor error' things, because it > doesn't know what they are. It just prints them out. > > >> [ 387.766425] {1}[Hardware Error]: MIDR: 0x00000000481fd010 >> [ 387.771972] {1}[Hardware Error]: Multiprocessor Affinity Register (MPIDR): 0x0000000081080000 >> [ 387.780628] {1}[Hardware Error]: error affinity level: 0 >> [ 387.786088] {1}[Hardware Error]: running state: 0x1 >> [ 387.791115] {1}[Hardware Error]: Power State Coordination Interface state: 0 >> [ 387.798301] {1}[Hardware Error]: Error info structure 0: >> [ 387.803761] {1}[Hardware Error]: num errors: 1 >> [ 387.808356] {1}[Hardware Error]: error_type: 0, cache error >> [ 387.814160] {1}[Hardware Error]: error_info: 0x0000000024400017 >> [ 387.820311] {1}[Hardware Error]: transaction type: Instruction >> [ 387.826461] {1}[Hardware Error]: operation type: Generic error (type cannot be determined) >> [ 387.835031] {1}[Hardware Error]: cache level: 1 > >> [ 387.839878] {1}[Hardware Error]: the error has been corrected > > As this is corrected, I think the bug is a deadlock somewhere in do_task_dead(). > > >> [ 387.845942] {1}[Hardware Error]: physical fault address: 0x00000027caf50770 > > (and your firmware gives you the physical address, excellent, the kernel can do something > with this!) > > >> [ 388.021037] Call trace: >> [ 388.023475] lsu_inj_ue+0x58/0x70 [l1l2_inject] >> [ 388.029019] error_inject+0x64/0xb0 [l1l2_inject] >> [ 388.033707] process_one_work+0x158/0x4b8 >> [ 388.037699] worker_thread+0x50/0x498 >> [ 388.041348] kthread+0xfc/0x128 >> [ 388.044480] ret_from_fork+0x10/0x1c >> [ 388.048042] Code: b2790000 d519f780 f9800020 d5033f9f (58001001) >> [ 388.054109] ---[ end trace 39d51c21b0e42ba6 ]--- >> >> core 0 hung up at here. > > DEBUG_ATOMIC_SLEEP and maybe LOCKDEP should help you pin down where the kernel is getting > stuck. It looks like a bug in the core code. > > > Thanks, > > James > > [0] https://lore.kernel.org/linux-acpi/20200228174817.74278-4-james.morse@arm.com/ > . >
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h index cb29253ae86b..acfc71c6d148 100644 --- a/arch/arm64/include/asm/esr.h +++ b/arch/arm64/include/asm/esr.h @@ -326,6 +326,18 @@ static inline bool esr_is_data_abort(u32 esr) return ec == ESR_ELx_EC_DABT_LOW || ec == ESR_ELx_EC_DABT_CUR; } +static inline bool esr_is_inst_abort(u32 esr) +{ + const u32 ec = ESR_ELx_EC(esr); + + return ec == ESR_ELx_EC_IABT_LOW || ec == ESR_ELx_EC_IABT_CUR; +} + +static inline bool esr_is_ext_abort(u32 esr) +{ + return esr_is_data_abort(esr) || esr_is_inst_abort(esr); +} + const char *esr_get_class_string(u32 esr); #endif /* __ASSEMBLY */ diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c index cf402be5c573..08f7f7688d5b 100644 --- a/arch/arm64/kernel/traps.c +++ b/arch/arm64/kernel/traps.c @@ -202,6 +202,8 @@ void die(const char *str, struct pt_regs *regs, int err) panic("Fatal exception in interrupt"); if (panic_on_oops) panic("Fatal exception"); + if (esr_is_ext_abort(err)) + panic("Synchronous external abort in kernel context"); raw_spin_unlock_irqrestore(&die_lock, flags);
We should panic even panic_on_oops is not set, when we can't recover from synchronous external abort in kernel context. Othervise, there are two issues: 1) fallback to do_exit() in exception context, cause this core hung up. do_sea() -> arm64_notify_die -> die -> do_exit 2) errors may propagated. Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> Cc: Xiaofei Tan <tanxiaofei@huawei.com> --- arch/arm64/include/asm/esr.h | 12 ++++++++++++ arch/arm64/kernel/traps.c | 2 ++ 2 files changed, 14 insertions(+)