Message ID | 1388944697-19927-5-git-send-email-hauke@hauke-m.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sunday 05 January 2014, Hauke Mehrtens wrote: > Without this patch I am getting a unhandled fault exception like this > one after "Freeing unused kernel memory": > > Freeing unused kernel memory: 1260K (c02c1000 - c03fc000) > Unhandled fault: imprecise external abort (0x1c06) at 0xb6f89005 > Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000007 > > The address which is here 0xb6f89005 changes from boot to boot, with a > new build the changes are bigger. With kernel 3.10 I have also seen > this fault at different places in the boot process, but starting with > 3.11 they are always occurring after the "Freeing unused kernel memory" > message. I never was able to completely boot to userspace without this > handler. The abort code is constant 0x1c06. This fault just happens > once in the boot process I have never seen it happing twice or more. How about narrowing down the abort handler to only ignore a single fault after boot, and only with the abort code 0x1c06? That way you don't risk silent data corruption in case something else goes wrong after booting. > This workaround was copied from the vendor code including most of the > comments. It says it they think this is caused by the CFE boot loader > used on this device. I do not have any access to any datasheet or > errata document to check this. Does the SoC by chance have a PCI host controller? The only other platforms with this kind of handler have it to catch things going wrong with PCI. Maybe another thing to try is to turn off the PCI core at early boot. Arnd
On 01/05/2014 09:25 PM, Arnd Bergmann wrote: > On Sunday 05 January 2014, Hauke Mehrtens wrote: >> Without this patch I am getting a unhandled fault exception like this >> one after "Freeing unused kernel memory": >> >> Freeing unused kernel memory: 1260K (c02c1000 - c03fc000) >> Unhandled fault: imprecise external abort (0x1c06) at 0xb6f89005 >> Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000007 >> >> The address which is here 0xb6f89005 changes from boot to boot, with a >> new build the changes are bigger. With kernel 3.10 I have also seen >> this fault at different places in the boot process, but starting with >> 3.11 they are always occurring after the "Freeing unused kernel memory" >> message. I never was able to completely boot to userspace without this >> handler. The abort code is constant 0x1c06. This fault just happens >> once in the boot process I have never seen it happing twice or more. > > How about narrowing down the abort handler to only ignore a single > fault after boot, and only with the abort code 0x1c06? That way you > don't risk silent data corruption in case something else goes wrong > after booting. Ok I extended the bcm5301x_abort_handler() to only ignore the fault when the code is 0x1c06 and it is the first fault. >> This workaround was copied from the vendor code including most of the >> comments. It says it they think this is caused by the CFE boot loader >> used on this device. I do not have any access to any datasheet or >> errata document to check this. > > Does the SoC by chance have a PCI host controller? The only other > platforms with this kind of handler have it to catch things going wrong > with PCI. Maybe another thing to try is to turn off the PCI core > at early boot. Yes it has two PCIe host controller. I haven't tried to initialized the PCIe core. It could be that the boot loader did something with the PCIe controller, but I do not think so. Hauke
diff --git a/arch/arm/mach-bcm/bcm_5301x.c b/arch/arm/mach-bcm/bcm_5301x.c index 4b83b52..30dcfea 100644 --- a/arch/arm/mach-bcm/bcm_5301x.c +++ b/arch/arm/mach-bcm/bcm_5301x.c @@ -12,6 +12,28 @@ #include <asm/hardware/cache-l2x0.h> #include <asm/mach/arch.h> +#include <asm/signal.h> + +static int bcm5301x_abort_handler(unsigned long addr, unsigned int fsr, + struct pt_regs *regs) +{ + /* + * These happen for no good reason, possibly left over from the CFE + * boot loader. + */ + pr_warn("External imprecise Data abort at addr=%#lx, fsr=%#x ignored.\n", + addr, fsr); + + /* Returning non-zero causes fault display and panic */ + return 0; +} + +static void __init bcm5301x_init_early(void) +{ + /* Install our hook */ + hook_fault_code(16 + 6, bcm5301x_abort_handler, SIGBUS, 0, + "imprecise external abort"); +} static void __init bcm5301x_timer_init(void) { @@ -31,6 +53,7 @@ static const char __initconst *bcm5301x_dt_compat[] = { }; DT_MACHINE_START(BCM5301X, "BCM5301X") + .init_early = bcm5301x_init_early, .init_time = bcm5301x_timer_init, .init_machine = bcm5301x_dt_init, .dt_compat = bcm5301x_dt_compat,
Without this patch I am getting a unhandled fault exception like this one after "Freeing unused kernel memory": Freeing unused kernel memory: 1260K (c02c1000 - c03fc000) Unhandled fault: imprecise external abort (0x1c06) at 0xb6f89005 Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000007 The address which is here 0xb6f89005 changes from boot to boot, with a new build the changes are bigger. With kernel 3.10 I have also seen this fault at different places in the boot process, but starting with 3.11 they are always occurring after the "Freeing unused kernel memory" message. I never was able to completely boot to userspace without this handler. The abort code is constant 0x1c06. This fault just happens once in the boot process I have never seen it happing twice or more. I also tried changing the CPSR.A bit to 0 in init_early, with this code like Afzal suggested, but that did not change anything: asm volatile("mrs r12, cpsr\n" "bic r12, r12, #0x00000100\n" "msr cpsr_c, r12" ::: "r12", "cc", "memory"); Disabling the L2 cache by building with CONFIG_CACHE_L2X0 unset did not help. This workaround was copied from the vendor code including most of the comments. It says it they think this is caused by the CFE boot loader used on this device. I do not have any access to any datasheet or errata document to check this. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> --- arch/arm/mach-bcm/bcm_5301x.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)