Message ID | 1448279946-19975-1-git-send-email-nyushchenko@dev.rtsoft.ru (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Nov 23, 2015 at 02:59:06PM +0300, nyushchenko@dev.rtsoft.ru wrote: > From: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru> > > While running an imx6s boasrd, I got following message in boot log: > > [ 0.032414] CPU1: failed to boot: -38 > > This looked strange: imx6s is singe-core and kernel perfectly knows > that. However, for some reason it tries to initialize CPU 1? > > I found this to be caused by > - CONFIG_SMP_ON_UP successfully detects that system is single core, > - this causes is_smp() to return false, > - this causes setup_arch() to skip smp_init_cpus() call, > - this skips board-specific code that sets cpu_possible mask. Right, so you should end up with the possible and present masks containing just one CPU, which should prevent the kernel trying to bring any secondary CPUs online.
23.11.2015 15:03, Russell King - ARM Linux ?????: > On Mon, Nov 23, 2015 at 02:59:06PM +0300, nyushchenko@dev.rtsoft.ru wrote: >> From: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru> >> >> While running an imx6s boasrd, I got following message in boot log: >> >> [ 0.032414] CPU1: failed to boot: -38 >> >> This looked strange: imx6s is singe-core and kernel perfectly knows >> that. However, for some reason it tries to initialize CPU 1? >> >> I found this to be caused by >> - CONFIG_SMP_ON_UP successfully detects that system is single core, >> - this causes is_smp() to return false, >> - this causes setup_arch() to skip smp_init_cpus() call, >> - this skips board-specific code that sets cpu_possible mask. > > Right, so you should end up with the possible and present masks > containing just one CPU, which should prevent the kernel trying to > bring any secondary CPUs online. Kernel that is running here still tries to init CPU 1 for some reason. Will try to check mainline (although not sure if that will be possible on available custom hardware)
On Mon, Nov 23, 2015 at 03:06:52PM +0300, Nikita Yushchenko wrote: > 23.11.2015 15:03, Russell King - ARM Linux ?????: > > On Mon, Nov 23, 2015 at 02:59:06PM +0300, nyushchenko@dev.rtsoft.ru wrote: > >> From: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru> > >> > >> While running an imx6s boasrd, I got following message in boot log: > >> > >> [ 0.032414] CPU1: failed to boot: -38 > >> > >> This looked strange: imx6s is singe-core and kernel perfectly knows > >> that. However, for some reason it tries to initialize CPU 1? > >> > >> I found this to be caused by > >> - CONFIG_SMP_ON_UP successfully detects that system is single core, > >> - this causes is_smp() to return false, > >> - this causes setup_arch() to skip smp_init_cpus() call, > >> - this skips board-specific code that sets cpu_possible mask. > > > > Right, so you should end up with the possible and present masks > > containing just one CPU, which should prevent the kernel trying to > > bring any secondary CPUs online. > > Kernel that is running here still tries to init CPU 1 for some reason. > > Will try to check mainline (although not sure if that will be possible > on available custom hardware) iMX6 is fairly well supported in mainline. The only reason to use a custom kernel is if you want to use some feature which mainline does not support (or support very well) such as video decode, the full IPU facilities, GPUs or CEC (sorry, I don't have an expansive list.) The GPU problem for the GC320/GC880/GC2000 is fairly close to being solved in a functional (but maybe not yet performant) manner.
On Mon, Nov 23, 2015 at 12:12:16PM +0000, Russell King - ARM Linux wrote: > iMX6 is fairly well supported in mainline. The only reason to use a > custom kernel is if you want to use some feature which mainline does > not support (or support very well) such as video decode, the full IPU > facilities, GPUs or CEC (sorry, I don't have an expansive list.) > > The GPU problem for the GC320/GC880/GC2000 is fairly close to being > solved in a functional (but maybe not yet performant) manner. For reference, iMX6S in mainline behaves like this: Calibrating delay loop (skipped), value calculated using timer frequency.. 6.00 BogoMIPS (lpj=12000) pid_max: default: 32768 minimum: 301 Mount-cache hash table entries: 1024 (order: 0, 4096 bytes) Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes) Initializing cgroup subsys net_cls CPU: Testing write buffer coherency: ok CPU0: thread -1, cpu 0, socket 0, mpidr 80000000 Setting up static identity map for 0x100082c0 - 0x10008318 Brought up 1 CPUs SMP: Total of 1 processors activated (6.00 BogoMIPS). CPU: All CPU(s) started in SVC mode.
On 23/11/15 12:06, Nikita Yushchenko wrote: > 23.11.2015 15:03, Russell King - ARM Linux ?????: >> On Mon, Nov 23, 2015 at 02:59:06PM +0300, nyushchenko@dev.rtsoft.ru wrote: >>> From: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru> >>> >>> While running an imx6s boasrd, I got following message in boot log: >>> >>> [ 0.032414] CPU1: failed to boot: -38 >>> >>> This looked strange: imx6s is singe-core and kernel perfectly knows >>> that. However, for some reason it tries to initialize CPU 1? >>> >>> I found this to be caused by >>> - CONFIG_SMP_ON_UP successfully detects that system is single core, >>> - this causes is_smp() to return false, >>> - this causes setup_arch() to skip smp_init_cpus() call, >>> - this skips board-specific code that sets cpu_possible mask. >> >> Right, so you should end up with the possible and present masks >> containing just one CPU, which should prevent the kernel trying to >> bring any secondary CPUs online. > > Kernel that is running here still tries to init CPU 1 for some reason. I *guess* cpus node [1] in your dts has more than one cpu entry, could you check please? [1] Documentation/devicetree/bindings/arm/cpus.txt Vladimir > > Will try to check mainline (although not sure if that will be possible > on available custom hardware) > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
>>>> While running an imx6s boasrd, I got following message in boot log: >>>> >>>> [ 0.032414] CPU1: failed to boot: -38 >>>> >>>> This looked strange: imx6s is singe-core and kernel perfectly knows >>>> that. However, for some reason it tries to initialize CPU 1? >>>> >>>> I found this to be caused by >>>> - CONFIG_SMP_ON_UP successfully detects that system is single core, >>>> - this causes is_smp() to return false, >>>> - this causes setup_arch() to skip smp_init_cpus() call, >>>> - this skips board-specific code that sets cpu_possible mask. >>> >>> Right, so you should end up with the possible and present masks >>> containing just one CPU, which should prevent the kernel trying to >>> bring any secondary CPUs online. >> >> Kernel that is running here still tries to init CPU 1 for some reason. > > I *guess* cpus node [1] in your dts has more than one cpu entry, could > you check please? Indeed looks so: # ls /proc/device-tree/cpus #address-cells #size-cells cpu@0 cpu@1 name But my custom device tree just includes imx6dl.dtsi So it is imx6dl.dtsi in linux-imx tree broken?.. Still, if I apply change from the patch, issue diappears, since in this case imx_smp_init_cpus() gets called and initializes possible_cpu mask properly.
23.11.2015 15:19, Russell King - ARM Linux ?????: > On Mon, Nov 23, 2015 at 12:12:16PM +0000, Russell King - ARM Linux wrote: >> iMX6 is fairly well supported in mainline. The only reason to use a >> custom kernel is if you want to use some feature which mainline does >> not support (or support very well) such as video decode, the full IPU >> facilities, GPUs or CEC (sorry, I don't have an expansive list.) >> >> The GPU problem for the GC320/GC880/GC2000 is fairly close to being >> solved in a functional (but maybe not yet performant) manner. > > For reference, iMX6S in mainline behaves like this: > > Calibrating delay loop (skipped), value calculated using timer frequency.. 6.00 BogoMIPS (lpj=12000) > pid_max: default: 32768 minimum: 301 > Mount-cache hash table entries: 1024 (order: 0, 4096 bytes) > Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes) > Initializing cgroup subsys net_cls > CPU: Testing write buffer coherency: ok > CPU0: thread -1, cpu 0, socket 0, mpidr 80000000 > Setting up static identity map for 0x100082c0 - 0x10008318 > Brought up 1 CPUs > SMP: Total of 1 processors activated (6.00 BogoMIPS). > CPU: All CPU(s) started in SVC mode. Indeed, somehow booted mainline and it does not try to initialize CPU 1.
>>>>> While running an imx6s boasrd, I got following message in boot log: >>>>> >>>>> [ 0.032414] CPU1: failed to boot: -38 >>>>> >>>>> This looked strange: imx6s is singe-core and kernel perfectly knows >>>>> that. However, for some reason it tries to initialize CPU 1? >>>>> >>>>> I found this to be caused by >>>>> - CONFIG_SMP_ON_UP successfully detects that system is single core, >>>>> - this causes is_smp() to return false, >>>>> - this causes setup_arch() to skip smp_init_cpus() call, >>>>> - this skips board-specific code that sets cpu_possible mask. >>>> >>>> Right, so you should end up with the possible and present masks >>>> containing just one CPU, which should prevent the kernel trying to >>>> bring any secondary CPUs online. >>> >>> Kernel that is running here still tries to init CPU 1 for some reason. >> >> I *guess* cpus node [1] in your dts has more than one cpu entry, could >> you check please? > > Indeed looks so: > > # ls /proc/device-tree/cpus > #address-cells #size-cells cpu@0 cpu@1 name > > But my custom device tree just includes imx6dl.dtsi > > So it is imx6dl.dtsi in linux-imx tree broken?.. Just booted mainline... unline linux-imx, it does not try to init cpu1. However, imx6dl.dtsi from mainline also has both cpu@0 and cpu@1 So missing piece in linux-imx is elsewhere :(
On Mon, Nov 23, 2015 at 03:47:34PM +0300, Nikita Yushchenko wrote: > >>>>> While running an imx6s boasrd, I got following message in boot log: > >>>>> > >>>>> [ 0.032414] CPU1: failed to boot: -38 > >>>>> > >>>>> This looked strange: imx6s is singe-core and kernel perfectly knows > >>>>> that. However, for some reason it tries to initialize CPU 1? > >>>>> > >>>>> I found this to be caused by > >>>>> - CONFIG_SMP_ON_UP successfully detects that system is single core, > >>>>> - this causes is_smp() to return false, > >>>>> - this causes setup_arch() to skip smp_init_cpus() call, > >>>>> - this skips board-specific code that sets cpu_possible mask. > >>>> > >>>> Right, so you should end up with the possible and present masks > >>>> containing just one CPU, which should prevent the kernel trying to > >>>> bring any secondary CPUs online. > >>> > >>> Kernel that is running here still tries to init CPU 1 for some reason. > >> > >> I *guess* cpus node [1] in your dts has more than one cpu entry, could > >> you check please? > > > > Indeed looks so: > > > > # ls /proc/device-tree/cpus > > #address-cells #size-cells cpu@0 cpu@1 name > > > > But my custom device tree just includes imx6dl.dtsi > > > > So it is imx6dl.dtsi in linux-imx tree broken?.. > > Just booted mainline... unline linux-imx, it does not try to init cpu1. > > However, imx6dl.dtsi from mainline also has both cpu@0 and cpu@1 > > So missing piece in linux-imx is elsewhere :( It works as you mentioned - and it relies upon the code you tried to modify. The early boot code detects that the boot CPU is not SMP capable, so through SMP_ON_UP, it "turns off" SMP support by fixing up the code and making is_smp() return false. This prevents smp_init_cpus() being called, which in turn prevents imx_smp_init_cpus() executing, which prevents the CPU possible mask including any CPU but the boot CPU. As only the boot CPU is possible, this prevents the SMP code trying to bring any secondary CPUs online. Applying your patch which removes the is_smp() check will break this logic.
>> Just booted mainline... unline linux-imx, it does not try to init cpu1. >> >> However, imx6dl.dtsi from mainline also has both cpu@0 and cpu@1 >> >> So missing piece in linux-imx is elsewhere :( > > It works as you mentioned - and it relies upon the code you tried to > modify. > > The early boot code detects that the boot CPU is not SMP capable, so > through SMP_ON_UP, it "turns off" SMP support by fixing up the code > and making is_smp() return false. > > This prevents smp_init_cpus() being called, which in turn prevents > imx_smp_init_cpus() executing, which prevents the CPU possible mask > including any CPU but the boot CPU. > > As only the boot CPU is possible, this prevents the SMP code trying > to bring any secondary CPUs online. I'm still trying to understand what is going on, and my printk()s show that this is not entirely true. When smp_init() is entered on mainline om imx6s, cpu_possible_mask and cpu_present_mask both contain two cpus. These get initialized in arm_dt_init_cpu_maps() and stay unmodified since then. But cpu_online() returns 1 for cpu0 and 0 from cpu1 - thus it is cpu_online() check, not possible_mask or present_mask, that prevents cpu1 initialization attempt. Not sure I understand logic behind this. With the current code, resulting cpu_possible_mask depends on CONFIG_SMP_ON_UP: - if it is set, cpu_possible_mask contains (0 1), as initialized in arm_dt_init_cpu_maps() - if it is not set, cpu_possible_mask contains (0), since imx_smp_init_cpus() removes 1 from there. This does not seem to be intended difference.
> I'm still trying to understand what is going on, and my printk()s show > that this is not entirely true. > > When smp_init() is entered on mainline om imx6s, cpu_possible_mask and > cpu_present_mask both contain two cpus. These get initialized in > arm_dt_init_cpu_maps() and stay unmodified since then. > > But cpu_online() returns 1 for cpu0 and 0 from cpu1 - thus it is > cpu_online() check, not possible_mask or present_mask, that prevents > cpu1 initialization attempt. Sorry was too quick to type. cpu_online(0) is true and cpu_online(1) is false. It is natural, since cpu0 is already running. Thus cpu_up(1) is entered!
24.11.2015 18:05, Nikita Yushchenko ?????: >> I'm still trying to understand what is going on, and my printk()s show >> that this is not entirely true. >> >> When smp_init() is entered on mainline om imx6s, cpu_possible_mask and >> cpu_present_mask both contain two cpus. These get initialized in >> arm_dt_init_cpu_maps() and stay unmodified since then. >> >> But cpu_online() returns 1 for cpu0 and 0 from cpu1 - thus it is >> cpu_online() check, not possible_mask or present_mask, that prevents >> cpu1 initialization attempt. > > Sorry was too quick to type. > > cpu_online(0) is true and cpu_online(1) is false. > It is natural, since cpu0 is already running. > Thus cpu_up(1) is entered! ... and then code executes into __cpu_up() from arch/arm/kernel/smp.c, and stops via if (!smp_ops.smp_boot_secondary) return -ENOSYS; (smp_ops zeroed due to SMP_ON_UP, as far as I understand). In linux-imx 3.14.28 based tree, there is no such check in __cpu_up, thus boot_secondary() is called int boot_secondary(unsigned int cpu, struct task_struct *idle) { if (smp_ops.smp_boot_secondary) return smp_ops.smp_boot_secondary(cpu, idle); return -ENOSYS; } at this point zeroed smp_ops plays, -ENOSYS (-38) is returned, and pr_err() in __cpu_up() prints the message that caused the entire analysis. So conclusion is that - behaviour of mainline and linux-imx tres is almost the same, there is attempt to bring up non-existing cpu 1, difference is only in where zeroed smp_ops is detected and if error is logged or not. Not sure that my proposed patch was correct, it fixes imx6s case but can have bad effect on other arm targets. But I think that something needs to be done to make cpu masks correct in SMP_ON_UP case.
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 20edd34..8a14fce 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -980,16 +980,14 @@ void __init setup_arch(char **cmdline_p) psci_dt_init(); xen_early_init(); #ifdef CONFIG_SMP - if (is_smp()) { - if (!mdesc->smp_init || !mdesc->smp_init()) { - if (psci_smp_available()) - smp_set_ops(&psci_smp_ops); - else if (mdesc->smp) - smp_set_ops(mdesc->smp); - } - smp_init_cpus(); - smp_build_mpidr_hash(); + if (!mdesc->smp_init || !mdesc->smp_init()) { + if (psci_smp_available()) + smp_set_ops(&psci_smp_ops); + else if (mdesc->smp) + smp_set_ops(mdesc->smp); } + smp_init_cpus(); + smp_build_mpidr_hash(); #endif if (!is_smp())