diff mbox

Armada XP (MV78460): BUG in netdevice.h with maxcpus=2

Message ID 20160108182537.0a68630c@xhacker (mailing list archive)
State New, archived
Headers show

Commit Message

Jisheng Zhang Jan. 8, 2016, 10:25 a.m. UTC
Hi Stefan,

On Fri, 8 Jan 2016 08:50:31 +0100 Stefan Roese wrote:

> Hi Thomas,
> 
> I'm currently trying to boot latest Linux on the DB-MV784MP-GP
> Marvell board. This works just fine for the normal
> configuration. But when I try to boot with only 2 CPUs enabled
> (maxcpus=2), mounting the rootfs via NFS crashes with this
> log:
> 
> [   16.408285] ------------[ cut here ]------------
> [   16.412915] kernel BUG at include/linux/netdevice.h:507!
> [   16.418239] Internal error: Oops - BUG: 0 [#1] SMP ARM
> [   16.423390] Modules linked in:
> [   16.426464] CPU: 0 PID: 1424 Comm: systemd-udevd Not tainted 4.4.0-rc8-00045-gcbf389a #37
> [   16.434660] Hardware name: Marvell Armada 370/XP (Device Tree)
> [   16.440507] task: ee9d3100 ti: ee164000 task.ti: ee164000
> [   16.445927] PC is at mvneta_percpu_notifier+0x358/0x35c
> [   16.451167] LR is at mvneta_percpu_notifier+0x27c/0x35c
> [   16.456405] pc : [<c0310d60>]    lr : [<c0310c84>]    psr: 60010013
> [   16.456405] sp : ee165e18  ip : 00000003  fp : c077c494
> [   16.467914] r10: c077cd58  r9 : 00000002  r8 : c077ce08
> [   16.473154] r7 : eea8a520  r6 : ff7ea400  r5 : 00000004  r4 : 00000000
> [   16.479697] r3 : 00000000  r2 : 00000008  r1 : 00000004  r0 : 00000004
> [   16.486243] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> [   16.493402] Control: 10c5387d  Table: 2e16806a  DAC: 00000051
> [   16.499168] Process systemd-udevd (pid: 1424, stack limit = 0xee164220)
> [   16.505801] Stack: (0xee165e18 to 0xee166000)
> [   16.510170] 5e00:                                                       c07b89c0 eea8a500
> [   16.518373] 5e20: 00000002 ffffffdb 00000002 00000002 00000000 c0782aa0 00000000 ee82a5c0
> [   16.526575] 5e40: 00000000 c003fb44 00000000 00000000 c077c494 00000000 00000001 c0024f68
> [   16.534774] 5e60: 00000000 00000000 00000002 c0025574 ee165e7c 00000022 00000000 00000028
> [   16.542977] 5e80: 00000002 eefca010 eefca044 ee165f88 edc61240 b6f07000 00000000 c00255fc
> [   16.551178] 5ea0: c02907fc c028c520 eefca010 00000001 00000001 c028c5ac 00000000 01eada00
> [   16.559381] 5ec0: 00000001 edeada00 edc6124c c01324f0 00000000 00000000 00000800 ee9c6f40
> [   16.567582] 5ee0: c0132438 b6f07000 ee165f88 c000f7c4 ee164000 00000000 00039388 c00d6bd0
> [   16.575782] 5f00: 00000001 00000000 0002b2cc c00092f4 00001000 00001000 b6f07000 00000003
> [   16.583981] 5f20: 00000000 edc22580 00000001 c077ce2c 00000073 c00c3994 000b6f07 00000000
> [   16.592180] 5f40: beef1270 ffffffff 48400000 ee9c6f40 00000001 b6f07000 ee165f88 c000f7c4
> [   16.600385] 5f60: ee164000 c00d7414 0002b2cc c00b3a88 ee9c6f40 ee9c6f40 b6f07000 00000001
> [   16.608588] 5f80: c000f7c4 c00d7c38 00000000 00000000 48400000 00000001 b6f07000 00046c70
> [   16.616797] 5fa0: 00000004 c000f600 00000001 b6f07000 00000005 b6f07000 00000001 00000000
> [   16.625003] 5fc0: 00000001 b6f07000 00046c70 00000004 00000001 00000000 00039860 00039388
> [   16.633208] 5fe0: 00000000 beef181c b6d6bf81 b6da5066 400d0030 00000005 00180011 000022be
> [   16.641431] [<c0310d60>] (mvneta_percpu_notifier) from [<c003fb44>] (notifier_call_chain+0x44/0x84)
> [   16.650513] [<c003fb44>] (notifier_call_chain) from [<c0024f68>] (cpu_notify+0x24/0x40)
> [   16.658544] [<c0024f68>] (cpu_notify) from [<c0025574>] (_cpu_up+0x14c/0x164)
> [   16.665700] [<c0025574>] (_cpu_up) from [<c00255fc>] (cpu_up+0x70/0x94)
> [   16.672333] [<c00255fc>] (cpu_up) from [<c028c520>] (device_online+0x64/0x88)
> [   16.679490] [<c028c520>] (device_online) from [<c028c5ac>] (online_store+0x68/0x74)
> [   16.687174] [<c028c5ac>] (online_store) from [<c01324f0>] (kernfs_fop_write+0xb8/0x1b4)
> [   16.695206] [<c01324f0>] (kernfs_fop_write) from [<c00d6bd0>] (__vfs_write+0x1c/0xd8)
> [   16.703061] [<c00d6bd0>] (__vfs_write) from [<c00d7414>] (vfs_write+0x90/0x16c)
> [   16.710391] [<c00d7414>] (vfs_write) from [<c00d7c38>] (SyS_write+0x44/0x9c)
> [   16.717464] [<c00d7c38>] (SyS_write) from [<c000f600>] (ret_fast_syscall+0x0/0x3c)
> [   16.725057] Code: 3afffff5 e3a00001 e28dd00c e8bd8ff0 (e7f001f2) 
> [   16.731167] ---[ end trace 1451bd9f24838b0f ]---
> 
> 
> Again, this only happen, if I boot the system with 2 CPUs online.
> Without this kernel commandline arg (and all 4 CPUs online),
> mounting the NFS rootfs works just fine.
> 
> Is this a known problem? Do you have any quick hints how to fix
> this issue?


can you please try the following patch?

Thanks

Comments

Gregory CLEMENT Jan. 8, 2016, 10:51 a.m. UTC | #1
Hi Jisheng,
 
 On ven., janv. 08 2016, Jisheng Zhang <jszhang@marvell.com> wrote:

> Hi Stefan,
>
> On Fri, 8 Jan 2016 08:50:31 +0100 Stefan Roese wrote:
>
>> Hi Thomas,
>> 
>> I'm currently trying to boot latest Linux on the DB-MV784MP-GP
>> Marvell board. This works just fine for the normal
>> configuration. But when I try to boot with only 2 CPUs enabled
>> (maxcpus=2), mounting the rootfs via NFS crashes with this
>> log:
>> 
>> [   16.408285] ------------[ cut here ]------------
>> [   16.412915] kernel BUG at include/linux/netdevice.h:507!
>> [   16.418239] Internal error: Oops - BUG: 0 [#1] SMP ARM
>> [   16.423390] Modules linked in:
>> [   16.426464] CPU: 0 PID: 1424 Comm: systemd-udevd Not tainted 4.4.0-rc8-00045-gcbf389a #37
>> [   16.434660] Hardware name: Marvell Armada 370/XP (Device Tree)
>> [   16.440507] task: ee9d3100 ti: ee164000 task.ti: ee164000
>> [   16.445927] PC is at mvneta_percpu_notifier+0x358/0x35c
>> [   16.451167] LR is at mvneta_percpu_notifier+0x27c/0x35c
>> [   16.456405] pc : [<c0310d60>]    lr : [<c0310c84>]    psr: 60010013
>> [   16.456405] sp : ee165e18  ip : 00000003  fp : c077c494
>> [   16.467914] r10: c077cd58  r9 : 00000002  r8 : c077ce08
>> [   16.473154] r7 : eea8a520  r6 : ff7ea400  r5 : 00000004  r4 : 00000000
>> [   16.479697] r3 : 00000000  r2 : 00000008  r1 : 00000004  r0 : 00000004
>> [   16.486243] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>> [   16.493402] Control: 10c5387d  Table: 2e16806a  DAC: 00000051
>> [   16.499168] Process systemd-udevd (pid: 1424, stack limit = 0xee164220)
>> [   16.505801] Stack: (0xee165e18 to 0xee166000)
>> [   16.510170] 5e00:                                                       c07b89c0 eea8a500
>> [   16.518373] 5e20: 00000002 ffffffdb 00000002 00000002 00000000 c0782aa0 00000000 ee82a5c0
>> [   16.526575] 5e40: 00000000 c003fb44 00000000 00000000 c077c494 00000000 00000001 c0024f68
>> [   16.534774] 5e60: 00000000 00000000 00000002 c0025574 ee165e7c 00000022 00000000 00000028
>> [   16.542977] 5e80: 00000002 eefca010 eefca044 ee165f88 edc61240 b6f07000 00000000 c00255fc
>> [   16.551178] 5ea0: c02907fc c028c520 eefca010 00000001 00000001 c028c5ac 00000000 01eada00
>> [   16.559381] 5ec0: 00000001 edeada00 edc6124c c01324f0 00000000 00000000 00000800 ee9c6f40
>> [   16.567582] 5ee0: c0132438 b6f07000 ee165f88 c000f7c4 ee164000 00000000 00039388 c00d6bd0
>> [   16.575782] 5f00: 00000001 00000000 0002b2cc c00092f4 00001000 00001000 b6f07000 00000003
>> [   16.583981] 5f20: 00000000 edc22580 00000001 c077ce2c 00000073 c00c3994 000b6f07 00000000
>> [   16.592180] 5f40: beef1270 ffffffff 48400000 ee9c6f40 00000001 b6f07000 ee165f88 c000f7c4
>> [   16.600385] 5f60: ee164000 c00d7414 0002b2cc c00b3a88 ee9c6f40 ee9c6f40 b6f07000 00000001
>> [   16.608588] 5f80: c000f7c4 c00d7c38 00000000 00000000 48400000 00000001 b6f07000 00046c70
>> [   16.616797] 5fa0: 00000004 c000f600 00000001 b6f07000 00000005 b6f07000 00000001 00000000
>> [   16.625003] 5fc0: 00000001 b6f07000 00046c70 00000004 00000001 00000000 00039860 00039388
>> [   16.633208] 5fe0: 00000000 beef181c b6d6bf81 b6da5066 400d0030 00000005 00180011 000022be
>> [   16.641431] [<c0310d60>] (mvneta_percpu_notifier) from [<c003fb44>] (notifier_call_chain+0x44/0x84)
>> [   16.650513] [<c003fb44>] (notifier_call_chain) from [<c0024f68>] (cpu_notify+0x24/0x40)
>> [   16.658544] [<c0024f68>] (cpu_notify) from [<c0025574>] (_cpu_up+0x14c/0x164)
>> [   16.665700] [<c0025574>] (_cpu_up) from [<c00255fc>] (cpu_up+0x70/0x94)
>> [   16.672333] [<c00255fc>] (cpu_up) from [<c028c520>] (device_online+0x64/0x88)
>> [   16.679490] [<c028c520>] (device_online) from [<c028c5ac>] (online_store+0x68/0x74)
>> [   16.687174] [<c028c5ac>] (online_store) from [<c01324f0>] (kernfs_fop_write+0xb8/0x1b4)
>> [   16.695206] [<c01324f0>] (kernfs_fop_write) from [<c00d6bd0>] (__vfs_write+0x1c/0xd8)
>> [   16.703061] [<c00d6bd0>] (__vfs_write) from [<c00d7414>] (vfs_write+0x90/0x16c)
>> [   16.710391] [<c00d7414>] (vfs_write) from [<c00d7c38>] (SyS_write+0x44/0x9c)
>> [   16.717464] [<c00d7c38>] (SyS_write) from [<c000f600>] (ret_fast_syscall+0x0/0x3c)
>> [   16.725057] Code: 3afffff5 e3a00001 e28dd00c e8bd8ff0 (e7f001f2) 
>> [   16.731167] ---[ end trace 1451bd9f24838b0f ]---
>> 
>> 
>> Again, this only happen, if I boot the system with 2 CPUs online.
>> Without this kernel commandline arg (and all 4 CPUs online),
>> mounting the NFS rootfs works just fine.
>> 
>> Is this a known problem? Do you have any quick hints how to fix
>> this issue?
>
>
> can you please try the following patch?

This patch looks good for me. The reason behind using the
for_each_present_cpu macro was to be ready when the CPUs come
online. But this case is already managed by the CPU notifiers.

Once Stefan will have confirmed that it fixed his issue, would you mind
to send a proper patch to the netdev mailing list?

Thanks,

Gregory


>
> Thanks
>
> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index ed622fa..e1242f4 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -2446,7 +2446,7 @@ static void mvneta_start_dev(struct mvneta_port *pp)
>  	mvneta_port_enable(pp);
>  
>  	/* Enable polling on the port */
> -	for_each_present_cpu(cpu) {
> +	for_each_online_cpu(cpu) {
>  		struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
>  
>  		napi_enable(&port->napi);
> @@ -2472,7 +2472,7 @@ static void mvneta_stop_dev(struct mvneta_port *pp)
>  
>  	phy_stop(pp->phy_dev);
>  
> -	for_each_present_cpu(cpu) {
> +	for_each_online_cpu(cpu) {
>  		struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
>  
>  		napi_disable(&port->napi);
> @@ -2907,7 +2907,7 @@ static int mvneta_stop(struct net_device *dev)
>  	mvneta_stop_dev(pp);
>  	mvneta_mdio_remove(pp);
>  	unregister_cpu_notifier(&pp->cpu_notifier);
> -	for_each_present_cpu(cpu)
> +	for_each_online_cpu(cpu)
>  		smp_call_function_single(cpu, mvneta_percpu_disable, pp, true);
>  	free_percpu_irq(dev->irq, pp->ports);
>  	mvneta_cleanup_rxqs(pp);
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Stefan Roese Jan. 8, 2016, 10:53 a.m. UTC | #2
Hi Jisheng,

On 08.01.2016 11:25, Jisheng Zhang wrote:
> On Fri, 8 Jan 2016 08:50:31 +0100 Stefan Roese wrote:
>
>> Hi Thomas,
>>
>> I'm currently trying to boot latest Linux on the DB-MV784MP-GP
>> Marvell board. This works just fine for the normal
>> configuration. But when I try to boot with only 2 CPUs enabled
>> (maxcpus=2), mounting the rootfs via NFS crashes with this
>> log:
>>
>> [   16.408285] ------------[ cut here ]------------
>> [   16.412915] kernel BUG at include/linux/netdevice.h:507!
>> [   16.418239] Internal error: Oops - BUG: 0 [#1] SMP ARM
>> [   16.423390] Modules linked in:
>> [   16.426464] CPU: 0 PID: 1424 Comm: systemd-udevd Not tainted 4.4.0-rc8-00045-gcbf389a #37
>> [   16.434660] Hardware name: Marvell Armada 370/XP (Device Tree)
>> [   16.440507] task: ee9d3100 ti: ee164000 task.ti: ee164000
>> [   16.445927] PC is at mvneta_percpu_notifier+0x358/0x35c
>> [   16.451167] LR is at mvneta_percpu_notifier+0x27c/0x35c
>> [   16.456405] pc : [<c0310d60>]    lr : [<c0310c84>]    psr: 60010013
>> [   16.456405] sp : ee165e18  ip : 00000003  fp : c077c494
>> [   16.467914] r10: c077cd58  r9 : 00000002  r8 : c077ce08
>> [   16.473154] r7 : eea8a520  r6 : ff7ea400  r5 : 00000004  r4 : 00000000
>> [   16.479697] r3 : 00000000  r2 : 00000008  r1 : 00000004  r0 : 00000004
>> [   16.486243] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>> [   16.493402] Control: 10c5387d  Table: 2e16806a  DAC: 00000051
>> [   16.499168] Process systemd-udevd (pid: 1424, stack limit = 0xee164220)
>> [   16.505801] Stack: (0xee165e18 to 0xee166000)
>> [   16.510170] 5e00:                                                       c07b89c0 eea8a500
>> [   16.518373] 5e20: 00000002 ffffffdb 00000002 00000002 00000000 c0782aa0 00000000 ee82a5c0
>> [   16.526575] 5e40: 00000000 c003fb44 00000000 00000000 c077c494 00000000 00000001 c0024f68
>> [   16.534774] 5e60: 00000000 00000000 00000002 c0025574 ee165e7c 00000022 00000000 00000028
>> [   16.542977] 5e80: 00000002 eefca010 eefca044 ee165f88 edc61240 b6f07000 00000000 c00255fc
>> [   16.551178] 5ea0: c02907fc c028c520 eefca010 00000001 00000001 c028c5ac 00000000 01eada00
>> [   16.559381] 5ec0: 00000001 edeada00 edc6124c c01324f0 00000000 00000000 00000800 ee9c6f40
>> [   16.567582] 5ee0: c0132438 b6f07000 ee165f88 c000f7c4 ee164000 00000000 00039388 c00d6bd0
>> [   16.575782] 5f00: 00000001 00000000 0002b2cc c00092f4 00001000 00001000 b6f07000 00000003
>> [   16.583981] 5f20: 00000000 edc22580 00000001 c077ce2c 00000073 c00c3994 000b6f07 00000000
>> [   16.592180] 5f40: beef1270 ffffffff 48400000 ee9c6f40 00000001 b6f07000 ee165f88 c000f7c4
>> [   16.600385] 5f60: ee164000 c00d7414 0002b2cc c00b3a88 ee9c6f40 ee9c6f40 b6f07000 00000001
>> [   16.608588] 5f80: c000f7c4 c00d7c38 00000000 00000000 48400000 00000001 b6f07000 00046c70
>> [   16.616797] 5fa0: 00000004 c000f600 00000001 b6f07000 00000005 b6f07000 00000001 00000000
>> [   16.625003] 5fc0: 00000001 b6f07000 00046c70 00000004 00000001 00000000 00039860 00039388
>> [   16.633208] 5fe0: 00000000 beef181c b6d6bf81 b6da5066 400d0030 00000005 00180011 000022be
>> [   16.641431] [<c0310d60>] (mvneta_percpu_notifier) from [<c003fb44>] (notifier_call_chain+0x44/0x84)
>> [   16.650513] [<c003fb44>] (notifier_call_chain) from [<c0024f68>] (cpu_notify+0x24/0x40)
>> [   16.658544] [<c0024f68>] (cpu_notify) from [<c0025574>] (_cpu_up+0x14c/0x164)
>> [   16.665700] [<c0025574>] (_cpu_up) from [<c00255fc>] (cpu_up+0x70/0x94)
>> [   16.672333] [<c00255fc>] (cpu_up) from [<c028c520>] (device_online+0x64/0x88)
>> [   16.679490] [<c028c520>] (device_online) from [<c028c5ac>] (online_store+0x68/0x74)
>> [   16.687174] [<c028c5ac>] (online_store) from [<c01324f0>] (kernfs_fop_write+0xb8/0x1b4)
>> [   16.695206] [<c01324f0>] (kernfs_fop_write) from [<c00d6bd0>] (__vfs_write+0x1c/0xd8)
>> [   16.703061] [<c00d6bd0>] (__vfs_write) from [<c00d7414>] (vfs_write+0x90/0x16c)
>> [   16.710391] [<c00d7414>] (vfs_write) from [<c00d7c38>] (SyS_write+0x44/0x9c)
>> [   16.717464] [<c00d7c38>] (SyS_write) from [<c000f600>] (ret_fast_syscall+0x0/0x3c)
>> [   16.725057] Code: 3afffff5 e3a00001 e28dd00c e8bd8ff0 (e7f001f2)
>> [   16.731167] ---[ end trace 1451bd9f24838b0f ]---
>>
>>
>> Again, this only happen, if I boot the system with 2 CPUs online.
>> Without this kernel commandline arg (and all 4 CPUs online),
>> mounting the NFS rootfs works just fine.
>>
>> Is this a known problem? Do you have any quick hints how to fix
>> this issue?
>
>
> can you please try the following patch?
>
> Thanks
>
> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index ed622fa..e1242f4 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -2446,7 +2446,7 @@ static void mvneta_start_dev(struct mvneta_port *pp)
>   	mvneta_port_enable(pp);
>
>   	/* Enable polling on the port */
> -	for_each_present_cpu(cpu) {
> +	for_each_online_cpu(cpu) {
>   		struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
>
>   		napi_enable(&port->napi);
> @@ -2472,7 +2472,7 @@ static void mvneta_stop_dev(struct mvneta_port *pp)
>
>   	phy_stop(pp->phy_dev);
>
> -	for_each_present_cpu(cpu) {
> +	for_each_online_cpu(cpu) {
>   		struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
>
>   		napi_disable(&port->napi);
> @@ -2907,7 +2907,7 @@ static int mvneta_stop(struct net_device *dev)
>   	mvneta_stop_dev(pp);
>   	mvneta_mdio_remove(pp);
>   	unregister_cpu_notifier(&pp->cpu_notifier);
> -	for_each_present_cpu(cpu)
> +	for_each_online_cpu(cpu)
>   		smp_call_function_single(cpu, mvneta_percpu_disable, pp, true);
>   	free_percpu_irq(dev->irq, pp->ports);
>   	mvneta_cleanup_rxqs(pp);

This patch fixes this issue. Thanks for the quick response.

Thanks,
Stefan
Russell King - ARM Linux Jan. 8, 2016, 10:57 a.m. UTC | #3
On Fri, Jan 08, 2016 at 06:25:37PM +0800, Jisheng Zhang wrote:
> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index ed622fa..e1242f4 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -2446,7 +2446,7 @@ static void mvneta_start_dev(struct mvneta_port *pp)
>  	mvneta_port_enable(pp);
>  
>  	/* Enable polling on the port */
> -	for_each_present_cpu(cpu) {
> +	for_each_online_cpu(cpu) {
>  		struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
>  
>  		napi_enable(&port->napi);
> @@ -2472,7 +2472,7 @@ static void mvneta_stop_dev(struct mvneta_port *pp)
>  
>  	phy_stop(pp->phy_dev);
>  
> -	for_each_present_cpu(cpu) {
> +	for_each_online_cpu(cpu) {
>  		struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
>  
>  		napi_disable(&port->napi);
> @@ -2907,7 +2907,7 @@ static int mvneta_stop(struct net_device *dev)
>  	mvneta_stop_dev(pp);
>  	mvneta_mdio_remove(pp);
>  	unregister_cpu_notifier(&pp->cpu_notifier);
> -	for_each_present_cpu(cpu)
> +	for_each_online_cpu(cpu)
>  		smp_call_function_single(cpu, mvneta_percpu_disable, pp, true);
>  	free_percpu_irq(dev->irq, pp->ports);
>  	mvneta_cleanup_rxqs(pp);

I'm not convinced that this isn't racy - what happens if a CPU is
brought online between unregister_cpu_notifier(&pp->cpu_notifier)
and the following loop?  We'll end up calling mvneta_percpu_disable()
for the new CPU - is that harmless?

Similarly, what if the online CPUs change between mvneta_stop_dev()
and mvneta_stop(), and also what about hotplug CPU changes during
the startup path?

Secondly, is there a reason for:

	for_each_online_cpu(cpu)
		smp_call_function_single(cpu, ...)

as opposed to:

	smp_call_function(mvneta_percpu_disable, pp, true);
Jisheng Zhang Jan. 8, 2016, 12:45 p.m. UTC | #4
Dear Russell,

On Fri, 8 Jan 2016 10:57:21 +0000 Russell King - ARM Linux wrote:

> On Fri, Jan 08, 2016 at 06:25:37PM +0800, Jisheng Zhang wrote:
> > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> > index ed622fa..e1242f4 100644
> > --- a/drivers/net/ethernet/marvell/mvneta.c
> > +++ b/drivers/net/ethernet/marvell/mvneta.c
> > @@ -2446,7 +2446,7 @@ static void mvneta_start_dev(struct mvneta_port *pp)
> >  	mvneta_port_enable(pp);
> >  
> >  	/* Enable polling on the port */
> > -	for_each_present_cpu(cpu) {
> > +	for_each_online_cpu(cpu) {
> >  		struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
> >  
> >  		napi_enable(&port->napi);
> > @@ -2472,7 +2472,7 @@ static void mvneta_stop_dev(struct mvneta_port *pp)
> >  
> >  	phy_stop(pp->phy_dev);
> >  
> > -	for_each_present_cpu(cpu) {
> > +	for_each_online_cpu(cpu) {
> >  		struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
> >  
> >  		napi_disable(&port->napi);
> > @@ -2907,7 +2907,7 @@ static int mvneta_stop(struct net_device *dev)
> >  	mvneta_stop_dev(pp);
> >  	mvneta_mdio_remove(pp);
> >  	unregister_cpu_notifier(&pp->cpu_notifier);
> > -	for_each_present_cpu(cpu)
> > +	for_each_online_cpu(cpu)
> >  		smp_call_function_single(cpu, mvneta_percpu_disable, pp, true);
> >  	free_percpu_irq(dev->irq, pp->ports);
> >  	mvneta_cleanup_rxqs(pp);  
> 
> I'm not convinced that this isn't racy - what happens if a CPU is
> brought online between unregister_cpu_notifier(&pp->cpu_notifier)
> and the following loop?  We'll end up calling mvneta_percpu_disable()
> for the new CPU - is that harmless?
> 
> Similarly, what if the online CPUs change between mvneta_stop_dev()
> and mvneta_stop(), and also what about hotplug CPU changes during
> the startup path?
> 
> Secondly, is there a reason for:
> 
> 	for_each_online_cpu(cpu)
> 		smp_call_function_single(cpu, ...)
> 
> as opposed to:
> 
> 	smp_call_function(mvneta_percpu_disable, pp, true);
> 

Yep, thanks for pointing out the race. Before sending out the patch, I prepared
another *wrong*(IMHO) patch to patch smp_prepare_cpus() in arch/arm/kernel/smp.c.

Here is the background:
I can reproduce this issue on arm but failed to reproduce it on arm64. The key
is what's present cpu.

let's assume a quad core system, boot with maxcpus=2, after booting.

on arm64, present cpus is cpu0, cpu1

on arm, present cpus is cpu0, cpu1, cpu2 and cpu3.

the arm core code requires every platform to update the present map in
platforms' smp_prepare_cpus(), but only two or three platforms do so.

Then get back to mvneta issue, during startup, mvneta_start_dev() calls
napi_enable() for each present cpu's port. If userspace ask for online
cpu2, mvneta_percpu_notifier() will call napi_enable() for cpu2 again,
so BUG_ON() is triggered.

I have two solutions:

1. as the above patch did, then prevent the race as pointed out by
get_online_cpus().

2. make arm platforms smp_prepare_cpus to update the present map or even
patch arm core smp_prepare_cpus().

What's the better solution? Could you please guide me?

Thanks in advance,
Jisheng
Russell King - ARM Linux Jan. 8, 2016, 1:31 p.m. UTC | #5
On Fri, Jan 08, 2016 at 08:45:23PM +0800, Jisheng Zhang wrote:
> let's assume a quad core system, boot with maxcpus=2, after booting.
> 
> on arm64, present cpus is cpu0, cpu1
> 
> on arm, present cpus is cpu0, cpu1, cpu2 and cpu3.
> 
> the arm core code requires every platform to update the present map in
> platforms' smp_prepare_cpus(), but only two or three platforms do so.

The behaviour of maxcpus= is architecture specific.  Some architectures
take notice of this to limit the number of present and possible CPUs,
others ignore it.  In any case, it limits the number of CPUs that come
online at boot.

However, that is a complete red herring, because what we're talking
about is bringing up the network interface at some point later, when
CPU hotplug may have changed the online state already: the kernel may
have booted with CPU0-3 online, but CPUs 2 and 3 may have been taken
offline before the network interface is brought up.

> What's the better solution? Could you please guide me?

I would suggest that you need to track the state of each CPU within
your percpu specific code with appropriate locks to prevent concurrency,
and ensure that you register the CPU hotplug notifier just before
walking the currently on-line CPUs.

Also, walk the list of currently on-line CPUs (using, eg,
smp_call_function) just before unregistering the hotplug notifier.

Remember that your per-CPU bringup/takedown may be executed twice
for the same CPU - once via the hotplug notifier and once via
smp_call_function() - hence why you need locking and state tracking.

That fixes two of the sites.  For the other site, I wonder whether
it's possible to restructure the code so there's no need to have
three sites, since it's fairly obvious when thinking about the
CPU hotplug case, you only have notification of the CPU coming
online and the CPU going away.  It's been a while since I looked
at the mvneta code to really comment on that though.
Jisheng Zhang Jan. 8, 2016, 1:48 p.m. UTC | #6
Dear Russell,

On Fri, 8 Jan 2016 13:31:04 +0000 Russell King - ARM Linux wrote:

> On Fri, Jan 08, 2016 at 08:45:23PM +0800, Jisheng Zhang wrote:
> > let's assume a quad core system, boot with maxcpus=2, after booting.
> > 
> > on arm64, present cpus is cpu0, cpu1
> > 
> > on arm, present cpus is cpu0, cpu1, cpu2 and cpu3.
> > 
> > the arm core code requires every platform to update the present map in
> > platforms' smp_prepare_cpus(), but only two or three platforms do so.  
> 
> The behaviour of maxcpus= is architecture specific.  Some architectures
> take notice of this to limit the number of present and possible CPUs,
> others ignore it.  In any case, it limits the number of CPUs that come
> online at boot.
> 
> However, that is a complete red herring, because what we're talking
> about is bringing up the network interface at some point later, when
> CPU hotplug may have changed the online state already: the kernel may
> have booted with CPU0-3 online, but CPUs 2 and 3 may have been taken
> offline before the network interface is brought up.

Oh yeah! On arm64, the BUG_ON can be reproduced in this case. So the driver
need a fix. Thanks a lot

> 
> > What's the better solution? Could you please guide me?  
> 
> I would suggest that you need to track the state of each CPU within
> your percpu specific code with appropriate locks to prevent concurrency,
> and ensure that you register the CPU hotplug notifier just before
> walking the currently on-line CPUs.
> 
> Also, walk the list of currently on-line CPUs (using, eg,
> smp_call_function) just before unregistering the hotplug notifier.
> 
> Remember that your per-CPU bringup/takedown may be executed twice
> for the same CPU - once via the hotplug notifier and once via
> smp_call_function() - hence why you need locking and state tracking.

Got it. Thanks for the guidance, I'll try.

> 
> That fixes two of the sites.  For the other site, I wonder whether
> it's possible to restructure the code so there's no need to have
> three sites, since it's fairly obvious when thinking about the
> CPU hotplug case, you only have notification of the CPU coming
> online and the CPU going away.  It's been a while since I looked
> at the mvneta code to really comment on that though.
> 

Thank you very much,
Jisheng
diff mbox

Patch

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index ed622fa..e1242f4 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2446,7 +2446,7 @@  static void mvneta_start_dev(struct mvneta_port *pp)
 	mvneta_port_enable(pp);
 
 	/* Enable polling on the port */
-	for_each_present_cpu(cpu) {
+	for_each_online_cpu(cpu) {
 		struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
 
 		napi_enable(&port->napi);
@@ -2472,7 +2472,7 @@  static void mvneta_stop_dev(struct mvneta_port *pp)
 
 	phy_stop(pp->phy_dev);
 
-	for_each_present_cpu(cpu) {
+	for_each_online_cpu(cpu) {
 		struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
 
 		napi_disable(&port->napi);
@@ -2907,7 +2907,7 @@  static int mvneta_stop(struct net_device *dev)
 	mvneta_stop_dev(pp);
 	mvneta_mdio_remove(pp);
 	unregister_cpu_notifier(&pp->cpu_notifier);
-	for_each_present_cpu(cpu)
+	for_each_online_cpu(cpu)
 		smp_call_function_single(cpu, mvneta_percpu_disable, pp, true);
 	free_percpu_irq(dev->irq, pp->ports);
 	mvneta_cleanup_rxqs(pp);