From patchwork Fri Jan 8 13:21:09 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jisheng Zhang X-Patchwork-Id: 7985801 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 5C9D2BEEE5 for ; Fri, 8 Jan 2016 13:26:51 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 4ECA92017E for ; Fri, 8 Jan 2016 13:26:50 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3441F2014A for ; Fri, 8 Jan 2016 13:26:49 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1aHX2i-0007Ue-G0; Fri, 08 Jan 2016 13:25:24 +0000 Received: from mx0a-0016f401.pphosted.com ([67.231.148.174]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1aHX2f-0005ps-21 for linux-arm-kernel@lists.infradead.org; Fri, 08 Jan 2016 13:25:22 +0000 Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.15.0.59/8.15.0.59) with SMTP id u08DOcjk017953; Fri, 8 Jan 2016 05:24:52 -0800 Received: from sc-exch01.marvell.com ([199.233.58.181]) by mx0a-0016f401.pphosted.com with ESMTP id 20a8h1grb4-1 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 08 Jan 2016 05:24:52 -0800 Received: from SC-EXCH01.marvell.com (10.93.176.81) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server (TLS) id 15.0.1104.5; Fri, 8 Jan 2016 05:24:51 -0800 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server id 15.0.1104.5 via Frontend Transport; Fri, 8 Jan 2016 05:24:51 -0800 Received: from xhacker (unknown [10.37.135.134]) by maili.marvell.com (Postfix) with ESMTP id AF4B83F703F; Fri, 8 Jan 2016 05:24:50 -0800 (PST) Date: Fri, 8 Jan 2016 21:21:09 +0800 From: Jisheng Zhang To: Russell King - ARM Linux , Stefan Roese , Gregory CLEMENT Subject: Re: Armada XP (MV78460): BUG in netdevice.h with maxcpus=2 Message-ID: <20160108212109.521962d0@xhacker> In-Reply-To: <20160108210309.5a728bd7@xhacker> References: <568F6A47.6010905@gmail.com> <20160108182537.0a68630c@xhacker> <20160108105721.GG19062@n2100.arm.linux.org.uk> <20160108204523.43b4d473@xhacker> <20160108210309.5a728bd7@xhacker> X-Mailer: Claws Mail 3.13.1 (GTK+ 2.24.29; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-01-08_04:, , signatures=0 X-Proofpoint-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1507310008 definitions=main-1601080226 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20160108_052521_247977_B206FC94 X-CRM114-Status: GOOD ( 32.56 ) X-Spam-Score: -2.6 (--) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Thomas Petazzoni , linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, 8 Jan 2016 21:03:09 +0800 Jisheng Zhang wrote: > On Fri, 8 Jan 2016 20:45:23 +0800 > Jisheng Zhang wrote: > > > Dear Russell, > > > > On Fri, 8 Jan 2016 10:57:21 +0000 Russell King - ARM Linux wrote: > > > > > On Fri, Jan 08, 2016 at 06:25:37PM +0800, Jisheng Zhang wrote: > > > > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c > > > > index ed622fa..e1242f4 100644 > > > > --- a/drivers/net/ethernet/marvell/mvneta.c > > > > +++ b/drivers/net/ethernet/marvell/mvneta.c > > > > @@ -2446,7 +2446,7 @@ static void mvneta_start_dev(struct mvneta_port *pp) > > > > mvneta_port_enable(pp); > > > > > > > > /* Enable polling on the port */ > > > > - for_each_present_cpu(cpu) { > > > > + for_each_online_cpu(cpu) { > > > > struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu); > > > > > > > > napi_enable(&port->napi); > > > > @@ -2472,7 +2472,7 @@ static void mvneta_stop_dev(struct mvneta_port *pp) > > > > > > > > phy_stop(pp->phy_dev); > > > > > > > > - for_each_present_cpu(cpu) { > > > > + for_each_online_cpu(cpu) { > > > > struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu); > > > > > > > > napi_disable(&port->napi); > > > > @@ -2907,7 +2907,7 @@ static int mvneta_stop(struct net_device *dev) > > > > mvneta_stop_dev(pp); > > > > mvneta_mdio_remove(pp); > > > > unregister_cpu_notifier(&pp->cpu_notifier); > > > > - for_each_present_cpu(cpu) > > > > + for_each_online_cpu(cpu) > > > > smp_call_function_single(cpu, mvneta_percpu_disable, pp, true); > > > > free_percpu_irq(dev->irq, pp->ports); > > > > mvneta_cleanup_rxqs(pp); > > > > > > I'm not convinced that this isn't racy - what happens if a CPU is > > > brought online between unregister_cpu_notifier(&pp->cpu_notifier) > > > and the following loop? We'll end up calling mvneta_percpu_disable() > > > for the new CPU - is that harmless? > > > > > > Similarly, what if the online CPUs change between mvneta_stop_dev() > > > and mvneta_stop(), and also what about hotplug CPU changes during > > > the startup path? > > > > > > Secondly, is there a reason for: > > > > > > for_each_online_cpu(cpu) > > > smp_call_function_single(cpu, ...) > > > > > > as opposed to: > > > > > > smp_call_function(mvneta_percpu_disable, pp, true); > > > > > > > Yep, thanks for pointing out the race. Before sending out the patch, I prepared > > another *wrong*(IMHO) patch to patch smp_prepare_cpus() in arch/arm/kernel/smp.c. > > Here is the patch. I think it could also fix the issue. oops, the version can't be compiled, sorry for noise. Here is the updated one: > > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c > index b263613..f94c755 100644 > --- a/arch/arm/kernel/smp.c > +++ b/arch/arm/kernel/smp.c > @@ -443,15 +443,31 @@ void __init smp_prepare_cpus(unsigned int max_cpus) > */ > if (max_cpus > ncores) > max_cpus = ncores; > - if (ncores > 1 && max_cpus) { > - /* > - * Initialise the present map, which describes the set of CPUs > - * actually populated at the present time. A platform should > - * re-initialize the map in the platforms smp_prepare_cpus() > - * if present != possible (e.g. physical hotplug). > - */ > - init_cpu_present(cpu_possible_mask); > > + /* Don't bother if we're effectively UP */ > + if (max_cpus <= 1) > + return; > + > + /* > + * Initialise the present map (which describes the set of CPUs > + * actually populated at the present time) and release the > + * secondaries from the bootloader. > + * > + * Make sure we online at most (max_cpus - 1) additional CPUs. > + */ > + max_cpus--; > + for_each_possible_cpu(cpu) { > + if (max_cpus == 0) > + break; > + > + if (cpu == smp_processor_id()) > + continue; > + > + set_cpu_present(cpu, true); > + max_cpus--; > + } > + > + if (ncores > 1 && max_cpus) { > /* > * Initialise the SCU if there are more than one CPU > * and let them know where to start. > > > > > > Here is the background: > > I can reproduce this issue on arm but failed to reproduce it on arm64. The key > > is what's present cpu. > > > > let's assume a quad core system, boot with maxcpus=2, after booting. > > > > on arm64, present cpus is cpu0, cpu1 > > > > on arm, present cpus is cpu0, cpu1, cpu2 and cpu3. > > > > the arm core code requires every platform to update the present map in > > platforms' smp_prepare_cpus(), but only two or three platforms do so. > > > > Then get back to mvneta issue, during startup, mvneta_start_dev() calls > > napi_enable() for each present cpu's port. If userspace ask for online > > cpu2, mvneta_percpu_notifier() will call napi_enable() for cpu2 again, > > so BUG_ON() is triggered. > > > > I have two solutions: > > > > 1. as the above patch did, then prevent the race as pointed out by > > get_online_cpus(). > > > > 2. make arm platforms smp_prepare_cpus to update the present map or even > > patch arm core smp_prepare_cpus(). > > > > What's the better solution? Could you please guide me? > > > > Thanks in advance, > > Jisheng > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index b263613..26c164b 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -432,6 +432,7 @@ void __init smp_prepare_boot_cpu(void) void __init smp_prepare_cpus(unsigned int max_cpus) { + unsigned int cpu, max; unsigned int ncores = num_possible_cpus(); init_cpu_topology(); @@ -443,15 +444,29 @@ void __init smp_prepare_cpus(unsigned int max_cpus) */ if (max_cpus > ncores) max_cpus = ncores; - if (ncores > 1 && max_cpus) { - /* - * Initialise the present map, which describes the set of CPUs - * actually populated at the present time. A platform should - * re-initialize the map in the platforms smp_prepare_cpus() - * if present != possible (e.g. physical hotplug). - */ - init_cpu_present(cpu_possible_mask); + /* Don't bother if we're effectively UP */ + if (max_cpus <= 1) + return; + + /* + * Initialise the present map (which describes the set of CPUs + * actually populated at the present time). + */ + max = max_cpus; + max--; + for_each_possible_cpu(cpu) { + if (max == 0) + break; + + if (cpu == smp_processor_id()) + continue; + + set_cpu_present(cpu, true); + max--; + } + + if (ncores > 1 && max_cpus) { /* * Initialise the SCU if there are more than one CPU * and let them know where to start.