From patchwork Fri Jan 8 13:03:09 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jisheng Zhang X-Patchwork-Id: 7985551 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 4C53F9F6FA for ; Fri, 8 Jan 2016 13:10:18 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 51B6A2014A for ; Fri, 8 Jan 2016 13:10:17 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 434242011E for ; Fri, 8 Jan 2016 13:10:16 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1aHWlH-0007I5-Q0; Fri, 08 Jan 2016 13:07:23 +0000 Received: from mx0a-0016f401.pphosted.com ([67.231.148.174]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1aHWlF-0007HQ-J6 for linux-arm-kernel@lists.infradead.org; Fri, 08 Jan 2016 13:07:22 +0000 Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.15.0.59/8.15.0.59) with SMTP id u08D53XT006526; Fri, 8 Jan 2016 05:06:52 -0800 Received: from sc-exch03.marvell.com ([199.233.58.183]) by mx0a-0016f401.pphosted.com with ESMTP id 20a8h1gqe7-1 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 08 Jan 2016 05:06:51 -0800 Received: from SC-EXCH01.marvell.com (10.93.176.81) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server (TLS) id 15.0.1104.5; Fri, 8 Jan 2016 05:06:50 -0800 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server id 15.0.1104.5 via Frontend Transport; Fri, 8 Jan 2016 05:06:50 -0800 Received: from xhacker (unknown [10.37.135.134]) by maili.marvell.com (Postfix) with ESMTP id 1F3F53F7041; Fri, 8 Jan 2016 05:06:49 -0800 (PST) Date: Fri, 8 Jan 2016 21:03:09 +0800 From: Jisheng Zhang To: Russell King - ARM Linux , Stefan Roese , Gregory CLEMENT Subject: Re: Armada XP (MV78460): BUG in netdevice.h with maxcpus=2 Message-ID: <20160108210309.5a728bd7@xhacker> In-Reply-To: <20160108204523.43b4d473@xhacker> References: <568F6A47.6010905@gmail.com> <20160108182537.0a68630c@xhacker> <20160108105721.GG19062@n2100.arm.linux.org.uk> <20160108204523.43b4d473@xhacker> X-Mailer: Claws Mail 3.13.1 (GTK+ 2.24.29; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-01-08_04:, , signatures=0 X-Proofpoint-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1507310008 definitions=main-1601080220 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20160108_050721_657393_E92AE17F X-CRM114-Status: GOOD ( 28.76 ) X-Spam-Score: -2.6 (--) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Thomas Petazzoni , linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, 8 Jan 2016 20:45:23 +0800 Jisheng Zhang wrote: > Dear Russell, > > On Fri, 8 Jan 2016 10:57:21 +0000 Russell King - ARM Linux wrote: > > > On Fri, Jan 08, 2016 at 06:25:37PM +0800, Jisheng Zhang wrote: > > > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c > > > index ed622fa..e1242f4 100644 > > > --- a/drivers/net/ethernet/marvell/mvneta.c > > > +++ b/drivers/net/ethernet/marvell/mvneta.c > > > @@ -2446,7 +2446,7 @@ static void mvneta_start_dev(struct mvneta_port *pp) > > > mvneta_port_enable(pp); > > > > > > /* Enable polling on the port */ > > > - for_each_present_cpu(cpu) { > > > + for_each_online_cpu(cpu) { > > > struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu); > > > > > > napi_enable(&port->napi); > > > @@ -2472,7 +2472,7 @@ static void mvneta_stop_dev(struct mvneta_port *pp) > > > > > > phy_stop(pp->phy_dev); > > > > > > - for_each_present_cpu(cpu) { > > > + for_each_online_cpu(cpu) { > > > struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu); > > > > > > napi_disable(&port->napi); > > > @@ -2907,7 +2907,7 @@ static int mvneta_stop(struct net_device *dev) > > > mvneta_stop_dev(pp); > > > mvneta_mdio_remove(pp); > > > unregister_cpu_notifier(&pp->cpu_notifier); > > > - for_each_present_cpu(cpu) > > > + for_each_online_cpu(cpu) > > > smp_call_function_single(cpu, mvneta_percpu_disable, pp, true); > > > free_percpu_irq(dev->irq, pp->ports); > > > mvneta_cleanup_rxqs(pp); > > > > I'm not convinced that this isn't racy - what happens if a CPU is > > brought online between unregister_cpu_notifier(&pp->cpu_notifier) > > and the following loop? We'll end up calling mvneta_percpu_disable() > > for the new CPU - is that harmless? > > > > Similarly, what if the online CPUs change between mvneta_stop_dev() > > and mvneta_stop(), and also what about hotplug CPU changes during > > the startup path? > > > > Secondly, is there a reason for: > > > > for_each_online_cpu(cpu) > > smp_call_function_single(cpu, ...) > > > > as opposed to: > > > > smp_call_function(mvneta_percpu_disable, pp, true); > > > > Yep, thanks for pointing out the race. Before sending out the patch, I prepared > another *wrong*(IMHO) patch to patch smp_prepare_cpus() in arch/arm/kernel/smp.c. Here is the patch. I think it could also fix the issue. > > Here is the background: > I can reproduce this issue on arm but failed to reproduce it on arm64. The key > is what's present cpu. > > let's assume a quad core system, boot with maxcpus=2, after booting. > > on arm64, present cpus is cpu0, cpu1 > > on arm, present cpus is cpu0, cpu1, cpu2 and cpu3. > > the arm core code requires every platform to update the present map in > platforms' smp_prepare_cpus(), but only two or three platforms do so. > > Then get back to mvneta issue, during startup, mvneta_start_dev() calls > napi_enable() for each present cpu's port. If userspace ask for online > cpu2, mvneta_percpu_notifier() will call napi_enable() for cpu2 again, > so BUG_ON() is triggered. > > I have two solutions: > > 1. as the above patch did, then prevent the race as pointed out by > get_online_cpus(). > > 2. make arm platforms smp_prepare_cpus to update the present map or even > patch arm core smp_prepare_cpus(). > > What's the better solution? Could you please guide me? > > Thanks in advance, > Jisheng > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index b263613..f94c755 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -443,15 +443,31 @@ void __init smp_prepare_cpus(unsigned int max_cpus) */ if (max_cpus > ncores) max_cpus = ncores; - if (ncores > 1 && max_cpus) { - /* - * Initialise the present map, which describes the set of CPUs - * actually populated at the present time. A platform should - * re-initialize the map in the platforms smp_prepare_cpus() - * if present != possible (e.g. physical hotplug). - */ - init_cpu_present(cpu_possible_mask); + /* Don't bother if we're effectively UP */ + if (max_cpus <= 1) + return; + + /* + * Initialise the present map (which describes the set of CPUs + * actually populated at the present time) and release the + * secondaries from the bootloader. + * + * Make sure we online at most (max_cpus - 1) additional CPUs. + */ + max_cpus--; + for_each_possible_cpu(cpu) { + if (max_cpus == 0) + break; + + if (cpu == smp_processor_id()) + continue; + + set_cpu_present(cpu, true); + max_cpus--; + } + + if (ncores > 1 && max_cpus) { /* * Initialise the SCU if there are more than one CPU * and let them know where to start.