From patchwork Thu Sep 29 11:07:14 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Chris Wilson <chris@chris-wilson.co.uk>
X-Patchwork-Id: 9356341
Return-Path: 
 <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	A707D6077B for <patchwork-linux-arm@patchwork.kernel.org>;
	Thu, 29 Sep 2016 11:09:21 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 97D48298F2
	for <patchwork-linux-arm@patchwork.kernel.org>;
	Thu, 29 Sep 2016 11:09:21 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 8B9F929954; Thu, 29 Sep 2016 11:09:21 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED
	autolearn=unavailable version=3.3.1
Received: from bombadil.infradead.org (bombadil.infradead.org
	[198.137.202.9])
	(using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C4007298F2
	for <patchwork-linux-arm@patchwork.kernel.org>;
	Thu, 29 Sep 2016 11:09:16 +0000 (UTC)
Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org)
	by bombadil.infradead.org with esmtp (Exim 4.85_2 #1 (Red Hat Linux))
	id 1bpZBs-0001VW-5c; Thu, 29 Sep 2016 11:07:48 +0000
Received: from mail.fireflyinternet.com ([109.228.58.192]
	helo=fireflyinternet.com)
	by bombadil.infradead.org with esmtps (Exim 4.85_2 #1 (Red Hat
	Linux)) id 1bpZBn-0001P9-6D
	for linux-arm-kernel@lists.infradead.org;
	Thu, 29 Sep 2016 11:07:44 +0000
X-Default-Received-SPF: pass (skip=forwardok (res=PASS))
	x-ip-name=78.156.65.138;
Received: from nuc-i3427.alporthouse.com (unverified [78.156.65.138])
	by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id
	2439424-1500050 for multiple; Thu, 29 Sep 2016 12:07:15 +0100
Received: by nuc-i3427.alporthouse.com (sSMTP sendmail emulation);
	Thu, 29 Sep 2016 12:07:14 +0100
Date: Thu, 29 Sep 2016 12:07:14 +0100
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Jisheng Zhang <jszhang@marvell.com>
Subject: Re: [PATCH] mm/vmalloc: reduce the number of lazy_max_pages to
	reduce latency
Message-ID: <20160929110714.GF28107@nuc-i3427.alporthouse.com>
References: <20160929073411.3154-1-jszhang@marvell.com>
	<20160929081818.GE28107@nuc-i3427.alporthouse.com>
	<20160929162808.745c869b@xhacker>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <20160929162808.745c869b@xhacker>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 
X-CRM114-CacheID: sfid-20160929_040743_599397_F1AE11EF 
X-CRM114-Status: GOOD (  21.36  )
X-BeenThere: linux-arm-kernel@lists.infradead.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: <linux-arm-kernel.lists.infradead.org>
List-Unsubscribe: 
 <http://lists.infradead.org/mailman/options/linux-arm-kernel>,
	<mailto:linux-arm-kernel-request@lists.infradead.org?subject=unsubscribe>
List-Archive: <http://lists.infradead.org/pipermail/linux-arm-kernel/>
List-Post: <mailto:linux-arm-kernel@lists.infradead.org>
List-Help: <mailto:linux-arm-kernel-request@lists.infradead.org?subject=help>
List-Subscribe: 
 <http://lists.infradead.org/mailman/listinfo/linux-arm-kernel>,
	<mailto:linux-arm-kernel-request@lists.infradead.org?subject=subscribe>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, agnel.joel@gmail.com,
	rientjes@google.com, akpm@linux-foundation.org,
	mgorman@techsingularity.net,
	iamjoonsoo.kim@lge.com, linux-arm-kernel@lists.infradead.org
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: 
 linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org
X-Virus-Scanned: ClamAV using ClamSMTP

On Thu, Sep 29, 2016 at 04:28:08PM +0800, Jisheng Zhang wrote:
> On Thu, 29 Sep 2016 09:18:18 +0100 Chris Wilson wrote:
> 
> > On Thu, Sep 29, 2016 at 03:34:11PM +0800, Jisheng Zhang wrote:
> > > On Marvell berlin arm64 platforms, I see the preemptoff tracer report
> > > a max 26543 us latency at __purge_vmap_area_lazy, this latency is an
> > > awfully bad for STB. And the ftrace log also shows __free_vmap_area
> > > contributes most latency now. I noticed that Joel mentioned the same
> > > issue[1] on x86 platform and gave two solutions, but it seems no patch
> > > is sent out for this purpose.
> > > 
> > > This patch adopts Joel's first solution, but I use 16MB per core
> > > rather than 8MB per core for the number of lazy_max_pages. After this
> > > patch, the preemptoff tracer reports a max 6455us latency, reduced to
> > > 1/4 of original result.  
> > 
> > My understanding is that
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 91f44e78c516..3f7c6d6969ac 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -626,7 +626,6 @@ void set_iounmap_nonlazy(void)
> >  static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end,
> >                                         int sync, int force_flush)
> >  {
> > -       static DEFINE_SPINLOCK(purge_lock);
> >         struct llist_node *valist;
> >         struct vmap_area *va;
> >         struct vmap_area *n_va;
> > @@ -637,12 +636,6 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end,
> >          * should not expect such behaviour. This just simplifies locking for
> >          * the case that isn't actually used at the moment anyway.
> >          */
> > -       if (!sync && !force_flush) {
> > -               if (!spin_trylock(&purge_lock))
> > -                       return;
> > -       } else
> > -               spin_lock(&purge_lock);
> > -
> >         if (sync)
> >                 purge_fragmented_blocks_allcpus();
> >  
> > @@ -667,7 +660,6 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end,
> >                         __free_vmap_area(va);
> >                 spin_unlock(&vmap_area_lock);
> 
> Hi Chris,
> 
> Per my test, the bottleneck now is __free_vmap_area() over the valist, the
> iteration is protected with spinlock vmap_area_lock. So the larger lazy max
> pages, the longer valist, the bigger the latency.
> 
> So besides above patch, we still need to remove vmap_are_lock or replace with
> mutex.

Or follow up with


?
-Chris

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 3f7c6d6969ac..67b5475f0b0a 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -656,8 +656,10 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end,
 
        if (nr) {
                spin_lock(&vmap_area_lock);
-               llist_for_each_entry_safe(va, n_va, valist, purge_list)
+               llist_for_each_entry_safe(va, n_va, valist, purge_list) {
                        __free_vmap_area(va);
+                       cond_resched_lock(&vmap_area_lock);
+               }
                spin_unlock(&vmap_area_lock);
        }
 }