From patchwork Thu Dec 17 20:30:53 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Doug Anderson <dianders@chromium.org>
X-Patchwork-Id: 7877261
Return-Path: 
 <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org>
X-Original-To: patchwork-linux-arm@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.29.136])
	by patchwork2.web.kernel.org (Postfix) with ESMTP id 5A533BEEE5
	for <patchwork-linux-arm@patchwork.kernel.org>;
	Thu, 17 Dec 2015 20:34:59 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id 5C54A20373
	for <patchwork-linux-arm@patchwork.kernel.org>;
	Thu, 17 Dec 2015 20:34:58 +0000 (UTC)
Received: from bombadil.infradead.org (bombadil.infradead.org
	[198.137.202.9])
	(using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id 80211202A1
	for <patchwork-linux-arm@patchwork.kernel.org>;
	Thu, 17 Dec 2015 20:34:57 +0000 (UTC)
Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org)
	by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux))
	id 1a9fEh-00067K-Pz; Thu, 17 Dec 2015 20:33:15 +0000
Received: from mail-pf0-x234.google.com ([2607:f8b0:400e:c00::234])
	by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat
	Linux)) id 1a9fEf-00065s-Ca
	for linux-arm-kernel@lists.infradead.org;
	Thu, 17 Dec 2015 20:33:14 +0000
Received: by mail-pf0-x234.google.com with SMTP id o64so36428959pfb.3
	for <linux-arm-kernel@lists.infradead.org>;
	Thu, 17 Dec 2015 12:32:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org;
	s=google; h=from:to:cc:subject:date:message-id;
	bh=mN729ZG5FzYRwZ0MOG8G71qo5eqdtwajrvNiSQUgwYs=;
	b=i4rmArjqRpMtV++mIc0olhw7EovJ1rnUK2W34OD+SjS8SPEHAvITBQ/LanY+JmcOEm
	sZbRiYQXoyOSHVGl+20w+oWBtYI/Pkm6pqrzqhVXEPaLdxFXEhcxP7S5BHAsuAu53kkm
	vARSOy7bIBVM8tjkp32WP3i+W8pISrbCDWrFw=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20130820;
	h=x-gm-message-state:from:to:cc:subject:date:message-id;
	bh=mN729ZG5FzYRwZ0MOG8G71qo5eqdtwajrvNiSQUgwYs=;
	b=O0RDWSpbQnHlFboV6RVIuzO/20DCtwsjsplsDFJ4RsQCK1Z8aP6IwPhtO/VczlwJ4e
	0ilZ3Rg0KUK7rOqSn7HKSmhBatp7K4yB1B6VL3BATcxR0w72qg12KdAIcdjkHaq4392I
	5tN9TxOYxVcsu4eNA8XCMrlVXTcbGdVbosinvANz8PwORM2z7qpxtTjsGvkKArdCoHyF
	Q/XRRT556N8fO2klF1neFO6GLCwYbOgFKA4D0W/VExvOQ/9/3ETKMXhytMflYMip8c6M
	1HFeOMzioD+xgDX8J/N9ahzUp4flEyDXh/aR/PfBA6P+ngLYR4uY9JDFJGdr4G4NvQTB
	Eaxw==
X-Gm-Message-State: 
 ALoCoQkGxPCvCIOmnGRaaaWq4jxiY3rsKnhk0Qhp6XSyfymCeeFJ+7go0ohtGu4daFpvfsOFpacTqzU7TsDVLUnI/54ch8mfkg==
X-Received: by 10.98.16.26 with SMTP id y26mr17067238pfi.135.1450384371898;
	Thu, 17 Dec 2015 12:32:51 -0800 (PST)
Received: from tictac.mtv.corp.google.com ([172.22.65.76])
	by smtp.gmail.com with ESMTPSA id
	xz6sm17550007pab.42.2015.12.17.12.32.50
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
	Thu, 17 Dec 2015 12:32:51 -0800 (PST)
From: Douglas Anderson <dianders@chromium.org>
To: Russell King <linux@arm.linux.org.uk>
Subject: [PATCH] ARM: dma-mapping: Just allocate one chunk at a time
Date: Thu, 17 Dec 2015 12:30:53 -0800
Message-Id: <1450384253-1067-1-git-send-email-dianders@chromium.org>
X-Mailer: git-send-email 2.6.0.rc2.230.g3dd15c0
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 
X-CRM114-CacheID: sfid-20151217_123313_471139_A66FB1F1 
X-CRM114-Status: GOOD (  18.76  )
X-Spam-Score: -2.7 (--)
X-BeenThere: linux-arm-kernel@lists.infradead.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: <linux-arm-kernel.lists.infradead.org>
List-Unsubscribe: 
 <http://lists.infradead.org/mailman/options/linux-arm-kernel>,
	<mailto:linux-arm-kernel-request@lists.infradead.org?subject=unsubscribe>
List-Archive: <http://lists.infradead.org/pipermail/linux-arm-kernel/>
List-Post: <mailto:linux-arm-kernel@lists.infradead.org>
List-Help: <mailto:linux-arm-kernel-request@lists.infradead.org?subject=help>
List-Subscribe: 
 <http://lists.infradead.org/mailman/listinfo/linux-arm-kernel>,
	<mailto:linux-arm-kernel-request@lists.infradead.org?subject=subscribe>
Cc: laurent.pinchart+renesas@ideasonboard.com, linux-kernel@vger.kernel.org,
	Pawel Osciak <pawel@osciak.com>, mike.looijmans@topic.nl,
	lorenx4@gmail.com,
	Dmitry Torokhov <dmitry.torokhov@gmail.com>, will.deacon@arm.com,
	Douglas Anderson <dianders@chromium.org>,
	Tomasz Figa <tfiga@chromium.org>,
	rientjes@google.com, carlo@caione.org, akpm@linux-foundation.org,
	linux-arm-kernel@lists.infradead.org,
	Marek Szyprowski <m.szyprowski@samsung.com>
MIME-Version: 1.0
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: 
 linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org
X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	RCVD_IN_DNSWL_MED, T_DKIM_INVALID, T_RP_MATCHES_RCVD,
	UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

The __iommu_alloc_buffer() is expected to be called to allocate pretty
sizeable buffers.  Upon simple tests of video I saw it trying to
allocate 4,194,304 bytes.  The function tries to be efficient about this
by starting out allocating large chunks and then moving to smaller and
smaller chunk sizes until it succeeds.

The current function is very, very slow.

One problem is the way it keeps trying and trying to allocate big
chunks.  Imagine a very fragmented memory that has 4M free but no
contiguous pages at all.  Further imagine allocating 4M (1024 pages).
We'll do the following memory allocations:
- For page 1:
  - Try to allocate order 10 (no retry)
  - Try to allocate order 9 (no retry)
  - ...
  - Try to allocate order 0 (with retry, but not needed)
- For page 2:
  - Try to allocate order 9 (no retry)
  - Try to allocate order 8 (no retry)
  - ...
  - Try to allocate order 0 (with retry, but not needed)
- ...
- ...

Total number of calls to alloc() calls for this case is:
  sum(int(math.log(i, 2)) + 1 for i in range(1, 1025))
  => 9228

The above is obviously worse case, but given how slow alloc can be we
really want to try to avoid even somewhat bad cases.  I timed the old
code with a device under memory pressure and it wasn't hard to see it
take more than 24 seconds to allocate 4 megs of memory (!!).

A second problem (and maybe even more important) is that allocating big
chunks when we don't need them is just not a good idea anyway.  The
first thing we do with these big chunks is break them into smaller
chunks!  If we allocate small chunks:
- The memory manager doesn't need to work so hard to give us big chunks.
- We can save the big chunks for those that really need them and this
  code can make great use of all the small chunks sitting around.

Let's simplify by just allocating one page at a time.  We may make more
total allocate calls but it works way better.  In real world tests that
used to sometimes see a 24 second allocation call I can now see at most
250 ms.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
 arch/arm/mm/dma-mapping.c | 38 ++++++--------------------------------
 1 file changed, 6 insertions(+), 32 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 492bf3efffab..7efeb2d4801b 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1160,39 +1160,13 @@ static struct page **__iommu_alloc_buffer(struct device *dev, size_t size,
 	gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
 
 	while (count) {
-		int j, order;
-
-		for (order = __fls(count); order > 0; --order) {
-			/*
-			 * We do not want OOM killer to be invoked as long
-			 * as we can fall back to single pages, so we force
-			 * __GFP_NORETRY for orders higher than zero.
-			 */
-			pages[i] = alloc_pages(gfp | __GFP_NORETRY, order);
-			if (pages[i])
-				break;
-		}
-
-		if (!pages[i]) {
-			/*
-			 * Fall back to single page allocation.
-			 * Might invoke OOM killer as last resort.
-			 */
-			pages[i] = alloc_pages(gfp, 0);
-			if (!pages[i])
-				goto error;
-		}
-
-		if (order) {
-			split_page(pages[i], order);
-			j = 1 << order;
-			while (--j)
-				pages[i + j] = pages[i] + j;
-		}
+		pages[i] = alloc_pages(gfp, 0);
+		if (!pages[i])
+			goto error;
 
-		__dma_clear_buffer(pages[i], PAGE_SIZE << order);
-		i += 1 << order;
-		count -= 1 << order;
+		__dma_clear_buffer(pages[i], PAGE_SIZE);
+		i += 1;
+		count -= 1;
 	}
 
 	return pages;