From patchwork Fri Dec 18 22:27:02 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Anderson X-Patchwork-Id: 7888871 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id BD3AF9F387 for ; Fri, 18 Dec 2015 22:29:23 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id D191120515 for ; Fri, 18 Dec 2015 22:29:22 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EDD74204E2 for ; Fri, 18 Dec 2015 22:29:21 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1aA3VA-0000zl-So; Fri, 18 Dec 2015 22:27:52 +0000 Received: from mail-pa0-x22c.google.com ([2607:f8b0:400e:c03::22c]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1aA3V7-0000x9-F6 for linux-arm-kernel@lists.infradead.org; Fri, 18 Dec 2015 22:27:50 +0000 Received: by mail-pa0-x22c.google.com with SMTP id ur14so66648511pab.0 for ; Fri, 18 Dec 2015 14:27:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=nzq0Hqa2hinclOZGoMffFcot1IWKkY56fJ1ct9HmBdc=; b=OPfKwpilw0wLXiqKrWf9Hi5m95728lhZ9w0lP6BeAKAbmrYvA/NLlzS6s+7itMvhaR 0R8ag6TgDKxZlIoZi5czssgroqXMkiDHLTTiQWy/vd5m712faIgQIgqKA/dzGYa36nf+ 42kHasUCjTppKYXG1WOCJ+pSFW0pOxCizKpgg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=nzq0Hqa2hinclOZGoMffFcot1IWKkY56fJ1ct9HmBdc=; b=M+2kul9+5I/kZ3nfhKQu46z/BALIz7hVfpU9suPhZVtK19ODeGF3gQanNUgrYlBZyh qLs5wmV9C1z6uquZORQPOtP36g4DCHjF0phIQixwtW6OCMD+KZMea8C1fpqtvPQ7qSry bK9sWD6SCTTYYJ9C80Kpq2utwNDWNlzaQh7T3ol3hkE8fvy8PhlgQTiz5/noOdnP4Mqr vbc3w2FsOnjczSrnKu/ROQDoQ82wMf/WMmRdACcBSvhEJ76sYqK3KYFVhstsReQoYCqr GhiM32XeQxxJSFZvcz9vToiEZf3V5RGD1hTRIh8sZDzuJhdtda7fQJJDEBdD7ULNSLLI Z4Rg== X-Gm-Message-State: ALoCoQlrO9ytASwTvcdzxeLizr6g4Tvv+v9l9e+Bfiw9cwhzdtFaWxAmtdHkkXz6T0brbGHx/sU2RmHrv8k82HvY303DMY/SFA== X-Received: by 10.66.102.37 with SMTP id fl5mr8851700pab.24.1450477648816; Fri, 18 Dec 2015 14:27:28 -0800 (PST) Received: from tictac.mtv.corp.google.com ([172.22.65.76]) by smtp.gmail.com with ESMTPSA id o17sm19879445pfa.66.2015.12.18.14.27.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 18 Dec 2015 14:27:28 -0800 (PST) From: Douglas Anderson To: Russell King Subject: [PATCH v2 2/2] ARM: dma-mapping: sort the pages after allocation Date: Fri, 18 Dec 2015 14:27:02 -0800 Message-Id: <1450477622-30948-2-git-send-email-dianders@chromium.org> X-Mailer: git-send-email 2.6.0.rc2.230.g3dd15c0 In-Reply-To: <1450477622-30948-1-git-send-email-dianders@chromium.org> References: <1450477622-30948-1-git-send-email-dianders@chromium.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20151218_142749_557681_4B029957 X-CRM114-Status: GOOD ( 20.40 ) X-Spam-Score: -2.7 (--) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: laurent.pinchart+renesas@ideasonboard.com, Pawel Osciak , mike.looijmans@topic.nl, linux-kernel@vger.kernel.org, Dmitry Torokhov , will.deacon@arm.com, Douglas Anderson , Tomasz Figa , penguin-kernel@i-love.sakura.ne.jp, carlo@caione.org, akpm@linux-foundation.org, Robin Murphy , linux-arm-kernel@lists.infradead.org, Marek Szyprowski MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED, T_DKIM_INVALID, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP After doing allocation, make one last-ditch effort to get contiguous regions of pages to optimize TLB usage. This is a rather simplistic approach that could be later optimized, but it doesn't hurt and should only have the opportunity to help. From my testing the sort took less than 400us for a 4MB allocation. That's much faster than the actual allocation which was more than a millisecond even in the fastest case (and was often several hundred ms). Signed-off-by: Douglas Anderson --- Changes in v2: - Sort patch new for v2 (and optional if people hate it). arch/arm/mm/dma-mapping.c | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 9887d432cf1f..d1b3d3e6fe47 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include #include @@ -1122,6 +1123,21 @@ static inline void __free_iova(struct dma_iommu_mapping *mapping, spin_unlock_irqrestore(&mapping->lock, flags); } +static int cmp_pfns(const void *a, const void *b) +{ + unsigned long a_pfn; + unsigned long b_pfn; + + a_pfn = page_to_pfn(*(struct page **)a); + b_pfn = page_to_pfn(*(struct page **)b); + + if (a_pfn < b_pfn) + return -1; + else if (a_pfn > b_pfn) + return 1; + return 0; +} + /* We'll try 2M, 1M, 64K, and finally 4K; array must end with 0! */ static const int iommu_order_array[] = { 9, 8, 4, 0 }; @@ -1133,6 +1149,7 @@ static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, int array_size = count * sizeof(struct page *); int i = 0; int order_idx = 0; + int first_order_zero = -1; if (array_size <= PAGE_SIZE) pages = kzalloc(array_size, GFP_KERNEL); @@ -1171,6 +1188,7 @@ static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, /* Drop down when we get small */ if (__fls(count) < order) { order_idx++; + /* Don't update first_order_zero; no need to sort end */ continue; } @@ -1181,6 +1199,8 @@ static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, /* Go down a notch at first sign of pressure */ if (!pages[i]) { order_idx++; + if (iommu_order_array[order_idx] == 0) + first_order_zero = i; continue; } } else { @@ -1201,6 +1221,26 @@ static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, count -= 1 << order; } + /* + * If we folded under memory pressure, try one last ditch event to get + * contiguous pages via sorting. Under testing this sometimes helped + * get a few more contiguous pages and didn't cost much compared to + * the above allocations. + * + * Note that we only sort the order zero pages so that we don't mess + * up the higher order allocations by sticking small pages in between + * them. + * + * If someone wanted to optimize this more, they could insert extra + * (out of order) single pages in places to help keep virtual and + * physical pages aligned with each other. As it is we often get + * lucky and get the needed alignment but we're not guaranteed. + */ + if (first_order_zero >= 0) + sort(pages + first_order_zero, + (size >> PAGE_SHIFT) - first_order_zero, sizeof(*pages), + cmp_pfns, NULL); + return pages; error: while (i--)