From patchwork Sat Nov 10 01:38:04 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
X-Patchwork-Id: 10676799
Return-Path: <owner-linux-mm@kvack.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0156A17D4
	for <patchwork-linux-mm@patchwork.kernel.org>;
 Sat, 10 Nov 2018 01:36:35 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E3C5E2E2AF
	for <patchwork-linux-mm@patchwork.kernel.org>;
 Sat, 10 Nov 2018 01:36:34 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id D6CC32F059; Sat, 10 Nov 2018 01:36:34 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EBE022E2AF
	for <patchwork-linux-mm@patchwork.kernel.org>;
 Sat, 10 Nov 2018 01:36:33 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id C4A1A6B075B; Fri,  9 Nov 2018 20:36:30 -0500 (EST)
Delivered-To: linux-mm-outgoing@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 40)
	id A05076B075A; Fri,  9 Nov 2018 20:36:30 -0500 (EST)
X-Original-To: int-list-linux-mm@kvack.org
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 8A2D96B075D; Fri,  9 Nov 2018 20:36:30 -0500 (EST)
X-Original-To: linux-mm@kvack.org
X-Delivered-To: linux-mm@kvack.org
Received: from mail-pf1-f200.google.com (mail-pf1-f200.google.com
 [209.85.210.200])
	by kanga.kvack.org (Postfix) with ESMTP id 457346B075A
	for <linux-mm@kvack.org>; Fri,  9 Nov 2018 20:36:30 -0500 (EST)
Received: by mail-pf1-f200.google.com with SMTP id 87-v6so2814683pfq.8
        for <linux-mm@kvack.org>; Fri, 09 Nov 2018 17:36:30 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-original-authentication-results:x-gm-message-state:from:to:cc
         :subject:date:message-id:in-reply-to:references;
        bh=uJR4wvSxvJF21bAWNoROmkCnuY/uMr51cezzkx7SWfA=;
        b=lb020K99WH5BvThMkPZ7ga8CX/N1MBmy0N50kQKXZHiYhZib6aVr7jJzGBsL0O83Pu
         HqgIpUmeus3qjzwdrpP3N7t0Oi/wZvhXy/Fc8DCoOiBmSGzsrtEg4SSv6eEYk9cg5MnO
         HrR1EIPPmKgjFaTYBFqXqcxSwaAWhOvkIEKxMmJANprzcX8b17LEuGn7IRvrbiMxNqfL
         Uy/K3IqyFWgPMJh25IwRW6cRp47j/2FR3pfFOsU2HZzHqkSTqHafaT4IQH+k8Kc1vL7E
         HIxa1e1qpdblPhgTbxPfZLkCKxwOqEaBzIhVAMSJLdqU3NClPt40S4wKmGZH3nft9Iki
         Mgdw==
X-Original-Authentication-Results: mx.google.com;
       spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates
 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com;
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com
X-Gm-Message-State: AGRZ1gJu7l/PeifPqrLcLdoRETGB0WBUtWXjFv9V1oITkzaAQW68OucJ
	RdMaxKou2+gbjDlZFXOIx4Y1nZAEU+UMMjBYYV8FcOGzKx9g4wInChxBD4tzOCvg6TE8iSS66MR
	pqTMsqW2WNm24frRIxA96m2HAWE8z1DyQSfhCdX+/ltl5E2xPGXhD4WY8o4iJx4Wjqg==
X-Received: by 2002:a17:902:33c1:: with SMTP id
 b59-v6mr10876623plc.71.1541813789862;
        Fri, 09 Nov 2018 17:36:29 -0800 (PST)
X-Google-Smtp-Source: 
 AJdET5fJVT1KEg+PfWt9cPZzF3cFftazLDXdrfnvoPlOSmSAT/L23xL7qq2ucwMrAJicve4s/Ycd
X-Received: by 2002:a17:902:33c1:: with SMTP id
 b59-v6mr10876558plc.71.1541813788575;
        Fri, 09 Nov 2018 17:36:28 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; t=1541813788; cv=none;
        d=google.com; s=arc-20160816;
        b=fYvVn13Mn9n0AjnXPSInQ23ECv0IYvZkwC+MvLjWzOi0lk4MjehFxvsTYeWEvM4Lnw
         fBXCFiBW4oWchgfJpYommHnzuJWBmJHsVUhJ5McItlKtObYbjmnsU6dGcWMy0AsoDRKO
         Cc1H+FiFPY8jYSWd2SyBmbwUcNB8TcpidGP0FThNANgZV2KZLAsGbmoJQn8Qnb8QwBiS
         0yQIpZ3L/9ooCaoUaCXgPEh0VPVTaIxY2OKylQ19ew8A8I4068qDQHlzzubjLcetO8v5
         zt1GLGUtI7549AJoqobEOQptfXjIzIR1DdpWykCplx+2bbP1e8k8Q7jv1UEry6I9zqBb
         AHIw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=references:in-reply-to:message-id:date:subject:cc:to:from;
        bh=uJR4wvSxvJF21bAWNoROmkCnuY/uMr51cezzkx7SWfA=;
        b=NlegmIfM+Wa1OQmFtpGeMfbCUvPplmJ0bHL3JZLfbcM28CQ0z3kUj5BVnYOxhnwZmO
         bXNg3Yjj2Wu4FqhgDGahDemoK3XN3K6/Kkv+XpRk2gsgwBXx6GOvIH8Cv8Cpij5UJtBk
         +Q25l04v1mXN2Ymd49/ozLFtgyGodLaF/c+wXtVlyPuqRSSujLOsKsB1VwI/lp1xUOQ8
         LId6U8+V345WEmuUPgKdhSIi+sCOBlQR7PHPqj6JENr0zU+ONrclgU/tBWW0fZb3noOR
         FpZ5EmrXj8nOeV7h5Fp6d9l9nvdYvK8q30plI7oxYCSMfDfj/4XsuPL1pPuEfGPEoBLi
         5SMw==
ARC-Authentication-Results: i=1; mx.google.com;
       spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates
 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com;
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com
Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65])
        by mx.google.com with ESMTPS id
 n32si8099080pgm.439.2018.11.09.17.36.28
        for <linux-mm@kvack.org>
        (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
        Fri, 09 Nov 2018 17:36:28 -0800 (PST)
Received-SPF: pass (google.com: domain of rick.p.edgecombe@intel.com
 designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65;
Authentication-Results: mx.google.com;
       spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates
 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com;
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga004.fm.intel.com ([10.253.24.48])
  by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 09 Nov 2018 17:36:28 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.54,485,1534834800";
   d="scan'208";a="105025660"
Received: from rpedgeco-desk5.jf.intel.com ([10.54.75.168])
  by fmsmga004.fm.intel.com with ESMTP; 09 Nov 2018 17:36:27 -0800
From: Rick Edgecombe <rick.p.edgecombe@intel.com>
To: akpm@linux-foundation.org,
	willy@infradead.org,
	tglx@linutronix.de,
	mingo@redhat.com,
	hpa@zytor.com,
	x86@kernel.org,
	linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	kernel-hardening@lists.openwall.com,
	daniel@iogearbox.net,
	jannh@google.com,
	keescook@chromium.org
Cc: kristen@linux.intel.com,
	dave.hansen@intel.com,
	arjan@linux.intel.com,
	Rick Edgecombe <rick.p.edgecombe@intel.com>
Subject: [PATCH v9 1/4] vmalloc: Add __vmalloc_node_try_addr function
Date: Fri,  9 Nov 2018 17:38:04 -0800
Message-Id: <20181110013807.24903-2-rick.p.edgecombe@intel.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181110013807.24903-1-rick.p.edgecombe@intel.com>
References: <20181110013807.24903-1-rick.p.edgecombe@intel.com>
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
X-Virus-Scanned: ClamAV using ClamSMTP

Create __vmalloc_node_try_addr function that tries to allocate at a specific
address without triggering any lazy purging and retry. For the randomized
allocator that uses this function, failing to allocate at a specific address is
a lot more common. This function will not try to do any lazy purge and retry,
to try to fail faster when an allocation won't fit at a specific address. This
function is used for a case where lazy free areas are unlikely and so the purge
and retry is just extra work done every time. For the randomized module
loader, the performance for an average allocation in ns for different numbers
of modules was:

Modules Vmalloc optimization    No Vmalloc Optimization
1000    1433                    1993
2000    2295                    3681
3000    4424                    7450
4000    7746                    13824
5000    12721                   21852
6000    19724                   33926
7000    27638                   47427
8000    37745                   64443

In order to support this behavior a try_addr argument was plugged into several
of the static helpers.

This also changes logic in __get_vm_area_node to be faster in cases where
allocations fail due to no space, which is a lot more common when trying
specific addresses.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 include/linux/vmalloc.h |   3 +
 mm/vmalloc.c            | 128 +++++++++++++++++++++++++++++-----------
 2 files changed, 95 insertions(+), 36 deletions(-)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 398e9c95cd61..6eaa89612372 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -82,6 +82,9 @@ extern void *__vmalloc_node_range(unsigned long size, unsigned long align,
 			unsigned long start, unsigned long end, gfp_t gfp_mask,
 			pgprot_t prot, unsigned long vm_flags, int node,
 			const void *caller);
+extern void *__vmalloc_node_try_addr(unsigned long addr, unsigned long size,
+			gfp_t gfp_mask,	pgprot_t prot, unsigned long vm_flags,
+			int node, const void *caller);
 #ifndef CONFIG_MMU
 extern void *__vmalloc_node_flags(unsigned long size, int node, gfp_t flags);
 static inline void *__vmalloc_node_flags_caller(unsigned long size, int node,
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 97d4b25d0373..b8b34d319c85 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -326,6 +326,9 @@ EXPORT_SYMBOL(vmalloc_to_pfn);
 #define VM_LAZY_FREE	0x02
 #define VM_VM_AREA	0x04
 
+#define VMAP_MAY_PURGE	0x2
+#define VMAP_NO_PURGE	0x1
+
 static DEFINE_SPINLOCK(vmap_area_lock);
 /* Export for kexec only */
 LIST_HEAD(vmap_area_list);
@@ -402,12 +405,12 @@ static BLOCKING_NOTIFIER_HEAD(vmap_notify_list);
 static struct vmap_area *alloc_vmap_area(unsigned long size,
 				unsigned long align,
 				unsigned long vstart, unsigned long vend,
-				int node, gfp_t gfp_mask)
+				int node, gfp_t gfp_mask, int try_purge)
 {
 	struct vmap_area *va;
 	struct rb_node *n;
 	unsigned long addr;
-	int purged = 0;
+	int purged = try_purge & VMAP_NO_PURGE;
 	struct vmap_area *first;
 
 	BUG_ON(!size);
@@ -860,7 +863,7 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask)
 
 	va = alloc_vmap_area(VMAP_BLOCK_SIZE, VMAP_BLOCK_SIZE,
 					VMALLOC_START, VMALLOC_END,
-					node, gfp_mask);
+					node, gfp_mask, VMAP_MAY_PURGE);
 	if (IS_ERR(va)) {
 		kfree(vb);
 		return ERR_CAST(va);
@@ -1170,8 +1173,9 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node, pgprot_t pro
 		addr = (unsigned long)mem;
 	} else {
 		struct vmap_area *va;
-		va = alloc_vmap_area(size, PAGE_SIZE,
-				VMALLOC_START, VMALLOC_END, node, GFP_KERNEL);
+		va = alloc_vmap_area(size, PAGE_SIZE, VMALLOC_START,
+					VMALLOC_END, node, GFP_KERNEL,
+					VMAP_MAY_PURGE);
 		if (IS_ERR(va))
 			return NULL;
 
@@ -1372,7 +1376,8 @@ static void clear_vm_uninitialized_flag(struct vm_struct *vm)
 
 static struct vm_struct *__get_vm_area_node(unsigned long size,
 		unsigned long align, unsigned long flags, unsigned long start,
-		unsigned long end, int node, gfp_t gfp_mask, const void *caller)
+		unsigned long end, int node, gfp_t gfp_mask, int try_purge,
+		const void *caller)
 {
 	struct vmap_area *va;
 	struct vm_struct *area;
@@ -1386,16 +1391,17 @@ static struct vm_struct *__get_vm_area_node(unsigned long size,
 		align = 1ul << clamp_t(int, get_count_order_long(size),
 				       PAGE_SHIFT, IOREMAP_MAX_ORDER);
 
-	area = kzalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node);
-	if (unlikely(!area))
-		return NULL;
-
 	if (!(flags & VM_NO_GUARD))
 		size += PAGE_SIZE;
 
-	va = alloc_vmap_area(size, align, start, end, node, gfp_mask);
-	if (IS_ERR(va)) {
-		kfree(area);
+	va = alloc_vmap_area(size, align, start, end, node, gfp_mask,
+				try_purge);
+	if (IS_ERR(va))
+		return NULL;
+
+	area = kzalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node);
+	if (unlikely(!area)) {
+		free_vmap_area(va);
 		return NULL;
 	}
 
@@ -1408,7 +1414,8 @@ struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags,
 				unsigned long start, unsigned long end)
 {
 	return __get_vm_area_node(size, 1, flags, start, end, NUMA_NO_NODE,
-				  GFP_KERNEL, __builtin_return_address(0));
+				  GFP_KERNEL, VMAP_MAY_PURGE,
+				  __builtin_return_address(0));
 }
 EXPORT_SYMBOL_GPL(__get_vm_area);
 
@@ -1417,7 +1424,7 @@ struct vm_struct *__get_vm_area_caller(unsigned long size, unsigned long flags,
 				       const void *caller)
 {
 	return __get_vm_area_node(size, 1, flags, start, end, NUMA_NO_NODE,
-				  GFP_KERNEL, caller);
+				  GFP_KERNEL, VMAP_MAY_PURGE, caller);
 }
 
 /**
@@ -1432,7 +1439,7 @@ struct vm_struct *__get_vm_area_caller(unsigned long size, unsigned long flags,
 struct vm_struct *get_vm_area(unsigned long size, unsigned long flags)
 {
 	return __get_vm_area_node(size, 1, flags, VMALLOC_START, VMALLOC_END,
-				  NUMA_NO_NODE, GFP_KERNEL,
+				  NUMA_NO_NODE, GFP_KERNEL, VMAP_MAY_PURGE,
 				  __builtin_return_address(0));
 }
 
@@ -1440,7 +1447,8 @@ struct vm_struct *get_vm_area_caller(unsigned long size, unsigned long flags,
 				const void *caller)
 {
 	return __get_vm_area_node(size, 1, flags, VMALLOC_START, VMALLOC_END,
-				  NUMA_NO_NODE, GFP_KERNEL, caller);
+				  NUMA_NO_NODE, GFP_KERNEL, VMAP_MAY_PURGE,
+				  caller);
 }
 
 /**
@@ -1713,26 +1721,10 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 	return NULL;
 }
 
-/**
- *	__vmalloc_node_range  -  allocate virtually contiguous memory
- *	@size:		allocation size
- *	@align:		desired alignment
- *	@start:		vm area range start
- *	@end:		vm area range end
- *	@gfp_mask:	flags for the page level allocator
- *	@prot:		protection mask for the allocated pages
- *	@vm_flags:	additional vm area flags (e.g. %VM_NO_GUARD)
- *	@node:		node to use for allocation or NUMA_NO_NODE
- *	@caller:	caller's return address
- *
- *	Allocate enough pages to cover @size from the page level
- *	allocator with @gfp_mask flags.  Map them into contiguous
- *	kernel virtual space, using a pagetable protection of @prot.
- */
-void *__vmalloc_node_range(unsigned long size, unsigned long align,
+static void *__vmalloc_node_range_opts(unsigned long size, unsigned long align,
 			unsigned long start, unsigned long end, gfp_t gfp_mask,
 			pgprot_t prot, unsigned long vm_flags, int node,
-			const void *caller)
+			int try_purge, const void *caller)
 {
 	struct vm_struct *area;
 	void *addr;
@@ -1743,7 +1735,8 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
 		goto fail;
 
 	area = __get_vm_area_node(size, align, VM_ALLOC | VM_UNINITIALIZED |
-				vm_flags, start, end, node, gfp_mask, caller);
+				vm_flags, start, end, node, gfp_mask,
+				try_purge, caller);
 	if (!area)
 		goto fail;
 
@@ -1768,6 +1761,69 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
 	return NULL;
 }
 
+/**
+ *	__vmalloc_node_range  -  allocate virtually contiguous memory
+ *	@size:		allocation size
+ *	@align:		desired alignment
+ *	@start:		vm area range start
+ *	@end:		vm area range end
+ *	@gfp_mask:	flags for the page level allocator
+ *	@prot:		protection mask for the allocated pages
+ *	@vm_flags:	additional vm area flags (e.g. %VM_NO_GUARD)
+ *	@node:		node to use for allocation or NUMA_NO_NODE
+ *	@caller:	caller's return address
+ *
+ *	Allocate enough pages to cover @size from the page level
+ *	allocator with @gfp_mask flags.  Map them into contiguous
+ *	kernel virtual space, using a pagetable protection of @prot.
+ */
+void *__vmalloc_node_range(unsigned long size, unsigned long align,
+			unsigned long start, unsigned long end, gfp_t gfp_mask,
+			pgprot_t prot, unsigned long vm_flags, int node,
+			const void *caller)
+{
+	return __vmalloc_node_range_opts(size, align, start, end, gfp_mask,
+					prot, vm_flags, node, VMAP_MAY_PURGE,
+					caller);
+}
+
+/**
+ *	__vmalloc_try_addr  -  try to alloc at a specific address
+ *	@addr:		address to try
+ *	@size:		size to try
+ *	@gfp_mask:	flags for the page level allocator
+ *	@prot:		protection mask for the allocated pages
+ *	@vm_flags:	additional vm area flags (e.g. %VM_NO_GUARD)
+ *	@node:		node to use for allocation or NUMA_NO_NODE
+ *	@caller:	caller's return address
+ *
+ *	Try to allocate at the specific address. If it succeeds the address is
+ *	returned. If it fails NULL is returned.  It will not try to purge lazy
+ *	free vmap areas in order to fit.
+ */
+void *__vmalloc_node_try_addr(unsigned long addr, unsigned long size,
+			gfp_t gfp_mask,	pgprot_t prot, unsigned long vm_flags,
+			int node, const void *caller)
+{
+	unsigned long addr_end;
+	unsigned long vsize = PAGE_ALIGN(size);
+
+	if (!vsize || (vsize >> PAGE_SHIFT) > totalram_pages)
+		return NULL;
+
+	if (!(vm_flags & VM_NO_GUARD))
+		vsize += PAGE_SIZE;
+
+	addr_end = addr + vsize;
+
+	if (addr > addr_end)
+		return NULL;
+
+	return __vmalloc_node_range_opts(size, 1, addr, addr_end,
+				gfp_mask | __GFP_NOWARN, prot, vm_flags, node,
+				VMAP_NO_PURGE, caller);
+}
+
 /**
  *	__vmalloc_node  -  allocate virtually contiguous memory
  *	@size:		allocation size