From patchwork Tue Aug 29 08:11:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 13368661 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86ADCC83F14 for ; Tue, 29 Aug 2023 08:11:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4C51228003B; Tue, 29 Aug 2023 04:11:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3FD78280037; Tue, 29 Aug 2023 04:11:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 29E6A28003B; Tue, 29 Aug 2023 04:11:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1BAC0280037 for ; Tue, 29 Aug 2023 04:11:51 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CB36C1204B2 for ; Tue, 29 Aug 2023 08:11:50 +0000 (UTC) X-FDA: 81176423580.12.625AB77 Received: from mail-lf1-f54.google.com (mail-lf1-f54.google.com [209.85.167.54]) by imf08.hostedemail.com (Postfix) with ESMTP id E5E93160008 for ; Tue, 29 Aug 2023 08:11:48 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=drYl4gUd; spf=pass (imf08.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.54 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693296709; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7IWUAKCJIzOp/gxV9EkyFZD0b+FcbujfqgJIfgCDsRU=; b=ibjREdCp6FMlBKIkFoABB4Z2T0IZtXJJd689RJgqfFANqSXUPnms+F7fvy6uDNQcDVer8X zRIv3YM0wSeCCldU1slVfV1Qcnm01Sw6tTFGL3qQy05NdXTA9wqmwkecTALc7QR2S3xBsQ 0kWL6Le9k335byRZkPLr6mbTmVnzlpk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693296709; a=rsa-sha256; cv=none; b=IKte+1A+jQne3asQgLIjwFoZVHOPoNMr3lychnQ1zhlxZa2KhjGofSZWUjDV8mBp6tB20w DkPr6OMcimmD1woPbgvIeLJ5lb19FJT9IfoWCTRf56JxwH3FrFKWdduk/usTIfugBs6O/q QNo2IsPaf2FMQioDbnO7N2zKkYSpB5I= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=drYl4gUd; spf=pass (imf08.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.54 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-lf1-f54.google.com with SMTP id 2adb3069b0e04-500aed06ffcso4823778e87.0 for ; Tue, 29 Aug 2023 01:11:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1693296707; x=1693901507; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7IWUAKCJIzOp/gxV9EkyFZD0b+FcbujfqgJIfgCDsRU=; b=drYl4gUd+Jtb9pJbnkmWDw0WW9znwqfm/dN9edYWOz5h87Zk4m0CZviGo0tqS209ji 7VllomAwTwIXfIpMe+6KrvXr6F1IRj+WCNZvkLczNr61sBgvSNMuVN/kJ8mEWH2X6nX0 uuj1sm2KsNDMVcam+3JEmn7cWzUahEwLgLL2nwnedEfakWQpVGG+Q7S+18rT44j8Dzpq ATERh9jcoFaEnq1Ed/WsI1swE7aFwgUQoLhOvI01UFm7uQIkzqXeEF5RNOuqplOGcqhV OI6BaeR2dQa3xX9JA5JQKEioDYYkPMYwPRQU0RLQg5irN7C6Y2ey+azm0BbbyJmONyR1 ZjEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693296707; x=1693901507; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7IWUAKCJIzOp/gxV9EkyFZD0b+FcbujfqgJIfgCDsRU=; b=ceVXaI91US4Jytzm7XygvAjsL7zITJB+TiRgoZ+0oMX0wyAf8rZPcKDqWn8bBfi3MS 3Q3v8VFGY+Hvrnrx5ZLbC5jG3dasdIT5Hbx9SG1pcUSAqUnRrEdxfg6bSWgkNhSt9ldh Fh0omigKUL+hr1peIf9Ra4112Grm4qkdBwDu+AqcV6l8rdGYtFxlt8dqVK8E9MI/Xob9 zooHCrs1e4ILvX4E7GUfVi9+mno0Sxir8FNomKRQiXhj3CaAn9oPLcehoQ5DkaO86LbI 11chIwpeeFxs3aEOEHMjkkHWPwF1mciYHeLykjQyWt1rmDCknt3L89aP/zFUBqRLuDsk KbcA== X-Gm-Message-State: AOJu0YzffDZZiW/7bU4RMR92jHso8EiI03dPpn9PBryg4T3nxbsj/0xY 9bZZQmw5OBMper/uuRy/vHME+QhkV4sOnQ== X-Google-Smtp-Source: AGHT+IF4shimJXURLpT/BCINR6HV4dj6RneDkH6uQA661sVN+/c+lRcPx1Hj3E6H1bDmojzFC3+KaQ== X-Received: by 2002:a19:6451:0:b0:4fd:bc33:e508 with SMTP id b17-20020a196451000000b004fdbc33e508mr18034118lfj.49.1693296707014; Tue, 29 Aug 2023 01:11:47 -0700 (PDT) Received: from pc638.lan ([155.137.26.201]) by smtp.gmail.com with ESMTPSA id f25-20020a19ae19000000b004fbad341442sm1868026lfc.97.2023.08.29.01.11.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Aug 2023 01:11:46 -0700 (PDT) From: "Uladzislau Rezki (Sony)" To: linux-mm@kvack.org, Andrew Morton Cc: LKML , Baoquan He , Lorenzo Stoakes , Christoph Hellwig , Matthew Wilcox , "Liam R . Howlett" , Dave Chinner , "Paul E . McKenney" , Joel Fernandes , Uladzislau Rezki , Oleksiy Avramchenko Subject: [PATCH v2 4/9] mm: vmalloc: Remove global vmap_area_root rb-tree Date: Tue, 29 Aug 2023 10:11:37 +0200 Message-Id: <20230829081142.3619-5-urezki@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230829081142.3619-1-urezki@gmail.com> References: <20230829081142.3619-1-urezki@gmail.com> MIME-Version: 1.0 X-Stat-Signature: h3t6qtbtz8uhraqa5htbc3gs7e779z5p X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: E5E93160008 X-Rspam-User: X-HE-Tag: 1693296708-175836 X-HE-Meta: U2FsdGVkX18uekGGmC3rHv2RZhROau9OiwXDTTvED6oSPwNSW4z9W0i+olFIPfHpWUEzqtmHSB4248g4/SkPzTULMnFXwfT15CMyTTm0/WacvuLgTdI5qb7aM6v+QFk6Xt4+JHZLzSBgiQNkYtNKgiPXM9OQYRInZ3TU5/jbdDYH7lVxrOZeCC4hDFl2pv3apv+8DeTahgQ4+Fs1SDTF5qg8q+jRj6OHVRQREflwBTkfZvTeL34WgOx6aFRBOaZZTPrUrXAoVCMX0QsWnVUiYUl511NZ4K3FpLKfE01siA32DQ+XGbxYiBCqSQvaYeiMkokZt1p2yFt6KRDNbNf6vxoYLLAF41OtXqVI6UcgWdfIy/fxtyFbUk6bZF5w4HJBSzdXZpVKqpPniBnP7ggLsNY8t0cNuutN2vHlQcYsueb6l+HvNkqN55GspIyn/YfEBdFxBZJK5xC0PSirm6kiczgGK3RJChN9pq3On1O9VvjZY8cZ5k3NmvQZEY+25+DrKsvXnzfjxJrAYNwWJsP0c+wRkSTMROW3GtViLQ/EMheuoDWz984PFbm3LtXWgJHZWtUiWQ35yHrnOHoI0NrE4XRkxWHPbhgRi1b+7A87MmLeuQVhiqskQ7+APG3SjNTXovsZoIyZv29IhMcFyd905wp3BE7vl6j3/jcK9vxvzOfuLVr/J8nF26bgzciIdX4QZ8dY0umcWaHoghzZBc56e9lqyN0B7dthBCM5pb/EYMddODjfSzxW65+1ctMPXm4fe0ZEjnx63Du1ZsKQQjkLEN4vMtMOEb4H7CaGVA3zkrGnaKxKBdbElr2dUU2sn+KvMbqIk9SH62oajHpYsVPrFuxiFkVOqKLUB6L9qOW4DyR3qLh1JeXI46tXhUMA5Kh3Mq8OJ6r9bWOxiHKgsiSSsxK26H5ub8dZArKqLqF2BsbNU23WvM3LYaIULXJnR0c33NVMQj1ZZbgUbP3Bs7n VFCJEFqU jApH7is9LXuzCdwtzn55tGCxiXJnp130OubWhlZzSrLz3aPAu5TjekDJ16pfYsnKBWmWR48Ce+OIpmknM91kKB8BmzZTPyZu8RzFuVpbv3IS5JTomck3ywEQEh/lNd3rh406fpcQSrZpxV54yArWE3ORcf/qkSJiJSOhZ2F5P3lSC8HCYBmjnANrmud9rITp+iqTA8Dlp6f5joW6eyyKpCsqEjXkcJDUmUi/v+YDYdQZtMLVBHIz0hpR8mOHSUja00Rq5rSHI8IXO0aE1RT/BkRym6nPvjUgpBbo4uStz+fBZzFggC8Ogk+1kPASlzOAGwKzL8P1GPvf1AoHNslOGjiUbgoAt5envopY7FrNiRig4SSigVP4nt1AXzqO9YRe2eJKVX5MYM09j01bORJ5O2WN66BtDgjB3AKUM2O8KftmNrtso34EbbMKfBtNHN8yTf0L6So0NfUpiIdJGBx1eIoHI/CF5Bb2Rxp6tZipNMVkv9w0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Store allocated objects in a separate nodes. A va->va_start address is converted into a correct node where it should be placed and resided. An addr_to_node() function is used to do a proper address conversion to determine a node that contains a VA. Such approach balances VAs across nodes as a result an access becomes scalable. Number of nodes in a system depends on number of CPUs divided by two. The density factor in this case is 1/2. Please note: 1. As of now allocated VAs are bound to a node-0. It means the patch does not give any difference comparing with a current behavior; 2. The global vmap_area_lock, vmap_area_root are removed as there is no need in it anymore. The vmap_area_list is still kept and is _empty_. It is exported for a kexec only; 3. The vmallocinfo and vread() have to be reworked to be able to handle multiple nodes. Signed-off-by: Uladzislau Rezki (Sony) Reviewed-by: Baoquan He Signed-off-by: Baoquan He --- mm/vmalloc.c | 209 +++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 161 insertions(+), 48 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index b7deacca1483..ae0368c314ff 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -728,11 +728,9 @@ EXPORT_SYMBOL(vmalloc_to_pfn); #define DEBUG_AUGMENT_LOWEST_MATCH_CHECK 0 -static DEFINE_SPINLOCK(vmap_area_lock); static DEFINE_SPINLOCK(free_vmap_area_lock); /* Export for kexec only */ LIST_HEAD(vmap_area_list); -static struct rb_root vmap_area_root = RB_ROOT; static bool vmap_initialized __read_mostly; static struct rb_root purge_vmap_area_root = RB_ROOT; @@ -772,6 +770,38 @@ static struct rb_root free_vmap_area_root = RB_ROOT; */ static DEFINE_PER_CPU(struct vmap_area *, ne_fit_preload_node); +/* + * An effective vmap-node logic. Users make use of nodes instead + * of a global heap. It allows to balance an access and mitigate + * contention. + */ +struct rb_list { + struct rb_root root; + struct list_head head; + spinlock_t lock; +}; + +struct vmap_node { + /* Bookkeeping data of this node. */ + struct rb_list busy; +}; + +static struct vmap_node *nodes, snode; +static __read_mostly unsigned int nr_nodes = 1; +static __read_mostly unsigned int node_size = 1; + +static inline unsigned int +addr_to_node_id(unsigned long addr) +{ + return (addr / node_size) % nr_nodes; +} + +static inline struct vmap_node * +addr_to_node(unsigned long addr) +{ + return &nodes[addr_to_node_id(addr)]; +} + static __always_inline unsigned long va_size(struct vmap_area *va) { @@ -803,10 +833,11 @@ unsigned long vmalloc_nr_pages(void) } /* Look up the first VA which satisfies addr < va_end, NULL if none. */ -static struct vmap_area *find_vmap_area_exceed_addr(unsigned long addr) +static struct vmap_area * +find_vmap_area_exceed_addr(unsigned long addr, struct rb_root *root) { struct vmap_area *va = NULL; - struct rb_node *n = vmap_area_root.rb_node; + struct rb_node *n = root->rb_node; addr = (unsigned long)kasan_reset_tag((void *)addr); @@ -1552,12 +1583,14 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head, */ static void free_vmap_area(struct vmap_area *va) { + struct vmap_node *vn = addr_to_node(va->va_start); + /* * Remove from the busy tree/list. */ - spin_lock(&vmap_area_lock); - unlink_va(va, &vmap_area_root); - spin_unlock(&vmap_area_lock); + spin_lock(&vn->busy.lock); + unlink_va(va, &vn->busy.root); + spin_unlock(&vn->busy.lock); /* * Insert/Merge it back to the free tree/list. @@ -1600,6 +1633,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, int node, gfp_t gfp_mask, unsigned long va_flags) { + struct vmap_node *vn; struct vmap_area *va; unsigned long freed; unsigned long addr; @@ -1645,9 +1679,11 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, va->vm = NULL; va->flags = va_flags; - spin_lock(&vmap_area_lock); - insert_vmap_area(va, &vmap_area_root, &vmap_area_list); - spin_unlock(&vmap_area_lock); + vn = addr_to_node(va->va_start); + + spin_lock(&vn->busy.lock); + insert_vmap_area(va, &vn->busy.root, &vn->busy.head); + spin_unlock(&vn->busy.lock); BUG_ON(!IS_ALIGNED(va->va_start, align)); BUG_ON(va->va_start < vstart); @@ -1871,26 +1907,61 @@ static void free_unmap_vmap_area(struct vmap_area *va) struct vmap_area *find_vmap_area(unsigned long addr) { + struct vmap_node *vn; struct vmap_area *va; + int i, j; + + /* + * An addr_to_node_id(addr) converts an address to a node index + * where a VA is located. If VA spans several zones and passed + * addr is not the same as va->va_start, what is not common, we + * may need to scan an extra nodes. See an example: + * + * <--va--> + * -|-----|-----|-----|-----|- + * 1 2 0 1 + * + * VA resides in node 1 whereas it spans 1 and 2. If passed + * addr is within a second node we should do extra work. We + * should mention that it is rare and is a corner case from + * the other hand it has to be covered. + */ + i = j = addr_to_node_id(addr); + do { + vn = &nodes[i]; - spin_lock(&vmap_area_lock); - va = __find_vmap_area(addr, &vmap_area_root); - spin_unlock(&vmap_area_lock); + spin_lock(&vn->busy.lock); + va = __find_vmap_area(addr, &vn->busy.root); + spin_unlock(&vn->busy.lock); - return va; + if (va) + return va; + } while ((i = (i + 1) % nr_nodes) != j); + + return NULL; } static struct vmap_area *find_unlink_vmap_area(unsigned long addr) { + struct vmap_node *vn; struct vmap_area *va; + int i, j; - spin_lock(&vmap_area_lock); - va = __find_vmap_area(addr, &vmap_area_root); - if (va) - unlink_va(va, &vmap_area_root); - spin_unlock(&vmap_area_lock); + i = j = addr_to_node_id(addr); + do { + vn = &nodes[i]; - return va; + spin_lock(&vn->busy.lock); + va = __find_vmap_area(addr, &vn->busy.root); + if (va) + unlink_va(va, &vn->busy.root); + spin_unlock(&vn->busy.lock); + + if (va) + return va; + } while ((i = (i + 1) % nr_nodes) != j); + + return NULL; } /*** Per cpu kva allocator ***/ @@ -2092,6 +2163,7 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask) static void free_vmap_block(struct vmap_block *vb) { + struct vmap_node *vn; struct vmap_block *tmp; struct xarray *xa; @@ -2099,9 +2171,10 @@ static void free_vmap_block(struct vmap_block *vb) tmp = xa_erase(xa, addr_to_vb_idx(vb->va->va_start)); BUG_ON(tmp != vb); - spin_lock(&vmap_area_lock); - unlink_va(vb->va, &vmap_area_root); - spin_unlock(&vmap_area_lock); + vn = addr_to_node(vb->va->va_start); + spin_lock(&vn->busy.lock); + unlink_va(vb->va, &vn->busy.root); + spin_unlock(&vn->busy.lock); free_vmap_area_noflush(vb->va); kfree_rcu(vb, rcu_head); @@ -2525,9 +2598,11 @@ static inline void setup_vmalloc_vm_locked(struct vm_struct *vm, static void setup_vmalloc_vm(struct vm_struct *vm, struct vmap_area *va, unsigned long flags, const void *caller) { - spin_lock(&vmap_area_lock); + struct vmap_node *vn = addr_to_node(va->va_start); + + spin_lock(&vn->busy.lock); setup_vmalloc_vm_locked(vm, va, flags, caller); - spin_unlock(&vmap_area_lock); + spin_unlock(&vn->busy.lock); } static void clear_vm_uninitialized_flag(struct vm_struct *vm) @@ -3711,6 +3786,7 @@ static size_t vmap_ram_vread_iter(struct iov_iter *iter, const char *addr, */ long vread_iter(struct iov_iter *iter, const char *addr, size_t count) { + struct vmap_node *vn; struct vmap_area *va; struct vm_struct *vm; char *vaddr; @@ -3724,8 +3800,11 @@ long vread_iter(struct iov_iter *iter, const char *addr, size_t count) remains = count; - spin_lock(&vmap_area_lock); - va = find_vmap_area_exceed_addr((unsigned long)addr); + /* Hooked to node_0 so far. */ + vn = addr_to_node(0); + spin_lock(&vn->busy.lock); + + va = find_vmap_area_exceed_addr((unsigned long)addr, &vn->busy.root); if (!va) goto finished_zero; @@ -3733,7 +3812,7 @@ long vread_iter(struct iov_iter *iter, const char *addr, size_t count) if ((unsigned long)addr + remains <= va->va_start) goto finished_zero; - list_for_each_entry_from(va, &vmap_area_list, list) { + list_for_each_entry_from(va, &vn->busy.head, list) { size_t copied; if (remains == 0) @@ -3792,12 +3871,12 @@ long vread_iter(struct iov_iter *iter, const char *addr, size_t count) } finished_zero: - spin_unlock(&vmap_area_lock); + spin_unlock(&vn->busy.lock); /* zero-fill memory holes */ return count - remains + zero_iter(iter, remains); finished: /* Nothing remains, or We couldn't copy/zero everything. */ - spin_unlock(&vmap_area_lock); + spin_unlock(&vn->busy.lock); return count - remains; } @@ -4131,14 +4210,15 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, } /* insert all vm's */ - spin_lock(&vmap_area_lock); for (area = 0; area < nr_vms; area++) { - insert_vmap_area(vas[area], &vmap_area_root, &vmap_area_list); + struct vmap_node *vn = addr_to_node(vas[area]->va_start); + spin_lock(&vn->busy.lock); + insert_vmap_area(vas[area], &vn->busy.root, &vn->busy.head); setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC, pcpu_get_vm_areas); + spin_unlock(&vn->busy.lock); } - spin_unlock(&vmap_area_lock); /* * Mark allocated areas as accessible. Do it now as a best-effort @@ -4261,25 +4341,26 @@ bool vmalloc_dump_obj(void *object) #ifdef CONFIG_PROC_FS static void *s_start(struct seq_file *m, loff_t *pos) - __acquires(&vmap_purge_lock) - __acquires(&vmap_area_lock) { + struct vmap_node *vn = addr_to_node(0); + mutex_lock(&vmap_purge_lock); - spin_lock(&vmap_area_lock); + spin_lock(&vn->busy.lock); - return seq_list_start(&vmap_area_list, *pos); + return seq_list_start(&vn->busy.head, *pos); } static void *s_next(struct seq_file *m, void *p, loff_t *pos) { - return seq_list_next(p, &vmap_area_list, pos); + struct vmap_node *vn = addr_to_node(0); + return seq_list_next(p, &vn->busy.head, pos); } static void s_stop(struct seq_file *m, void *p) - __releases(&vmap_area_lock) - __releases(&vmap_purge_lock) { - spin_unlock(&vmap_area_lock); + struct vmap_node *vn = addr_to_node(0); + + spin_unlock(&vn->busy.lock); mutex_unlock(&vmap_purge_lock); } @@ -4322,9 +4403,11 @@ static void show_purge_info(struct seq_file *m) static int s_show(struct seq_file *m, void *p) { + struct vmap_node *vn; struct vmap_area *va; struct vm_struct *v; + vn = addr_to_node(0); va = list_entry(p, struct vmap_area, list); if (!va->vm) { @@ -4375,7 +4458,7 @@ static int s_show(struct seq_file *m, void *p) * As a final step, dump "unpurged" areas. */ final: - if (list_is_last(&va->list, &vmap_area_list)) + if (list_is_last(&va->list, &vn->busy.head)) show_purge_info(m); return 0; @@ -4406,7 +4489,8 @@ static void vmap_init_free_space(void) { unsigned long vmap_start = 1; const unsigned long vmap_end = ULONG_MAX; - struct vmap_area *busy, *free; + struct vmap_area *free; + struct vm_struct *busy; /* * B F B B B F @@ -4414,12 +4498,12 @@ static void vmap_init_free_space(void) * | The KVA space | * |<--------------------------------->| */ - list_for_each_entry(busy, &vmap_area_list, list) { - if (busy->va_start - vmap_start > 0) { + for (busy = vmlist; busy; busy = busy->next) { + if (busy->addr - vmap_start > 0) { free = kmem_cache_zalloc(vmap_area_cachep, GFP_NOWAIT); if (!WARN_ON_ONCE(!free)) { free->va_start = vmap_start; - free->va_end = busy->va_start; + free->va_end = (unsigned long) busy->addr; insert_vmap_area_augment(free, NULL, &free_vmap_area_root, @@ -4427,7 +4511,7 @@ static void vmap_init_free_space(void) } } - vmap_start = busy->va_end; + vmap_start = (unsigned long) busy->addr + busy->size; } if (vmap_end - vmap_start > 0) { @@ -4443,9 +4527,31 @@ static void vmap_init_free_space(void) } } +static void vmap_init_nodes(void) +{ + struct vmap_node *vn; + int i; + + nodes = &snode; + + if (nr_nodes > 1) { + vn = kmalloc_array(nr_nodes, sizeof(*vn), GFP_NOWAIT); + if (vn) + nodes = vn; + } + + for (i = 0; i < nr_nodes; i++) { + vn = &nodes[i]; + vn->busy.root = RB_ROOT; + INIT_LIST_HEAD(&vn->busy.head); + spin_lock_init(&vn->busy.lock); + } +} + void __init vmalloc_init(void) { struct vmap_area *va; + struct vmap_node *vn; struct vm_struct *tmp; int i; @@ -4467,6 +4573,11 @@ void __init vmalloc_init(void) xa_init(&vbq->vmap_blocks); } + /* + * Setup nodes before importing vmlist. + */ + vmap_init_nodes(); + /* Import existing vmlist entries. */ for (tmp = vmlist; tmp; tmp = tmp->next) { va = kmem_cache_zalloc(vmap_area_cachep, GFP_NOWAIT); @@ -4476,7 +4587,9 @@ void __init vmalloc_init(void) va->va_start = (unsigned long)tmp->addr; va->va_end = va->va_start + tmp->size; va->vm = tmp; - insert_vmap_area(va, &vmap_area_root, &vmap_area_list); + + vn = addr_to_node(va->va_start); + insert_vmap_area(va, &vn->busy.root, &vn->busy.head); } /*