From patchwork Tue Jan 2 18:46:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 13509281 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F7BDC47074 for ; Tue, 2 Jan 2024 18:47:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9EC546B00B4; Tue, 2 Jan 2024 13:47:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 99D726B02D9; Tue, 2 Jan 2024 13:47:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F1496B02D8; Tue, 2 Jan 2024 13:47:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 6B26C6B00B4 for ; Tue, 2 Jan 2024 13:47:20 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 4DA1FA1C21 for ; Tue, 2 Jan 2024 18:47:20 +0000 (UTC) X-FDA: 81635253840.14.1B6C793 Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com [209.85.167.43]) by imf18.hostedemail.com (Postfix) with ESMTP id 4C29C1C0004 for ; Tue, 2 Jan 2024 18:47:18 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="aFRZuA/u"; spf=pass (imf18.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.43 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704221238; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=36KamwtZGiOJ+iCEiFn7yiYoyZmI7LhL5vNG1kBAh/Y=; b=zbOKi5a9eKXwmuYH2z8fLuR3YhHT9z0Se6yyD+A8FkoVI3UK94pvmLE+sdKOj7yt3RCPtd 6Klpkaww6Oxe8Y1tQqn1R+I8/spwrgv2rI5eJUk2cMv2QnoVCnS52yP8ZtZ0OgizxYDz2k ODEJ+lgJO7rOn5jRkl2hNkQxGZyKK1g= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704221238; a=rsa-sha256; cv=none; b=Qcwrmy/3KscrLIkSSSanHiH7XRQparkSdtfHa5uttQjb8DvC5VC4b6q2R7IlR8kquR6vSy 6NXrr9KbTGvl7WCE4hNwemBajA0E0cX6f+//kOE1kl/NefQjtiUDX7X2w5ncX4Zu8rpw2X ZiYZenHGU+Bvq7F1aGssMfB/1ulC568= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="aFRZuA/u"; spf=pass (imf18.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.43 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-lf1-f43.google.com with SMTP id 2adb3069b0e04-50e7d6565b5so6740774e87.0 for ; Tue, 02 Jan 2024 10:47:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704221237; x=1704826037; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=36KamwtZGiOJ+iCEiFn7yiYoyZmI7LhL5vNG1kBAh/Y=; b=aFRZuA/urmBi97qC/yqKQJLSpBptW2OdXkWD6uh1PwRCTGJYa0jpLYKAKHJf3OlJhA BXODd5CCnYE48uI5ToGhZ3t3Ig5jYqiK+YJTguE23JsOfDLxxiOHL9RJ3T5PuUq1w7pM GcsK4+j5t1Fnh9iqnZ3j9S7GNJrCr95N1Gy9DlEkoz/N1WzUbUNOrJeEN85TLQL9RpJ5 JsSpiMXgsERFack14zjSAf2Mwhiz2Nnod34HFalSOn/mbIFFdZL5k+ngHsBqCTdyUN2a gkJ+nEAnkz4qzS91iIuADK13fHqk4ju6EjVWxMyASQIHZHCm3rylX9Kzg8wyVcunxY2M A4fQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704221237; x=1704826037; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=36KamwtZGiOJ+iCEiFn7yiYoyZmI7LhL5vNG1kBAh/Y=; b=N82yO2QkJnkLOXMPf1abicNgnotpHX0hjVLwBDC5EBTwUZVa0F5q5rfM1ymx97b/rq ygeImQNVIKCMv/HhaGNnzdWrWBE4jBW0JQM6XHmmkY1CnXF0DUwLDIjrkqFFG8cJ3EaP oA8I8Ol1ovU3/d4ZkRPwZurUBojjvycb17vFj/DYOGi+cT0ln3zYVMq1cQbBY9ON3hq7 2nIuEXrWijZkY+5g42kq6/O3lc7h1fTzXb7x8H0qPhn9aPsLDVGnX+1F5P5iXUAEQbFF IJMDHmBSYRUSrQOKi1CcrNK1RwS0UnNsv+Z7fBYkjKFJncYJTW746VsXVg6nR52psJ5x cTNA== X-Gm-Message-State: AOJu0YwSCWj96/ufUUUtmN0YGiVj7m8p4XGf8T2DVQeUhzquL8ASAzdR S41+UJTAFBahfkPivUavWWNli22LiNa3Bg== X-Google-Smtp-Source: AGHT+IFi7xrYzx6xhsiq08yLfVLl/kXBKY0Xd49T3HSM7WNrgq2+gVmjHodohn+UrhXHc14kRcbU4A== X-Received: by 2002:ac2:5d67:0:b0:50e:383b:19bd with SMTP id h7-20020ac25d67000000b0050e383b19bdmr6365437lft.102.1704221236664; Tue, 02 Jan 2024 10:47:16 -0800 (PST) Received: from pc638.lan (host-185-121-47-193.sydskane.nu. [185.121.47.193]) by smtp.gmail.com with ESMTPSA id q1-20020ac246e1000000b0050e7be886d9sm2592656lfo.56.2024.01.02.10.47.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Jan 2024 10:47:16 -0800 (PST) From: "Uladzislau Rezki (Sony)" To: linux-mm@kvack.org, Andrew Morton Cc: LKML , Baoquan He , Lorenzo Stoakes , Christoph Hellwig , Matthew Wilcox , "Liam R . Howlett" , Dave Chinner , "Paul E . McKenney" , Joel Fernandes , Uladzislau Rezki , Oleksiy Avramchenko Subject: [PATCH v3 10/11] mm: vmalloc: Set nr_nodes based on CPUs in a system Date: Tue, 2 Jan 2024 19:46:32 +0100 Message-Id: <20240102184633.748113-11-urezki@gmail.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240102184633.748113-1-urezki@gmail.com> References: <20240102184633.748113-1-urezki@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 4C29C1C0004 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: ojhsgc6qy8s7s37po1bperfwp9ms4ghp X-HE-Tag: 1704221238-579956 X-HE-Meta: U2FsdGVkX191tBI7Sfox4ouh/3WPtJXnLyodGk+4qornTgjthe4opdevfoceyw/wzhf0ewTRbV76IsliMhNGvZ6IY9bhIR7WFabQ5kz16QcscjzefS2sLI4RfW3S7wcFVrELAXpV4hFRUnCPpGafTLeVeJQTdA93adE77Jgrr/OlA8RwYU5Fnb3uoCnng9X5rc+raHryqx42flcEFYL1ojLHFl32BFhP5gIylTvpceNleXt9qy7f/eTeMcYR2UxbJccdYpqVOSRXXtOUPKAdQ6cLp5j1G2M8jjJWFMiI0JySvV316mHsquWo4wK5IVzLUXVfOOn4arWxAGtUxs7K2KBSSwAru6rsO888IVovcmeGwHetZNUc7KEH51SEWj8BS6ulf2SX4K+kuzjUlU6EgnfoviPi25yoGRA3sMRMhOlaNdIS/o1GdJgY8g3yjjwlNarzrRbU8fAJ6BcHHxl7y5A4AdeibD6R114WQC/wtK0HUvSI6U5xy0Qx7/WsAyzASlzsUYk1wL78BR6V4TN1E6UfVcCAxr1WZMP/FUy/IVrBXD0cmfcgKIZdUcYerB/9UM96WvG8lZgF2RyBhiWYIEX5+06lohf/+782DxMptyD6ilbwEK6Zm2YtAS8g8rDEo/Db67r5TEbdp2v2OuUZCefTgXMKDTRagDzxMGc9+c3ImRJ1bxNvJqkW3LftvYd6CWXdLKvgyTzbJZr8J7C69dy0WPMrKBjw0hKE2JrobTDyOiQbDlGlrmygZzhKo5AyYNKhzIf3hdR6aMzyU5neFzulJPUp5Dbolh92XgoHrGwd/m/ntkwbZILpZvHW0bv0WKVKq2UWQdL6FTRBTDL1dWvF5/JFCQxUbWgwlHLDqL1v/CInrS3zgi5IcwHcS5vRIiWQQb1hJ3RSPZxzUIygOll0cTGT60I9zxQI5nc1n+Ot9324h2RsNPged+CCAAInvMqcglfnIjg/xhrLzio 3Z9R6SN1 mY3TPlJO5uOFBehsPGVCAgmbZADnPMy3qoZNgPb+5FMQ8uvmZeTMFoxYx2it1qBVs52UH+l/BMUWbGr3MHD24G1kdZH38AWwwaYKrFNC9fselTXBp81i2rjN236RHA9eWtqJH9jCi3PNx5Mz156Xi8HrPhgCua9VCXviNkcRT+1kKQ5OQIQXWElC0cbJYDI/z72MS9yzEqJUBRPCuDO3GZOGJhij0WDSKD14T7YdJEj1paF4IjI6RiK2JPMR1gUeFNvlXUwRoIcorPPZa0lYPJwZSTFTz+WVmKH2hIIAJ6yVY/WI6IH+OuW/dKlCOGxS08WR7u899a/BZNSzZUDebe/LKWdVmAaMUiH90kMnyayWNNDFPbch6d+qTfsvPI5tn5KNPEb9L0DyG8E7CzUDuydMRKvVztgu7Qpf0PGWpVVCR2Q153Mg9SkI/Rgbnkljwv0sMaB8V0YAI3z2vy0RgPF2tstxjffmdjTsz24CTY2DQtE0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A number of nodes which are used in the alloc/free paths is set based on num_possible_cpus() in a system. Please note a high limit threshold though is fixed and corresponds to 128 nodes. For 32-bit or single core systems an access to a global vmap heap is not balanced. Such small systems do not suffer from lock contentions due to low number of CPUs. In such case the nr_nodes is equal to 1. Test on AMD Ryzen Threadripper 3970X 32-Core Processor: sudo ./test_vmalloc.sh run_test_mask=7 nr_threads=64 94.41% 0.89% [kernel] [k] _raw_spin_lock 93.35% 93.07% [kernel] [k] native_queued_spin_lock_slowpath 76.13% 0.28% [kernel] [k] __vmalloc_node_range 72.96% 0.81% [kernel] [k] alloc_vmap_area 56.94% 0.00% [kernel] [k] __get_vm_area_node 41.95% 0.00% [kernel] [k] vmalloc 37.15% 0.01% [test_vmalloc] [k] full_fit_alloc_test 35.17% 0.00% [kernel] [k] ret_from_fork_asm 35.17% 0.00% [kernel] [k] ret_from_fork 35.17% 0.00% [kernel] [k] kthread 35.08% 0.00% [test_vmalloc] [k] test_func 34.45% 0.00% [test_vmalloc] [k] fix_size_alloc_test 28.09% 0.01% [test_vmalloc] [k] long_busy_list_alloc_test 23.53% 0.25% [kernel] [k] vfree.part.0 21.72% 0.00% [kernel] [k] remove_vm_area 20.08% 0.21% [kernel] [k] find_unlink_vmap_area 2.34% 0.61% [kernel] [k] free_vmap_area_noflush vs 82.32% 0.22% [test_vmalloc] [k] long_busy_list_alloc_test 63.36% 0.02% [kernel] [k] vmalloc 63.34% 2.64% [kernel] [k] __vmalloc_node_range 30.42% 4.46% [kernel] [k] vfree.part.0 28.98% 2.51% [kernel] [k] __alloc_pages_bulk 27.28% 0.19% [kernel] [k] __get_vm_area_node 26.13% 1.50% [kernel] [k] alloc_vmap_area 21.72% 21.67% [kernel] [k] clear_page_rep 19.51% 2.43% [kernel] [k] _raw_spin_lock 16.61% 16.51% [kernel] [k] native_queued_spin_lock_slowpath 13.40% 2.07% [kernel] [k] free_unref_page 10.62% 0.01% [kernel] [k] remove_vm_area 9.02% 8.73% [kernel] [k] insert_vmap_area 8.94% 0.00% [kernel] [k] ret_from_fork_asm 8.94% 0.00% [kernel] [k] ret_from_fork 8.94% 0.00% [kernel] [k] kthread 8.29% 0.00% [test_vmalloc] [k] test_func 7.81% 0.05% [test_vmalloc] [k] full_fit_alloc_test 5.30% 4.73% [kernel] [k] purge_vmap_node 4.47% 2.65% [kernel] [k] free_vmap_area_noflush confirms that a native_queued_spin_lock_slowpath goes down to 16.51% percent from 93.07%. The throughput is ~12x higher: urezki@pc638:~$ time sudo ./test_vmalloc.sh run_test_mask=7 nr_threads=64 Run the test with following parameters: run_test_mask=7 nr_threads=64 Done. Check the kernel ring buffer to see the summary. real 10m51.271s user 0m0.013s sys 0m0.187s urezki@pc638:~$ urezki@pc638:~$ time sudo ./test_vmalloc.sh run_test_mask=7 nr_threads=64 Run the test with following parameters: run_test_mask=7 nr_threads=64 Done. Check the kernel ring buffer to see the summary. real 0m51.301s user 0m0.015s sys 0m0.040s urezki@pc638:~$ Signed-off-by: Uladzislau Rezki (Sony) --- mm/vmalloc.c | 29 +++++++++++++++++++++++------ 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 0c671cb96151..ef534c76daef 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -4879,10 +4879,27 @@ static void vmap_init_free_space(void) static void vmap_init_nodes(void) { struct vmap_node *vn; - int i, j; + int i, n; + +#if BITS_PER_LONG == 64 + /* A high threshold of max nodes is fixed and bound to 128. */ + n = clamp_t(unsigned int, num_possible_cpus(), 1, 128); + + if (n > 1) { + vn = kmalloc_array(n, sizeof(*vn), GFP_NOWAIT | __GFP_NOWARN); + if (vn) { + /* Node partition is 16 pages. */ + vmap_zone_size = (1 << 4) * PAGE_SIZE; + nr_vmap_nodes = n; + vmap_nodes = vn; + } else { + pr_err("Failed to allocate an array. Disable a node layer\n"); + } + } +#endif - for (i = 0; i < nr_vmap_nodes; i++) { - vn = &vmap_nodes[i]; + for (n = 0; n < nr_vmap_nodes; n++) { + vn = &vmap_nodes[n]; vn->busy.root = RB_ROOT; INIT_LIST_HEAD(&vn->busy.head); spin_lock_init(&vn->busy.lock); @@ -4891,9 +4908,9 @@ static void vmap_init_nodes(void) INIT_LIST_HEAD(&vn->lazy.head); spin_lock_init(&vn->lazy.lock); - for (j = 0; j < MAX_VA_SIZE_PAGES; j++) { - INIT_LIST_HEAD(&vn->pool[j].head); - WRITE_ONCE(vn->pool[j].len, 0); + for (i = 0; i < MAX_VA_SIZE_PAGES; i++) { + INIT_LIST_HEAD(&vn->pool[i].head); + WRITE_ONCE(vn->pool[i].len, 0); } spin_lock_init(&vn->pool_lock);