From patchwork Sat May 16 06:47:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Tang X-Patchwork-Id: 11553343 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CDA1B913 for ; Sat, 16 May 2020 06:47:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B2EFC2065C for ; Sat, 16 May 2020 06:47:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B2EFC2065C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 736EA8E0005; Sat, 16 May 2020 02:47:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 698EF8E0001; Sat, 16 May 2020 02:47:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EBF58E0005; Sat, 16 May 2020 02:47:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0095.hostedemail.com [216.40.44.95]) by kanga.kvack.org (Postfix) with ESMTP id 368288E0001 for ; Sat, 16 May 2020 02:47:49 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id EA44B181AEF1D for ; Sat, 16 May 2020 06:47:48 +0000 (UTC) X-FDA: 76821651816.04.voice40_612d76728d41f X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,feng.tang@intel.com,,RULES_HIT:30054:30064:30070,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: voice40_612d76728d41f X-Filterd-Recvd-Size: 2406 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Sat, 16 May 2020 06:47:47 +0000 (UTC) IronPort-SDR: M0JIEJH3wFKUMMTOi5r23BTAoGkuu19NhXG7ihO9wIkLXJ2DE6mD7R3GuKllUcRSHKkuYZOIY4 M/5zNvJFpSEw== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2020 23:47:47 -0700 IronPort-SDR: AuRbLFYfyv25OBAoLDS8sMa16bR0wQfY6MwQH3IyzqUfN6ue3uOfSQ1todkCrYa6z1kKjk1Any 8dxTiXEododw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,398,1583222400"; d="scan'208";a="464984924" Received: from shbuild999.sh.intel.com ([10.239.146.107]) by fmsmga005.fm.intel.com with ESMTP; 15 May 2020 23:47:44 -0700 From: Feng Tang To: Andrew Morton , Michal Hocko , Matthew Wilcox , Johannes Weiner , Mel Gorman , Kees Cook , andi.kleen@intel.com, tim.c.chen@intel.com, dave.hansen@intel.com, ying.huang@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Feng Tang Subject: [PATCH v3 1/3] proc/meminfo: avoid open coded reading of vm_committed_as Date: Sat, 16 May 2020 14:47:38 +0800 Message-Id: <1589611660-89854-2-git-send-email-feng.tang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1589611660-89854-1-git-send-email-feng.tang@intel.com> References: <1589611660-89854-1-git-send-email-feng.tang@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Use the existing vm_memory_committed() instead, which is also convenient for future change. Signed-off-by: Feng Tang Acked-by: Michal Hocko --- fs/proc/meminfo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index 8c1f1bb..578c0b8 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -42,7 +42,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) si_meminfo(&i); si_swapinfo(&i); - committed = percpu_counter_read_positive(&vm_committed_as); + committed = vm_memory_committed(); cached = global_node_page_state(NR_FILE_PAGES) - total_swapcache_pages() - i.bufferram; From patchwork Sat May 16 06:47:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Tang X-Patchwork-Id: 11553345 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B3F6690 for ; Sat, 16 May 2020 06:47:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 81C752065C for ; Sat, 16 May 2020 06:47:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 81C752065C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 784C18E0006; Sat, 16 May 2020 02:47:52 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 710DC8E0001; Sat, 16 May 2020 02:47:52 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5FF398E0006; Sat, 16 May 2020 02:47:52 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0206.hostedemail.com [216.40.44.206]) by kanga.kvack.org (Postfix) with ESMTP id 4681C8E0001 for ; Sat, 16 May 2020 02:47:52 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 0AB3F4DBD for ; Sat, 16 May 2020 06:47:52 +0000 (UTC) X-FDA: 76821651984.01.rub22_61adf39953123 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,feng.tang@intel.com,,RULES_HIT:30046:30054:30064:30075,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: rub22_61adf39953123 X-Filterd-Recvd-Size: 2831 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Sat, 16 May 2020 06:47:51 +0000 (UTC) IronPort-SDR: uptVPf8HdB/NF/30SEeRN9VXHPF9KsqhLzakQJA8caObrjLNk4Ky0jrS/qMOzgUXR/kHAmlfEE z2fPVD7J7fBQ== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2020 23:47:50 -0700 IronPort-SDR: MWwpcoNGIUZHitX854RSvMz2jRCeunv+opN/2PqK1ojEY/UnWuiy9zY2VIJndjE/D9Ao/7P6Kf GpYVqLS88WKQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,398,1583222400"; d="scan'208";a="464984933" Received: from shbuild999.sh.intel.com ([10.239.146.107]) by fmsmga005.fm.intel.com with ESMTP; 15 May 2020 23:47:47 -0700 From: Feng Tang To: Andrew Morton , Michal Hocko , Matthew Wilcox , Johannes Weiner , Mel Gorman , Kees Cook , andi.kleen@intel.com, tim.c.chen@intel.com, dave.hansen@intel.com, ying.huang@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Feng Tang Subject: [PATCH v3 2/3] mm/util.c: make vm_memory_committed() more accurate Date: Sat, 16 May 2020 14:47:39 +0800 Message-Id: <1589611660-89854-3-git-send-email-feng.tang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1589611660-89854-1-git-send-email-feng.tang@intel.com> References: <1589611660-89854-1-git-send-email-feng.tang@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: percpu_counter_sum_positive() will provide more accurate info. As with percpu_counter_read_positive(), in worst case the deviation could be 'batch * nr_cpus', which is totalram_pages/256 for now, and will be more when the batch gets enlarged. Its time cost is about 800 nanoseconds on a 2C/4T platform and 2~3 microseconds on a 2S/36C/72T server in normal case, and in worst case where vm_committed_as's spinlock is under severe contention, it costs 30~40 microseconds for the 2S/36C/72T sever, which should be fine for its only two users: /proc/meminfo and HyperV balloon driver's status trace per second. Signed-off-by: Feng Tang --- mm/util.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/util.c b/mm/util.c index 988d11e..3de78e9 100644 --- a/mm/util.c +++ b/mm/util.c @@ -774,7 +774,7 @@ struct percpu_counter vm_committed_as ____cacheline_aligned_in_smp; */ unsigned long vm_memory_committed(void) { - return percpu_counter_read_positive(&vm_committed_as); + return percpu_counter_sum_positive(&vm_committed_as); } EXPORT_SYMBOL_GPL(vm_memory_committed); From patchwork Sat May 16 06:47:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Tang X-Patchwork-Id: 11553347 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4329590 for ; Sat, 16 May 2020 06:47:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0A05620853 for ; Sat, 16 May 2020 06:47:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0A05620853 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E89CD8E0007; Sat, 16 May 2020 02:47:55 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E16578E0001; Sat, 16 May 2020 02:47:55 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CDA808E0007; Sat, 16 May 2020 02:47:55 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0078.hostedemail.com [216.40.44.78]) by kanga.kvack.org (Postfix) with ESMTP id B6A8F8E0001 for ; Sat, 16 May 2020 02:47:55 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 695F28248068 for ; Sat, 16 May 2020 06:47:55 +0000 (UTC) X-FDA: 76821652110.29.brick72_6225b23d99e28 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,feng.tang@intel.com,,RULES_HIT:30034:30051:30054:30064:30070,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: brick72_6225b23d99e28 X-Filterd-Recvd-Size: 7015 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Sat, 16 May 2020 06:47:54 +0000 (UTC) IronPort-SDR: 4MVg6eV+0C5Hc2mfyxutnEaNr2tCE2tykcMFJ7tFycBE7v9l9qfCaXSlUyP7FNhTWozKcaVlFS dKH3PFH87h4Q== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2020 23:47:54 -0700 IronPort-SDR: HHu9rIu2MAMCIdBVEDX2y+JFZyieAXcsspWNbHGNzFFnDtnSzMnYAptng4hHPYl7llMAARnoSm BYpgY62Uj5hw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,398,1583222400"; d="scan'208";a="464984951" Received: from shbuild999.sh.intel.com ([10.239.146.107]) by fmsmga005.fm.intel.com with ESMTP; 15 May 2020 23:47:51 -0700 From: Feng Tang To: Andrew Morton , Michal Hocko , Matthew Wilcox , Johannes Weiner , Mel Gorman , Kees Cook , andi.kleen@intel.com, tim.c.chen@intel.com, dave.hansen@intel.com, ying.huang@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Feng Tang Subject: [PATCH v3 3/3] mm: adjust vm_committed_as_batch according to vm overcommit policy Date: Sat, 16 May 2020 14:47:40 +0800 Message-Id: <1589611660-89854-4-git-send-email-feng.tang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1589611660-89854-1-git-send-email-feng.tang@intel.com> References: <1589611660-89854-1-git-send-email-feng.tang@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When checking a performance change for will-it-scale scalability mmap test [1], we found very high lock contention for spinlock of percpu counter 'vm_committed_as': 94.14% 0.35% [kernel.kallsyms] [k] _raw_spin_lock_irqsave 48.21% _raw_spin_lock_irqsave;percpu_counter_add_batch;__vm_enough_memory;mmap_region;do_mmap; 45.91% _raw_spin_lock_irqsave;percpu_counter_add_batch;__do_munmap; Actually this heavy lock contention is not always necessary. The 'vm_committed_as' needs to be very precise when the strict OVERCOMMIT_NEVER policy is set, which requires a rather small batch number for the percpu counter. So keep 'batch' number unchanged for strict OVERCOMMIT_NEVER policy, and lift it to 64X for OVERCOMMIT_ALWAYS and OVERCOMMIT_GUESS policies. Also add a sysctl handler to adjust it when the policy is reconfigured. Benchmark with the same testcase in [1] shows 53% improvement on a 8C/16T desktop, and 2097%(20X) on a 4S/72C/144T server. We tested with test platforms in 0day (server, desktop and laptop), and 80%+ platforms shows improvements with that test. And whether it shows improvements depends on if the test mmap size is bigger than the batch number computed. And if the lift is 16X, 1/3 of the platforms will show improvements, though it should help the mmap/unmap usage generally, as Michal Hocko mentioned: " I believe that there are non-synthetic worklaods which would benefit from a larger batch. E.g. large in memory databases which do large mmaps during startups from multiple threads. " [1] https://lore.kernel.org/lkml/20200305062138.GI5972@shao2-debian/ Signed-off-by: Feng Tang --- include/linux/mm.h | 2 ++ include/linux/mman.h | 4 ++++ kernel/sysctl.c | 2 +- mm/mm_init.c | 18 ++++++++++++++---- mm/util.c | 13 +++++++++++++ 5 files changed, 34 insertions(+), 5 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 5a32342..bc3722f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -205,6 +205,8 @@ extern int overcommit_ratio_handler(struct ctl_table *, int, void __user *, size_t *, loff_t *); extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *, size_t *, loff_t *); +extern int overcommit_policy_handler(struct ctl_table *, int, void __user *, + size_t *, loff_t *); #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n)) diff --git a/include/linux/mman.h b/include/linux/mman.h index 4b08e9c..91c93c1 100644 --- a/include/linux/mman.h +++ b/include/linux/mman.h @@ -57,8 +57,12 @@ extern struct percpu_counter vm_committed_as; #ifdef CONFIG_SMP extern s32 vm_committed_as_batch; +extern void mm_compute_batch(void); #else #define vm_committed_as_batch 0 +static inline void mm_compute_batch(void) +{ +} #endif unsigned long vm_memory_committed(void); diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 8a176d8..6fa552d 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -1278,7 +1278,7 @@ static struct ctl_table vm_table[] = { .data = &sysctl_overcommit_memory, .maxlen = sizeof(sysctl_overcommit_memory), .mode = 0644, - .proc_handler = proc_dointvec_minmax, + .proc_handler = overcommit_policy_handler, .extra1 = SYSCTL_ZERO, .extra2 = &two, }, diff --git a/mm/mm_init.c b/mm/mm_init.c index 7da6991..b48dafd 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "internal.h" #ifdef CONFIG_DEBUG_MEMORY_INIT @@ -140,14 +141,23 @@ EXPORT_SYMBOL_GPL(mm_kobj); #ifdef CONFIG_SMP s32 vm_committed_as_batch = 32; -static void __meminit mm_compute_batch(void) +void mm_compute_batch(void) { u64 memsized_batch; s32 nr = num_present_cpus(); s32 batch = max_t(s32, nr*2, 32); - - /* batch size set to 0.4% of (total memory/#cpus), or max int32 */ - memsized_batch = min_t(u64, (totalram_pages()/nr)/256, 0x7fffffff); + unsigned long ram_pages = totalram_pages(); + + /* + * For policy of OVERCOMMIT_NEVER, set batch size to 0.4% + * of (total memory/#cpus), and lift it to 25% for other + * policies to easy the possible lock contention for percpu_counter + * vm_committed_as, while the max limit is INT_MAX + */ + if (sysctl_overcommit_memory == OVERCOMMIT_NEVER) + memsized_batch = min_t(u64, ram_pages/nr/256, INT_MAX); + else + memsized_batch = min_t(u64, ram_pages/nr/4, INT_MAX); vm_committed_as_batch = max_t(s32, memsized_batch, batch); } diff --git a/mm/util.c b/mm/util.c index 3de78e9..99936d3 100644 --- a/mm/util.c +++ b/mm/util.c @@ -729,6 +729,19 @@ int overcommit_ratio_handler(struct ctl_table *table, int write, return ret; } +int overcommit_policy_handler(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, + loff_t *ppos) +{ + int ret; + + ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); + if (ret == 0 && write) + mm_compute_batch(); + + return ret; +} + int overcommit_kbytes_handler(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, loff_t *ppos)