Message ID | 20230329053149.3976378-4-mcgrof@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show
Return-Path: <owner-linux-mm@kvack.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FF9AC74A5B for <linux-mm@archiver.kernel.org>; Wed, 29 Mar 2023 06:22:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 146A76B0078; Wed, 29 Mar 2023 02:22:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D0EC6B007B; Wed, 29 Mar 2023 02:22:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E1638900002; Wed, 29 Mar 2023 02:22:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D0BB86B0078 for <linux-mm@kvack.org>; Wed, 29 Mar 2023 02:22:18 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 9FCA2C0A8B for <linux-mm@kvack.org>; Wed, 29 Mar 2023 06:22:18 +0000 (UTC) X-FDA: 80620941156.25.C63646E Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf22.hostedemail.com (Postfix) with ESMTP id ECEEEC0003 for <linux-mm@kvack.org>; Wed, 29 Mar 2023 06:22:16 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=GyXZwQdz; spf=none (imf22.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680070937; a=rsa-sha256; cv=none; b=RkuR5cE1RB05JXY3o6KzfENsCT7SY9VF4fVwfD8nggXVV5tSmrU4IszoCV5YPH9atuDvuJ hZz9lNdW8aiqO01aPa3E0x0M1+2PIs6WJYj/AbQ1Ds7cf+Mryts90al2dUqRMRPCDk1gL3 fyVMbpIrKVSxulZKr9dcz8usHK9fjUQ= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=GyXZwQdz; spf=none (imf22.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680070937; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RbshZVKDekPDOboeHnEAcgErH9krvuwQN7TysZlnIZw=; b=zAwO9GROvFy+Z8jMEml+lXfdxto5bM5kqtvL5kmtxxzSkyTzkW9vOjn2qBIQiP5Ztgu9m6 kXBzX5WvP4K/jO9EKDeP9sMcNb3iwxonctd99AgIQrVLGqZFAcoVaRtq4csYej25PGmdrf pCzvjNjSqIQc3nRwqI9j6rRypwuCbF8= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=RbshZVKDekPDOboeHnEAcgErH9krvuwQN7TysZlnIZw=; b=GyXZwQdzvgM7yhc8r+C5Vhai/9 IOpEs0a2tDWNEdfprqiEIcZ2E8v1vexNsaoG9ZN2e1ZBToe1wdV9OPP21g4oLpqbMgEuo3fyX/8r2 z8ZG617RPED8eMv9RVJgKv6yxZsjZikUTsRsoepiIWpblLpufupifKeeB2RZS9hCVfFLS5Qi/zaPs wbXqYiFQNVQ9NpGD3zNYkWbEk7J5CwMo3VeJ2PMxGIbDq6flmPN4cv/e7EzAP13A/ztHYle/wUwUP pdzO68cb4rR6+2D/EtIoNOIRkfU1dIvlJhCsfV4cOL31V/agcdygbpFFjQy9YclM9/4eqvm8S4zQH /GKQHogQ==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1phOPW-00GgRY-2D; Wed, 29 Mar 2023 05:31:50 +0000 From: Luis Chamberlain <mcgrof@kernel.org> To: david@redhat.com, patches@lists.linux.dev, linux-modules@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, pmladek@suse.com, petr.pavlu@suse.com, prarit@redhat.com, torvalds@linux-foundation.org, gregkh@linuxfoundation.org, rafael@kernel.org Cc: christophe.leroy@csgroup.eu, tglx@linutronix.de, peterz@infradead.org, song@kernel.org, rppt@kernel.org, willy@infradead.org, vbabka@suse.cz, mhocko@suse.com, dave.hansen@linux.intel.com, mcgrof@kernel.org Subject: [PATCH 3/7] module: avoid allocation if module is already present and ready Date: Tue, 28 Mar 2023 22:31:45 -0700 Message-Id: <20230329053149.3976378-4-mcgrof@kernel.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20230329053149.3976378-1-mcgrof@kernel.org> References: <20230329053149.3976378-1-mcgrof@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: ECEEEC0003 X-Rspamd-Server: rspam01 X-Stat-Signature: ajmjo6nyjbhay57o7jospgq5dxaffgj1 X-HE-Tag: 1680070936-227067 X-HE-Meta: U2FsdGVkX1+IvidKrmgF5EDZuZVFKyMuBKz/OBLaa6p4Ty2NXk7q/t/AdqNVG9spJAQDqhHWKNDxlYk32t1CVEgiTGBTVAvF6HYC36qCvAVCGsHVpl7b6UmNjN0i4tdpPTm+L3lM3Jp8/NkmyWHe8S3ZcaLRzqIzWwyrss5L+WNlaD3eGNiZuot4ovsLOGn1CdnMXFq3mQWHugmqSIcy9Voa814S+Amm2gRUNPb7iuQy76cTdL9b4VDiDDW2WaNXx44TKap4nnaFN8jgx4kgap0/gqgoeXCmi498WqBCX3S34B7RahWxVSeqVps0Pi7PDVGM5cbjLk1dqINwRW5ORsUQxdqT7D2c6NF5UNipb+aWpaoTV/OuEfLIJJ9YgcLEVyccrxJnLOHn3YapP8oFLzEG5dkSB5ZUxNNrQ+1aNjjUDvztibPVFYCSVstiCy2pS6/Z2/edJmNbpKI5E3LIwXu7jBj4bdlSo0If/Wf/JFxnF4gY3EyOHSyn7/TOidChpz22yCBJhqgIFWMkbNrm0pQlO3jxHmYA3+Px7SlMZQ/dg1Y9uHHUbCbkGhGmYzd7Zk3IX0kDZv4l35LBZRGVn+e+AICQr42wrfMxGYN9Zz3flAbRrV1Chj9XNGQQWTZEgVLlZ0T2rxyCM2+74buGdSUeAKVrlz4ae4XpjXZ5gOhCWCwmyKwYa2Fe4yI2EhJidZEUlpAl6Nndt8AbMyhpEWd3avze9spshmyW68jOJtxjGvG2Oi7bm9jqV9W1maQYqrj5MJPsVRll53iaSRUNbw4+mQ+Nd5eJAyKUS4IIDI6ALFfTloKyfLpk7FU1+w4A7SUAtt0DTS0b+8QPWgKAycfCeWkVfMW05ntgWLHZeeeXG90lA/vv6WDvMdWK2HI3+u8M7l6TeN65ZFhl51zus2DfbgNgo4LZCBlSb0K4G4ugGnyWrVS+Zx/3EtmkgusWQtZ4r+sVo4CFxpZMOlt r9AeNBo6 aDpmqeV/K7HFbWvYjs0jWvW7hDYHjerSzcHU4kJFUnD54Udvi3geNP6kfSucNdJrUTQ03x9LyH1CstuP5Zk092u6nN/mRKlHvmiM63g4eaH3qOXVcu9JTbFbWRsHY0WOC5lAgOu4EvD5D0Gylcl4lC1NOI+CHEZVkEq1W2Y58HMFEnh08SsBGtNfZ+F6L3xuM01idGhN12salwUv2Q1uIRZFXrF91G2HM8ndzi6HDTWjGGBk3XiGbnNzeZQYTgM2g3DlEygbSOrcmi7DmlJTwBFK4k4ZDhoY9QTkkgn0tA4BvFYa1hFnlIE3zL7TkW0PX/pGWbOtgY0iXs7NCVxZ0zX8LEmaDabD/nhUtKdxLRjBRBvCc1AUX6IFai4OycQn+tFXz+/jzA57GWvd4YLd/SxzTN5psGhNUpKM80sgucX6AlTZW5H8FpUxxag== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: <linux-mm.kvack.org> |
Series |
module: avoid userspace pressure on unwanted allocations
|
expand
|
diff --git a/kernel/module/main.c b/kernel/module/main.c index 77c2e7a60f2e..145e15f19576 100644 --- a/kernel/module/main.c +++ b/kernel/module/main.c @@ -2785,7 +2785,11 @@ static int early_mod_check(struct load_info *info, int flags) if (err) return err; - return 0; + mutex_lock(&module_mutex); + err = module_patient_check_exists(info->mod->name); + mutex_unlock(&module_mutex); + + return err; } /*
load_module() will allocate a struct module before even checking if the module is already loaded. This can create unecessary memory pressure since we can easily just check if the module is already present early with the copy of the module information from userspace after we've validated it a bit. This can only be an issue if a system is getting hammered with userspace loading modules. Note that there are two ways to load modules, one is kernel moduile auto-loading (request_module() calls in-kernel) and the other is modprobe calls from userspace. The auto-loading is in-kernel, that pings back to userspace to just call modprobe. We already have a way to restrict the amount of concurrent kernel auto-loads in a given time, however that does not stop a system from issuing tons of system calls to load a module and for the races to exist. Userspace itself *is* supposed to check if a module is present before loading it. But we're observing situations where tons of the same module are in effect being loaded. Although some of these are acknolwedged as in-kernel bugs such as the ACPI frequency modules, issues for which we already have fixes merged or are working towards, but we can also help a bit more in the modules side to avoid those dramatic situations. All that is just memory being allocated to then be thrown away. To avoid memory pressure for such stupid cases put a stop gap for them. We now check for the module being present *before* allocation, and then right after we are going to add it to the system. On a 8vcpu 8 GiB RAM system using kdevops and testing against selftests kmod.sh -t 0008 I see a saving in the *highest* side of memory consumption of up to ~ 84 MiB with the Linux kernel selftests kmod test 0008. With the new stress-ng module test I see a 145 MiB difference in max memory consumption with 100 ops. The stress-ng module ops tests can be pretty pathalogical -- it is not realistic, however it was used to finally successfully reproduce issues which are only reported to happen on system with over 400 CPUs [0] by just usign 100 ops on a 8vcpu 8 GiB RAM system. This can be observed and visualized below. The time it takes to run the test is also not affected. The kmod tests 0008: The gnuplot is set to a range from 400000 KiB (390 Mib) - 580000 (566 Mib) given the tests peak around that range. cat kmod.plot set term dumb set output fileout set yrange [400000:580000] plot filein with linespoints title "Memory usage (KiB)" Before: root@kmod ~ # /data/linux-next/tools/testing/selftests/kmod/kmod.sh -t 0008 root@kmod ~ # free -k -s 1 -c 40 | grep Mem | awk '{print $3}' > log-0008-before.txt ^C root@kmod ~ # sort -n -r log-0008-before.txt | head -1 528732 So ~516.33 MiB After: root@kmod ~ # /data/linux-next/tools/testing/selftests/kmod/kmod.sh -t 0008 root@kmod ~ # free -k -s 1 -c 40 | grep Mem | awk '{print $3}' > log-0008-after.txt ^C root@kmod ~ # sort -n -r log-0008-after.txt | head -1 442516 So ~432.14 MiB That's about 84 ~MiB in savings in the worst case. The graphs: root@kmod ~ # gnuplot -e "filein='log-0008-before.txt'; fileout='graph-0008-before.txt'" kmod.plot root@kmod ~ # gnuplot -e "filein='log-0008-after.txt'; fileout='graph-0008-after.txt'" kmod.plot root@kmod ~ # cat graph-0008-before.txt 580000 +-----------------------------------------------------------------+ | + + + + + + + | 560000 |-+ Memory usage (KiB) ***A***-| | | 540000 |-+ +-| | | | *A *AA*AA*A*AA *A*AA A*A*A *AA*A*AA*A A | 520000 |-+A*A*AA *AA*A *A*AA*A*AA *A*A A *A+-| |*A | 500000 |-+ +-| | | 480000 |-+ +-| | | 460000 |-+ +-| | | | | 440000 |-+ +-| | | 420000 |-+ +-| | + + + + + + + | 400000 +-----------------------------------------------------------------+ 0 5 10 15 20 25 30 35 40 root@kmod ~ # cat graph-0008-after.txt 580000 +-----------------------------------------------------------------+ | + + + + + + + | 560000 |-+ Memory usage (KiB) ***A***-| | | 540000 |-+ +-| | | | | 520000 |-+ +-| | | 500000 |-+ +-| | | 480000 |-+ +-| | | 460000 |-+ +-| | | | *A *A*A | 440000 |-+A*A*AA*A A A*A*AA A*A*AA*A*AA*A*AA*A*AA*AA*A*AA*A*AA-| |*A *A*AA*A | 420000 |-+ +-| | + + + + + + + | 400000 +-----------------------------------------------------------------+ 0 5 10 15 20 25 30 35 40 The stress-ng module tests: This is used to run the test to try to reproduce the vmap issues reported by David: echo 0 > /proc/sys/vm/oom_dump_tasks ./stress-ng --module 100 --module-name xfs Prior to this commit: root@kmod ~ # free -k -s 1 -c 40 | grep Mem | awk '{print $3}' > baseline-stress-ng.txt root@kmod ~ # sort -n -r baseline-stress-ng.txt | head -1 5046456 After this commit: root@kmod ~ # free -k -s 1 -c 40 | grep Mem | awk '{print $3}' > after-stress-ng.txt root@kmod ~ # sort -n -r after-stress-ng.txt | head -1 4896972 5046456 - 4896972 149484 149484/1024 145.98046875000000000000 So this commit using stress-ng reveals saving about 145 MiB in memory using 100 ops from stress-ng which reproduced the vmap issue reported. cat kmod.plot set term dumb set output fileout set yrange [4700000:5070000] plot filein with linespoints title "Memory usage (KiB)" root@kmod ~ # gnuplot -e "filein='baseline-stress-ng.txt'; fileout='graph-stress-ng-before.txt'" kmod-simple-stress-ng.plot root@kmod ~ # gnuplot -e "filein='after-stress-ng.txt'; fileout='graph-stress-ng-after.txt'" kmod-simple-stress-ng.plot root@kmod ~ # cat graph-stress-ng-before.txt +---------------------------------------------------------------+ 5.05e+06 |-+ + A + + + + + + +-| | * Memory usage (KiB) ***A*** | | * A | 5e+06 |-+ ** ** +-| | ** * * A | 4.95e+06 |-+ * * A * A* +-| | * * A A * * * * A | | * * * * * * *A * * * A * | 4.9e+06 |-+ * * * A*A * A*AA*A A *A **A **A*A *+-| | A A*A A * A * * A A * A * ** | | * ** ** * * * * * * * | 4.85e+06 |-+ A A A ** * * ** *-| | * * * * ** * | | * A * * * * | 4.8e+06 |-+ * * * A A-| | * * * | 4.75e+06 |-+ * * * +-| | * ** | | * + + + + + + ** + | 4.7e+06 +---------------------------------------------------------------+ 0 5 10 15 20 25 30 35 40 root@kmod ~ # cat graph-stress-ng-after.txt +---------------------------------------------------------------+ 5.05e+06 |-+ + + + + + + + +-| | Memory usage (KiB) ***A*** | | | 5e+06 |-+ +-| | | 4.95e+06 |-+ +-| | | | | 4.9e+06 |-+ *AA +-| | A*AA*A*A A A*AA*AA*A*AA*A A A A*A *AA*A*A A A*AA*AA | | * * ** * * * ** * *** * | 4.85e+06 |-+* *** * * * * *** A * * +-| | * A * * ** * * A * * | | * * * * ** * * | 4.8e+06 |-+* * * A * * * +-| | * * * A * * | 4.75e+06 |-* * * * * +-| | * * * * * | | * + * *+ + + + + * *+ | 4.7e+06 +---------------------------------------------------------------+ 0 5 10 15 20 25 30 35 40 [0] https://lkml.kernel.org/r/20221013180518.217405-1-david@redhat.com Reported-by: David Hildenbrand <david@redhat.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> --- kernel/module/main.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)