From patchwork Thu Aug 24 03:42:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13363434 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F5F1C71145 for ; Thu, 24 Aug 2023 03:43:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ED2BB28006E; Wed, 23 Aug 2023 23:43:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E832E8E0011; Wed, 23 Aug 2023 23:43:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D244528006E; Wed, 23 Aug 2023 23:43:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C40E48E0011 for ; Wed, 23 Aug 2023 23:43:56 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8E50AC011C for ; Thu, 24 Aug 2023 03:43:56 +0000 (UTC) X-FDA: 81157604472.24.F9F2B1F Received: from mail-oo1-f47.google.com (mail-oo1-f47.google.com [209.85.161.47]) by imf02.hostedemail.com (Postfix) with ESMTP id C6B918000A for ; Thu, 24 Aug 2023 03:43:54 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="K/MrBMNf"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf02.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.161.47 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692848634; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Q1zZqEjbNGGOVpiYhrZMO0jXiprhVfWuvqN/BJwCVRM=; b=yxGvNHmjOo6bLalpoI5G+fkkGbt6AJpcj5CET4Isxd5klHVbtE11qwuddiBsOns/BhewFR Wq8VyBIygqBHaV6bxJW/wEbodcdZ2dULufj7p2M1+o8UY4oNlqPivPhNNumptYT+CHiVlK 5ETkIYz5xEC3rDzEmx5o6nOcMVLYqms= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="K/MrBMNf"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf02.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.161.47 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692848634; a=rsa-sha256; cv=none; b=y3mqnnwfZBp0JpT2VeCg/BWiwtd059lQ2vN7plA78csb1G8OM9hKX5wi8l757c1zo/QA4g TAh4ELY24cxSuz58nUSMyFCksKNPGZnB9gRn9fDJgu6FUDg/648FAg45BuCpwTSe9mTIzu oVa0Ud86hggjG2PaeGavlQvkphXhSVQ= Received: by mail-oo1-f47.google.com with SMTP id 006d021491bc7-57328758a72so31947eaf.1 for ; Wed, 23 Aug 2023 20:43:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1692848634; x=1693453434; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Q1zZqEjbNGGOVpiYhrZMO0jXiprhVfWuvqN/BJwCVRM=; b=K/MrBMNfp7dHdkm63sphyH9kM0px/L/OwegOYmv+g0wSOt5+s0OJOTKyRsMewbZYJO RY+TZU97/c1AaVqyfogCwHjn0zSWG2fDDNLlCze3otFgvG0fyCImBsKayQzpPHyFNUSw NpAgV3jhtF+TT26aC3yUKh97/fYgtxa9ZQ/i5JDWo8gWtDR9QjcpJjnnFW3umkWkEPNs nobjK9HqhQdL1Ao18CUqebOrUdqqb7bSFqkinY2DDt7xLohnwYPBy6YAuF4oRgrdwzP8 eivM6nYnzrcKzYZBshNXlM+helz+tC0kXFonrcMT2I6mkHvCD9FG51AfjLXkV6pV2OkH Mjqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692848634; x=1693453434; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Q1zZqEjbNGGOVpiYhrZMO0jXiprhVfWuvqN/BJwCVRM=; b=R4TfYo62cexMNboHUdBz+uzXVP2F6M921z+OEQRIepMdR5ncxteZSFFBFD2iCfdySz Oi0jtAfGv2iLOhSF0bBfZaESS21zq/EuSCqAgCAuqXRlAHEhWE/RSfCSqzCh5umeHCD4 3U73Calh9C6VoYWv6Y5196vp0IKfEbSCDcjGEKILJTxFIm+i6N/UxDxINz/cJv70HMR7 he4OwtRWDk4YLLFjmEAS4OTDHVCONYdYjxyYvSB7WsJHWQ6+EgZ6DAvw75MChvstC1wF 2HOilOK5+x+4uLHxtb1Bm5rJWRlJTfJjMw7X+TQV9f/vtm358WcSSjtk/yGioc30xTGn Kpzg== X-Gm-Message-State: AOJu0YwtyjhLCNGaMnOv/UCY9kF5uW40MiJ8Nv4PAa4N9aT5Jih9mK2/ lLkUg+TL+2bp5jHW6vffs7AcBA== X-Google-Smtp-Source: AGHT+IFpwUWWwWOremRYw4t1mi4SxEeMAtG9qVGhInSAaAz7xDSk2nrbI2IvLuKxDeVK1mK2CxIZFw== X-Received: by 2002:a05:6808:1a84:b0:3a6:f8e5:edad with SMTP id bm4-20020a0568081a8400b003a6f8e5edadmr13551708oib.4.1692848633823; Wed, 23 Aug 2023 20:43:53 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.146]) by smtp.gmail.com with ESMTPSA id t6-20020a63b246000000b005579f12a238sm10533157pgo.86.2023.08.23.20.43.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Aug 2023 20:43:53 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu, steven.price@arm.com, cel@kernel.org, senozhatsky@chromium.org, yujie.liu@intel.com, gregkh@linuxfoundation.org, muchun.song@linux.dev Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Qi Zheng Subject: [PATCH v5 01/45] mm: shrinker: add infrastructure for dynamically allocating shrinker Date: Thu, 24 Aug 2023 11:42:20 +0800 Message-Id: <20230824034304.37411-2-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230824034304.37411-1-zhengqi.arch@bytedance.com> References: <20230824034304.37411-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: C6B918000A X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: wr6feppmuwjn6ua9qrkf3nais9kjhqry X-HE-Tag: 1692848634-912691 X-HE-Meta: U2FsdGVkX1+n4IE0O0QLDPDPsYZ306O1SmIFBrvAz9zV12ZZjmUZoVE+xYgTahygESKWEN5mjNoEIiNInjBHlTTrEkp8URR8wlsSoGM8pmEqf56hPvadPtuzSfCF/Z0do9RMQleysSPBeGLClPXHpbr2yt1r/Phr9q1azWnsMQxvgJ3CsBzyyqIdaclu3TMrcIhRmG3AjE0abSxn+3Kb/6pYYBAIo1flUiuPsce4KJI67zKeBU4LBbyhHr4+k9JMGjQBvUYkr3YnsRDDmEEdDSROJeIT2/P5noccVK9hkeFr7V4o91HrV3eYnphG3ruzCJO5Msk/B4lPpkD0KefSVXkxHRZeo4kggOMDBNJcm1Yu0oSZUcMcNsi1tfjRuEraJfkNtDyIHJenbuyWDkcjzUDoL9wo2OMTVqsDykYSupXywHZS+E7Ig6ckq99n1AkljsVmZPhTEbdE59AMkdLf2uEnz4frzuuljcEIwWjIlryuNta1Xf1qMc8xyWvWtCpc1wOrPHC/DJfmorLVMHBb2gDT2/dhVdQ79DKbe6PnIq4NgoD2NRhsyQvl0iQEApIjjxiFruP4C36g/euBfOQnBcHv3f0MEp94HxOM7VDXjBC+wS9MZM+0z9lU/3fkL/I/L1k4idrLRVQ4RrBykzvWITk2637/Bo2Oft5ikfmDObnbelvzovi848urV9Ozok082ozvijaJs9YqGFZnYtGq8sxN+Lje70ha8rK8C4HG8UJQccTY7pIozKMOUcvoxKxmyqEYOiaX2NR576W1CqhIELjoz1e20XPz/96uLLuFPZQD6VqLCDMI/+iM8XF9D1WKw4JrGDAn8FrCs+nnZDfLoUg8oAQy7EyhaihkfebHzi1dyLRGm2VWSicqnvlRsNbKLcS7qu6b5KlKp6WSHZHcpvQQ1NGDD1V1lthMadQeYMMFbVkWJeVgU+HxkH6m+VlWezC2DWAmUQwHsXWrx/s akdxcQIz 9N4A3fqLBtY2bkV0QAjJQg78bhONheMwYNyD+VeNP/XfbQQgJ6mvLochTvXai7ngY7vdr8i7S6xzJzNzUlOxppp0U6+79/Luflkl765oofn2vwMyacyxbsd0IUAz0C59h0zsR3rz2M/SaFtyAr3Q+TWcn33VVu3/JKELs9L/qGHd/jRYsVhmxOXIgLhcQEeakatTugQ0u3sgNzAqkNbKKaoMHvJKCpqLBaq23u3JruOqY40vH+TGwtRJhCprQWUYmVeHvnW6+r8wVri7tYnM/HVAJvhVQfZG0RPlJMVnMmMiOYB44shJtNSLGwAzbD+4QgzdXW1oJonQKCsYPom2RbBLpNk4CFqAq/W5DOGZygs7M6jRx2g2GtCKiDb1uBxrz0o2ao+nZJ768epkrWNnCv1glGay+Q1ZhwYtf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently, the shrinker instances can be divided into the following three types: a) global shrinker instance statically defined in the kernel, such as workingset_shadow_shrinker. b) global shrinker instance statically defined in the kernel modules, such as mmu_shrinker in x86. c) shrinker instance embedded in other structures. For case a, the memory of shrinker instance is never freed. For case b, the memory of shrinker instance will be freed after synchronize_rcu() when the module is unloaded. For case c, the memory of shrinker instance will be freed along with the structure it is embedded in. In preparation for implementing lockless slab shrink, we need to dynamically allocate those shrinker instances in case c, then the memory can be dynamically freed alone by calling kfree_rcu(). So this commit adds the following new APIs for dynamically allocating shrinker, and add a private_data field to struct shrinker to record and get the original embedded structure. 1. shrinker_alloc() Used to allocate shrinker instance itself and related memory, it will return a pointer to the shrinker instance on success and NULL on failure. 2. shrinker_register() Used to register the shrinker instance, which is same as the current register_shrinker_prepared(). 3. shrinker_free() Used to unregister (if needed) and free the shrinker instance. In order to simplify shrinker-related APIs and make shrinker more independent of other kernel mechanisms, subsequent submissions will use the above API to convert all shrinkers (including case a and b) to dynamically allocated, and then remove all existing APIs. This will also have another advantage mentioned by Dave Chinner: ``` The other advantage of this is that it will break all the existing out of tree code and third party modules using the old API and will no longer work with a kernel using lockless slab shrinkers. They need to break (both at the source and binary levels) to stop bad things from happening due to using unconverted shrinkers in the new setup. ``` Signed-off-by: Qi Zheng --- include/linux/shrinker.h | 7 +++ mm/internal.h | 11 +++++ mm/shrinker.c | 101 +++++++++++++++++++++++++++++++++++++++ mm/shrinker_debug.c | 17 ++++++- 4 files changed, 134 insertions(+), 2 deletions(-) diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 6b5843c3b827..3f3fd9974ce5 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -70,6 +70,8 @@ struct shrinker { int seeks; /* seeks to recreate an obj */ unsigned flags; + void *private_data; + /* These are for internal use */ struct list_head list; #ifdef CONFIG_MEMCG @@ -95,6 +97,11 @@ struct shrinker { * non-MEMCG_AWARE shrinker should not have this flag set. */ #define SHRINKER_NONSLAB (1 << 3) +#define SHRINKER_ALLOCATED (1 << 4) + +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...); +void shrinker_register(struct shrinker *shrinker); +void shrinker_free(struct shrinker *shrinker); extern int __printf(2, 3) prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...); diff --git a/mm/internal.h b/mm/internal.h index 5d4697612073..b9a116dce28e 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1162,6 +1162,9 @@ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, #ifdef CONFIG_SHRINKER_DEBUG extern int shrinker_debugfs_add(struct shrinker *shrinker); +extern int shrinker_debugfs_name_alloc(struct shrinker *shrinker, + const char *fmt, va_list ap); +extern void shrinker_debugfs_name_free(struct shrinker *shrinker); extern struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, int *debugfs_id); extern void shrinker_debugfs_remove(struct dentry *debugfs_entry, @@ -1171,6 +1174,14 @@ static inline int shrinker_debugfs_add(struct shrinker *shrinker) { return 0; } +static inline int shrinker_debugfs_name_alloc(struct shrinker *shrinker, + const char *fmt, va_list ap) +{ + return 0; +} +static inline void shrinker_debugfs_name_free(struct shrinker *shrinker) +{ +} static inline struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, int *debugfs_id) { diff --git a/mm/shrinker.c b/mm/shrinker.c index a16cd448b924..36711c5c01f9 100644 --- a/mm/shrinker.c +++ b/mm/shrinker.c @@ -550,6 +550,107 @@ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, return freed; } +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...) +{ + struct shrinker *shrinker; + unsigned int size; + va_list ap; + int err; + + shrinker = kzalloc(sizeof(struct shrinker), GFP_KERNEL); + if (!shrinker) + return NULL; + + va_start(ap, fmt); + err = shrinker_debugfs_name_alloc(shrinker, fmt, ap); + va_end(ap); + if (err) + goto err_name; + + shrinker->flags = flags | SHRINKER_ALLOCATED; + + if (flags & SHRINKER_MEMCG_AWARE) { + err = prealloc_memcg_shrinker(shrinker); + if (err == -ENOSYS) + shrinker->flags &= ~SHRINKER_MEMCG_AWARE; + else if (err == 0) + goto done; + else + goto err_flags; + } + + /* + * The nr_deferred is available on per memcg level for memcg aware + * shrinkers, so only allocate nr_deferred in the following cases: + * - non memcg aware shrinkers + * - !CONFIG_MEMCG + * - memcg is disabled by kernel command line + */ + size = sizeof(*shrinker->nr_deferred); + if (flags & SHRINKER_NUMA_AWARE) + size *= nr_node_ids; + + shrinker->nr_deferred = kzalloc(size, GFP_KERNEL); + if (!shrinker->nr_deferred) + goto err_flags; + +done: + return shrinker; + +err_flags: + shrinker_debugfs_name_free(shrinker); +err_name: + kfree(shrinker); + return NULL; +} +EXPORT_SYMBOL_GPL(shrinker_alloc); + +void shrinker_register(struct shrinker *shrinker) +{ + if (unlikely(!(shrinker->flags & SHRINKER_ALLOCATED))) { + pr_warn("Must use shrinker_alloc() to dynamically allocate the shrinker"); + return; + } + + down_write(&shrinker_rwsem); + list_add_tail(&shrinker->list, &shrinker_list); + shrinker->flags |= SHRINKER_REGISTERED; + shrinker_debugfs_add(shrinker); + up_write(&shrinker_rwsem); +} +EXPORT_SYMBOL_GPL(shrinker_register); + +void shrinker_free(struct shrinker *shrinker) +{ + struct dentry *debugfs_entry = NULL; + int debugfs_id; + + if (!shrinker) + return; + + down_write(&shrinker_rwsem); + if (shrinker->flags & SHRINKER_REGISTERED) { + list_del(&shrinker->list); + debugfs_entry = shrinker_debugfs_detach(shrinker, &debugfs_id); + shrinker->flags &= ~SHRINKER_REGISTERED; + } else { + shrinker_debugfs_name_free(shrinker); + } + + if (shrinker->flags & SHRINKER_MEMCG_AWARE) + unregister_memcg_shrinker(shrinker); + up_write(&shrinker_rwsem); + + if (debugfs_entry) + shrinker_debugfs_remove(debugfs_entry, debugfs_id); + + kfree(shrinker->nr_deferred); + shrinker->nr_deferred = NULL; + + kfree(shrinker); +} +EXPORT_SYMBOL_GPL(shrinker_free); + /* * Add a shrinker callback to be called from the vm. */ diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c index e4ce509f619e..38452f539f40 100644 --- a/mm/shrinker_debug.c +++ b/mm/shrinker_debug.c @@ -193,6 +193,20 @@ int shrinker_debugfs_add(struct shrinker *shrinker) return 0; } +int shrinker_debugfs_name_alloc(struct shrinker *shrinker, const char *fmt, + va_list ap) +{ + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap); + + return shrinker->name ? 0 : -ENOMEM; +} + +void shrinker_debugfs_name_free(struct shrinker *shrinker) +{ + kfree_const(shrinker->name); + shrinker->name = NULL; +} + int shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ...) { struct dentry *entry; @@ -241,8 +255,7 @@ struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, lockdep_assert_held(&shrinker_rwsem); - kfree_const(shrinker->name); - shrinker->name = NULL; + shrinker_debugfs_name_free(shrinker); *debugfs_id = entry ? shrinker->debugfs_id : -1; shrinker->debugfs_entry = NULL;