From patchwork Mon Jul 24 09:43:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13323643 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDA25C04A94 for ; Mon, 24 Jul 2023 09:45:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 575796B0071; Mon, 24 Jul 2023 05:45:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 525E2280003; Mon, 24 Jul 2023 05:45:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C65F280002; Mon, 24 Jul 2023 05:45:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 30CDA6B0071 for ; Mon, 24 Jul 2023 05:45:46 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 021B41C98CD for ; Mon, 24 Jul 2023 09:45:45 +0000 (UTC) X-FDA: 81046023492.13.4C42948 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by imf06.hostedemail.com (Postfix) with ESMTP id 28CB8180016 for ; Mon, 24 Jul 2023 09:45:43 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=bsUNhBCX; spf=pass (imf06.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690191944; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DaxJS65ntPERL7le+nmtmDDoPHIKL7Svp/IgETL6ZeE=; b=rLMwMvjHFwDn0/OjIVKDrbycWwpofJ+byz9T4++JRBoDx1h/1WD4tN2QP03YzUDCCBd93Y Kv8xOWIO4MIM8kuVkPV9zPbJaifz5N/rWa4Pq2ZNLvUAaU4A9/j/mVfHiFh2/CQpT5qxGU 1qpGcdBmnqin5t+BC811lwAArTNrsZY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690191944; a=rsa-sha256; cv=none; b=KpGP6f0iZmgcCii8XgokWT1g/ZtAK5IBWJZvDzdWy0yGXansGXHVRHXFFPvmCXXWRMSYtQ KD5X2dXYNuocsqiEbWfwNe42g0Gbf4siPVD0AJkApvtjpRdP9KxsxLPd1Xp++USU2kJXV0 UyhBeM/G21PimRcAaVM7RZ0eSPwW4a0= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=bsUNhBCX; spf=pass (imf06.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-1bb91c20602so2225635ad.0 for ; Mon, 24 Jul 2023 02:45:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1690191943; x=1690796743; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DaxJS65ntPERL7le+nmtmDDoPHIKL7Svp/IgETL6ZeE=; b=bsUNhBCXwNUzkDR0ZBYGRWY6QU8jf6SXW0GNGt2FOx6GMmf4JYDUwOhZCN0a9aNwPh pTka7+wy4J6SiQlycedqXCWLDNsWz3aJioSMhFKKX+Cimy7bU+oF7hjqxyvShR8A2yEG Bu7LESrzTJZCH9TPtc2I9FzALspZlYdCLyw0Gn9itfPl/wiIrNjf25JL50bsQdJ04tcL SlbyOamKluDXUz4pmzuKSCdYj+rqtmnfikA7iqukCNA9GBpbd9F7GXY8//WwnDt8lif7 VBbsqb4LU2WF81b1loSTallposIksCkIrM7zY2O8grvdOy/OC2xy9l8uFRAuNMFBjkRJ /Eug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690191943; x=1690796743; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DaxJS65ntPERL7le+nmtmDDoPHIKL7Svp/IgETL6ZeE=; b=dGPaCYe5rjv7ApcuwKbSmYTqIPXXN9hALQ+5nn4vuNDZ6wnwxrRWzPaAsi6OHymmPp 1de9CWQTKQ3RIG4NSc14gDfTHiA2OW2BHIEBKaVKStsS1dcO25QLVHC88m98F2QlUhIg uZx1duEEDsTgelL+Ow0c0r/Q24Z+Q55ZfmYk0cdc3pbZFbBEW1MPQ0/1eupZy3Yd26xo 4C9cnu9nTdqRFNTYNhET5riu2fw84YzUd4jsYvDdxVKWPq/glZ3yc3yiOgeME2qXODbE I0H7y3mUtiCxW2x0+MubWulFQbv6oyEGIKu/EMAZrKVXs9jWJhOCEmSdlLxhLWOB2iKK M8Ww== X-Gm-Message-State: ABy/qLZK8mvto5aO9RhrGOuqAio8oOx7gcVG6B68hUkzOgd/lWY2m5jJ BFMTXuol/ke8GB/pdlrk2cafcw== X-Google-Smtp-Source: APBJJlGWwe+Wn5SAE9CC225imGBvK3oJNP2MQHqUiC6fnoUxBGqj3ofPftHC2Yw9p8SUIC56L+2IOQ== X-Received: by 2002:a17:903:2305:b0:1b8:b0c4:2e3d with SMTP id d5-20020a170903230500b001b8b0c42e3dmr12236473plh.4.1690191942806; Mon, 24 Jul 2023 02:45:42 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.147]) by smtp.gmail.com with ESMTPSA id d5-20020a170902c18500b001bb20380bf2sm8467233pld.13.2023.07.24.02.45.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 02:45:42 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu, steven.price@arm.com, cel@kernel.org, senozhatsky@chromium.org, yujie.liu@intel.com, gregkh@linuxfoundation.org, muchun.song@linux.dev Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org, kvm@vger.kernel.org, xen-devel@lists.xenproject.org, linux-erofs@lists.ozlabs.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-nfs@vger.kernel.org, linux-mtd@lists.infradead.org, rcu@vger.kernel.org, netdev@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH v2 03/47] mm: shrinker: add infrastructure for dynamically allocating shrinker Date: Mon, 24 Jul 2023 17:43:10 +0800 Message-Id: <20230724094354.90817-4-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230724094354.90817-1-zhengqi.arch@bytedance.com> References: <20230724094354.90817-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 X-Stat-Signature: ygpgd156c4mynd37ptr6t55cncbnda8g X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 28CB8180016 X-Rspam-User: X-HE-Tag: 1690191943-225158 X-HE-Meta: U2FsdGVkX1+fEmEtOKhzdXmFQAaiXPM7dbrFwcAF/FaFQeDWKlpSyeKngBND0jTL8YMN5pwGtUqmwBD/aG1zALzIhOIzBH67Lwy6Hj9LxsHEp6UWiOHXZ/U63d3NQGSJHq+K/v1wtw2KkG/ZTMkbXhEBOS9x07MPbrHJ+/rzZx/DTK4jS48147KWUphk6BUlGiYEM9zf3+TxfBwBBO9NsoAsl0nC3WVO+JjzeqlO0opSe1mhXvvPOD4FZLQfOFifTErbRP3S5GYAxV9JrGHd481FsEyVzd3YBFkzcl5whDP+6vikobTOsYjjUSjPalpNOs85rhYYPPY1vxScUpGJodob4iWXSV2X0nFQNdJ56KG/YuLqofdLhp/nD0fHvlUWTfayuX0KE0+HTWY32T11vYQTCCLhBpG/Rq4QwjdEs9vnBEmkzs+uZdw7fXMucUnR1/aTcNqMoEhzrpjEOFwhy8cN7nO6QAxTyKestxa9YR1imSk+ui3r3BkVcmodzOvm1ah9n0cIe7iYkzYS5QnQSKI55K3pTlqLS1CHKHFXlxWcHRu/xY4q5b8K02B1vOhYyMF1Q4O/MGhrdYQcAzrsJaFcThjTzpnJfkx0VRua0pF7y0hsJ7zh2D05i3u/66HTSx2VywVCiVp778ic7GIYcwbnKkWaSat28z3fU/KPfJFhKOotbyMZ6znN2cDaxh8C1/07U6h9SkKK/v711c8P7vucHLE+KItXQufVoZCI/P3yOi8BEsHqHJISonaLImVj9phUvoDlsADosTCrfBuKFpJSDtVFVAELZeX6w1MGJ3jDMe6Mz4vZAnvcorfsTPFdnaot5HtDrJ7sW7/jnr7sHX/+++TRENaQfEn/0sVLWmel+wqdQjWXZchLGG4vCejYnO07Skcv9ExkaTEKkxqUh9/6SBbx+1giYMiBT9gtuiZh9S+pdOOb076rD8V6Xed1t7rx0ZsTABXl9OrajWz ZWhhnlP/ +/325lE2nCC7aCC5/sBLxf6+eTvFO0CBjMOfaN735PrRjYy8C8uy2ns8gM9PgJiM8FWIsrfzqwcTaFPRYYBQ8GVRHCULnh0BMHjBiHOSmsxq6S5EBokvPjixAQQKCgJFxfKgP8W1xsu1n1DxvIxOHcQOLwhh47j4sIkbbeR0sfraog0xLbCM78BcgqTaRsTqf4wZkLHtrRV4aYzODOGj0NqLCBUFz0hKj5z9nWjjwHpttEelS1rH1gJ7HvUN76C2BGWw9CZduEATA8w7tt5SHLajie61ZtIfvUuNniz89IQuHocQSXEJX3Rh5lwzPsrlxGMc1wj9avCsfad7RxX6jrimsKjnpZFUlaLs18hOSnF3mKzAAw5SWofqhvqg5LdOwWrqqk2HT32aJ3fY5bFSfyTmCQve2gcgex2Z9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently, the shrinker instances can be divided into the following three types: a) global shrinker instance statically defined in the kernel, such as workingset_shadow_shrinker. b) global shrinker instance statically defined in the kernel modules, such as mmu_shrinker in x86. c) shrinker instance embedded in other structures. For case a, the memory of shrinker instance is never freed. For case b, the memory of shrinker instance will be freed after synchronize_rcu() when the module is unloaded. For case c, the memory of shrinker instance will be freed along with the structure it is embedded in. In preparation for implementing lockless slab shrink, we need to dynamically allocate those shrinker instances in case c, then the memory can be dynamically freed alone by calling kfree_rcu(). So this commit adds the following new APIs for dynamically allocating shrinker, and add a private_data field to struct shrinker to record and get the original embedded structure. 1. shrinker_alloc() Used to allocate shrinker instance itself and related memory, it will return a pointer to the shrinker instance on success and NULL on failure. 2. shrinker_free_non_registered() Used to destroy the non-registered shrinker instance. 3. shrinker_register() Used to register the shrinker instance, which is same as the current register_shrinker_prepared(). 4. shrinker_unregister() Used to unregister and free the shrinker instance. In order to simplify shrinker-related APIs and make shrinker more independent of other kernel mechanisms, subsequent submissions will use the above API to convert all shrinkers (including case a and b) to dynamically allocated, and then remove all existing APIs. This will also have another advantage mentioned by Dave Chinner: ``` The other advantage of this is that it will break all the existing out of tree code and third party modules using the old API and will no longer work with a kernel using lockless slab shrinkers. They need to break (both at the source and binary levels) to stop bad things from happening due to using uncoverted shrinkers in the new setup. ``` Signed-off-by: Qi Zheng --- include/linux/shrinker.h | 6 +++ mm/shrinker.c | 113 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 119 insertions(+) diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 961cb84e51f5..296f5e163861 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -70,6 +70,8 @@ struct shrinker { int seeks; /* seeks to recreate an obj */ unsigned flags; + void *private_data; + /* These are for internal use */ struct list_head list; #ifdef CONFIG_MEMCG @@ -98,6 +100,10 @@ struct shrinker { unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, int priority); +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...); +void shrinker_free_non_registered(struct shrinker *shrinker); +void shrinker_register(struct shrinker *shrinker); +void shrinker_unregister(struct shrinker *shrinker); extern int __printf(2, 3) prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...); diff --git a/mm/shrinker.c b/mm/shrinker.c index 0a32ef42f2a7..d820e4cc5806 100644 --- a/mm/shrinker.c +++ b/mm/shrinker.c @@ -548,6 +548,119 @@ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, return freed; } +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...) +{ + struct shrinker *shrinker; + unsigned int size; + va_list __maybe_unused ap; + int err; + + shrinker = kzalloc(sizeof(struct shrinker), GFP_KERNEL); + if (!shrinker) + return NULL; + +#ifdef CONFIG_SHRINKER_DEBUG + va_start(ap, fmt); + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap); + va_end(ap); + if (!shrinker->name) + goto err_name; +#endif + shrinker->flags = flags; + + if (flags & SHRINKER_MEMCG_AWARE) { + err = prealloc_memcg_shrinker(shrinker); + if (err == -ENOSYS) + shrinker->flags &= ~SHRINKER_MEMCG_AWARE; + else if (err == 0) + goto done; + else + goto err_flags; + } + + /* + * The nr_deferred is available on per memcg level for memcg aware + * shrinkers, so only allocate nr_deferred in the following cases: + * - non memcg aware shrinkers + * - !CONFIG_MEMCG + * - memcg is disabled by kernel command line + */ + size = sizeof(*shrinker->nr_deferred); + if (flags & SHRINKER_NUMA_AWARE) + size *= nr_node_ids; + + shrinker->nr_deferred = kzalloc(size, GFP_KERNEL); + if (!shrinker->nr_deferred) + goto err_flags; + +done: + return shrinker; + +err_flags: +#ifdef CONFIG_SHRINKER_DEBUG + kfree_const(shrinker->name); + shrinker->name = NULL; +err_name: +#endif + kfree(shrinker); + return NULL; +} +EXPORT_SYMBOL(shrinker_alloc); + +void shrinker_free_non_registered(struct shrinker *shrinker) +{ +#ifdef CONFIG_SHRINKER_DEBUG + kfree_const(shrinker->name); + shrinker->name = NULL; +#endif + if (shrinker->flags & SHRINKER_MEMCG_AWARE) { + down_write(&shrinker_rwsem); + unregister_memcg_shrinker(shrinker); + up_write(&shrinker_rwsem); + } + + kfree(shrinker->nr_deferred); + shrinker->nr_deferred = NULL; + + kfree(shrinker); +} +EXPORT_SYMBOL(shrinker_free_non_registered); + +void shrinker_register(struct shrinker *shrinker) +{ + down_write(&shrinker_rwsem); + list_add_tail(&shrinker->list, &shrinker_list); + shrinker->flags |= SHRINKER_REGISTERED; + shrinker_debugfs_add(shrinker); + up_write(&shrinker_rwsem); +} +EXPORT_SYMBOL(shrinker_register); + +void shrinker_unregister(struct shrinker *shrinker) +{ + struct dentry *debugfs_entry; + int debugfs_id; + + if (!shrinker || !(shrinker->flags & SHRINKER_REGISTERED)) + return; + + down_write(&shrinker_rwsem); + list_del(&shrinker->list); + shrinker->flags &= ~SHRINKER_REGISTERED; + if (shrinker->flags & SHRINKER_MEMCG_AWARE) + unregister_memcg_shrinker(shrinker); + debugfs_entry = shrinker_debugfs_detach(shrinker, &debugfs_id); + up_write(&shrinker_rwsem); + + shrinker_debugfs_remove(debugfs_entry, debugfs_id); + + kfree(shrinker->nr_deferred); + shrinker->nr_deferred = NULL; + + kfree(shrinker); +} +EXPORT_SYMBOL(shrinker_unregister); + /* * Add a shrinker callback to be called from the vm. */