From patchwork Sun Feb 26 14:46:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13152424 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 725AEC64ED6 for ; Sun, 26 Feb 2023 14:48:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C78E96B0075; Sun, 26 Feb 2023 09:48:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C29076B0078; Sun, 26 Feb 2023 09:48:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF18E6B007B; Sun, 26 Feb 2023 09:48:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9F5176B0075 for ; Sun, 26 Feb 2023 09:48:31 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7DC41A0251 for ; Sun, 26 Feb 2023 14:48:30 +0000 (UTC) X-FDA: 80509723980.21.EA8DD23 Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by imf16.hostedemail.com (Postfix) with ESMTP id B6A0018000B for ; Sun, 26 Feb 2023 14:48:28 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=LzQ+kcdg; spf=pass (imf16.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677422908; a=rsa-sha256; cv=none; b=n1o5/SzSOD1caF4pifW4Z9QaIXZjxpqFiutuQCl0D064fzZAlTY6gw0JhrcOA6cIlSQysX uNKmgcFx9TkSpkAefbksxAFUymGnQTmUu6wo6UTtAvZDP9EAEnNdlUUco5q9NYKN+AUVxO spTQe3pTb93MZ3Qxsx2zDRmJqgXij8Q= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=LzQ+kcdg; spf=pass (imf16.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677422908; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gvRso6AuWsz0X9JCAZwWH9GIT0TN5B8gocPlJ++Lup8=; b=0T/49CSlavG5e4Fpj3YL3pY4pOpVV1KFK1Z9G+p1ZQ0PKSRNG8uP2ibVw1PS0ZiML5hLy2 AmREJjVvbxklP/I8D5jgLL3o9qt5SYQdS+6DlFbHrLNXH+FN04qrb9NGlbIZq5hZ6q1EFK OmskC2mID8nZ6OMGVaFPnAk1sz6tAhY= Received: by mail-pl1-f179.google.com with SMTP id i10so4189949plr.9 for ; Sun, 26 Feb 2023 06:48:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gvRso6AuWsz0X9JCAZwWH9GIT0TN5B8gocPlJ++Lup8=; b=LzQ+kcdgxQjvYV6/aBTUiIdXty1SduBzej96ViyOvdlzB3Fsg4Lcdgbz3j4hmKO+8+ TIcLX9NYqDrIQ5/ezPI3RaFfYwQ1ZEDeoEpqbn5P8z2NLz+savfhhRBcSPVq7BwrsDXr R1O+OCmiRS291b9X4XNwI5b5W0qVttXgY3Oqixpog+eVUFyly+Rm9YGT9YU8Ro/3F5Pw d/nQ47A2O7bUl+GBn4gAU0PulKhyQ7Bu/CfMAW5zYIkO2kY76GczKehRR3kCEzOW262q ug1kCR6iRSBx0q+cNBVn+mB+nddqScXB+yCOQ55hN5exNnBS/N7LGAnn4tefuskRqhyI gJZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gvRso6AuWsz0X9JCAZwWH9GIT0TN5B8gocPlJ++Lup8=; b=2GgtU9TCgqsIkdQj5+4R3ToCt3zU9lx+4jm13fHlP2tk4T71o6SHj73ipat88Jdpj9 lcS+UJZEUTBVVzPZO8Wzj1J7+2kCKDpwR6oyxtLaROTzPFvYPZRDG1c8+u4GgYfSWfOS 3ilRx5W9rvr+Hjz3wCtoy+mcR7rRUkZBwzsd0M0HMrQ0RCkOM6mSFMjdso3S5Jq4cRm/ 7zrWFihXv9gp5ZQORZCYFTa0VMf4MvZyoLRRsNthdXAj4UzhYBJk8KdYDQlswW5JAC0d AQM268SJIDlhGa/Zt5hUN2YluOYBtnZlUa6n6kgEI8c/WTp1IvCRiDLiNKZAWUiKoQxA Kt+g== X-Gm-Message-State: AO0yUKW4uSVzj2PtzCpaGE8ETRsIJQEfUAvxIbhZ24jo+Ww6im6CI1+w A/sEQYGvwm0pRxf9oVOZrR6sHQ== X-Google-Smtp-Source: AK7set8spgroQQIMXk8gdX51DNtELr9O76gOJSC+OLLzgfCw5s0Vzk98NSKahnR1+3e2KLwZ3nhiFQ== X-Received: by 2002:a17:903:230f:b0:19a:7060:948 with SMTP id d15-20020a170903230f00b0019a70600948mr21481855plh.1.1677422907679; Sun, 26 Feb 2023 06:48:27 -0800 (PST) Received: from localhost.localdomain ([139.177.225.248]) by smtp.gmail.com with ESMTPSA id y20-20020a170902ed5400b0019c2cf12d15sm2755589plb.116.2023.02.26.06.48.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 26 Feb 2023 06:48:27 -0800 (PST) From: Qi Zheng To: akpm@linux-foundation.org, tkhai@ya.ru, hannes@cmpxchg.org, shakeelb@google.com, mhocko@kernel.org, roman.gushchin@linux.dev, muchun.song@linux.dev, david@redhat.com, shy828301@gmail.com Cc: sultan@kerneltoast.com, dave@stgolabs.net, penguin-kernel@I-love.SAKURA.ne.jp, paulmck@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH v3 3/8] mm: vmscan: make memcg slab shrink lockless Date: Sun, 26 Feb 2023 22:46:50 +0800 Message-Id: <20230226144655.79778-4-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230226144655.79778-1-zhengqi.arch@bytedance.com> References: <20230226144655.79778-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: B6A0018000B X-Rspamd-Server: rspam01 X-Stat-Signature: t7hcsaeyy3b1ys6dqt4io149pbsxrok9 X-HE-Tag: 1677422908-156519 X-HE-Meta: U2FsdGVkX1+yV6cBmh6jp0ppyNNvJuVOvx8ixVDwz2XylhaJk6FBz9aLYb3Q2YWR8Yg0J6+XYvQ3hGeigTSGsKicKkj9vQnhn1K8j/EIwGioqoWAC/W48aNLsDegdOkewMKmFaQPcHdNqE0xntgYXGPg0AwEujyOyg+QKHPGIT/ot1FRbuNloe8BWFkvtRqr1ZLO6M8X2fDyzVppVZdpVKcZPTBXXSSpLrDYBAVuRq2X6Bi/5EQqFtOcQRcZysKsm/JHqgRebA4hU2UlZ/F/oDz7cbTfH6maCmlcOKyrNkv5kXHY8PN/+J7S12DIfvwoDrJ4ydfBpF/UKkD78lTTbDb7ldlhGA3w/4ni8G1dAEGRaVe/LSTCjVSJPYVUPEMcCtH9yJdLXE0h4fFWE9dBZNTHfwoVy6aEKaJu1DmMaVgbotN4VF+tIcuw8NkRvGyymhinalRXVxzkhi36XPKj5cCkoLcALXSMXxWITBap1Igoz32IOsqpDyQppoHFV7DfwcG3Mz1cfcrkKec5Rno/qv64/PqsRm8/9nwY+iUH/TTeUZ5rQyJl8510nWnoq1wFif/sQzZx1PJqOqf7yGnmVDFPe2B7mWZZjUUUAC7+IFa69SSqCMAXPcXwMBbqTZSK96hRusk0rADBzyPkUfJwWoSX+u4f+SkAXLNKn36UyDyabFbKy2EBN8N8cOfm4sZ8OE9akdNZoaNffdJsiuMECeqDwjVI3YyTvVrXqeLTWSQ3Ejt+cdUT5qbcdwidKpsq+jVYnQ4zeUIgxzwiV8/CGieOgBRzd677IygaW3+gSIjOfmf8i2NVUHi6OtKixpymQDOohiz+l4dxOrn6oP38wh9x0HDpk1rYew1hDsaFt/Amht8xt2A75Nv3n/cyzLSMW1U2Zd4WyIeM8cGyhoYjtw63sv11SV42k+cAxa/+/NK3O+EKDxIDO8bu93vPmO49R7BoorOv2VMauIWKhy2 KEbkymNW epg6g9X/6DBYH3pf6xwVMZUknSPEyh1NSu+jz+PxUpbkYrtxmNDQ5lmSWHa95Ozs1IADnlccMw7Q9er7seaQQvGj1+3KBGlwUGZ7Gokj18xm4uDJHHbuozvn7sH0WnTyXPtRY3nGNj2+iCL4/a44E4d4golktFyiyvBb04uioC+hpmsxXnP7siUD+ceH4eTroJTGCQ05CzSXN8DWx9PItNkG8Nnyhucw6BR6lLdr2iuaURi29R8niapmTuPE1OXqkXVOEvdqmzsi0Q3KZyzbOyGr8VM8+dGg1/z9NtbFltY0vNyYe5tiLuDbilIK7U7cwML9ydndbFz4tCWQTfvd5PjgMOZlg6Auo1c8GnuJA4Bfnk7WhXGY2aObpsarw/JL6342KgxkAFPxobONC8sw9yNPfZiFBHplq45gDWrzV5sQmYtI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Like global slab shrink, since commit 1cd0bd06093c ("rcu: Remove CONFIG_SRCU"), it's time to use SRCU to protect readers who previously held shrinker_rwsem. We can test with the following script: ``` DIR="/root/shrinker/memcg/mnt" do_create() { mkdir /sys/fs/cgroup/memory/test echo 200M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes for i in `seq 0 $1`; do mkdir /sys/fs/cgroup/memory/test/$i; echo $$ > /sys/fs/cgroup/memory/test/$i/cgroup.procs; mkdir -p $DIR/$i; done } do_mount() { for i in `seq $1 $2`; do mount -t tmpfs $i $DIR/$i; done } do_touch() { for i in `seq $1 $2`; do echo $$ > /sys/fs/cgroup/memory/test/$i/cgroup.procs; dd if=/dev/zero of=$DIR/$i/file$i bs=1M count=1 & done } do_create 2000 do_mount 0 2000 do_touch 0 1000 ``` Before applying: 46.60% [kernel] [k] down_read_trylock 18.70% [kernel] [k] up_read 15.44% [kernel] [k] shrink_slab 4.37% [kernel] [k] _find_next_bit 2.75% [kernel] [k] xa_load 2.07% [kernel] [k] idr_find 1.73% [kernel] [k] do_shrink_slab 1.42% [kernel] [k] shrink_lruvec 0.74% [kernel] [k] shrink_node 0.60% [kernel] [k] list_lru_count_one After applying: 19.53% [kernel] [k] _find_next_bit 14.63% [kernel] [k] do_shrink_slab 14.58% [kernel] [k] shrink_slab 11.83% [kernel] [k] shrink_lruvec 9.33% [kernel] [k] __blk_flush_plug 6.67% [kernel] [k] mem_cgroup_iter 3.73% [kernel] [k] list_lru_count_one 2.43% [kernel] [k] shrink_node 1.96% [kernel] [k] super_cache_count 1.78% [kernel] [k] __rcu_read_unlock 1.38% [kernel] [k] __srcu_read_lock 1.30% [kernel] [k] xas_descend We can see that the readers is no longer blocked. Signed-off-by: Qi Zheng --- mm/vmscan.c | 46 +++++++++++++++++++++++++++------------------- 1 file changed, 27 insertions(+), 19 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 2a21a84d3db1..490764f8e085 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -57,6 +57,7 @@ #include #include #include +#include #include #include @@ -221,8 +222,21 @@ static inline int shrinker_defer_size(int nr_items) static struct shrinker_info *shrinker_info_protected(struct mem_cgroup *memcg, int nid) { - return rcu_dereference_protected(memcg->nodeinfo[nid]->shrinker_info, - lockdep_is_held(&shrinker_rwsem)); + return srcu_dereference_check(memcg->nodeinfo[nid]->shrinker_info, + &shrinker_srcu, + lockdep_is_held(&shrinker_rwsem)); +} + +static struct shrinker_info *shrinker_info_srcu(struct mem_cgroup *memcg, + int nid) +{ + return srcu_dereference(memcg->nodeinfo[nid]->shrinker_info, + &shrinker_srcu); +} + +static void free_shrinker_info_rcu(struct rcu_head *head) +{ + kvfree(container_of(head, struct shrinker_info, rcu)); } static inline bool need_expand(int new_nr_max, int old_nr_max) @@ -269,7 +283,7 @@ static int expand_one_shrinker_info(struct mem_cgroup *memcg, defer_size - old_defer_size); rcu_assign_pointer(pn->shrinker_info, new); - kvfree_rcu(old, rcu); + call_srcu(&shrinker_srcu, &old->rcu, free_shrinker_info_rcu); } return 0; @@ -355,15 +369,16 @@ void set_shrinker_bit(struct mem_cgroup *memcg, int nid, int shrinker_id) { if (shrinker_id >= 0 && memcg && !mem_cgroup_is_root(memcg)) { struct shrinker_info *info; + int srcu_idx; - rcu_read_lock(); - info = rcu_dereference(memcg->nodeinfo[nid]->shrinker_info); + srcu_idx = srcu_read_lock(&shrinker_srcu); + info = shrinker_info_srcu(memcg, nid); if (!WARN_ON_ONCE(shrinker_id >= info->map_nr_max)) { /* Pairs with smp mb in shrink_slab() */ smp_mb__before_atomic(); set_bit(shrinker_id, info->map); } - rcu_read_unlock(); + srcu_read_unlock(&shrinker_srcu, srcu_idx); } } @@ -377,7 +392,6 @@ static int prealloc_memcg_shrinker(struct shrinker *shrinker) return -ENOSYS; down_write(&shrinker_rwsem); - /* This may call shrinker, so it must use down_read_trylock() */ id = idr_alloc(&shrinker_idr, shrinker, 0, 0, GFP_KERNEL); if (id < 0) goto unlock; @@ -411,7 +425,7 @@ static long xchg_nr_deferred_memcg(int nid, struct shrinker *shrinker, { struct shrinker_info *info; - info = shrinker_info_protected(memcg, nid); + info = shrinker_info_srcu(memcg, nid); return atomic_long_xchg(&info->nr_deferred[shrinker->id], 0); } @@ -420,7 +434,7 @@ static long add_nr_deferred_memcg(long nr, int nid, struct shrinker *shrinker, { struct shrinker_info *info; - info = shrinker_info_protected(memcg, nid); + info = shrinker_info_srcu(memcg, nid); return atomic_long_add_return(nr, &info->nr_deferred[shrinker->id]); } @@ -898,15 +912,14 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, { struct shrinker_info *info; unsigned long ret, freed = 0; + int srcu_idx; int i; if (!mem_cgroup_online(memcg)) return 0; - if (!down_read_trylock(&shrinker_rwsem)) - return 0; - - info = shrinker_info_protected(memcg, nid); + srcu_idx = srcu_read_lock(&shrinker_srcu); + info = shrinker_info_srcu(memcg, nid); if (unlikely(!info)) goto unlock; @@ -956,14 +969,9 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, set_shrinker_bit(memcg, nid, i); } freed += ret; - - if (rwsem_is_contended(&shrinker_rwsem)) { - freed = freed ? : 1; - break; - } } unlock: - up_read(&shrinker_rwsem); + srcu_read_unlock(&shrinker_srcu, srcu_idx); return freed; } #else /* CONFIG_MEMCG */