From patchwork Thu Mar 30 19:17:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13194750 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE74FC77B6D for ; Thu, 30 Mar 2023 19:18:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4BFAF900002; Thu, 30 Mar 2023 15:18:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 447B86B0078; Thu, 30 Mar 2023 15:18:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 31032900002; Thu, 30 Mar 2023 15:18:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 1DB816B0074 for ; Thu, 30 Mar 2023 15:18:08 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C5E96801B9 for ; Thu, 30 Mar 2023 19:18:07 +0000 (UTC) X-FDA: 80626525014.03.837A82E Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by imf25.hostedemail.com (Postfix) with ESMTP id 21514A000C for ; Thu, 30 Mar 2023 19:18:04 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lNkrtIAG; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of 3a-AlZAoKCOokaedkMTYQPSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--yosryahmed.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3a-AlZAoKCOokaedkMTYQPSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--yosryahmed.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680203885; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=lrEZYZAprkJkjo6nuInD9+WtSSeLrA1vEvWYswR6SSY=; b=wqrf+AYbRZXgEva2qo/McYTN4uBrfuJDCOwfYdipljhhX1/Oyku32iaQPL7n5HT6Lkoo5Z ZXWq/8B6je+SIo39lr3uQUsRtl1WEv9wekGujREtXrgcbgpya8J6fVCHrhKlzLYwW3CxOX Jl0+QM3lWxv06Z6x9AiCQMkXO7jZhaE= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lNkrtIAG; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of 3a-AlZAoKCOokaedkMTYQPSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--yosryahmed.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3a-AlZAoKCOokaedkMTYQPSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--yosryahmed.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680203885; a=rsa-sha256; cv=none; b=2Gg0CjNxkDsDILHFgXGsklBt0c2k+mS7YcMBkEs3CGfSz3H93a9dZ/Hzkz51WVpbJvocgu niYcpdwGM6YhmPUyw4Ud4AygD3Ub9d2VlE8f2uzkrg4pATqxT1QgTNkymjaOc7arWUjqMJ U5xG/38Ki3lp4spUd4ohQRJv+JtbM9s= Received: by mail-pg1-f201.google.com with SMTP id 9-20020a630009000000b0051393797707so1577792pga.5 for ; Thu, 30 Mar 2023 12:18:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1680203884; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=lrEZYZAprkJkjo6nuInD9+WtSSeLrA1vEvWYswR6SSY=; b=lNkrtIAGQghUwPzFiHMiZzA3Dj60iiJHRGbQtataV8UD8wLWTyWKOGxG8u3BZQmZqh 5yBxB5KPXBynvEz4Q8DwJ48f04q5ziCKK5w2421MfdnykM7hZYTD7QE3qTovUgTB10rE /OdTkB3vbZ9H6A+1KcS+Dt8UCQ5k6ylxqWC7L4UhAwDP+Zclc2vMHiXrKe42d7O3O4e9 yMvq13mWuAkYQrH23Ii0S8JTYXEyLwOzzS6RVCfoQRjcbgjsxO6V7FRVf/WhFXr1IBp2 ogNqPXtjlpJKSx4pNT1Fg/IoYhT5iC8WlEOAjIe+2CAkukUXdOx4ot3VDqnrD4ezj2KP WfrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680203884; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=lrEZYZAprkJkjo6nuInD9+WtSSeLrA1vEvWYswR6SSY=; b=extC4HZiRgoxuF2cVuQjUG6BPmuFVg3nNoLN7KapmGVdgD7Fxc9P73cz7EUqZ3EbBK QewnhZgmheKCPIiQ820zNwLrCvOzQs2ds7uOxq8+SAim4dvLalsuMg1v1gv97VhUKB/f rufs5k9IxYR/Zlpp5cajRZ8tqw1aCifWWiVIUjZsm7xmvnpYZacHL4EVWJl57n3x3Rlm ixtuJQuJnt2MIDUQUnrJsxADtpa69B3Q8vlmzSX13/Ik6aaNg8qZoij+IS2zjQH8WXgB ci1SdtNRc0IKVYom0zu1iiSsy1fOCRhbqJVqSCSSlanM2J7Kbwiirtpe1eiyXtrKdqay qnCQ== X-Gm-Message-State: AAQBX9dd+ZhzG7oIsNIjgk3kP//WSwMey6tERpdQUWwFjUBHuHnLhsnc aj6qh39v/LDr0lGMV/M3SvMpAyARzCO6Y90j X-Google-Smtp-Source: AKy350YB83aSAlfHtZ1brNy8tK94TomVrO/ck654euHQoNAhGyGIzRBJ8LL3uL8MDTRbXOo6aA3Aj9bt+0RVTLSh X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:2327]) (user=yosryahmed job=sendgmr) by 2002:aa7:9285:0:b0:626:1710:9b7d with SMTP id j5-20020aa79285000000b0062617109b7dmr3161728pfa.0.1680203883841; Thu, 30 Mar 2023 12:18:03 -0700 (PDT) Date: Thu, 30 Mar 2023 19:17:53 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.40.0.348.gf938b09366-goog Message-ID: <20230330191801.1967435-1-yosryahmed@google.com> Subject: [PATCH v3 0/8] memcg: avoid flushing stats atomically where possible From: Yosry Ahmed To: Tejun Heo , Josef Bacik , Jens Axboe , Zefan Li , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , " =?utf-8?q?Michal_Koutn=C3=BD?= " Cc: Vasily Averin , cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org, Yosry Ahmed X-Rspamd-Queue-Id: 21514A000C X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: oqwm9pjdcbk5kpdzyjqfb7xwktfu6zna X-HE-Tag: 1680203884-97916 X-HE-Meta: U2FsdGVkX1/mJRzsu/xtlwJ3n0pWoTr5+fuL8SPGb3fVgx984hvKGYhwpjmLu/wfJhTeHme75Gwc7AoACfldP4kyASG8pj5T71ILcV2KMNcVJ+2uVYBAIMr5jYGjSmt8wEDI7uGu1uOS2fjQqTshqWa0yCcUCOG9beKSnJOKF+zEbqp2vqCJc0Ujptwb0uSRnxDT93LJEUghOXS3YXCWZ6rQtnQYiYujPxlC0xs3Gj3/EZDVdfMJFmGb65gtjnQLVSLT0dAeJ6350OdU4Sc4XQi9CZQFZHj+rggTlmeqP0TaOGbTW6cDHIuD6ZGLqPXIXBY9oenbxMI2p4apBvonpAA8k/PC8rxHHMuvp0nrlC3VZn/J++LRZlrKTeAF01W4ZkM81tw3PA1VPtMxF1PdBA7s/0OzD7KWjJX1z3rynAoEutk5vFvvzVebaag9HZWNyai0QyUoYSOHj6wnPMSdkaAERJ/zxk1VvzOAJmjE/LWbIjgqjzWtPbeR+s+T4CUNv8EXOuVb4DV0gsdotCZRzJpzFUoHIPjZamHlVzuO5jmxu8/8xKLIhvwXkhSAUDqKH9AUsO6M7MzNjDPIGBheMAClZL4xsDikz19Lf5mQxgG7eLiXSkyYmB8Y4zta6gzwxt8LCoxnAU6YXoFNQIKMjLu/M+5RQYJvspe1HBnbqn3mfMm6d7kewQ4JojZ+RiKbaxz6zxS8ZGAoEv3gpLZmAlsmPBaq8UUMXzo05MEh6YHmWVYTm3gLtpAt4WEp9kqkYX9h+hDGTMzRRuHGGSmfhFXPr714w+BR0UvT3R3WxsLZJVv+sab7e8+Y10OIHbS5UcnqhRpgznL9zG6xQPZi5sCyphlW1XJoa5sCwjRLi9Y1v3SkMIBBkKFpgGCT366TmcjlakV9Xg9jTNjpf9I+fGzC2Fv0S1/mrycRhDB5f7KIZJGfls2O78sePOg7I5yaYiUB92fdwbYzEVPbKBY OFwhZS8A ZtY0udFc1u+79bMTE+eb8wSyncE9sA2qn8+UDkJUUw28t3pE8fzR7nG+1ahWvmgi0ymX+ehR2chun4HDzemo7NN5yPri82QNTZOMDOF23xVKMkjH4qAguvc0gsCKEeSPA76aShe7i4x3Z6pHCsG0iTNi+joEDzqhlmYbBsH4ozQmUubNgMBh9xb5jDe8hOfbspXibJYg28eso3tHNCVV3k+kydpRdTuggENmWkDb0jzcxHScILt0e/Lq/QyE688J5v3zmMDmQhZva+9cycQnJcexuFT2s907spMqgNSQKFq2D8sCiLntlthvSfcYj1TrSW4JUTTC0kpMR1nwdCOaR31UtTGG58EuJaDZqVfrg+eg42aNDNaD1PdABoCcPIyvIdRN0EL/a+XG6ollyP+TMYNEoQEpVaahAONULfiEEIvMTdTHSRKOOQSAIBdweEiQhWBprAjdjl3UZk/t8Tq/q/mAFbnAlbedBXXR2fPwZM8Zn0buKDEkpeprEPdinERMyjA5eHA/utIADu+XguxRkiDQVVleAt682bbp7Ia/gnEU3y05tXsocq3vYELHPjoAMd2yeLVb6/PZXB/frzBNpFkAr27wh2V7QrAuKLFXNQGRnArS5CTyRntPBYy6gUiIJxo5er5K+AGw9xbVS9s/eFeNa/Yp7mBxKPNGQ8ybhcNLGw3W8MOrNHLbzDDhkx/7ssf8LRmgM6OPTKil9FsyoqZaYUw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: rstat flushing is an expensive operation that scales with the number of cpus and the number of cgroups in the system. The purpose of this series is to minimize the contexts where we flush stats atomically. Patches 1 and 2 are cleanups requested during reviews of prior versions of this series. Patch 3 makes sure we never try to flush from within an irq context. Patches 4 to 7 introduce separate variants of mem_cgroup_flush_stats() for atomic and non-atomic flushing, and make sure we only flush the stats atomically when necessary. Patch 8 is a slightly tangential optimization that limits the work done by rstat flushing in some scenarios. v2 -> v3: - Collected more Acks (thanks everyone!). - Dropped controversial patch 4 from v2. - Improved commit logs and cover letter (Michal). v2: https://lore.kernel.org/linux-mm/20230328221644.803272-1-yosryahmed@google.com/ v1 -> v2: - Added more context in patch 4's commit log. - Added atomic_read() before atomic_xchg() in patch 5 to avoid needlessly locking the cache line (Shakeel). - Refactored patch 6: added a common helper, do_flush_stats(), for mem_cgroup_flush_stats{_atomic}() (Johannes). - Renamed mem_cgroup_flush_stats_ratelimited() to mem_cgroup_flush_stats_atomic_ratelimited() in patch 6. It is restored in patch 7, but this maintains consistency (Johannes). - Added line break to keep the lock section visually separated in patch 7 (Johannes). v1: https://lore.kernel.org/lkml/20230328061638.203420-1-yosryahmed@google.com/ RFC -> v1: - Dropped patch 1 that attempted to make the global rstat lock a non-irq lock, will follow up on that separetly (Shakeel). - Dropped stats_flush_lock entirely, replaced by an atomic (Johannes). - Renamed cgroup_rstat_flush_irqsafe() to cgroup_rstat_flush_atomic() instead of removing it (Johannes). - Added a patch to rename mem_cgroup_flush_stats_delayed() to mem_cgroup_flush_stats_ratelimited() (Johannes). - Separate APIs for flushing memcg stats in atomic and non-atomic contexts instead of a boolean argument (Johannes). - Added patches 3 & 4 to make sure we never flush from irq context (Shakeel & Johannes). RFC: https://lore.kernel.org/lkml/20230323040037.2389095-1-yosryahmed@google.com/ Yosry Ahmed (8): cgroup: rename cgroup_rstat_flush_"irqsafe" to "atomic" memcg: rename mem_cgroup_flush_stats_"delayed" to "ratelimited" memcg: do not flush stats in irq context memcg: replace stats_flush_lock with an atomic memcg: sleep during flushing stats in safe contexts workingset: memcg: sleep when flushing stats in workingset_refault() vmscan: memcg: sleep when flushing stats during reclaim memcg: do not modify rstat tree for zero updates include/linux/cgroup.h | 2 +- include/linux/memcontrol.h | 9 ++++- kernel/cgroup/rstat.c | 4 +- mm/memcontrol.c | 78 ++++++++++++++++++++++++++++++-------- mm/workingset.c | 5 ++- 5 files changed, 76 insertions(+), 22 deletions(-)