From patchwork Thu Aug 25 00:05:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 12954102 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2C80C3F6B0 for ; Thu, 25 Aug 2022 00:05:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 11E99940008; Wed, 24 Aug 2022 20:05:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A7686B0075; Wed, 24 Aug 2022 20:05:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E64A5940008; Wed, 24 Aug 2022 20:05:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D506C6B0074 for ; Wed, 24 Aug 2022 20:05:30 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A3A19160921 for ; Thu, 25 Aug 2022 00:05:30 +0000 (UTC) X-FDA: 79836170820.16.B8CB936 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf10.hostedemail.com (Postfix) with ESMTP id 4CE0EC0036 for ; Thu, 25 Aug 2022 00:05:29 +0000 (UTC) Received: by mail-pf1-f202.google.com with SMTP id a21-20020a62bd15000000b005360da6b25aso6820913pff.23 for ; Wed, 24 Aug 2022 17:05:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:from:to:cc; bh=X0aXbVnjIsqv6IZlugAyzwT5Jn0BIh8DUeidfr3NTwU=; b=pizx5jtbolFH74HPRP+JXhQp2kt4f4Gvo5c8BahsoqOTkDMpUC21gySwAqRyERcsxp RxiJ7ZTDPE4UhNy7ASniRjeSQfLiOp7A52n7UcNJaRcgoNbt5hJSwRUZ2tkpQ2Q3baJ8 4cQTvaVR9bW07R8J1S2oVdOXktAWKIkD5EuCtdCBJ7DOtVg6boyaG97qE7d++5ICk+ah 2bvtXldISERsWobmMdnkvmtae/Ky7rkElbKgMUHUw96nEyyamlUTC7zNBQtNgsfupFiv 1ehcbs9orlmOrXVk8eSGqQ8mVmxBPOLIJjcaHvhtzjRgZpozbsey811NndPtqmQMB85j qK7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:x-gm-message-state:from:to:cc; bh=X0aXbVnjIsqv6IZlugAyzwT5Jn0BIh8DUeidfr3NTwU=; b=UsWsJp3Znxe8JxqeH1q0jmoQdsabvP8TNQx1DIq0OzOxAJZFFnHxeGpESN5IMTyuZ3 Yhnx4lkDjkRWdFT0r5nirLCHr3+B2ZeQiUadYlMDl17l/jOb8fWKxziJVPV9ZydKtF93 P7gjXcLaaKKgSPXjmWGaMjyKeaYeYpGFfZh1q+XMeSqf2/MFya4f3uHKoZgRLxn9zRWO Mslb27qGyw3T2Nga6Ed9W64LRvwNGtqcTxyV693jwlesjfl4trQ1w5Gnvo3n+wMozHW3 q/dBx5u+2+s67bKpTmWfmHoX7hZNkD0Tn+U67gPWT+CcBVlVYgunOO2ZB3+0jZvRD+a2 P4Xw== X-Gm-Message-State: ACgBeo0fxXuFdzK4Me/qJlNGJb/qqahmnyJLdWdXsdIcfyBBXCeprZ3h 0l8Wt+ziw23vRq9a1sJ74rpTrou4/tKBQg== X-Google-Smtp-Source: AA6agR5XSREtrswBTLHTLhIxt0y3r3MXI7uBpxQYmDyvEFx98x6p9m1M8gUdoBcShWUI0C8PD7kVrWRRQWbPPw== X-Received: from shakeelb.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:262e]) (user=shakeelb job=sendgmr) by 2002:a05:6a00:4147:b0:52e:2d56:17c8 with SMTP id bv7-20020a056a00414700b0052e2d5617c8mr1426905pfb.51.1661385928246; Wed, 24 Aug 2022 17:05:28 -0700 (PDT) Date: Thu, 25 Aug 2022 00:05:04 +0000 In-Reply-To: <20220825000506.239406-1-shakeelb@google.com> Message-Id: <20220825000506.239406-2-shakeelb@google.com> Mime-Version: 1.0 References: <20220825000506.239406-1-shakeelb@google.com> X-Mailer: git-send-email 2.37.1.595.g718a3a8f04-goog Subject: [PATCH v2 1/3] mm: page_counter: remove unneeded atomic ops for low/min From: Shakeel Butt To: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song Cc: " =?utf-8?q?Michal_Koutn=C3=BD?= " , Eric Dumazet , Soheil Hassas Yeganeh , Feng Tang , Oliver Sang , Andrew Morton , lkp@lists.01.org, cgroups@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Shakeel Butt ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661385929; a=rsa-sha256; cv=none; b=JAiCV49RdgLdOlceFvnVJO0ASGuIsvC4j90mgN/3pcYCNSoI0Z4fNhZdhE0m4LKh5yXF8F qloa2CpkFTKFxGYF3xsttQZxjEUL8a9VDOIdAojXZmi0BxA8j0BXpNkxJh9JVRqpOBHnh1 oeUvqCDnymTGNxOIVRb/5JlOsA5h0Kw= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=pizx5jtb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of 3yLwGYwgKCHsrgZjddkafnnfkd.bnlkhmtw-lljuZbj.nqf@flex--shakeelb.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3yLwGYwgKCHsrgZjddkafnnfkd.bnlkhmtw-lljuZbj.nqf@flex--shakeelb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661385929; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=X0aXbVnjIsqv6IZlugAyzwT5Jn0BIh8DUeidfr3NTwU=; b=b/A5BiIt0fphT9u5vaPThJA/+bQThBu3ChgGnURh4ljYuWSjyTGETmp0TbKjrZV8rn8zU0 szvJ+eHN9MGGx9lwiu6gePjALqm2BM4Teak9Lu6SbXozrTx+T+BB8QvJ9QgcqByvBoqIs8 vq/tDW/ft9LeN8Ao1Vf3AYVx5ZKYbSM= Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=pizx5jtb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of 3yLwGYwgKCHsrgZjddkafnnfkd.bnlkhmtw-lljuZbj.nqf@flex--shakeelb.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3yLwGYwgKCHsrgZjddkafnnfkd.bnlkhmtw-lljuZbj.nqf@flex--shakeelb.bounces.google.com X-Rspamd-Queue-Id: 4CE0EC0036 X-Rspamd-Server: rspam02 X-Stat-Signature: d4mhsugpc9me39qt14kaympiiabpmhwb X-Rspam-User: X-HE-Tag: 1661385929-987364 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: For cgroups using low or min protections, the function propagate_protected_usage() was doing an atomic xchg() operation irrespectively. We can optimize out this atomic operation for one specific scenario where the workload is using the protection (i.e. min > 0) and the usage is above the protection (i.e. usage > min). This scenario is actually very common where the users want a part of their workload to be protected against the external reclaim. Though this optimization does introduce a race when the usage is around the protection and concurrent charges and uncharged trip it over or under the protection. In such cases, we might see lower effective protection but the subsequent charge/uncharge will correct it. To evaluate the impact of this optimization, on a 72 CPUs machine, we ran the following workload in a three level of cgroup hierarchy with top level having min and low setup appropriately to see if this optimization is effective for the mentioned case. $ netserver -6 # 36 instances of netperf with following params $ netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K Results (average throughput of netperf): Without (6.0-rc1) 10482.7 Mbps With patch 14542.5 Mbps (38.7% improvement) With the patch, the throughput improved by 38.7% Signed-off-by: Shakeel Butt Reported-by: kernel test robot Acked-by: Soheil Hassas Yeganeh Reviewed-by: Feng Tang Acked-by: Roman Gushchin Acked-by: Michal Hocko --- Changes since v1: - Commit message update with more detail on which scenario is getting optimized and possible race condition. mm/page_counter.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/mm/page_counter.c b/mm/page_counter.c index eb156ff5d603..47711aa28161 100644 --- a/mm/page_counter.c +++ b/mm/page_counter.c @@ -17,24 +17,23 @@ static void propagate_protected_usage(struct page_counter *c, unsigned long usage) { unsigned long protected, old_protected; - unsigned long low, min; long delta; if (!c->parent) return; - min = READ_ONCE(c->min); - if (min || atomic_long_read(&c->min_usage)) { - protected = min(usage, min); + protected = min(usage, READ_ONCE(c->min)); + old_protected = atomic_long_read(&c->min_usage); + if (protected != old_protected) { old_protected = atomic_long_xchg(&c->min_usage, protected); delta = protected - old_protected; if (delta) atomic_long_add(delta, &c->parent->children_min_usage); } - low = READ_ONCE(c->low); - if (low || atomic_long_read(&c->low_usage)) { - protected = min(usage, low); + protected = min(usage, READ_ONCE(c->low)); + old_protected = atomic_long_read(&c->low_usage); + if (protected != old_protected) { old_protected = atomic_long_xchg(&c->low_usage, protected); delta = protected - old_protected; if (delta)