From patchwork Wed Aug 17 17:21:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 12946270 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2846FC25B08 for ; Wed, 17 Aug 2022 17:22:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AEFE68D0001; Wed, 17 Aug 2022 13:22:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A9F596B0074; Wed, 17 Aug 2022 13:22:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 968B48D0001; Wed, 17 Aug 2022 13:22:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 839886B0073 for ; Wed, 17 Aug 2022 13:22:10 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 4FB71141721 for ; Wed, 17 Aug 2022 17:22:10 +0000 (UTC) X-FDA: 79809752820.17.EE25017 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf07.hostedemail.com (Postfix) with ESMTP id F071E40054 for ; Wed, 17 Aug 2022 17:22:09 +0000 (UTC) Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-31f5960500bso162766327b3.14 for ; Wed, 17 Aug 2022 10:22:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:mime-version:message-id:date:from:to:cc; bh=dSzSreqSgeWTr7IT3H9o9mfUnYWShCnFurU/IX78wck=; b=dTD3RFmREPolvh67EVgPCaiqfPyrzjLIQk3j7COquerAm7hUK1GISZwu7LhnAsS8rx 9pHHdZosMxN0Q3SK3B5q37DPVuCJNtYYzBL6nVkLxz21xZhJitWg0GXoCxrcnTgN8Jwc Cr5luIxno/kD6ed4ki0W2eh7XGjQc8VFXxqH3HJvH6A2OuDxGL/lXQ3owGk29+tpha+2 ZvsP4zzvlWLcA3uxDoxEtMWgbPdjTjufGW0fS4h0583eJL0YVXB6neXzbAJP86uUXlIq 9xzV/+A/HP2bIMAMWJIsATEziLl0Gk+d/FJLCiZColNnBniWGGmvtH2DhvD94kXfi14T PTyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:mime-version:message-id:date:x-gm-message-state :from:to:cc; bh=dSzSreqSgeWTr7IT3H9o9mfUnYWShCnFurU/IX78wck=; b=xjIWPV0+zXwOQ7c7wVS+jAWXQDQ/staD5MkC1HPkKRLLdSwxhClvDBmbir0FQ8RIDx 6DDnCiK/3aUPYxMTWxQDvbKq0w2Yj9QiOML0eT9lxJWvsNRkKuyBC86zD25b9eFEMAq3 up7r6sE1L1Tt1u2/0mPD9Tgd11R08lzibchsGyO+cjyQKEu9pUs07caJBtJfDhssupTF Uf0IuaW7kwiLStKjRC88Tb1C2RmbwHULwQsdxZXK3HAYnI1DKiBk4LKeBaEc9VF1JyID DYy7r5rJeMKmd88b86dVBueeY5QHXc1G3B7SQxRtGroQoU/1dlhctvFxsh58AxMFkIgh bSpw== X-Gm-Message-State: ACgBeo1klmibbwIGe/DyaTHRMO2r6BBRRLHnaWJPoXZQBJtg3FcfGqkr 9JWco+QoY64Ip4X8GzDFlN6FnezlaKmEjA== X-Google-Smtp-Source: AA6agR7jpqI5gHh0L3BshFzAgkpfAr7thcx3wVGMnmFl2krsRkYhz8IM1V25L53py8VgBd9yt3hx4SPwZCczSg== X-Received: from shakeelb.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:28b]) (user=shakeelb job=sendgmr) by 2002:a5b:ac9:0:b0:67b:4ba1:cde7 with SMTP id a9-20020a5b0ac9000000b0067b4ba1cde7mr21506046ybr.70.1660756929175; Wed, 17 Aug 2022 10:22:09 -0700 (PDT) Date: Wed, 17 Aug 2022 17:21:39 +0000 Message-Id: <20220817172139.3141101-1-shakeelb@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.37.1.595.g718a3a8f04-goog Subject: [PATCH] Revert "memcg: cleanup racy sum avoidance code" From: Shakeel Butt To: " =?utf-8?q?Michal_Koutn=C3=BD?= " , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , David Hildenbrand , Yosry Ahmed , Greg Thelen Cc: Andrew Morton , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Shakeel Butt , stable@vger.kernel.org ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=dTD3RFmR; spf=pass (imf07.hostedemail.com: domain of 3wSP9YggKCBwK92C66D38GG8D6.4GEDAFMP-EECN24C.GJ8@flex--shakeelb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3wSP9YggKCBwK92C66D38GG8D6.4GEDAFMP-EECN24C.GJ8@flex--shakeelb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660756930; a=rsa-sha256; cv=none; b=YwQ9w8w4S4Uzgf1JutWR7GtC2It/qdH2WTEywwkwFq/2iSxyK9b0enph3hgAiiqJFuNEEY 5eHDOuuPV1Ws9/C8T9ySNjq7J+xCwwO6kH6doaF83JiEcJp9AfHwI04wPfdPYXj/ETnuy+ 1y0V5PMY9rbUfDWAdIqb/QlDotBD7F4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660756930; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=dSzSreqSgeWTr7IT3H9o9mfUnYWShCnFurU/IX78wck=; b=QAHCZWhe9o7RB+PF3WLhOgThiHUWxnwhJLr/ahY8Fp+f9vznHifWRnX/vsEF7rLwBRWqQR 5HSW7In+UMxatd6dhRTpEoP1kEikffyP4TJ+RuWveDaYe1C2zCxQsPKPVcmrSzlMud09Hr nWGjLDxtYX/6h+1zOw+Q4MaWMd2y5GM= Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=dTD3RFmR; spf=pass (imf07.hostedemail.com: domain of 3wSP9YggKCBwK92C66D38GG8D6.4GEDAFMP-EECN24C.GJ8@flex--shakeelb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3wSP9YggKCBwK92C66D38GG8D6.4GEDAFMP-EECN24C.GJ8@flex--shakeelb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspam-User: X-Rspamd-Server: rspam12 X-Stat-Signature: z9kmi7a4hofprnyephp6o1don8gz4mkx X-Rspamd-Queue-Id: F071E40054 X-HE-Tag: 1660756929-358598 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This reverts commit 96e51ccf1af33e82f429a0d6baebba29c6448d0f. Recently we started running the kernel with rstat infrastructure on production traffic and begin to see negative memcg stats values. Particularly the 'sock' stat is the one which we observed having negative value. $ grep "sock " /mnt/memory/job/memory.stat sock 253952 total_sock 18446744073708724224 Re-run after couple of seconds $ grep "sock " /mnt/memory/job/memory.stat sock 253952 total_sock 53248 For now we are only seeing this issue on large machines (256 CPUs) and only with 'sock' stat. I think the networking stack increase the stat on one cpu and decrease it on another cpu much more often. So, this negative sock is due to rstat flusher flushing the stats on the CPU that has seen the decrement of sock but missed the CPU that has increments. A typical race condition. For easy stable backport, revert is the most simple solution. For long term solution, I am thinking of two directions. First is just reduce the race window by optimizing the rstat flusher. Second is if the reader sees a negative stat value, force flush and restart the stat collection. Basically retry but limited. Signed-off-by: Shakeel Butt Cc: stable@vger.kernel.org # 5.15 --- include/linux/memcontrol.h | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 4d31ce55b1c0..6257867fbf95 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -987,19 +987,30 @@ static inline void mod_memcg_page_state(struct page *page, static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) { - return READ_ONCE(memcg->vmstats.state[idx]); + long x = READ_ONCE(memcg->vmstats.state[idx]); +#ifdef CONFIG_SMP + if (x < 0) + x = 0; +#endif + return x; } static inline unsigned long lruvec_page_state(struct lruvec *lruvec, enum node_stat_item idx) { struct mem_cgroup_per_node *pn; + long x; if (mem_cgroup_disabled()) return node_page_state(lruvec_pgdat(lruvec), idx); pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); - return READ_ONCE(pn->lruvec_stats.state[idx]); + x = READ_ONCE(pn->lruvec_stats.state[idx]); +#ifdef CONFIG_SMP + if (x < 0) + x = 0; +#endif + return x; } static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec,