From patchwork Sun Feb 5 06:58:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 13128977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02526C6379F for ; Sun, 5 Feb 2023 06:58:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0607B6B0072; Sun, 5 Feb 2023 01:58:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 00F3D6B0073; Sun, 5 Feb 2023 01:58:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E196A6B0074; Sun, 5 Feb 2023 01:58:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id CEE9E6B0072 for ; Sun, 5 Feb 2023 01:58:23 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 9E6D0120317 for ; Sun, 5 Feb 2023 06:58:23 +0000 (UTC) X-FDA: 80432334486.07.20BC4C8 Received: from mail-pf1-f196.google.com (mail-pf1-f196.google.com [209.85.210.196]) by imf20.hostedemail.com (Postfix) with ESMTP id DB5A11C0008 for ; Sun, 5 Feb 2023 06:58:20 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=XSC+bQej; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.210.196 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675580300; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=NkWRdzEKke7GR8UIzoPrXe0fCR7Zi4RGiPc8wkbTB4w=; b=outeVv7Z5b2w7KAo7Ju9yplTlQie0CBqbJovUPkySdMP0hlBfaNC6F+73mbvao/dK2ge3l AHWm61dZN3qubmr9JX0KDfdHfB34H/r6J0TvqhMvDj8Wu8gXgmh7XHqk9vJQOpxf5ko0ZU dwabnRBmIuxP2qJMbVAXm+85TUBWxv0= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=XSC+bQej; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.210.196 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675580300; a=rsa-sha256; cv=none; b=KcBMsRU5kEp9jfUNjd1zGhW4yRYl5AQg/aZF6I6jwO0xGI6XgPBDL75xha8PJUBhdGRU/9 fBd+CTSJKTrHy0dY5Lm2sd1gEDxoHEfuib5uPhl+mZd9TWtXJYbMzkMEv/1wvlaMgn68sl xZ4anc/NNTrcciMstcRjjtm+T5Bje74= Received: by mail-pf1-f196.google.com with SMTP id z3so6398578pfb.2 for ; Sat, 04 Feb 2023 22:58:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=NkWRdzEKke7GR8UIzoPrXe0fCR7Zi4RGiPc8wkbTB4w=; b=XSC+bQejMiGP5TbJYVDlLreC8JI6izirwf2lFqkk+LIE4in+7ZH/gEPPm+2/e6s3sj 4VfBFEKjaVS6SVUHOiDcEf7TGvyMIphk/+tiH/qgiOeuZIC134Qmja6JNox0NpC+uro0 hq/ZslHjhPEmSkhFXVsx1E3GRLJVqS2ou2fUs/S8qLkFZCFqLLlXdfSht5fx2xt0aPTJ wM7CiLm5JJpJYF40pc/jTJpxsXUv5/TfQiDb2/bNX9TsaJVq9fGsjqE8kBNFc6vae5Hm FoQHwrX4C/UuNQiWXiNDSB8joi4VfJQGFFL70FV9ToCJ7K64eB+p2C2OG72S8Paoxqvp hMSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=NkWRdzEKke7GR8UIzoPrXe0fCR7Zi4RGiPc8wkbTB4w=; b=EDey6CT07ueVUOSxTR05JH1xkN7wbRLbNJq29G8JL/g3abjqYAyRU+FsGZ1kR/fJNz ef90xkSO564OUSuoLjHLzg2LCiqcFIH8JDI/CEGHNu3o/mdg6KfqoyUg61YUiSumUHlh gpdqkTwm0lV6HCxcyfM8pauROiIcKw76DRnZBEi3ZiN8Y3JTw3X/iu48qRO8Uqun0WZX 26a00cv7nqrUL8U19jCgNodOG3DmlxFyXQ23RCmMYSuJONg+3D/YjEFtgbr3TKMZEZKY NVBnTDqVlI0UtUO8OXJHbZFaTbO+frdMdGiMj/h3lCLNiUVefKgUYA6wN6Wmt8VX1qS1 bnNw== X-Gm-Message-State: AO0yUKWQw3WvloMC/2iw1pP80P3G3pFGy0Wvpo9LCQed/D2qIl+zGOyG ByLGFShqIKLel0uk0SN/I2M= X-Google-Smtp-Source: AK7set8ZmU5/Kpg6u1AWbZ+6T3Ym9SJiBF6LaA1AfSe3W4zcwl5v6pvvRr1kYum7Kv86G/1Ihb2+vg== X-Received: by 2002:a05:6a00:230c:b0:56c:232e:395e with SMTP id h12-20020a056a00230c00b0056c232e395emr18950984pfh.15.1675580299680; Sat, 04 Feb 2023 22:58:19 -0800 (PST) Received: from vultr.guest ([2401:c080:1c02:6a5:5400:4ff:fe4b:6fe6]) by smtp.gmail.com with ESMTPSA id 144-20020a621596000000b00593ce7ebbaasm4596114pfv.184.2023.02.04.22.58.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 04 Feb 2023 22:58:19 -0800 (PST) From: Yafang Shao To: tj@kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, akpm@linux-foundation.org Cc: bpf@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [PATCH bpf-next 0/5] bpf, mm: introduce cgroup.memory=nobpf Date: Sun, 5 Feb 2023 06:58:00 +0000 Message-Id: <20230205065805.19598-1-laoar.shao@gmail.com> X-Mailer: git-send-email 2.39.0 MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: DB5A11C0008 X-Stat-Signature: bf4cwuc9qjd1iqjhiacdiiz6h4h8wuxz X-HE-Tag: 1675580300-376136 X-HE-Meta: U2FsdGVkX1/+7S2ylmKiXLiISzGDjp2znQoSAMyEj5NUKd//5PL8wMVkBZpJ4bXQMkLDFY8KKBLww/7/cES1WDOItwt/jIUDJH8PbpvY7Q3V2ZbWTByjvX2EhD/ujamax+uT7zy2MweJHE8kmCJmWmXXUk0KGL4S66DRZTw7UGR7ij2V357EzlDj9ogmMylgfvWLyMZuySPShVN3qXj++pMcysLtv4TVDsESqZG2PWHBgBO3teJGjliq710VPCI3axc61aYypSATU0vwqpXSo4pknahZ+nPp32oNVyQ4phk379ogBDUWNdOruaD3caZVoHQ8HUaQBZ5CPJkwqIuJCwW4AcYfTccP0eMPn5ox/eQArRdnrfVBrZShK3Uk6OXwB8OuKjhMP0BP3s9RIW7RgdELxl0QNnuO44xkK5yRmuUmaXzM23K/mSTF79O48r7dI1zAMIxD2IbKxittpQ062q9I0V8llktX8m+8Esx3/iPLd58u8jA8CCFUOmBPgDvayI/WLIp97uOY3+OioQ7pxij5SkGlBn9wCzL2CiB8Dx+lDDUzJffcOyq7NoZxZpwxuhDzIpa/rVdeLjYhjgG2RR5r1ywB0YPNwDDUxjynxoFllcKen0Lyy/8O0xGgf5+9/pSEDQGjlz4w7fObNfA/VEAMwuaIuRlJgAH9zr7TFqbhQcoGXWVWTMJSSKvpjNDJ7XuHl1NjWPX5gy7ZsiOXfhFrs9yJ4gZinrmqbz58strNQZYJUz7/I7172Ozcb8y3L3E6D7uC3+4YD1a9I4AWSHowrC/ibvVjo6nx+RliSv8smnH1nj13LV7wx/+zpfJDScJRtPyUDj3snaZiOke812Uv9G3XlX6AohLgwE4pQbDy84W6SUfRis62y1oba2tkK9LDIMB5RtYMFgfUtfGeHdRtRXsu7X2i0duhiwzSy4bWrEyXvV/zYwLTLM1RIKmdI5ZobSXfwJArxPCZhl4 9Xa2YGFc FrB70raYwUUAVpQhYmqqGnfhAuRG62fGDYcZ3cDeI8q68cAQkTKV45+S/JZB+FcSyVf35ywzo8WuXBffwq9wW9NXCil4LmcNeGWVpXq4Onr92omyYJbQ+UJ2Sej0vfd0LhFTeNCXDa/7JSdYPBSYPPGcGSt9P2v+eLAwP4BGz3BNmcHWaFG+BM976GiN8T4pQAtYmKmYrJz9lSKLFMV1OFfHpm2/pgZwFxLAmtcHj4zqWrCrLTprclS+cS7TxDGp+P7Y8l16uEqkzgxriO3F12c0K/ZRGzTOerRy0hvdUxbiOX+EXtthgRTWslWM5R5qpvEY0uOgr41M1QVigQKXXh5aPRHzNlwHnnXokqtnN1NVVt2GrkZsG2MhKdqhrbL1WfNN0CjzYl7jLEmxny3+DJbNPJ5W8IanUHQMOzrfwhpPZH1n30SQRaXW6q4P95h8DINTVThAGpVJJ9h4v48IvVcSDT1MjNyJqDNgTyYn4bcCLK5iCfHHe9oicVQ+NrWyJNJc0+giBkLFw9Hk0haq4Ur2R/IUkmX7Tn4iEXApo5ra/k0ozHS+zzx4CGfCyphIzwgtaE0dAvd3JQN2dcU2DwjSFt6ZksYtz/J4lPtkN+5t3L/jGMja61NaVjyMW+cxtEkJAWkPuewWdmCNf4l7n1VHu3uCbgBwTLP29/ggzxDrz/mrPVqpfUWHxug== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The bpf memory accouting has some known problems in contianer environment, - The container memory usage is not consistent if there's pinned bpf program After the container restart, the leftover bpf programs won't account to the new generation, so the memory usage of the container is not consistent. This issue can be resolved by introducing selectable memcg, but we don't have an agreement on the solution yet. See also the discussions at https://lwn.net/Articles/905150/ . - The leftover non-preallocated bpf map can't be limited The leftover bpf map will be reparented, and thus it will be limited by the parent, rather than the container itself. Furthermore, if the parent is destroyed, it be will limited by its parent's parent, and so on. It can also be resolved by introducing selectable memcg. - The memory dynamically allocated in bpf prog is charged into root memcg only Nowdays the bpf prog can dynamically allocate memory, for example via bpf_obj_new(), but it only allocate from the global bpf_mem_alloc pool, so it will charge into root memcg only. That needs to be addressed by a new proposal. So let's give the user an option to disable bpf memory accouting. The idea of "cgroup.memory=nobpf" is originally by Tejun[1]. [1]. https://lwn.net/ml/linux-mm/YxjOawzlgE458ezL@slm.duckdns.org/ Yafang Shao (5): mm: memcontrol: add new kernel parameter cgroup.memory=nobpf bpf: use bpf_map_kvcalloc in bpf_local_storage bpf: introduce bpf_memcg_flags() bpf: allow to disable bpf map memory accounting bpf: allow to disable bpf prog memory accounting Documentation/admin-guide/kernel-parameters.txt | 1 + include/linux/bpf.h | 16 ++++++++++++++++ include/linux/memcontrol.h | 11 +++++++++++ kernel/bpf/bpf_local_storage.c | 4 ++-- kernel/bpf/core.c | 13 +++++++------ kernel/bpf/memalloc.c | 3 ++- kernel/bpf/syscall.c | 20 ++++++++++++++++++-- mm/memcontrol.c | 18 ++++++++++++++++++ 8 files changed, 75 insertions(+), 11 deletions(-) Acked-by: Johannes Weiner Acked-by: Roman Gushchin