From patchwork Fri Feb 10 15:47:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 13135956 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22806C636CD for ; Fri, 10 Feb 2023 15:47:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 99FC06B014F; Fri, 10 Feb 2023 10:47:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 929016B0151; Fri, 10 Feb 2023 10:47:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A268280003; Fri, 10 Feb 2023 10:47:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 69B3D6B014F for ; Fri, 10 Feb 2023 10:47:50 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id ED9F9AAB58 for ; Fri, 10 Feb 2023 15:47:49 +0000 (UTC) X-FDA: 80451812658.21.8FF7AA9 Received: from mail-pf1-f196.google.com (mail-pf1-f196.google.com [209.85.210.196]) by imf08.hostedemail.com (Postfix) with ESMTP id E6B70160013 for ; Fri, 10 Feb 2023 15:47:47 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=QNqX8q2Z; spf=pass (imf08.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.210.196 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676044068; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=s+YH2jubPt3atRm183/M73ZL/tfmWGLNd7PCoTgNemA=; b=nQ2+YG/G/U25HvNPMUdDbYaq52tBvQwlWDgEl1FQuSrBq8KZizCLdOBmccn66qAsJC4nbD JnQrquUXzuyGcuZUyceuHmsxnNWmSfNYQCj9dcK6y2FoRaFNRSoU5EUSuEsiWLcTZinqw9 LG0d6mVHdUJ0ptifS7qrc3XAE6Cwgc4= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=QNqX8q2Z; spf=pass (imf08.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.210.196 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676044068; a=rsa-sha256; cv=none; b=lRN/fMjr0vANDBZJRpNWsg6/ODgRAw25tmXCRxOkxoJA1dw4UOGsd7//OTWXucGl+V+WF2 hQoTXWOe18G97OFtQGnmub30OVDM7KGHh351pJ+CQ6+xGMwRCSrIm2Px6UBA1ebdPqJUoc z7GBk6uv36TgKWupiJJAimqKP1sTv/s= Received: by mail-pf1-f196.google.com with SMTP id g9so3721328pfo.5 for ; Fri, 10 Feb 2023 07:47:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=s+YH2jubPt3atRm183/M73ZL/tfmWGLNd7PCoTgNemA=; b=QNqX8q2ZIQrMW67nmsy+zUNpS7XR5DpjwMa+QC46vQ4QQmBXgjRWFzUADH3PIVQAvM FZQ0+Lwy2K0/Q2dxn7RUNIP2sSCxrkOWkrr7iR7EFCZjNYeXAt2HozKhomVVZ84I1Req JaybFnw9Q3fIYLoNgQJdOlHfqCehA7iHa+v+DquWbC8y3Quf4X2ILQxsPHjhh2v7hjzD HTVfSVvQSXhPC2803RTP2MCbSbL20xfU1iKSVd1lK5EBjizBudJSJxvBrsNGQuyOaTZ/ njYYNt/pF7HbveYbeszG4LcRP2LxwcPrizgibQa72LOJwIrixuInxvYzU7aTn6mCyyiY Ocjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=s+YH2jubPt3atRm183/M73ZL/tfmWGLNd7PCoTgNemA=; b=UUJ6FkCQvp5BtArq7ZrTDj1MNMQ5hyrx8h8RGmPoYa20UzUwQ58jHHMMtfobz+yNEU /avSASqyz7qcszfrGKDMIUItx6ywhj0z2uPm0jWp7ttdQh9VtPqZVOUdpwFMIPL+3tpK 0Hlpurv002rivIgPx8LIKSIzYol58yJhjKQlUG8aeSk1WgOy8s7pZpHU1tbxZlxmJnBq rW4+//CoqR0Vy4nRMc3/lM7ci6n87ZFL3HXQ3WtOQLedgNvqCKXM2RhwiidtOSomPthI TyAe1AnYXg73u4jDxzY2NiMAA5js8jhga8E3iU3JX2ctnRqri0Ph0/Cwt5fmJuI5PdNP YwBg== X-Gm-Message-State: AO0yUKXjQSTIfiy/hj1MqNeeTBuex34+pvUZGLALx8xa8YFqaGMcsy4z UOyrhwCP636mKTeUY1T2lJk= X-Google-Smtp-Source: AK7set8mHinKZZpdR2SeOy1pRrvPpKDphy3bpuU2275HlwAF6DtEddZcfOT81C2VzeZlsXGLNBDHMw== X-Received: by 2002:a62:7b0c:0:b0:5a8:4b23:85e5 with SMTP id w12-20020a627b0c000000b005a84b2385e5mr7626941pfc.20.1676044066725; Fri, 10 Feb 2023 07:47:46 -0800 (PST) Received: from vultr.guest ([2001:19f0:7001:2f6a:5400:4ff:fe4c:e050]) by smtp.gmail.com with ESMTPSA id t20-20020aa79394000000b005921c46cbadsm3520069pfe.99.2023.02.10.07.47.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Feb 2023 07:47:46 -0800 (PST) From: Yafang Shao To: tj@kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, akpm@linux-foundation.org Cc: bpf@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [PATCH bpf-next v2 0/4] bpf, mm: introduce cgroup.memory=nobpf Date: Fri, 10 Feb 2023 15:47:30 +0000 Message-Id: <20230210154734.4416-1-laoar.shao@gmail.com> X-Mailer: git-send-email 2.39.0 MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E6B70160013 X-Stat-Signature: imu3y7z1au7o86mr4ixsxm496xwei1xu X-Rspam-User: X-HE-Tag: 1676044067-376460 X-HE-Meta: U2FsdGVkX1/0w/toVNpMRIwfjMdx8r4fkiuQM7YR1ZvZbxVelm7YinvIMdEt2sZRg6hLqgiO2oAV0j7ZnLMA/g3eIKbEw+sUfd+TScvKwd9MZKuT+vbVo8slUhejaej/OfGFWatnMBFQe4lBw5TqanU9PU66Mazgajzk+k7bHk3ebBNd5aGK5phURHO1qAdYxGxB4r5m3UrJuQJ6RBfZNfF1MANXmh4VTTpTIu1/saXVYkbfc/hGc4Kqv829mxNWq22XhJbmrZbSwHrkGTeSGYELOA4xWj/Temnn1Ux77NLsP4do5IpZn1cHX1yTC3bWaQI2u+LItYdZ+0IiW48G2mH5bUinl8O0uKeucWWGP6NleGG8fcbhfuOlx5XJ0JxYLDcz47SJyzqb0ae94y31MRmrx1K9fTExWLO12q1iXBcG5qnyA5WVT5znx9icmtw8uxIAt0Aqy71+GtpliNwZIF/VDl2pbYQqXMAQDOLmJE9WSDRvhzuDdGqJCUH1K/3Dj9DdzbX9H2yEDQTx66eJh/XYS1JTVn83MmFAA14bMJfF+LEPdeVb6jYFlMWkY+YPesgL6wv/dovAQoNW0ydjJSwHV5iJV6bAbDK5v3bGt2BIB7WfHZqBJYFFAZHskCBTXzHC1A7nrC7siBIC/HwFyib1LLvn1MlX7Ni6YpN223fFacxRSYLscS7yvcl6EDeAXpTInJDuwkcC1ZOvGDG+mXUbZetNM6dS6KI4YiVcodxQU8/F2C4ojLTV8To5M9a0rpc01ye0OUbUK5wLUaiSceqOB2H9ing07uwQNyLanJjilfGp50cR3oRnC1iPIqf7fAi7hcWxcR4f2fZ+kigpuCwI0jcm23HFadj9hRrvlLmfLuayGHEJgBNxhXW0nmzKlnrUJyW7jxkONGyAGGjvsiuypHXyHM53XVPJtNX6ntbb6xozrcyNKXuyY2OyxFZP+OcBDPtGtSxRrnJ+vLq 9b7Xq0Fr CRe1ukMd2YyIVD8Fd9MawUqQaAtaacUkqAvgJqnc9T92api/CJsZqpRP1b/tSmYZm0DGunSKPJwvz+KXRIGx9Z+HnaLvsAoevXTn5m/GQWjKlc5lIFdKMnHxq2eCffS4A8w+FJD+EMA65b9tPvFtvw3YG9KETRJi9FjGP/l3q0WovIDsm27VAej9kPtRlVHZMKrkSRPJAjrg0pulWep6pjk+5WiYHAgACMTfj3OCviLXiMdk14xJCxUZm0dHjuPh+ijuW1TMGSBtwNKVv1I0wq/Psj72GXyUna5Nog0uv6Atk64K171sUvK2Ym9xvJhCMYYtACn4suBArCLCUfORB9isbaf6riP780pO/BTMj+KXn9e/vBVJGTZoSE/et2t7/ZUXnBiDVoUw0u1StDYMLxwJFqxJJYK71b/dSWE6JnC8ZVYV/h8Oh9uQBOY8zIPYC81yTy0Md6EBacz4cMbaOMfrGfLnR9K67fJa6iGI8pdB99hjGDTtuFkQi4h0+h6KcsGsIelzwwZ9od1DP3PatF8i/Zjj+uQK8jOWDrmVbmSDsK28WlQfEZS+eqZ2U2dlxOdsZ04/KvqNYumk90dO477hUiPP18I8zGvSX8I9EyHzs2T8jItfUmWRSY3v7+9z4pcL3VcPGe3qHjviOcaFHY7qRfOMNTVEXoSxsdnkqkdEMX03SoJjMK7Y9pQP75tY7PnLj+K/JnxK5p+n/A9TG3Py27gpXDiJRUoZBeNNM71PH1II= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The bpf memory accouting has some known problems in contianer environment, - The container memory usage is not consistent if there's pinned bpf program After the container restart, the leftover bpf programs won't account to the new generation, so the memory usage of the container is not consistent. This issue can be resolved by introducing selectable memcg, but we don't have an agreement on the solution yet. See also the discussions at https://lwn.net/Articles/905150/ . - The leftover non-preallocated bpf map can't be limited The leftover bpf map will be reparented, and thus it will be limited by the parent, rather than the container itself. Furthermore, if the parent is destroyed, it be will limited by its parent's parent, and so on. It can also be resolved by introducing selectable memcg. - The memory dynamically allocated in bpf prog is charged into root memcg only Nowdays the bpf prog can dynamically allocate memory, for example via bpf_obj_new(), but it only allocate from the global bpf_mem_alloc pool, so it will charge into root memcg only. That needs to be addressed by a new proposal. So let's give the container user an option to disable bpf memory accouting. The idea of "cgroup.memory=nobpf" is originally by Tejun[1]. [1]. https://lwn.net/ml/linux-mm/YxjOawzlgE458ezL@slm.duckdns.org/ Changes, v1->v2: - squash patches (Roman) - commit log improvement in patch #2. (Johannes) Yafang Shao (4): mm: memcontrol: add new kernel parameter cgroup.memory=nobpf bpf: use bpf_map_kvcalloc in bpf_local_storage bpf: allow to disable bpf map memory accounting bpf: allow to disable bpf prog memory accounting Documentation/admin-guide/kernel-parameters.txt | 1 + include/linux/bpf.h | 16 ++++++++++++++++ include/linux/memcontrol.h | 11 +++++++++++ kernel/bpf/bpf_local_storage.c | 4 ++-- kernel/bpf/core.c | 13 +++++++------ kernel/bpf/memalloc.c | 3 ++- kernel/bpf/syscall.c | 20 ++++++++++++++++++-- mm/memcontrol.c | 18 ++++++++++++++++++ 8 files changed, 75 insertions(+), 11 deletions(-)