From patchwork Thu Jan 12 15:53:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 13098275 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 802A8C54EBC for ; Thu, 12 Jan 2023 15:53:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 06C918E000D; Thu, 12 Jan 2023 10:53:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F0FAE8E0001; Thu, 12 Jan 2023 10:53:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D15E48E000D; Thu, 12 Jan 2023 10:53:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B58B28E0001 for ; Thu, 12 Jan 2023 10:53:51 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 8932C1C5F3A for ; Thu, 12 Jan 2023 15:53:51 +0000 (UTC) X-FDA: 80346592662.17.51427BF Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com [209.85.160.170]) by imf17.hostedemail.com (Postfix) with ESMTP id C09634000E for ; Thu, 12 Jan 2023 15:53:49 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=PFL20qpJ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673538829; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZHBLuOt22MmYduNrNXpPncXZaLhT6xO/3AZAghLO+28=; b=w8ETdJsTIKnkdxK4v6IF+Hx1klkjfi9GBk331jMXyfCVWPcy2qEb64iD+hbSo7MJMQ6HN7 J2svdwpSmBmAD3X3wVEoaUB+2X3cXjOuCP23naEtc3rhODMfM8UMhXjBsA+0AfUrzrSC4d K6atGE5K2FT2MRKlMbu2E4/C8KP2BX4= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=PFL20qpJ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673538829; a=rsa-sha256; cv=none; b=5vvNskLxy3ZJIsNGkf6DNfxL7kInwggVcnCtR8C/Lh58tseTo0F+e3C/ZLA8AVCWtOWpw5 T935+9aC1uu68SqaC/lRU6OpFGSdxBTxhXqBobYO5txD8+bp4R/WRPc9Ri1CiHop2b3iky R1i0rIouGh3vLaGC+Qzm7VGjFqdtxys= Received: by mail-qt1-f170.google.com with SMTP id j15so11295832qtv.4 for ; Thu, 12 Jan 2023 07:53:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZHBLuOt22MmYduNrNXpPncXZaLhT6xO/3AZAghLO+28=; b=PFL20qpJup5OHKv9UPkTO7fFgClOp8U9dXkFVAS69SL27zMiYGmBTuMYEO0GgGywQ0 V193glQ5i21Tw/cAkUxdMyN8zuEXZkM/mm8oX8ZYLoPWz0EWoeX1NqhCNUKoQ0wEjWYY CsQojYbPs+HGgxdT6CxkpJqTasLIyBCw7BrnDBw49VpR/0pX1xavKhdlPIH2loiOhbbZ K4g3kNIRNj3Qg/ZcZSZ2GY7YuAxuiZSwMEh+zl1GIQz2O8ge6wnA+0C4RLes/fgHs3U5 VXhsJZBUy4FhEd+GR2eoW8qt3YKLUTFhpXRwZ4MpqkzMq7f7Dt2wJHqm5Ss1//i4Dm1A eJqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZHBLuOt22MmYduNrNXpPncXZaLhT6xO/3AZAghLO+28=; b=YW3H/63A3sEO/rAM7Agobljc6OUHq1mXJMc1KTOHo2gcvKJEXSfvYS+j08tc1iqM+w kGDzV/0aFW6bxTIBvwXGaKvSN/46t4JlJXiqRWp/kOEx6e+f76ybUG7sWefb0D0Z6BSU mtOktOTH9Vy1fJQYlWIkIbB86wHYXX6IYIhfgHvx3XSq4u2MjmivbWdMdMy8kaMZRBqG Mf1IyYVCgrlOYpXXLcw4ce5Tqhh91VgG3Mr52L+6XWdWUDYHARpDa89jKPxTXTZmIyGL xsh5HiME8q5J8y22YDSaxubu9rRDs/mbCSdd9ifxnUS7sUc3z3/Ka/1vUqjtAh9sJEEx YV3A== X-Gm-Message-State: AFqh2kqgarEFH+G9P9wn1uSIggbot0eslJN1Wrk0oi8/mWdW3WvuiRNC YloUIzIpD7m03p873ybyha0= X-Google-Smtp-Source: AMrXdXtch+HYePOFAPhpj0/byz/XWiJclPWZEbWn/oV7C6FmuW8LQxsnkoeOAzPoqykxf8NIdgRnPQ== X-Received: by 2002:ac8:4601:0:b0:3af:7bf6:d209 with SMTP id p1-20020ac84601000000b003af7bf6d209mr13302883qtn.45.1673538828928; Thu, 12 Jan 2023 07:53:48 -0800 (PST) Received: from vultr.guest ([173.199.122.241]) by smtp.gmail.com with ESMTPSA id l17-20020ac848d1000000b003ab43dabfb1sm9280836qtr.55.2023.01.12.07.53.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Jan 2023 07:53:48 -0800 (PST) From: Yafang Shao To: 42.hyeyoo@gmail.com, vbabka@suse.cz, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, tj@kernel.org, dennis@kernel.org, cl@linux.com, akpm@linux-foundation.org, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, roman.gushchin@linux.dev Cc: linux-mm@kvack.org, bpf@vger.kernel.org, Yafang Shao Subject: [RFC PATCH bpf-next v2 11/11] bpf: introduce bpf memory statistics Date: Thu, 12 Jan 2023 15:53:26 +0000 Message-Id: <20230112155326.26902-12-laoar.shao@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230112155326.26902-1-laoar.shao@gmail.com> References: <20230112155326.26902-1-laoar.shao@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: C09634000E X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: 977byg7zgtzru5yd1hh36kqgqp6r3hkk X-HE-Tag: 1673538829-150617 X-HE-Meta: U2FsdGVkX18xJNLFRCGI3yw0CpivhUEZObg0EtE9TmrVGOwLRtlgDg3CjBx5cphz+kuuGXKuMlHFDJQdhr/kqaqw6cvUft01hbU0TPs3NAWXIFiJBZom8rovlijTuzase40lFLS2OgQJIkSWzrWkpZV3MjZ5OYkz0+foWNpNN/VyZPEk040GpV/9qkJmYzsKg9FU/Oa7mF4dUwSBCPBrWcensVaVDrEB3nFxomzunjBmkL2D0+dvF5SJKmHij684fzHQx/5LMHRr8Gic9plKoq1puobYG/2YAdqWLo0bo+r8+BhU+JgaiCekQqu2fvvxRe/RAyZ2c9Eyg8Q+7ub1L2yXtDYB17YUV13x3vphwdPGoKmb3VlroHumg2tA94B8fDsZ6btSj6ZpJQ6EE/m8cmEVM0QtEdY/2bQU2xy6iZFxE9IZX2+gbRj8uLDzWRYIfEuX0CBVoQDleNYpfB0WCeg6HieW8wV2yhxZLjvknln6rzTZDz20Kr67zCe7uxDR9T3E5ahw35VMNGQZwO/P4ZB2TpelAPQ3wwbr63my0W98fFDF3R5UTVjAi1vP6aecxbXMvv1yQjxT9VLVX+H1gHtPsYhdW0rjrPl9g51YIPh6GEwSAAC2B0P7qaR45RNVTOHkCeXmrixZY7MbwbUhSLaR0f016h3M/MQGhSIlP/aximVL2SPmeC0ATY3m0+hmrftfELf+x98w+231Thr+PjBF58X/WJiQWRZlOiqH1Ya5KipX5IhvqI2YuPY9ASeLheZofh+B7jsOwqOUR6dA3uHgSL1/jTVp47oyMcasXtBPzR92LxPVr9iHpej3m6ax/prMIOcgt0RYwIcHV3q8ySbP30acgKkAG5TGZaKK4dzdFjWk8v4P0qkobzvmlKOJMFkfG8ipaRrs7ysK/XINDxNGDzgxy7bPwEWloGKJD7B1ykP3rztYJRixwheEHy1Swpp01I1j+n4ujeJJTxb tgrAu3s7 UjVKtDgXQr7R6YZXkjTAAOx+Ny3QJqGdPRCa52jmc9Lww2ory4XGCwdhm2M7W/mPV4Fd960OswMSE3la9l2vXfTIdbRKgrA3sjKXc2uhi4lp36owZVm+Elxz/4+x2yFMV1t1w0gHJFntc3Ye6qY5C6Z2zZhz9jkMy3Nat0XtPtpc2ZXnblDdEgslgN5B86Pl1i35rJuLbqp+PV5d8PuC39P6z6zKL39nUjoJe45U3kYucwNQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: It introduces a new percpu global variable to store bpf memory statistic. It will adds this percpu variable at bpf memory allocation and subs this percpu variable at bpf memory freeing. A new item "BPF" is added into /proc/meminfo to show the bpf memory statistic. Pls. note that there're some deferred freeing, for example the kfree_rcu(), vfree_deferred(). For these deferred freeing, the in-flight memory may be not freed immediately after we subs them from the bpf memory statistic. But it won't take long time to free them, so this behavior is acceptible. Below is the output, $ grep BPF /proc/meminfo BPF: 358 kB Signed-off-by: Yafang Shao --- fs/proc/meminfo.c | 4 +++ include/linux/bpf.h | 92 +++++++++++++++++++++++++++++++++++++++++++++++---- kernel/bpf/memalloc.c | 19 ++++++++++- kernel/bpf/ringbuf.c | 4 +++ kernel/bpf/syscall.c | 39 ++++++++++++++++++++-- 5 files changed, 149 insertions(+), 9 deletions(-) diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index 4409601..5b67331 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -16,6 +16,7 @@ #ifdef CONFIG_CMA #include #endif +#include #include #include "internal.h" @@ -159,6 +160,9 @@ static int meminfo_proc_show(struct seq_file *m, void *v) arch_report_meminfo(m); + seq_printf(m, "BPF: %8lu kB\n", + bpf_mem_stat_sum() >> 10); + return 0; } diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 17c218e..add307e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1869,23 +1869,71 @@ int generic_map_delete_batch(struct bpf_map *map, struct bpf_map *bpf_map_get_curr_or_next(u32 *id); struct bpf_prog *bpf_prog_get_curr_or_next(u32 *id); +struct bpf_mem_stat { + long stat; +}; + +DECLARE_PER_CPU(struct bpf_mem_stat, bpfmm); + +static inline void bpf_mem_stat_add(size_t cnt) +{ + this_cpu_add(bpfmm.stat, cnt); +} + +static inline void bpf_mem_stat_sub(size_t cnt) +{ + this_cpu_sub(bpfmm.stat, cnt); +} + +static inline long bpf_mem_stat_sum(void) +{ + struct bpf_mem_stat *this; + long sum = 0; + int cpu; + + for_each_online_cpu(cpu) { + this = &per_cpu(bpfmm, cpu); + sum += this->stat; + } + + return sum; +} static inline void bpf_map_kfree(const void *ptr) { + size_t sz = ksize_full(ptr); + + if (sz) + bpf_mem_stat_sub(sz); kfree(ptr); } static inline void bpf_map_kvfree(const void *ptr) { + size_t sz = kvsize(ptr); + + if (sz) + bpf_mem_stat_sub(sz); kvfree(ptr); } static inline void bpf_map_free_percpu(void __percpu *ptr) { + size_t sz = percpu_size(ptr); + + if (sz) + bpf_mem_stat_sub(sz); free_percpu(ptr); } -#define bpf_map_kfree_rcu(ptr, rhf...) kvfree_rcu(ptr, ## rhf) +#define bpf_map_kfree_rcu(ptr, rhf...) \ +do { \ + size_t sz = kvsize(ptr); \ + \ + if (sz) \ + bpf_mem_stat_sub(sz); \ + kvfree_rcu(ptr, ## rhf); \ +} while (0) #ifdef CONFIG_MEMCG_KMEM void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags, @@ -1901,26 +1949,54 @@ void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags, int node) { - return kmalloc_node(size, flags, node); + void *ptr; + size_t sz; + + ptr = kmalloc_node(size, flags, node); + sz = ksize_full(ptr); + if (sz) + bpf_mem_stat_add(sz); + return ptr; } static inline void * bpf_map_kzalloc(const struct bpf_map *map, size_t size, gfp_t flags) { - return kzalloc(size, flags); + void *ptr; + size_t sz; + + ptr = kzalloc(size, flags); + sz = ksize_full(ptr); + if (sz) + bpf_mem_stat_add(sz); + return ptr; } static inline void * bpf_map_kvcalloc(struct bpf_map *map, size_t n, size_t size, gfp_t flags) { - return kvcalloc(n, size, flags); + void *ptr; + size_t sz; + + ptr = kvcalloc(n, size, flags); + sz = kvsize(ptr); + if (sz) + bpf_mem_stat_add(sz); + return ptr; } static inline void __percpu * bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, size_t align, gfp_t flags) { - return __alloc_percpu_gfp(size, align, flags); + void *ptr; + size_t sz; + + ptr = __alloc_percpu_gfp(size, align, flags); + sz = percpu_size(ptr); + if (sz) + bpf_mem_stat_add(sz); + return ptr; } #endif @@ -2461,6 +2537,11 @@ static inline void bpf_prog_inc_misses_counter(struct bpf_prog *prog) static inline void bpf_cgrp_storage_free(struct cgroup *cgroup) { } + +static inline long bpf_mem_stat_sum(void) +{ + return 0; +} #endif /* CONFIG_BPF_SYSCALL */ void __bpf_free_used_btfs(struct bpf_prog_aux *aux, @@ -2886,5 +2967,4 @@ static inline bool type_is_alloc(u32 type) { return type & MEM_ALLOC; } - #endif /* _LINUX_BPF_H */ diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index ebcc3dd..4e35f287 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -129,6 +129,8 @@ static void *__alloc(struct bpf_mem_cache *c, int node) * want here. */ gfp_t flags = GFP_NOWAIT | __GFP_NOWARN | __GFP_ACCOUNT; + void *ptr; + size_t sz; if (c->percpu_size) { void **obj = kmalloc_node(c->percpu_size, flags, node); @@ -140,10 +142,18 @@ static void *__alloc(struct bpf_mem_cache *c, int node) return NULL; } obj[1] = pptr; + sz = ksize_full(obj); + sz += percpu_size(pptr); + if (sz) + bpf_mem_stat_add(sz); return obj; } - return kmalloc_node(c->unit_size, flags, node); + ptr = kmalloc_node(c->unit_size, flags, node); + sz = ksize_full(ptr); + if (sz) + bpf_mem_stat_add(sz); + return ptr; } static struct mem_cgroup *get_memcg(const struct bpf_mem_cache *c) @@ -215,12 +225,19 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) static void free_one(struct bpf_mem_cache *c, void *obj) { + size_t sz = ksize_full(obj); + if (c->percpu_size) { + sz += percpu_size(((void **)obj)[1]); + if (sz) + bpf_mem_stat_sub(sz); free_percpu(((void **)obj)[1]); kfree(obj); return; } + if (sz) + bpf_mem_stat_sub(sz); kfree(obj); } diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c index 3264bf5..766c2f1 100644 --- a/kernel/bpf/ringbuf.c +++ b/kernel/bpf/ringbuf.c @@ -96,6 +96,7 @@ static void bpf_ringbuf_pages_free(struct page **pages, int nr_pages) { int i; + bpf_mem_stat_sub(nr_pages * PAGE_SIZE); for (i = 0; i < nr_pages; i++) __free_page(pages[i]); bpf_map_area_free(pages); @@ -126,9 +127,12 @@ static struct page **bpf_ringbuf_pages_alloc(int nr_meta_pages, pages[nr_data_pages + i] = page; } + bpf_mem_stat_add(nr_pages * PAGE_SIZE); return pages; err_free_pages: + if (nr_pages) + bpf_mem_stat_add(nr_pages * PAGE_SIZE); bpf_ringbuf_pages_free(pages, nr_pages); err: return NULL; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 9e266e8..6ca2ceb 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -46,6 +46,7 @@ #define BPF_OBJ_FLAG_MASK (BPF_F_RDONLY | BPF_F_WRONLY) +DEFINE_PER_CPU(struct bpf_mem_stat, bpfmm); DEFINE_PER_CPU(int, bpf_prog_active); static DEFINE_IDR(prog_idr); static DEFINE_SPINLOCK(prog_idr_lock); @@ -336,16 +337,34 @@ static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable) void *bpf_map_area_alloc(u64 size, int numa_node) { - return __bpf_map_area_alloc(size, numa_node, false); + size_t sz; + void *ptr; + + ptr = __bpf_map_area_alloc(size, numa_node, false); + sz = kvsize(ptr); + if (sz) + bpf_mem_stat_add(sz); + return ptr; } void *bpf_map_area_mmapable_alloc(u64 size, int numa_node) { - return __bpf_map_area_alloc(size, numa_node, true); + size_t sz; + void *ptr; + + ptr = __bpf_map_area_alloc(size, numa_node, true); + sz = kvsize(ptr); + if (sz) + bpf_mem_stat_add(sz); + return ptr; } void bpf_map_area_free(void *area) { + size_t sz = kvsize(area); + + if (sz) + bpf_mem_stat_sub(sz); kvfree(area); } @@ -446,12 +465,16 @@ void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags, { struct mem_cgroup *memcg, *old_memcg; void *ptr; + size_t sz; memcg = bpf_map_get_memcg(map); old_memcg = set_active_memcg(memcg); ptr = kmalloc_node(size, flags | __GFP_ACCOUNT, node); set_active_memcg(old_memcg); mem_cgroup_put(memcg); + sz = ksize_full(ptr); + if (sz) + bpf_mem_stat_add(sz); return ptr; } @@ -460,12 +483,16 @@ void *bpf_map_kzalloc(const struct bpf_map *map, size_t size, gfp_t flags) { struct mem_cgroup *memcg, *old_memcg; void *ptr; + size_t sz; memcg = bpf_map_get_memcg(map); old_memcg = set_active_memcg(memcg); ptr = kzalloc(size, flags | __GFP_ACCOUNT); set_active_memcg(old_memcg); mem_cgroup_put(memcg); + sz = ksize_full(ptr); + if (sz) + bpf_mem_stat_add(sz); return ptr; } @@ -475,12 +502,16 @@ void *bpf_map_kvcalloc(struct bpf_map *map, size_t n, size_t size, { struct mem_cgroup *memcg, *old_memcg; void *ptr; + size_t sz; memcg = bpf_map_get_memcg(map); old_memcg = set_active_memcg(memcg); ptr = kvcalloc(n, size, flags | __GFP_ACCOUNT); set_active_memcg(old_memcg); mem_cgroup_put(memcg); + sz = kvsize(ptr); + if (sz) + bpf_mem_stat_add(sz); return ptr; } @@ -490,12 +521,16 @@ void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, { struct mem_cgroup *memcg, *old_memcg; void __percpu *ptr; + size_t sz; memcg = bpf_map_get_memcg(map); old_memcg = set_active_memcg(memcg); ptr = __alloc_percpu_gfp(size, align, flags | __GFP_ACCOUNT); set_active_memcg(old_memcg); mem_cgroup_put(memcg); + sz = percpu_size(ptr); + if (sz) + bpf_mem_stat_add(sz); return ptr; }