From patchwork Sun Jul 16 12:10:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 13314795 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 103D26FDF for ; Sun, 16 Jul 2023 12:11:00 +0000 (UTC) Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AA85C1B1 for ; Sun, 16 Jul 2023 05:10:58 -0700 (PDT) Received: by mail-pf1-x42f.google.com with SMTP id d2e1a72fcca58-668709767b1so2545241b3a.2 for ; Sun, 16 Jul 2023 05:10:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689509458; x=1692101458; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cVcrh+W1YEVuCLWLeHuOrfFeTxRRYwnxDB44DTk9zfc=; b=NXtxtBfLx9y09g83zaZYri1t/KBXF599BjSV6585puzfuHgcDQQz/6G6D+tsV09LO1 8AhF9OdSchqNHtEH5a6S9ekSqZj1IGvW5/NamfqZaGFTtpm2IMEMRRyZ/AFV3v764cwK xmA11CKymuqWJuWgJndM2DGIae9noz5ia2iEAV4Ne8ytCGlgtK/J+9MY27rkWmSi8UvO Oi9aBE4LCHlUCXg64DLQ4u9y8BXh6Qjpc9Ygek3iNVS31u6JAWnnZmY93aVJvRFsiMN+ H2B4dZge8/EzHeFmq0mC84RADa+ohV3ORQskazqzyY4KQn/HBYqhilfrYmOX+YQ/InUg ru8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689509458; x=1692101458; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cVcrh+W1YEVuCLWLeHuOrfFeTxRRYwnxDB44DTk9zfc=; b=ayI/82vVCIk5s+dnEMinuWzhGcSeuY+V3Gf+2W7pfVoDhp2+16Hmmj3YN4KWV8RWNo v/1hFcCxcY3OidgiPi4LLIuxr8eOIrdGTUFbGPOUO47P1yez9iU9IQiN9/xn4Y7ZCX4k HmLDQbYE/M2r37yV4/CehHYBa9g2Ut9NMUiXk7i5Yf19svHv24cAl6jKRiJWCS1WyhKn Np1v4KDKBscBpRpbkESWQxGG0tp3ml1tKmK5b1LEKtFW/62p6VmiLI3nPHZGuISjrKMI L5KtbmhfI+BkGEcF8y8Cvzjt348b+qnvS5yYTRFHYakjXc5BMVWaAylo/Qw4CHGgZcbW VJUA== X-Gm-Message-State: ABy/qLY4173XywMMM8AZjNPsr9RYKNMmpjYYR0UvVWgrHZKdGRzdkGKH b6FesZZvMj9eZENwK/hspyE= X-Google-Smtp-Source: APBJJlEwikrUkIg79BOxZyhgRtJQZf+j5XPDAhWYbUjkQidjED5wW7VG80XWMFGosgIkVY+DOU0UMQ== X-Received: by 2002:a05:6a00:1787:b0:675:8f71:28f1 with SMTP id s7-20020a056a00178700b006758f7128f1mr11263820pfg.30.1689509458179; Sun, 16 Jul 2023 05:10:58 -0700 (PDT) Received: from vultr.guest ([2001:19f0:ac01:697:5400:4ff:fe82:495b]) by smtp.gmail.com with ESMTPSA id u8-20020a62ed08000000b0062cf75a9e6bsm10128730pfh.131.2023.07.16.05.10.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 16 Jul 2023 05:10:57 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, john.fastabend@gmail.com, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, quentin@isovalent.com Cc: bpf@vger.kernel.org, Yafang Shao Subject: [RFC PATCH bpf-next 1/4] bpf: Add __bpf_iter_attach_cgroup() Date: Sun, 16 Jul 2023 12:10:43 +0000 Message-Id: <20230716121046.17110-2-laoar.shao@gmail.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230716121046.17110-1-laoar.shao@gmail.com> References: <20230716121046.17110-1-laoar.shao@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC This is a preparation for the followup patch. No functional change. Signed-off-by: Yafang Shao --- kernel/bpf/cgroup_iter.c | 30 +++++++++++++++++++----------- 1 file changed, 19 insertions(+), 11 deletions(-) diff --git a/kernel/bpf/cgroup_iter.c b/kernel/bpf/cgroup_iter.c index 810378f04fbc..619c13c30e87 100644 --- a/kernel/bpf/cgroup_iter.c +++ b/kernel/bpf/cgroup_iter.c @@ -191,21 +191,14 @@ static const struct bpf_iter_seq_info cgroup_iter_seq_info = { .seq_priv_size = sizeof(struct cgroup_iter_priv), }; -static int bpf_iter_attach_cgroup(struct bpf_prog *prog, - union bpf_iter_link_info *linfo, - struct bpf_iter_aux_info *aux) +static int __bpf_iter_attach_cgroup(struct bpf_prog *prog, + union bpf_iter_link_info *linfo, + struct bpf_iter_aux_info *aux) { int fd = linfo->cgroup.cgroup_fd; u64 id = linfo->cgroup.cgroup_id; - int order = linfo->cgroup.order; struct cgroup *cgrp; - if (order != BPF_CGROUP_ITER_DESCENDANTS_PRE && - order != BPF_CGROUP_ITER_DESCENDANTS_POST && - order != BPF_CGROUP_ITER_ANCESTORS_UP && - order != BPF_CGROUP_ITER_SELF_ONLY) - return -EINVAL; - if (fd && id) return -EINVAL; @@ -220,10 +213,25 @@ static int bpf_iter_attach_cgroup(struct bpf_prog *prog, return PTR_ERR(cgrp); aux->cgroup.start = cgrp; - aux->cgroup.order = order; return 0; } +static int bpf_iter_attach_cgroup(struct bpf_prog *prog, + union bpf_iter_link_info *linfo, + struct bpf_iter_aux_info *aux) +{ + int order = linfo->cgroup.order; + + if (order != BPF_CGROUP_ITER_DESCENDANTS_PRE && + order != BPF_CGROUP_ITER_DESCENDANTS_POST && + order != BPF_CGROUP_ITER_ANCESTORS_UP && + order != BPF_CGROUP_ITER_SELF_ONLY) + return -EINVAL; + + aux->cgroup.order = order; + return __bpf_iter_attach_cgroup(prog, linfo, aux); +} + static void bpf_iter_detach_cgroup(struct bpf_iter_aux_info *aux) { cgroup_put(aux->cgroup.start); From patchwork Sun Jul 16 12:10:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 13314796 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 86D0979C1 for ; Sun, 16 Jul 2023 12:11:01 +0000 (UTC) Received: from mail-pf1-x430.google.com (mail-pf1-x430.google.com [IPv6:2607:f8b0:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26337E7F for ; Sun, 16 Jul 2023 05:11:00 -0700 (PDT) Received: by mail-pf1-x430.google.com with SMTP id d2e1a72fcca58-666edfc50deso2105133b3a.0 for ; Sun, 16 Jul 2023 05:11:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689509459; x=1692101459; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Hv+MhhCOHnLZYDdr/NgybYn9d0iGqsoXTJSes2gd/lE=; b=e57NFnW7jT5YDwFSGAlcFwvw6cIxytTwV35BjhdsGSSZ0TKyZoJcKDTaLRG9YSnGQy x3CKJa+jjTsmy9mz0AxvFGpaxMHViDtQZ2A3o7yHcynj/o/cMj0b4R1ZwXnfs1CFgA8s xrVeJVDVgVFNDgwEnpmgTOceCtgjDehafNB00GNJj/ldXWeFyc7ekUUsHvKzBhpf89Cs aOOywAvl8WCW1WUJXunHMf7kTW0gB7IgENWBjv4ZEWbtelvoaTnacyKlFZIZMErzEX9m oUjrDukWO3rVi+K00na3tBTfiWSEm9D/Fn/ot1Lok3LygL44lt1u+9ap0YRVog39Xytj XyOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689509459; x=1692101459; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Hv+MhhCOHnLZYDdr/NgybYn9d0iGqsoXTJSes2gd/lE=; b=CbZkWa38f6Kl/8Z3hrJ4chvytSlTqfMZcWChyy31zGMzRMe9FAzFzlE6+MchdEEVpB 1npN2MfPyPoZURJ+0eu+6mupgadEUl8mwyE11JMVvHRkc/R8ubqYsWk+UZUASPq02MbC qc26yBqapiR/Q4t8CyL7x8SDwhU1EVIn+lV0iUQGTgPaetxnzquTKSUP0/tppj+DD90p lT26+1HY2sEoMA7qcF+43LZ0YDUvNkR76x/ATPmhxtiaYqxmYlXVcxFVrM6mDr3MQICy YZaNT1VAiEVzvpvjvJteLiDjc36xiSKwfhIB+PS43zdSsOGxV1ONWlx87R549mk06HdP 87Mw== X-Gm-Message-State: ABy/qLZHr5BVgEerhvezMwXkKSMfT2ITdgMDKYWktH86MRhkAgODq2gZ pAwZvTNHiaASuVV9wZojMKQ= X-Google-Smtp-Source: APBJJlGsZwTsSFOhNMYW7OwduSizTXokYNQQk0pNHkw43zMx6sKIi0LQ5gFHk38o6TjGJWdE5RBYbA== X-Received: by 2002:a05:6a20:6a0d:b0:133:17f1:6436 with SMTP id p13-20020a056a206a0d00b0013317f16436mr8787175pzk.19.1689509459503; Sun, 16 Jul 2023 05:10:59 -0700 (PDT) Received: from vultr.guest ([2001:19f0:ac01:697:5400:4ff:fe82:495b]) by smtp.gmail.com with ESMTPSA id u8-20020a62ed08000000b0062cf75a9e6bsm10128730pfh.131.2023.07.16.05.10.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 16 Jul 2023 05:10:59 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, john.fastabend@gmail.com, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, quentin@isovalent.com Cc: bpf@vger.kernel.org, Yafang Shao Subject: [RFC PATCH bpf-next 2/4] bpf: Add cgroup_task iter Date: Sun, 16 Jul 2023 12:10:44 +0000 Message-Id: <20230716121046.17110-3-laoar.shao@gmail.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230716121046.17110-1-laoar.shao@gmail.com> References: <20230716121046.17110-1-laoar.shao@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC This patch introduces cgroup_task iter, which allows for efficient iteration of tasks within a specific cgroup. For example, we can effiently get the nr_{running,blocked} of a container with this new feature. The cgroup_task iteration serves as an alternative to task_iter in container environments due to certain limitations associated with task_iter. - Firstly, task_iter only supports the 'current' pidns. However, since our data collector operates on the host, we may need to collect information from multiple containers simultaneously. Using task_iter would require us to fork the collector for each container, which is not ideal. - Additionally, task_iter is unable to collect task information from containers running in the host pidns. In our container environment, we have containers running in the host pidns, and we would like to collect task information from them as well. - Lastly, task_iter does not support multiple-container pods. In a Kubernetes environment, a single pod may contain multiple containers, all sharing the same pidns. However, we are only interested in iterating tasks within the main container, which is not possible with task_iter. To address the first issue, we could potentially extend task_iter to support specifying a pidns other than the current one. However, for the other two issues, extending task_iter would not provide a solution. Therefore, we believe it is preferable to introduce the cgroup_task iter to handle these scenarios effectively. Signed-off-by: Yafang Shao --- include/linux/btf_ids.h | 14 ++++ kernel/bpf/cgroup_iter.c | 151 ++++++++++++++++++++++++++++++++++++++- 2 files changed, 162 insertions(+), 3 deletions(-) diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h index 00950cc03bff..559f78de8e25 100644 --- a/include/linux/btf_ids.h +++ b/include/linux/btf_ids.h @@ -265,6 +265,20 @@ MAX_BTF_TRACING_TYPE, }; extern u32 btf_tracing_ids[]; + +#ifdef CONFIG_CGROUPS +#define BTF_CGROUP_TYPE_xxx \ + BTF_CGROUP_TYPE(BTF_CGROUP_TYPE_CGROUP, cgroup) \ + BTF_CGROUP_TYPE(BTF_CGROUP_TYPE_TASK, task_struct) + +enum { +#define BTF_CGROUP_TYPE(name, type) name, +BTF_CGROUP_TYPE_xxx +#undef BTF_CGROUP_TYPE +MAX_BTF_CGROUP_TYPE, +}; +#endif + extern u32 bpf_cgroup_btf_id[]; extern u32 bpf_local_storage_map_btf_id[]; diff --git a/kernel/bpf/cgroup_iter.c b/kernel/bpf/cgroup_iter.c index 619c13c30e87..e5b82f05910b 100644 --- a/kernel/bpf/cgroup_iter.c +++ b/kernel/bpf/cgroup_iter.c @@ -157,7 +157,9 @@ static const struct seq_operations cgroup_iter_seq_ops = { .show = cgroup_iter_seq_show, }; -BTF_ID_LIST_GLOBAL_SINGLE(bpf_cgroup_btf_id, struct, cgroup) +BTF_ID_LIST_GLOBAL(bpf_cgroup_btf_id, MAX_BTF_CGROUP_TYPE) +BTF_ID(struct, cgroup) +BTF_ID(struct, task_struct) static int cgroup_iter_seq_init(void *priv, struct bpf_iter_aux_info *aux) { @@ -295,10 +297,153 @@ static struct bpf_iter_reg bpf_cgroup_reg_info = { .seq_info = &cgroup_iter_seq_info, }; +struct bpf_iter__cgroup_task { + __bpf_md_ptr(struct bpf_iter_meta *, meta); + __bpf_md_ptr(struct cgroup *, cgroup); + __bpf_md_ptr(struct task_struct *, task); +}; + +struct cgroup_task_iter_priv { + struct cgroup_iter_priv common; + struct css_task_iter it; + struct task_struct *task; +}; + +DEFINE_BPF_ITER_FUNC(cgroup_task, struct bpf_iter_meta *meta, + struct cgroup *cgroup, struct task_struct *task) + +static int bpf_iter_attach_cgroup_task(struct bpf_prog *prog, + union bpf_iter_link_info *linfo, + struct bpf_iter_aux_info *aux) +{ + int order = linfo->cgroup.order; + + if (order != BPF_CGROUP_ITER_SELF_ONLY) + return -EINVAL; + + aux->cgroup.order = order; + return __bpf_iter_attach_cgroup(prog, linfo, aux); +} + +static void *cgroup_task_seq_start(struct seq_file *seq, loff_t *pos) +{ + struct cgroup_task_iter_priv *p = seq->private; + struct cgroup_subsys_state *css = p->common.start_css; + struct css_task_iter *it = &p->it; + struct task_struct *task; + + css_task_iter_start(css, 0, it); + if (*pos > 0) { + if (p->common.visited_all) + return NULL; + return ERR_PTR(-EOPNOTSUPP); + } + + ++*pos; + p->common.terminate = false; + p->common.visited_all = false; + task = css_task_iter_next(it); + p->task = task; + return task; +} + +static void *cgroup_task_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + struct cgroup_task_iter_priv *p = seq->private; + struct css_task_iter *it = &p->it; + struct task_struct *task; + + ++*pos; + if (p->common.terminate) + return NULL; + + task = css_task_iter_next(it); + p->task = task; + return task; +} + +static int __cgroup_task_seq_show(struct seq_file *seq, struct cgroup_subsys_state *css, + bool in_stop) +{ + struct cgroup_task_iter_priv *p = seq->private; + + struct bpf_iter__cgroup_task ctx; + struct bpf_iter_meta meta; + struct bpf_prog *prog; + int ret = 0; + + ctx.meta = &meta; + ctx.cgroup = css ? css->cgroup : NULL; + ctx.task = p->task; + meta.seq = seq; + prog = bpf_iter_get_info(&meta, in_stop); + if (prog) + ret = bpf_iter_run_prog(prog, &ctx); + if (ret) + p->common.terminate = true; + return 0; +} + +static int cgroup_task_seq_show(struct seq_file *seq, void *v) +{ + return __cgroup_task_seq_show(seq, (struct cgroup_subsys_state *)v, false); +} + +static void cgroup_task_seq_stop(struct seq_file *seq, void *v) +{ + struct cgroup_task_iter_priv *p = seq->private; + struct css_task_iter *it = &p->it; + + css_task_iter_end(it); + if (!v) { + __cgroup_task_seq_show(seq, NULL, true); + p->common.visited_all = true; + } +} + +static const struct seq_operations cgroup_task_seq_ops = { + .start = cgroup_task_seq_start, + .next = cgroup_task_seq_next, + .stop = cgroup_task_seq_stop, + .show = cgroup_task_seq_show, +}; + +static const struct bpf_iter_seq_info cgroup_task_seq_info = { + .seq_ops = &cgroup_task_seq_ops, + .init_seq_private = cgroup_iter_seq_init, + .fini_seq_private = cgroup_iter_seq_fini, + .seq_priv_size = sizeof(struct cgroup_task_iter_priv), +}; + +static struct bpf_iter_reg bpf_cgroup_task_reg_info = { + .target = "cgroup_task", + .feature = BPF_ITER_RESCHED, + .attach_target = bpf_iter_attach_cgroup_task, + .detach_target = bpf_iter_detach_cgroup, + .show_fdinfo = bpf_iter_cgroup_show_fdinfo, + .fill_link_info = bpf_iter_cgroup_fill_link_info, + .ctx_arg_info_size = 2, + .ctx_arg_info = { + { offsetof(struct bpf_iter__cgroup_task, cgroup), + PTR_TO_BTF_ID_OR_NULL }, + { offsetof(struct bpf_iter__cgroup_task, task), + PTR_TO_BTF_ID_OR_NULL }, + }, + .seq_info = &cgroup_task_seq_info, +}; + static int __init bpf_cgroup_iter_init(void) { - bpf_cgroup_reg_info.ctx_arg_info[0].btf_id = bpf_cgroup_btf_id[0]; - return bpf_iter_reg_target(&bpf_cgroup_reg_info); + int ret; + + bpf_cgroup_reg_info.ctx_arg_info[0].btf_id = bpf_cgroup_btf_id[BTF_CGROUP_TYPE_CGROUP]; + ret = bpf_iter_reg_target(&bpf_cgroup_reg_info); + if (ret) + return ret; + + bpf_cgroup_task_reg_info.ctx_arg_info[0].btf_id = bpf_cgroup_btf_id[BTF_CGROUP_TYPE_CGROUP]; + bpf_cgroup_task_reg_info.ctx_arg_info[1].btf_id = bpf_cgroup_btf_id[BTF_CGROUP_TYPE_TASK]; + return bpf_iter_reg_target(&bpf_cgroup_task_reg_info); } late_initcall(bpf_cgroup_iter_init); From patchwork Sun Jul 16 12:10:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 13314797 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8D3EB79C1 for ; Sun, 16 Jul 2023 12:11:02 +0000 (UTC) Received: from mail-pf1-x42b.google.com (mail-pf1-x42b.google.com [IPv6:2607:f8b0:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E5C710E for ; Sun, 16 Jul 2023 05:11:01 -0700 (PDT) Received: by mail-pf1-x42b.google.com with SMTP id d2e1a72fcca58-6831e80080dso2366721b3a.0 for ; Sun, 16 Jul 2023 05:11:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689509461; x=1692101461; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=E6bydLkzhMc3r2IRFMFsEVdVArhhEy6BXo6ljUYKD2o=; b=jcstni45c961WxcwMSxSTHmDySMkZLQHs8P6sk8AjU8tFD/uHCInXCob1wGR5hh8UY jWWt/+mZB81liVyTrgmdLVBiuD3zh/6t1HH6Y24UFUsWZ5ANB04psIBmcmBwp8oQ3LEu pYvOtLF16NyiMGRoNUNhy8vSrZN/TdH45DdxMal1nxjyMKgngqeGJcvrPLZZSZSx3n0t RXj8Cjaj6/ut7x+XFGqnR0Ve3HKTNTXh69QF9OLD47BiwMVc02fPDKk0p0Cznepsq51A yG9TwCrpiBBLyshxw2GYurGwlzod/a4Y0q0b05i9k6+RE4gvMsjKwP0rxZGtv2WQ1Rbl uC6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689509461; x=1692101461; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=E6bydLkzhMc3r2IRFMFsEVdVArhhEy6BXo6ljUYKD2o=; b=Iq40F2dsKGKcfBNnWHoUwph7DsEDbjjvjO2xE9dgRNk96HGMUXTMCYuHdkUC3FJlSp ZoJ3LmKAwb5DnyeOSbOp7jaxT4E5zbwCEiW2gk98HFwaaGCKEefe3yrcDbflHVSwrbzJ CilAs8kZvs1WpuTIlf8TFWjC723TzN0+Ht7Hi6XGG4bExRzJUgTHoBEKkFd1sv7zfI0s aqfxJGD+EKRKcxgM6c4cdnoObJnQHXP/8NkbphupqdxgzkgPGewhGlz7LMCy/BNid5Lb 4OeUzHRCS1Ba6/knNvAc4A21j/B0csyMVPYwyNKddgKnCmB6Lmnc817MuizQmfuOdbCw nwfQ== X-Gm-Message-State: ABy/qLadgvWuXZcgzTV0jdbZAfYDYHw0rYFSVnf3ADvZ3i1Pa4Gi3OPN 3p9/vNKQe2kBblzaoOjkBAQ= X-Google-Smtp-Source: APBJJlFvCCKTkgcR1c2yMUqf6IS0tRHNmMArN6LqoSx5PAs48XCkT2dtv6z4faSgjFR7iZKK9pGEeA== X-Received: by 2002:a05:6a20:938b:b0:126:f64b:668e with SMTP id x11-20020a056a20938b00b00126f64b668emr11039898pzh.5.1689509460830; Sun, 16 Jul 2023 05:11:00 -0700 (PDT) Received: from vultr.guest ([2001:19f0:ac01:697:5400:4ff:fe82:495b]) by smtp.gmail.com with ESMTPSA id u8-20020a62ed08000000b0062cf75a9e6bsm10128730pfh.131.2023.07.16.05.10.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 16 Jul 2023 05:11:00 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, john.fastabend@gmail.com, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, quentin@isovalent.com Cc: bpf@vger.kernel.org, Yafang Shao Subject: [RFC PATCH bpf-next 3/4] bpftool: Add support for cgroup_task Date: Sun, 16 Jul 2023 12:10:45 +0000 Message-Id: <20230716121046.17110-4-laoar.shao@gmail.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230716121046.17110-1-laoar.shao@gmail.com> References: <20230716121046.17110-1-laoar.shao@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC we need to make corresponding changes to the bpftool to support cgroup_task iter. The result: $ bpftool link show 3: iter prog 15 target_name cgroup_task cgroup_id 7427 order self_only Signed-off-by: Yafang Shao --- tools/bpf/bpftool/link.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c index 65a168df63bc..efbdefdb1b18 100644 --- a/tools/bpf/bpftool/link.c +++ b/tools/bpf/bpftool/link.c @@ -158,7 +158,8 @@ static bool is_iter_map_target(const char *target_name) static bool is_iter_cgroup_target(const char *target_name) { - return strcmp(target_name, "cgroup") == 0; + return strcmp(target_name, "cgroup") == 0 || + strcmp(target_name, "cgroup_task") == 0; } static const char *cgroup_order_string(__u32 order) From patchwork Sun Jul 16 12:10:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 13314798 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4EE17882F for ; Sun, 16 Jul 2023 12:11:04 +0000 (UTC) Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C59B710E for ; Sun, 16 Jul 2023 05:11:02 -0700 (PDT) Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-6686a05bc66so2547335b3a.1 for ; Sun, 16 Jul 2023 05:11:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689509462; x=1692101462; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5qjSmlCfCw3325MFOhBSaBqZhjziPOzsAFf2ePB7sLs=; b=jTZ5PrUIOwShM8LlFzVtfBj6YV4YtGn2/tWePw45cVnomTPYaG++BeoAfnW4olD39s jOCXhduXd0KTo5Pq7k5q4ykA6+C4QK4lT9WK6qei+B0FnsoezXHcC85CHz+35pbCXfYN KlJT27SoA4GNUDu/V6nMc44Y5CTZETUfecdSAY4NlBUEKyQAk+a+6EuLDGBrcdzLJgUZ SKko4+S2Yh+X1g97yuhwCDkLgYdUTsryuDeRRceSzIajTFHs8hzKVvEaz2O+gC3cGV3x wA3e3zHmHc+qsJ/UxOnXK4Iz/qfaigDFg/950PpscHJJ9CrqilLib0JtBuVo41Jo4gjE SLTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689509462; x=1692101462; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5qjSmlCfCw3325MFOhBSaBqZhjziPOzsAFf2ePB7sLs=; b=jOS5LLo3/yDueSsneIArKjLLq0VMlBHu7/JvoymGUUQjUgwiTG6tQJPccPhCw079MV 9w/JSM4dLiH3khVn47eRgw/AninFLGQJZi3ePqFP3IVgfE9AJTASKpuiGzzdhGrHtJgv WYuRGDBQosmeshHYpuxf2EVZ2lgwaeV8sIgUptXGRmWXV7mOcUZp1F1HXSXaWbheGqSp UVm+5KvmFqcDhCfpI/SBEz/2pZg5qqRT/qtAGKZ/4D/suTHMfYZDXY+tw4fyoguY4KjI zCtWnC9vHxHo5T9QmZCTcY0JGf8SABKtJWi/mlEk9vWSeM3udgN9CGx2eWX9wpWxXWjv m5yw== X-Gm-Message-State: ABy/qLbY69gzRD7ENI5nQJvskK+vKgHekuP0m+aKkOItTZn2Ut0b403p BOu+mIs75ta6J8s/Y2VxXKE= X-Google-Smtp-Source: APBJJlFKHfFdF/RS2p9h3seDSo7l+C9jzJsOYOWnkMDydMK5IM61NxaIt1Zo6hxTfCTfcJ/CJ7lASg== X-Received: by 2002:a05:6a21:6da3:b0:121:bda6:2f85 with SMTP id wl35-20020a056a216da300b00121bda62f85mr10617636pzb.30.1689509462194; Sun, 16 Jul 2023 05:11:02 -0700 (PDT) Received: from vultr.guest ([2001:19f0:ac01:697:5400:4ff:fe82:495b]) by smtp.gmail.com with ESMTPSA id u8-20020a62ed08000000b0062cf75a9e6bsm10128730pfh.131.2023.07.16.05.11.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 16 Jul 2023 05:11:01 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, john.fastabend@gmail.com, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, quentin@isovalent.com Cc: bpf@vger.kernel.org, Yafang Shao Subject: [RFC PATCH bpf-next 4/4] selftests/bpf: Add selftest for cgroup_task iter Date: Sun, 16 Jul 2023 12:10:46 +0000 Message-Id: <20230716121046.17110-5-laoar.shao@gmail.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230716121046.17110-1-laoar.shao@gmail.com> References: <20230716121046.17110-1-laoar.shao@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC Add selftests for the newly introduced cgroup_task iter. The result: #42/1 cgroup_task_iter/cgroup_task_iter__invalid_order:OK #42/2 cgroup_task_iter/cgroup_task_iter__no_task:OK #42/3 cgroup_task_iter/cgroup_task_iter__task_pid:OK #42/4 cgroup_task_iter/cgroup_task_iter__task_cnt:OK #42 cgroup_task_iter:OK Summary: 1/4 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yafang Shao --- .../bpf/prog_tests/cgroup_task_iter.c | 197 ++++++++++++++++++ .../selftests/bpf/progs/cgroup_task_iter.c | 39 ++++ 2 files changed, 236 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/cgroup_task_iter.c create mode 100644 tools/testing/selftests/bpf/progs/cgroup_task_iter.c diff --git a/tools/testing/selftests/bpf/prog_tests/cgroup_task_iter.c b/tools/testing/selftests/bpf/prog_tests/cgroup_task_iter.c new file mode 100644 index 000000000000..9123577524b5 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/cgroup_task_iter.c @@ -0,0 +1,197 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Yafang Shao */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "cgroup_helpers.h" +#include "cgroup_task_iter.skel.h" + +#define PID_CNT (2) +static char expected_output[128]; + +static void read_from_cgroup_iter(struct bpf_program *prog, int cgroup_fd, + int order, const char *testname) +{ + DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts); + union bpf_iter_link_info linfo; + struct bpf_link *link; + int len, iter_fd; + static char buf[128]; + size_t left; + char *p; + + memset(&linfo, 0, sizeof(linfo)); + linfo.cgroup.cgroup_fd = cgroup_fd; + linfo.cgroup.order = order; + opts.link_info = &linfo; + opts.link_info_len = sizeof(linfo); + + link = bpf_program__attach_iter(prog, &opts); + if (!ASSERT_OK_PTR(link, "attach_iter")) + return; + + iter_fd = bpf_iter_create(bpf_link__fd(link)); + if (iter_fd < 0) + goto free_link; + + memset(buf, 0, sizeof(buf)); + left = ARRAY_SIZE(buf); + p = buf; + while ((len = read(iter_fd, p, left)) > 0) { + p += len; + left -= len; + } + + ASSERT_STREQ(buf, expected_output, testname); + + /* read() after iter finishes should be ok. */ + if (len == 0) + ASSERT_OK(read(iter_fd, buf, sizeof(buf)), "second_read"); + + close(iter_fd); +free_link: + bpf_link__destroy(link); +} + +/* Invalid walk order */ +static void test_invalid_order(struct cgroup_task_iter *skel, int fd) +{ + DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts); + enum bpf_cgroup_iter_order order; + union bpf_iter_link_info linfo; + struct bpf_link *link; + + memset(&linfo, 0, sizeof(linfo)); + linfo.cgroup.cgroup_fd = fd; + opts.link_info = &linfo; + opts.link_info_len = sizeof(linfo); + + /* Only BPF_CGROUP_ITER_SELF_ONLY is supported */ + for (order = 0; order <= BPF_CGROUP_ITER_ANCESTORS_UP; order++) { + if (order == BPF_CGROUP_ITER_SELF_ONLY) + continue; + linfo.cgroup.order = order; + link = bpf_program__attach_iter(skel->progs.cgroup_task_cnt, &opts); + ASSERT_ERR_PTR(link, "attach_task_iter"); + ASSERT_EQ(errno, EINVAL, "error code on invalid walk order"); + } +} + +/* Iterate a cgroup withouth any task */ +static void test_walk_no_task(struct cgroup_task_iter *skel, int fd) +{ + snprintf(expected_output, sizeof(expected_output), "nr_total 0\n"); + + read_from_cgroup_iter(skel->progs.cgroup_task_cnt, fd, + BPF_CGROUP_ITER_SELF_ONLY, "self_only"); +} + +/* The forked child process do nothing. */ +static void child_sleep(void) +{ + while (1) + sleep(1); +} + +/* Get task pid under a cgroup */ +static void test_walk_task_pid(struct cgroup_task_iter *skel, int fd) +{ + int pid, status, err; + char pid_str[16]; + + pid = fork(); + if (!ASSERT_GE(pid, 0, "fork_task")) + return; + if (pid) { + snprintf(pid_str, sizeof(pid_str), "%u", pid); + err = write_cgroup_file("cgroup_task_iter", "cgroup.procs", pid_str); + if (!ASSERT_EQ(err, 0, "write cgrp file")) + goto out; + snprintf(expected_output, sizeof(expected_output), "pid %u\n", pid); + read_from_cgroup_iter(skel->progs.cgroup_task_pid, fd, + BPF_CGROUP_ITER_SELF_ONLY, "self_only"); +out: + kill(pid, SIGKILL); + waitpid(pid, &status, 0); + } else { + child_sleep(); + } +} + +/* Get task count under a cgroup */ +static void test_walk_task_cnt(struct cgroup_task_iter *skel, int fd) +{ + int pids[PID_CNT], pid, status, err, i; + char pid_str[16]; + + for (i = 0; i < PID_CNT; i++) + pids[i] = 0; + + for (i = 0; i < PID_CNT; i++) { + pid = fork(); + if (!ASSERT_GE(pid, 0, "fork_task")) + goto out; + if (pid) { + pids[i] = pid; + snprintf(pid_str, sizeof(pid_str), "%u", pid); + err = write_cgroup_file("cgroup_task_iter", "cgroup.procs", pid_str); + if (!ASSERT_EQ(err, 0, "write cgrp file")) + goto out; + } else { + child_sleep(); + } + } + + snprintf(expected_output, sizeof(expected_output), "nr_total %u\n", PID_CNT); + read_from_cgroup_iter(skel->progs.cgroup_task_cnt, fd, + BPF_CGROUP_ITER_SELF_ONLY, "self_only"); + +out: + for (i = 0; i < PID_CNT; i++) { + if (!pids[i]) + continue; + kill(pids[i], SIGKILL); + waitpid(pids[i], &status, 0); + } +} + +void test_cgroup_task_iter(void) +{ + struct cgroup_task_iter *skel = NULL; + int cgrp_fd; + + if (setup_cgroup_environment()) + return; + + cgrp_fd = create_and_get_cgroup("cgroup_task_iter"); + if (!ASSERT_GE(cgrp_fd, 0, "create cgrp")) + goto cleanup_cgrp_env; + + skel = cgroup_task_iter__open_and_load(); + if (!ASSERT_OK_PTR(skel, "cgroup_task_iter__open_and_load")) + goto out; + + if (test__start_subtest("cgroup_task_iter__invalid_order")) + test_invalid_order(skel, cgrp_fd); + if (test__start_subtest("cgroup_task_iter__no_task")) + test_walk_no_task(skel, cgrp_fd); + if (test__start_subtest("cgroup_task_iter__task_pid")) + test_walk_task_pid(skel, cgrp_fd); + if (test__start_subtest("cgroup_task_iter__task_cnt")) + test_walk_task_cnt(skel, cgrp_fd); + +out: + cgroup_task_iter__destroy(skel); + close(cgrp_fd); + remove_cgroup("cgroup_task_iter"); +cleanup_cgrp_env: + cleanup_cgroup_environment(); +} diff --git a/tools/testing/selftests/bpf/progs/cgroup_task_iter.c b/tools/testing/selftests/bpf/progs/cgroup_task_iter.c new file mode 100644 index 000000000000..b9a6d9d29d58 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/cgroup_task_iter.c @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Yafang Shao */ + +#include "bpf_iter.h" +#include +#include + +char _license[] SEC("license") = "GPL"; + +SEC("iter/cgroup_task") +int cgroup_task_cnt(struct bpf_iter__cgroup_task *ctx) +{ + struct seq_file *seq = ctx->meta->seq; + struct task_struct *task = ctx->task; + static __u32 nr_total; + + if (!task) { + BPF_SEQ_PRINTF(seq, "nr_total %u\n", nr_total); + return 0; + } + + if (ctx->meta->seq_num == 0) + nr_total = 0; + nr_total++; + return 0; +} + +SEC("iter/cgroup_task") +int cgroup_task_pid(struct bpf_iter__cgroup_task *ctx) +{ + struct seq_file *seq = ctx->meta->seq; + struct task_struct *task = ctx->task; + + if (!task) + return 0; + + BPF_SEQ_PRINTF(seq, "pid %u\n", task->pid); + return 0; +}