From patchwork Thu Nov 3 19:10:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13030878 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59D55C4332F for ; Thu, 3 Nov 2022 19:11:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231354AbiKCTLL (ORCPT ); Thu, 3 Nov 2022 15:11:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231355AbiKCTLK (ORCPT ); Thu, 3 Nov 2022 15:11:10 -0400 Received: from mail-pf1-x444.google.com (mail-pf1-x444.google.com [IPv6:2607:f8b0:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D8E751EC70 for ; Thu, 3 Nov 2022 12:11:08 -0700 (PDT) Received: by mail-pf1-x444.google.com with SMTP id q9so2516566pfg.5 for ; Thu, 03 Nov 2022 12:11:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CEddTg6aJKfYCIpwPodramjir9MaWYVl4voA8sRfaJE=; b=K60YbPcrsB6GhSJZ2Ww4TciZk9DeQhJkU3sbGJAkWWWCJ8BmrJbCP1ASHF0NJGMpcz Woe8bxCsXMPqfOMvzIm6VxsKazmhCk+aKgfOpdTGceKJn7jmYa22UZw73/wKGbYcDVUy 7c5KouxMlb3lPbsUchEBMsvX27lDA7q49cgT/QCT17l/q0T4/ypIgQBogp2A7gqkpFMo x16TSA3QkZZwBXPXxRUCiyv4EAq7CuBCHfOyf3VSaI32MQKZW/sCxYCVpXF8pNV8WvKf XcZ3aYwLuT4O09VyugjPwiaWNqUPP0WLqJuVDz4EVbHPPyvD8ogchgjMmWPji0NQrvrt FwrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CEddTg6aJKfYCIpwPodramjir9MaWYVl4voA8sRfaJE=; b=v4Axw9sSDB7i8I2q60JGAHAUc6nRozGO0R10tiMZAFXtNjbZ+aeNFStTXMs+fsf05Q nL8lem5decQGa5cc/KQdZqyd1/AlOEDc98hpFY5SK3tvsliggdrc87KDRoOQFLHPivYg e5/cOkHDqWj6U0iSqrP5xx1TyCNagoujKT+w9lCPiaMnsIb6IS2IC2Qjh5FdRfJntBCS rtd+Od3MQ9JCFCMmprAlWvwnb+Batdo0RUqReUewOJfjshyk4JIeu3g4LKXsLTIXJYVk 5y1EJrMgnFSP1RvqF/GBdwTaVuwceMsNqgEny7LTcpHJ7fYYh2Ow4f1XFp6OpZwWfL7B UG+A== X-Gm-Message-State: ACrzQf3I84JLPfvLObHHpZhWHDBiw6kvBCGfyADUmzEOft97K0aWKd0l FC4+rOC83KGBMvFHSvoTlA5AkhweyzSncQ== X-Google-Smtp-Source: AMsMyM7owCtPO6EcAblnL7IgAhvwTdk4xzUVv1nrnULJv/6T8AA7y53vDGiXn/aLOD+DUSC0sU/Drw== X-Received: by 2002:a05:6a00:1d89:b0:56c:a2b:f1c2 with SMTP id z9-20020a056a001d8900b0056c0a2bf1c2mr32110692pfw.45.1667502667959; Thu, 03 Nov 2022 12:11:07 -0700 (PDT) Received: from localhost ([103.4.222.252]) by smtp.gmail.com with ESMTPSA id e6-20020a17090301c600b00186b138706fsm1064833plh.13.2022.11.03.12.11.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Nov 2022 12:11:07 -0700 (PDT) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Dave Marchevsky , Delyan Kratunov Subject: [PATCH bpf-next v4 11/24] bpf: Recognize bpf_{spin_lock,list_head,list_node} in local kptrs Date: Fri, 4 Nov 2022 00:40:00 +0530 Message-Id: <20221103191013.1236066-12-memxor@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221103191013.1236066-1-memxor@gmail.com> References: <20221103191013.1236066-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=14022; i=memxor@gmail.com; h=from:subject; bh=DPyVw7s9wtmmTbKC2JdhFm9RPlN7XLaB+oDAkcqS5Mo=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBjZBIB2LnhlvUi5n5nYBtMySczLddnBSRY4Spjb/SC D0OXHK2JAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCY2QSAQAKCRBM4MiGSL8Ryo5sD/ 9KYKoo2bm3yTfdXBRKHY7HtooLFMJiFqa1Q+D6Sq9ojori94g0tvP4eu6h+0HkZxFCmjogbNbqPB0s diS/b71sDph+2Dyx0BnCi8o4pyU3TfRB/MZwc0VPvBd3ECH6y3snDQovUjK9seTOCjyxstdK2UdyX5 bxdEibLN4gKIVQZMXzBeQzeu2s9ooU4a8Szf6dqI3NpyGmJGsYCnC65aQZNSIR4DGBkj8isVTNOCLW 1T7Cfc36KN9fBzuoeWy6fBRxcORDuv/Ooi9AL+DjJTk3PYlGoOgvkxCL0zkA3z9/MMfRc60vOfd2xM e0KF5HAqZZoxr3dbjjrgP03r0kyevuFjVem9ChY7/tquvcwr0LAPXatNXd1hXgaoUksyC8gwO10KV6 5igfYyFZWZZh2ly6RMDOU2RLaT4dcWMqGU+YYsQ7XOFXpBKy69dyxGVlHk4c7ZBWkDg95Mv2ntJl7g Arq+2GU0bcAUlUCRh/0icqUg+JffUeoZaWQPqZFMLNV68LFKknVKizEqhomwuMnnCa2SaOua1F4MWN wejVPkBwyZMf2ddBXJyVUTjuYxW86mcwjZaJHdtPirqtSvRRbAICaR3WKFaFZyyaHWOvxD9qeJ2w9j GoSMxtXc9OIrZcpjURJBR2H4ZdBMlFqnX/1cb/wj1h4XJysTOLFbyL5NHz2w== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Allow specifying bpf_spin_lock, bpf_list_head, bpf_list_node fields in a local kptr. A bpf_list_head allows implementing map-in-map style use cases, where local kptr with bpf_list_head is linked into a list in a map value. This would require embedding a bpf_list_node, support for which is also included. Lastly, while we strictly don't require to hold a bpf_spin_lock while manipulating the bpf_list_head of a local kptr, as when have access to it, we have complete ownership of the object, the locking constraint is still kept and may be conditionally lifted in the future. Note that the specification of such types can be done just like map values, e.g.: struct bar { struct bpf_list_node node; }; struct foo { struct bpf_spin_lock lock; struct bpf_list_head head __contains(bar, node); struct bpf_list_node node; }; struct map_value { struct bpf_spin_lock lock; struct bpf_list_head head __contains(foo, node); }; To recognize such types in user BTF, we build a btf_struct_metas array of metadata items corresponding to each BTF ID. This is done once during the btf_parse stage to avoid having to do it each time during the verification process's requirement to inspect the metadata. Moreover, the computed metadata needs to be passed to some helpers in future patches which requires allocating them and storing them in the BTF that is pinned by the program itself, so that valid access can be assumed to such data during program runtime. Signed-off-by: Kumar Kartikeya Dwivedi --- include/linux/bpf.h | 7 ++ include/linux/btf.h | 35 ++++++++ kernel/bpf/btf.c | 196 +++++++++++++++++++++++++++++++++++++++---- kernel/bpf/syscall.c | 4 + 4 files changed, 224 insertions(+), 18 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index bdd3adfcbe5f..0797c467e894 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -177,6 +177,7 @@ enum btf_field_type { BPF_KPTR_REF = (1 << 3), BPF_KPTR = BPF_KPTR_UNREF | BPF_KPTR_REF, BPF_LIST_HEAD = (1 << 4), + BPF_LIST_NODE = (1 << 5), }; struct btf_field_kptr { @@ -277,6 +278,8 @@ static inline const char *btf_field_type_name(enum btf_field_type type) return "kptr"; case BPF_LIST_HEAD: return "bpf_list_head"; + case BPF_LIST_NODE: + return "bpf_list_node"; default: WARN_ON_ONCE(1); return "unknown"; @@ -295,6 +298,8 @@ static inline u32 btf_field_type_size(enum btf_field_type type) return sizeof(u64); case BPF_LIST_HEAD: return sizeof(struct bpf_list_head); + case BPF_LIST_NODE: + return sizeof(struct bpf_list_node); default: WARN_ON_ONCE(1); return 0; @@ -313,6 +318,8 @@ static inline u32 btf_field_type_align(enum btf_field_type type) return __alignof__(u64); case BPF_LIST_HEAD: return __alignof__(struct bpf_list_head); + case BPF_LIST_NODE: + return __alignof__(struct bpf_list_node); default: WARN_ON_ONCE(1); return 0; diff --git a/include/linux/btf.h b/include/linux/btf.h index d80345fa566b..a01a8da20021 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -6,6 +6,8 @@ #include #include +#include +#include #include #include @@ -78,6 +80,17 @@ struct btf_id_dtor_kfunc { u32 kfunc_btf_id; }; +struct btf_struct_meta { + u32 btf_id; + struct btf_record *record; + struct btf_field_offs *field_offs; +}; + +struct btf_struct_metas { + u32 cnt; + struct btf_struct_meta types[]; +}; + typedef void (*btf_dtor_kfunc_t)(void *); extern const struct file_operations btf_fops; @@ -408,6 +421,23 @@ static inline struct btf_param *btf_params(const struct btf_type *t) return (struct btf_param *)(t + 1); } +static inline int btf_id_cmp_func(const void *a, const void *b) +{ + const int *pa = a, *pb = b; + + return *pa - *pb; +} + +static inline bool btf_id_set_contains(const struct btf_id_set *set, u32 id) +{ + return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL; +} + +static inline void *btf_id_set8_contains(const struct btf_id_set8 *set, u32 id) +{ + return bsearch(&id, set->pairs, set->cnt, sizeof(set->pairs[0]), btf_id_cmp_func); +} + #ifdef CONFIG_BPF_SYSCALL struct bpf_prog; @@ -423,6 +453,7 @@ int register_btf_kfunc_id_set(enum bpf_prog_type prog_type, s32 btf_find_dtor_kfunc(struct btf *btf, u32 btf_id); int register_btf_id_dtor_kfuncs(const struct btf_id_dtor_kfunc *dtors, u32 add_cnt, struct module *owner); +struct btf_struct_meta *btf_find_struct_meta(const struct btf *btf, u32 btf_id); #else static inline const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id) @@ -454,6 +485,10 @@ static inline int register_btf_id_dtor_kfuncs(const struct btf_id_dtor_kfunc *dt { return 0; } +static inline struct btf_struct_meta *btf_find_struct_meta(const struct btf *btf, u32 btf_id) +{ + return NULL; +} #endif static inline bool btf_type_is_struct_ptr(struct btf *btf, const struct btf_type *t) diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 2e0ec7307f73..ffe9d5b182e6 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -237,6 +237,7 @@ struct btf { struct rcu_head rcu; struct btf_kfunc_set_tab *kfunc_set_tab; struct btf_id_dtor_kfunc_tab *dtor_kfunc_tab; + struct btf_struct_metas *struct_meta_tab; /* split BTF support */ struct btf *base_btf; @@ -1642,8 +1643,30 @@ static void btf_free_dtor_kfunc_tab(struct btf *btf) btf->dtor_kfunc_tab = NULL; } +static void btf_struct_metas_free(struct btf_struct_metas *tab) +{ + int i; + + if (!tab) + return; + for (i = 0; i < tab->cnt; i++) { + btf_record_free(tab->types[i].record); + kfree(tab->types[i].field_offs); + } + kfree(tab); +} + +static void btf_free_struct_meta_tab(struct btf *btf) +{ + struct btf_struct_metas *tab = btf->struct_meta_tab; + + btf_struct_metas_free(tab); + btf->struct_meta_tab = NULL; +} + static void btf_free(struct btf *btf) { + btf_free_struct_meta_tab(btf); btf_free_dtor_kfunc_tab(btf); btf_free_kfunc_set_tab(btf); kvfree(btf->types); @@ -3356,6 +3379,12 @@ static int btf_get_field_type(const char *name, u32 field_mask, u32 *seen_mask, goto end; } } + if (field_mask & BPF_LIST_NODE) { + if (!strcmp(name, "bpf_list_node")) { + type = BPF_LIST_NODE; + goto end; + } + } /* Only return BPF_KPTR when all other types with matchable names fail */ if (field_mask & BPF_KPTR) { type = BPF_KPTR_REF; @@ -3401,6 +3430,7 @@ static int btf_find_struct_field(const struct btf *btf, switch (field_type) { case BPF_SPIN_LOCK: case BPF_TIMER: + case BPF_LIST_NODE: ret = btf_find_struct(btf, member_type, off, sz, field_type, idx < info_cnt ? &info[idx] : &tmp); if (ret < 0) @@ -3463,6 +3493,7 @@ static int btf_find_datasec_var(const struct btf *btf, const struct btf_type *t, switch (field_type) { case BPF_SPIN_LOCK: case BPF_TIMER: + case BPF_LIST_NODE: ret = btf_find_struct(btf, var_type, off, sz, field_type, idx < info_cnt ? &info[idx] : &tmp); if (ret < 0) @@ -3670,6 +3701,8 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type if (ret < 0) goto end; break; + case BPF_LIST_NODE: + break; default: ret = -EFAULT; goto end; @@ -5140,6 +5173,118 @@ static int btf_parse_hdr(struct btf_verifier_env *env) return btf_check_sec_info(env, btf_data_size); } +static const char *local_kptr_fields[] = { + "bpf_spin_lock", + "bpf_list_head", + "bpf_list_node", +}; + +static struct btf_struct_metas * +btf_parse_struct_metas(struct bpf_verifier_log *log, struct btf *btf) +{ + union { + struct btf_id_set set; + struct { + u32 _cnt; + u32 _ids[ARRAY_SIZE(local_kptr_fields)]; + } _arr; + } lkf; + struct btf_struct_metas *tab = NULL; + int i, n, id, ret; + + memset(&lkf, 0, sizeof(lkf)); + + for (i = 0; i < ARRAY_SIZE(local_kptr_fields); i++) { + /* Try to find whether this special type exists in user BTF, and + * if so remember its ID so we can easily find it among members + * of structs that we iterate in the next loop. + */ + id = btf_find_by_name_kind(btf, local_kptr_fields[i], BTF_KIND_STRUCT); + if (id < 0) + continue; + lkf.set.ids[lkf.set.cnt++] = id; + } + + if (!lkf.set.cnt) + return NULL; + sort(&lkf.set.ids, lkf.set.cnt, sizeof(lkf.set.ids[0]), btf_id_cmp_func, NULL); + + n = btf_nr_types(btf); + for (i = 1; i < n; i++) { + const struct btf_member *member; + struct btf_field_offs *foffs; + struct btf_struct_meta *type; + struct btf_record *record; + const struct btf_type *t; + int j; + + t = btf_type_by_id(btf, i); + if (!t) { + ret = -EINVAL; + goto free; + } + if (!__btf_type_is_struct(t)) + continue; + + cond_resched(); + + for_each_member(j, t, member) { + if (btf_id_set_contains(&lkf.set, member->type)) + goto parse; + } + continue; + parse: + if (!tab) { + tab = kzalloc(offsetof(struct btf_struct_metas, types[1]), + GFP_KERNEL | __GFP_NOWARN); + if (!tab) + return ERR_PTR(-ENOMEM); + } else { + struct btf_struct_metas *new_tab; + + new_tab = krealloc(tab, offsetof(struct btf_struct_metas, types[tab->cnt + 1]), + GFP_KERNEL | __GFP_NOWARN); + if (!new_tab) { + ret = -ENOMEM; + goto free; + } + tab = new_tab; + } + type = &tab->types[tab->cnt]; + + type->btf_id = i; + record = btf_parse_fields(btf, t, BPF_SPIN_LOCK | BPF_LIST_HEAD | BPF_LIST_NODE, t->size); + if (IS_ERR_OR_NULL(record)) { + ret = PTR_ERR_OR_ZERO(record) ?: -EFAULT; + goto free; + } + foffs = btf_parse_field_offs(record); + if (WARN_ON_ONCE(IS_ERR_OR_NULL(foffs))) { + btf_record_free(record); + ret = -EFAULT; + goto free; + } + type->record = record; + type->field_offs = foffs; + tab->cnt++; + } + return tab; +free: + btf_struct_metas_free(tab); + return ERR_PTR(ret); +} + +struct btf_struct_meta *btf_find_struct_meta(const struct btf *btf, u32 btf_id) +{ + struct btf_struct_metas *tab; + + BUILD_BUG_ON(offsetof(struct btf_struct_meta, btf_id) != 0); + tab = btf->struct_meta_tab; + if (!tab) + return NULL; + return bsearch(&btf_id, tab->types, tab->cnt, sizeof(tab->types[0]), btf_id_cmp_func); +} + static int btf_check_type_tags(struct btf_verifier_env *env, struct btf *btf, int start_id) { @@ -5190,6 +5335,7 @@ static int btf_check_type_tags(struct btf_verifier_env *env, static struct btf *btf_parse(bpfptr_t btf_data, u32 btf_data_size, u32 log_level, char __user *log_ubuf, u32 log_size) { + struct btf_struct_metas *struct_meta_tab; struct btf_verifier_env *env = NULL; struct bpf_verifier_log *log; struct btf *btf = NULL; @@ -5258,15 +5404,24 @@ static struct btf *btf_parse(bpfptr_t btf_data, u32 btf_data_size, if (err) goto errout; + struct_meta_tab = btf_parse_struct_metas(log, btf); + if (IS_ERR(struct_meta_tab)) { + err = PTR_ERR(struct_meta_tab); + goto errout; + } + btf->struct_meta_tab = struct_meta_tab; + if (log->level && bpf_verifier_log_full(log)) { err = -ENOSPC; - goto errout; + goto errout_meta; } btf_verifier_env_free(env); refcount_set(&btf->refcnt, 1); return btf; +errout_meta: + btf_free_struct_meta_tab(btf); errout: btf_verifier_env_free(env); if (btf) @@ -6027,6 +6182,28 @@ int btf_struct_access(struct bpf_verifier_log *log, u32 id = reg->btf_id; int err; + while (type_is_local_kptr(reg->type)) { + struct btf_struct_meta *meta; + struct btf_record *rec; + int i; + + meta = btf_find_struct_meta(btf, id); + if (!meta) + break; + rec = meta->record; + for (i = 0; i < rec->cnt; i++) { + struct btf_field *field = &rec->fields[i]; + u32 offset = field->offset; + if (off < offset + btf_field_type_size(field->type) && offset < off + size) { + bpf_log(log, + "direct access to %s is disallowed\n", + btf_field_type_name(field->type)); + return -EACCES; + } + } + break; + } + t = btf_type_by_id(btf, id); do { err = btf_struct_walk(log, btf, t, off, size, &id, &tmp_flag); @@ -7268,23 +7445,6 @@ bool btf_is_module(const struct btf *btf) return btf->kernel_btf && strcmp(btf->name, "vmlinux") != 0; } -static int btf_id_cmp_func(const void *a, const void *b) -{ - const int *pa = a, *pb = b; - - return *pa - *pb; -} - -bool btf_id_set_contains(const struct btf_id_set *set, u32 id) -{ - return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL; -} - -static void *btf_id_set8_contains(const struct btf_id_set8 *set, u32 id) -{ - return bsearch(&id, set->pairs, set->cnt, sizeof(set->pairs[0]), btf_id_cmp_func); -} - enum { BTF_MODULE_F_LIVE = (1 << 0), }; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index fdbae52f463f..c96039a4e57f 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -537,6 +537,7 @@ void btf_record_free(struct btf_record *rec) btf_put(rec->fields[i].kptr.btf); break; case BPF_LIST_HEAD: + case BPF_LIST_NODE: /* Nothing to release for bpf_list_head */ break; default: @@ -582,6 +583,7 @@ struct btf_record *btf_record_dup(const struct btf_record *rec) } break; case BPF_LIST_HEAD: + case BPF_LIST_NODE: /* Nothing to acquire for bpf_list_head */ break; default: @@ -648,6 +650,8 @@ void bpf_obj_free_fields(const struct btf_record *rec, void *obj) continue; bpf_list_head_free(field, field_ptr, obj + rec->spin_lock_off); break; + case BPF_LIST_NODE: + break; default: WARN_ON_ONCE(1); continue;