diff mbox series

[bpf-next,v10,16/24] bpf: Introduce single ownership BPF linked list API

Message ID 20221118015614.2013203-17-memxor@gmail.com (mailing list archive)
State Accepted
Commit 8cab76ec634995e59a8b6346bf8b835ab7fad3a3
Delegated to: BPF
Headers show
Series Allocated objects, BPF linked lists | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-VM_Test-4 fail Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-9 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-7 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-8 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-PR fail merge-conflict
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count fail Series longer than 15 patches (and no cover letter)
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 19 this patch: 25
netdev/cc_maintainers warning 11 maintainers not CCed: sdf@google.com kpsingh@kernel.org mykolal@fb.com haoluo@google.com linux-kselftest@vger.kernel.org yhs@fb.com shuah@kernel.org jolsa@kernel.org martin.lau@linux.dev song@kernel.org john.fastabend@gmail.com
netdev/build_clang fail Errors and warnings before: 5 this patch: 8
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 19 this patch: 25
netdev/checkpatch warning CHECK: Unnecessary parentheses around 'insn->src_reg == BPF_PSEUDO_CALL' CHECK: extern prototypes should be avoided in .h files WARNING: line length of 100 exceeds 80 columns WARNING: line length of 102 exceeds 80 columns WARNING: line length of 103 exceeds 80 columns WARNING: line length of 109 exceeds 80 columns WARNING: line length of 110 exceeds 80 columns WARNING: line length of 112 exceeds 80 columns WARNING: line length of 118 exceeds 80 columns WARNING: line length of 81 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns WARNING: line length of 83 exceeds 80 columns WARNING: line length of 85 exceeds 80 columns WARNING: line length of 86 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns WARNING: line length of 91 exceeds 80 columns WARNING: line length of 92 exceeds 80 columns WARNING: line length of 93 exceeds 80 columns WARNING: line length of 94 exceeds 80 columns WARNING: line length of 95 exceeds 80 columns WARNING: line length of 97 exceeds 80 columns WARNING: line length of 99 exceeds 80 columns WARNING: quoted string split across lines
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Kumar Kartikeya Dwivedi Nov. 18, 2022, 1:56 a.m. UTC
Add a linked list API for use in BPF programs, where it expects
protection from the bpf_spin_lock in the same allocation as the
bpf_list_head. For now, only one bpf_spin_lock can be present hence that
is assumed to be the one protecting the bpf_list_head.

The following functions are added to kick things off:

// Add node to beginning of list
void bpf_list_push_front(struct bpf_list_head *head, struct bpf_list_node *node);

// Add node to end of list
void bpf_list_push_back(struct bpf_list_head *head, struct bpf_list_node *node);

// Remove node at beginning of list and return it
struct bpf_list_node *bpf_list_pop_front(struct bpf_list_head *head);

// Remove node at end of list and return it
struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head);

The lock protecting the bpf_list_head needs to be taken for all
operations. The verifier ensures that the lock that needs to be taken is
always held, and only the correct lock is taken for these operations.
These checks are made statically by relying on the reg->id preserved for
registers pointing into regions having both bpf_spin_lock and the
objects protected by it. The comment over check_reg_allocation_locked in
this change describes the logic in detail.

Note that bpf_list_push_front and bpf_list_push_back are meant to
consume the object containing the node in the 1st argument, however that
specific mechanism is intended to not release the ref_obj_id directly
until the bpf_spin_unlock is called. In this commit, nothing is done,
but the next commit will be introducing logic to handle this case, so it
has been left as is for now.

bpf_list_pop_front and bpf_list_pop_back delete the first or last item
of the list respectively, and return pointer to the element at the
list_node offset. The user can then use container_of style macro to get
the actual entry type. The verifier however statically knows the actual
type, so the safety properties are still preserved.

With these additions, programs can now manage their own linked lists and
store their objects in them.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/helpers.c                          |  55 +++-
 kernel/bpf/verifier.c                         | 275 +++++++++++++++++-
 .../testing/selftests/bpf/bpf_experimental.h  |  28 ++
 3 files changed, 349 insertions(+), 9 deletions(-)

Comments

Nathan Chancellor Nov. 21, 2022, 6:34 p.m. UTC | #1
Hi Kumar,

On Fri, Nov 18, 2022 at 07:26:06AM +0530, Kumar Kartikeya Dwivedi wrote:
> Add a linked list API for use in BPF programs, where it expects
> protection from the bpf_spin_lock in the same allocation as the
> bpf_list_head. For now, only one bpf_spin_lock can be present hence that
> is assumed to be the one protecting the bpf_list_head.
> 
> The following functions are added to kick things off:
> 
> // Add node to beginning of list
> void bpf_list_push_front(struct bpf_list_head *head, struct bpf_list_node *node);
> 
> // Add node to end of list
> void bpf_list_push_back(struct bpf_list_head *head, struct bpf_list_node *node);
> 
> // Remove node at beginning of list and return it
> struct bpf_list_node *bpf_list_pop_front(struct bpf_list_head *head);
> 
> // Remove node at end of list and return it
> struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head);
> 
> The lock protecting the bpf_list_head needs to be taken for all
> operations. The verifier ensures that the lock that needs to be taken is
> always held, and only the correct lock is taken for these operations.
> These checks are made statically by relying on the reg->id preserved for
> registers pointing into regions having both bpf_spin_lock and the
> objects protected by it. The comment over check_reg_allocation_locked in
> this change describes the logic in detail.
> 
> Note that bpf_list_push_front and bpf_list_push_back are meant to
> consume the object containing the node in the 1st argument, however that
> specific mechanism is intended to not release the ref_obj_id directly
> until the bpf_spin_unlock is called. In this commit, nothing is done,
> but the next commit will be introducing logic to handle this case, so it
> has been left as is for now.
> 
> bpf_list_pop_front and bpf_list_pop_back delete the first or last item
> of the list respectively, and return pointer to the element at the
> list_node offset. The user can then use container_of style macro to get
> the actual entry type. The verifier however statically knows the actual
> type, so the safety properties are still preserved.
> 
> With these additions, programs can now manage their own linked lists and
> store their objects in them.
> 
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

Apologies if this has already been reported or discussed somewhere else;
I did a search of the mailing list and did find a sparse report that
seems to be complaining about this same issue but it does not look like
you were cc'd on that [1].

This commit is now in -next as commit 8cab76ec6349 ("bpf: Introduce
single ownership BPF linked list API") and I just bisected a new series
of warnings I see with clang as starting with this change; Yonghong's
recent "bpf: Implement two type cast kfuncs" [2] added a couple more but
they start here. When CONFIG_DEBUG_INFO_BTF is disabled (as is the case
with allmodconfig), I see the following clang warnings at this change:

    ../kernel/bpf/verifier.c:8340:19: error: array index 5 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds]
                   btf_id == special_kfunc_list[KF_bpf_list_pop_back];
                             ^                  ~~~~~~~~~~~~~~~~~~~~
    ../kernel/bpf/verifier.c:8113:1: note: array 'special_kfunc_list' declared here
    BTF_ID_LIST(special_kfunc_list)
    ^
    ../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST'
    #define BTF_ID_LIST(name) static u32 __maybe_unused name[5];
                              ^
    ../kernel/bpf/verifier.c:8819:24: error: array index 5 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds]
                                       meta.func_id == special_kfunc_list[KF_bpf_list_pop_back]) {
                                                       ^                  ~~~~~~~~~~~~~~~~~~~~
    ../kernel/bpf/verifier.c:8113:1: note: array 'special_kfunc_list' declared here
    BTF_ID_LIST(special_kfunc_list)
    ^
    ../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST'
    #define BTF_ID_LIST(name) static u32 __maybe_unused name[5];
                              ^
    2 errors generated.

At ToT -next (next-20221121), I see:

    ../kernel/bpf/verifier.c:8196:23: error: array index 6 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds]
            if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx])
                                 ^                  ~~~~~~~~~~~~~~~~~~~~~~~
    ../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here
    BTF_ID_LIST(special_kfunc_list)
    ^
    ../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST'
    #define BTF_ID_LIST(name) static u32 __maybe_unused name[5];
                              ^
    ../kernel/bpf/verifier.c:8443:19: error: array index 5 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds]
                   btf_id == special_kfunc_list[KF_bpf_list_pop_back];
                             ^                  ~~~~~~~~~~~~~~~~~~~~
    ../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here
    BTF_ID_LIST(special_kfunc_list)
    ^
    ../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST'
    #define BTF_ID_LIST(name) static u32 __maybe_unused name[5];
                              ^
    ../kernel/bpf/verifier.c:8686:25: error: array index 6 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds]
                            if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) {
                                                 ^                  ~~~~~~~~~~~~~~~~~~~~~~~
    ../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here
    BTF_ID_LIST(special_kfunc_list)
    ^
    ../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST'
    #define BTF_ID_LIST(name) static u32 __maybe_unused name[5];
                              ^
    ../kernel/bpf/verifier.c:8938:24: error: array index 5 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds]
                                       meta.func_id == special_kfunc_list[KF_bpf_list_pop_back]) {
                                                       ^                  ~~~~~~~~~~~~~~~~~~~~
    ../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here
    BTF_ID_LIST(special_kfunc_list)
    ^
    ../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST'
    #define BTF_ID_LIST(name) static u32 __maybe_unused name[5];
                              ^
    ../kernel/bpf/verifier.c:8946:31: error: array index 6 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds]
                            } else if (meta.func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) {
                                                       ^                  ~~~~~~~~~~~~~~~~~~~~~~~
    ../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here
    BTF_ID_LIST(special_kfunc_list)
    ^
    ../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST'
    #define BTF_ID_LIST(name) static u32 __maybe_unused name[5];
                              ^
    ../kernel/bpf/verifier.c:8951:31: error: array index 7 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds]
                            } else if (meta.func_id == special_kfunc_list[KF_bpf_rdonly_cast]) {
                                                       ^                  ~~~~~~~~~~~~~~~~~~
    ../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here
    BTF_ID_LIST(special_kfunc_list)
    ^
    ../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST'
    #define BTF_ID_LIST(name) static u32 __maybe_unused name[5];
                              ^
    ../kernel/bpf/verifier.c:15216:30: error: array index 6 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds]
            } else if (desc->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx] ||
                                        ^                  ~~~~~~~~~~~~~~~~~~~~~~~
    ../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here
    BTF_ID_LIST(special_kfunc_list)
    ^
    ../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST'
    #define BTF_ID_LIST(name) static u32 __maybe_unused name[5];
                              ^
    ../kernel/bpf/verifier.c:15217:23: error: array index 7 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds]
                       desc->func_id == special_kfunc_list[KF_bpf_rdonly_cast]) {
                                        ^                  ~~~~~~~~~~~~~~~~~~
    ../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here
    BTF_ID_LIST(special_kfunc_list)
    ^
    ../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST'
    #define BTF_ID_LIST(name) static u32 __maybe_unused name[5];
                              ^
    8 errors generated.

Somewhat surprisingly, I do not see these with GCC 11 (-Warray-bounds is
explicitly disabled for GCC 12) but I do not see how they could not be
legitimate, unless all this code gets optimized away when
CONFIG_DEBUG_INFO_BTF is disabled, which I could be missing. Any reason
that the fix is not something like:

diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
index c9744efd202f..71d0ce707744 100644
--- a/include/linux/btf_ids.h
+++ b/include/linux/btf_ids.h
@@ -204,7 +204,7 @@ extern struct btf_id_set8 name;
 
 #else
 
-#define BTF_ID_LIST(name) static u32 __maybe_unused name[5];
+#define BTF_ID_LIST(name) static u32 __maybe_unused name[8];
 #define BTF_ID(prefix, name)
 #define BTF_ID_FLAGS(prefix, name, ...)
 #define BTF_ID_UNUSED

I suppose that there should probably be some sort of assertion to catch
this discrepancy in the future.

Cheers,
Nathan

[1]: https://lore.kernel.org/202211190612.qFDcJqqt-lkp@intel.com/
[2]: https://lore.kernel.org/20221120195421.3112414-1-yhs@fb.com/

# bad: [e4cd8d3ff7f9efeb97330e5e9b99eeb2a68f5cf9] Add linux-next specific files for 20221121
# good: [894909f95aa1473f49f767dcd5750ba152b85e13] Merge tag 'x86_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect start 'e4cd8d3ff7f9efeb97330e5e9b99eeb2a68f5cf9' '894909f95aa1473f49f767dcd5750ba152b85e13'
# bad: [42c3120ba7547d18e7788707e67ff01dc220f09b] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
git bisect bad 42c3120ba7547d18e7788707e67ff01dc220f09b
# good: [ceda86b061181705d751fce5971da7da32aa9afc] Merge branch 'master' of https://github.com/Paragon-Software-Group/linux-ntfs3.git
git bisect good ceda86b061181705d751fce5971da7da32aa9afc
# good: [c609d739947894d7370eae4cf04eb2c49e910bcf] Merge tag 'wireless-next-2022-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
git bisect good c609d739947894d7370eae4cf04eb2c49e910bcf
# good: [4b39aeeaec47f5cea7f6ab76fa2da428e4bf5c7e] Merge branch 'master' of git://linuxtv.org/media_tree.git
git bisect good 4b39aeeaec47f5cea7f6ab76fa2da428e4bf5c7e
# good: [4c071062e639f74d95a229b5efc22cb1c9ed3d49] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
git bisect good 4c071062e639f74d95a229b5efc22cb1c9ed3d49
# bad: [8d72bd74ae946456865574a0390979af5aa74bcb] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git
git bisect bad 8d72bd74ae946456865574a0390979af5aa74bcb
# bad: [e181d3f143f7957a73c8365829249d8084602606] bpf: Disallow bpf_obj_new_impl call when bpf_mem_alloc_init fails
git bisect bad e181d3f143f7957a73c8365829249d8084602606
# good: [f73e601aafb2ad9f2b2012b969f86f4a41141a7d] bpf: Populate field_offs for inner_map_meta
git bisect good f73e601aafb2ad9f2b2012b969f86f4a41141a7d
# bad: [64069c72b4b8e44f6876249cc8f2e2ee4d209a93] selftests/bpf: Add __contains macro to bpf_experimental.h
git bisect bad 64069c72b4b8e44f6876249cc8f2e2ee4d209a93
# good: [00b85860feb809852af9a88cb4ca8766d7dff6a3] bpf: Rewrite kfunc argument handling
git bisect good 00b85860feb809852af9a88cb4ca8766d7dff6a3
# good: [df57f38a0d081f05ec48ea5aa7ca0564918ed915] bpf: Permit NULL checking pointer with non-zero fixed offset
git bisect good df57f38a0d081f05ec48ea5aa7ca0564918ed915
# bad: [534e86bc6c66e1e0c798a1c0a6a680bb231c08db] bpf: Add 'release on unlock' logic for bpf_list_push_{front,back}
git bisect bad 534e86bc6c66e1e0c798a1c0a6a680bb231c08db
# bad: [8cab76ec634995e59a8b6346bf8b835ab7fad3a3] bpf: Introduce single ownership BPF linked list API
git bisect bad 8cab76ec634995e59a8b6346bf8b835ab7fad3a3
# first bad commit: [8cab76ec634995e59a8b6346bf8b835ab7fad3a3] bpf: Introduce single ownership BPF linked list API
diff mbox series

Patch

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 71d803ca0c1d..212e791d7452 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1780,6 +1780,50 @@  void bpf_obj_drop_impl(void *p__alloc, void *meta__ign)
 	bpf_mem_free(&bpf_global_ma, p);
 }
 
+static void __bpf_list_add(struct bpf_list_node *node, struct bpf_list_head *head, bool tail)
+{
+	struct list_head *n = (void *)node, *h = (void *)head;
+
+	if (unlikely(!h->next))
+		INIT_LIST_HEAD(h);
+	if (unlikely(!n->next))
+		INIT_LIST_HEAD(n);
+	tail ? list_add_tail(n, h) : list_add(n, h);
+}
+
+void bpf_list_push_front(struct bpf_list_head *head, struct bpf_list_node *node)
+{
+	return __bpf_list_add(node, head, false);
+}
+
+void bpf_list_push_back(struct bpf_list_head *head, struct bpf_list_node *node)
+{
+	return __bpf_list_add(node, head, true);
+}
+
+static struct bpf_list_node *__bpf_list_del(struct bpf_list_head *head, bool tail)
+{
+	struct list_head *n, *h = (void *)head;
+
+	if (unlikely(!h->next))
+		INIT_LIST_HEAD(h);
+	if (list_empty(h))
+		return NULL;
+	n = tail ? h->prev : h->next;
+	list_del_init(n);
+	return (struct bpf_list_node *)n;
+}
+
+struct bpf_list_node *bpf_list_pop_front(struct bpf_list_head *head)
+{
+	return __bpf_list_del(head, false);
+}
+
+struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head)
+{
+	return __bpf_list_del(head, true);
+}
+
 __diag_pop();
 
 BTF_SET8_START(generic_btf_ids)
@@ -1788,6 +1832,10 @@  BTF_ID_FLAGS(func, crash_kexec, KF_DESTRUCTIVE)
 #endif
 BTF_ID_FLAGS(func, bpf_obj_new_impl, KF_ACQUIRE | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_obj_drop_impl, KF_RELEASE)
+BTF_ID_FLAGS(func, bpf_list_push_front)
+BTF_ID_FLAGS(func, bpf_list_push_back)
+BTF_ID_FLAGS(func, bpf_list_pop_front, KF_ACQUIRE | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_list_pop_back, KF_ACQUIRE | KF_RET_NULL)
 BTF_SET8_END(generic_btf_ids)
 
 static const struct btf_kfunc_id_set generic_kfunc_set = {
@@ -1797,7 +1845,12 @@  static const struct btf_kfunc_id_set generic_kfunc_set = {
 
 static int __init kfunc_init(void)
 {
-	return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &generic_kfunc_set);
+	int ret;
+
+	ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &generic_kfunc_set);
+	if (ret)
+		return ret;
+	return register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &generic_kfunc_set);
 }
 
 late_initcall(kfunc_init);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 84798773b592..00d3122086c2 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7883,6 +7883,9 @@  struct bpf_kfunc_call_arg_meta {
 		struct btf *btf;
 		u32 btf_id;
 	} arg_obj_drop;
+	struct {
+		struct btf_field *field;
+	} arg_list_head;
 };
 
 static bool is_kfunc_acquire(struct bpf_kfunc_call_arg_meta *meta)
@@ -7987,13 +7990,17 @@  static bool is_kfunc_arg_scalar_with_name(const struct btf *btf,
 
 enum {
 	KF_ARG_DYNPTR_ID,
+	KF_ARG_LIST_HEAD_ID,
+	KF_ARG_LIST_NODE_ID,
 };
 
 BTF_ID_LIST(kf_arg_btf_ids)
 BTF_ID(struct, bpf_dynptr_kern)
+BTF_ID(struct, bpf_list_head)
+BTF_ID(struct, bpf_list_node)
 
-static bool is_kfunc_arg_dynptr(const struct btf *btf,
-				const struct btf_param *arg)
+static bool __is_kfunc_ptr_arg_type(const struct btf *btf,
+				    const struct btf_param *arg, int type)
 {
 	const struct btf_type *t;
 	u32 res_id;
@@ -8006,7 +8013,22 @@  static bool is_kfunc_arg_dynptr(const struct btf *btf,
 	t = btf_type_skip_modifiers(btf, t->type, &res_id);
 	if (!t)
 		return false;
-	return btf_types_are_same(btf, res_id, btf_vmlinux, kf_arg_btf_ids[KF_ARG_DYNPTR_ID]);
+	return btf_types_are_same(btf, res_id, btf_vmlinux, kf_arg_btf_ids[type]);
+}
+
+static bool is_kfunc_arg_dynptr(const struct btf *btf, const struct btf_param *arg)
+{
+	return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_DYNPTR_ID);
+}
+
+static bool is_kfunc_arg_list_head(const struct btf *btf, const struct btf_param *arg)
+{
+	return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_LIST_HEAD_ID);
+}
+
+static bool is_kfunc_arg_list_node(const struct btf *btf, const struct btf_param *arg)
+{
+	return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_LIST_NODE_ID);
 }
 
 /* Returns true if struct is composed of scalars, 4 levels of nesting allowed */
@@ -8063,6 +8085,8 @@  enum kfunc_ptr_arg_type {
 	KF_ARG_PTR_TO_ALLOC_BTF_ID,  /* Allocated object */
 	KF_ARG_PTR_TO_KPTR,	     /* PTR_TO_KPTR but type specific */
 	KF_ARG_PTR_TO_DYNPTR,
+	KF_ARG_PTR_TO_LIST_HEAD,
+	KF_ARG_PTR_TO_LIST_NODE,
 	KF_ARG_PTR_TO_BTF_ID,	     /* Also covers reg2btf_ids conversions */
 	KF_ARG_PTR_TO_MEM,
 	KF_ARG_PTR_TO_MEM_SIZE,	     /* Size derived from next argument, skip it */
@@ -8071,16 +8095,28 @@  enum kfunc_ptr_arg_type {
 enum special_kfunc_type {
 	KF_bpf_obj_new_impl,
 	KF_bpf_obj_drop_impl,
+	KF_bpf_list_push_front,
+	KF_bpf_list_push_back,
+	KF_bpf_list_pop_front,
+	KF_bpf_list_pop_back,
 };
 
 BTF_SET_START(special_kfunc_set)
 BTF_ID(func, bpf_obj_new_impl)
 BTF_ID(func, bpf_obj_drop_impl)
+BTF_ID(func, bpf_list_push_front)
+BTF_ID(func, bpf_list_push_back)
+BTF_ID(func, bpf_list_pop_front)
+BTF_ID(func, bpf_list_pop_back)
 BTF_SET_END(special_kfunc_set)
 
 BTF_ID_LIST(special_kfunc_list)
 BTF_ID(func, bpf_obj_new_impl)
 BTF_ID(func, bpf_obj_drop_impl)
+BTF_ID(func, bpf_list_push_front)
+BTF_ID(func, bpf_list_push_back)
+BTF_ID(func, bpf_list_pop_front)
+BTF_ID(func, bpf_list_pop_back)
 
 static enum kfunc_ptr_arg_type
 get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
@@ -8123,6 +8159,12 @@  get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
 	if (is_kfunc_arg_dynptr(meta->btf, &args[argno]))
 		return KF_ARG_PTR_TO_DYNPTR;
 
+	if (is_kfunc_arg_list_head(meta->btf, &args[argno]))
+		return KF_ARG_PTR_TO_LIST_HEAD;
+
+	if (is_kfunc_arg_list_node(meta->btf, &args[argno]))
+		return KF_ARG_PTR_TO_LIST_NODE;
+
 	if ((base_type(reg->type) == PTR_TO_BTF_ID || reg2btf_ids[base_type(reg->type)])) {
 		if (!btf_type_is_struct(ref_t)) {
 			verbose(env, "kernel function %s args#%d pointer type %s %s is not supported\n",
@@ -8218,6 +8260,182 @@  static int process_kf_arg_ptr_to_kptr_strong(struct bpf_verifier_env *env,
 	return 0;
 }
 
+/* Implementation details:
+ *
+ * Each register points to some region of memory, which we define as an
+ * allocation. Each allocation may embed a bpf_spin_lock which protects any
+ * special BPF objects (bpf_list_head, bpf_rb_root, etc.) part of the same
+ * allocation. The lock and the data it protects are colocated in the same
+ * memory region.
+ *
+ * Hence, everytime a register holds a pointer value pointing to such
+ * allocation, the verifier preserves a unique reg->id for it.
+ *
+ * The verifier remembers the lock 'ptr' and the lock 'id' whenever
+ * bpf_spin_lock is called.
+ *
+ * To enable this, lock state in the verifier captures two values:
+ *	active_lock.ptr = Register's type specific pointer
+ *	active_lock.id  = A unique ID for each register pointer value
+ *
+ * Currently, PTR_TO_MAP_VALUE and PTR_TO_BTF_ID | MEM_ALLOC are the two
+ * supported register types.
+ *
+ * The active_lock.ptr in case of map values is the reg->map_ptr, and in case of
+ * allocated objects is the reg->btf pointer.
+ *
+ * The active_lock.id is non-unique for maps supporting direct_value_addr, as we
+ * can establish the provenance of the map value statically for each distinct
+ * lookup into such maps. They always contain a single map value hence unique
+ * IDs for each pseudo load pessimizes the algorithm and rejects valid programs.
+ *
+ * So, in case of global variables, they use array maps with max_entries = 1,
+ * hence their active_lock.ptr becomes map_ptr and id = 0 (since they all point
+ * into the same map value as max_entries is 1, as described above).
+ *
+ * In case of inner map lookups, the inner map pointer has same map_ptr as the
+ * outer map pointer (in verifier context), but each lookup into an inner map
+ * assigns a fresh reg->id to the lookup, so while lookups into distinct inner
+ * maps from the same outer map share the same map_ptr as active_lock.ptr, they
+ * will get different reg->id assigned to each lookup, hence different
+ * active_lock.id.
+ *
+ * In case of allocated objects, active_lock.ptr is the reg->btf, and the
+ * reg->id is a unique ID preserved after the NULL pointer check on the pointer
+ * returned from bpf_obj_new. Each allocation receives a new reg->id.
+ */
+static int check_reg_allocation_locked(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
+{
+	void *ptr;
+	u32 id;
+
+	switch ((int)reg->type) {
+	case PTR_TO_MAP_VALUE:
+		ptr = reg->map_ptr;
+		break;
+	case PTR_TO_BTF_ID | MEM_ALLOC:
+		ptr = reg->btf;
+		break;
+	default:
+		verbose(env, "verifier internal error: unknown reg type for lock check\n");
+		return -EFAULT;
+	}
+	id = reg->id;
+
+	if (!env->cur_state->active_lock.ptr)
+		return -EINVAL;
+	if (env->cur_state->active_lock.ptr != ptr ||
+	    env->cur_state->active_lock.id != id) {
+		verbose(env, "held lock and object are not in the same allocation\n");
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static bool is_bpf_list_api_kfunc(u32 btf_id)
+{
+	return btf_id == special_kfunc_list[KF_bpf_list_push_front] ||
+	       btf_id == special_kfunc_list[KF_bpf_list_push_back] ||
+	       btf_id == special_kfunc_list[KF_bpf_list_pop_front] ||
+	       btf_id == special_kfunc_list[KF_bpf_list_pop_back];
+}
+
+static int process_kf_arg_ptr_to_list_head(struct bpf_verifier_env *env,
+					   struct bpf_reg_state *reg, u32 regno,
+					   struct bpf_kfunc_call_arg_meta *meta)
+{
+	struct btf_field *field;
+	struct btf_record *rec;
+	u32 list_head_off;
+
+	if (meta->btf != btf_vmlinux || !is_bpf_list_api_kfunc(meta->func_id)) {
+		verbose(env, "verifier internal error: bpf_list_head argument for unknown kfunc\n");
+		return -EFAULT;
+	}
+
+	if (!tnum_is_const(reg->var_off)) {
+		verbose(env,
+			"R%d doesn't have constant offset. bpf_list_head has to be at the constant offset\n",
+			regno);
+		return -EINVAL;
+	}
+
+	rec = reg_btf_record(reg);
+	list_head_off = reg->off + reg->var_off.value;
+	field = btf_record_find(rec, list_head_off, BPF_LIST_HEAD);
+	if (!field) {
+		verbose(env, "bpf_list_head not found at offset=%u\n", list_head_off);
+		return -EINVAL;
+	}
+
+	/* All functions require bpf_list_head to be protected using a bpf_spin_lock */
+	if (check_reg_allocation_locked(env, reg)) {
+		verbose(env, "bpf_spin_lock at off=%d must be held for bpf_list_head\n",
+			rec->spin_lock_off);
+		return -EINVAL;
+	}
+
+	if (meta->arg_list_head.field) {
+		verbose(env, "verifier internal error: repeating bpf_list_head arg\n");
+		return -EFAULT;
+	}
+	meta->arg_list_head.field = field;
+	return 0;
+}
+
+static int process_kf_arg_ptr_to_list_node(struct bpf_verifier_env *env,
+					   struct bpf_reg_state *reg, u32 regno,
+					   struct bpf_kfunc_call_arg_meta *meta)
+{
+	const struct btf_type *et, *t;
+	struct btf_field *field;
+	struct btf_record *rec;
+	u32 list_node_off;
+
+	if (meta->btf != btf_vmlinux ||
+	    (meta->func_id != special_kfunc_list[KF_bpf_list_push_front] &&
+	     meta->func_id != special_kfunc_list[KF_bpf_list_push_back])) {
+		verbose(env, "verifier internal error: bpf_list_node argument for unknown kfunc\n");
+		return -EFAULT;
+	}
+
+	if (!tnum_is_const(reg->var_off)) {
+		verbose(env,
+			"R%d doesn't have constant offset. bpf_list_node has to be at the constant offset\n",
+			regno);
+		return -EINVAL;
+	}
+
+	rec = reg_btf_record(reg);
+	list_node_off = reg->off + reg->var_off.value;
+	field = btf_record_find(rec, list_node_off, BPF_LIST_NODE);
+	if (!field || field->offset != list_node_off) {
+		verbose(env, "bpf_list_node not found at offset=%u\n", list_node_off);
+		return -EINVAL;
+	}
+
+	field = meta->arg_list_head.field;
+
+	et = btf_type_by_id(field->list_head.btf, field->list_head.value_btf_id);
+	t = btf_type_by_id(reg->btf, reg->btf_id);
+	if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, 0, field->list_head.btf,
+				  field->list_head.value_btf_id, true)) {
+		verbose(env, "operation on bpf_list_head expects arg#1 bpf_list_node at offset=%d "
+			"in struct %s, but arg is at offset=%d in struct %s\n",
+			field->list_head.node_offset, btf_name_by_offset(field->list_head.btf, et->name_off),
+			list_node_off, btf_name_by_offset(reg->btf, t->name_off));
+		return -EINVAL;
+	}
+
+	if (list_node_off != field->list_head.node_offset) {
+		verbose(env, "arg#1 offset=%d, but expected bpf_list_node at offset=%d in struct %s\n",
+			list_node_off, field->list_head.node_offset,
+			btf_name_by_offset(field->list_head.btf, et->name_off));
+		return -EINVAL;
+	}
+	return 0;
+}
+
 static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_arg_meta *meta)
 {
 	const char *func_name = meta->func_name, *ref_tname;
@@ -8336,6 +8554,8 @@  static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 			break;
 		case KF_ARG_PTR_TO_KPTR:
 		case KF_ARG_PTR_TO_DYNPTR:
+		case KF_ARG_PTR_TO_LIST_HEAD:
+		case KF_ARG_PTR_TO_LIST_NODE:
 		case KF_ARG_PTR_TO_MEM:
 		case KF_ARG_PTR_TO_MEM_SIZE:
 			/* Trusted by default */
@@ -8400,6 +8620,33 @@  static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 				return -EINVAL;
 			}
 			break;
+		case KF_ARG_PTR_TO_LIST_HEAD:
+			if (reg->type != PTR_TO_MAP_VALUE &&
+			    reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) {
+				verbose(env, "arg#%d expected pointer to map value or allocated object\n", i);
+				return -EINVAL;
+			}
+			if (reg->type == (PTR_TO_BTF_ID | MEM_ALLOC) && !reg->ref_obj_id) {
+				verbose(env, "allocated object must be referenced\n");
+				return -EINVAL;
+			}
+			ret = process_kf_arg_ptr_to_list_head(env, reg, regno, meta);
+			if (ret < 0)
+				return ret;
+			break;
+		case KF_ARG_PTR_TO_LIST_NODE:
+			if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) {
+				verbose(env, "arg#%d expected pointer to allocated object\n", i);
+				return -EINVAL;
+			}
+			if (!reg->ref_obj_id) {
+				verbose(env, "allocated object must be referenced\n");
+				return -EINVAL;
+			}
+			ret = process_kf_arg_ptr_to_list_node(env, reg, regno, meta);
+			if (ret < 0)
+				return ret;
+			break;
 		case KF_ARG_PTR_TO_BTF_ID:
 			/* Only base_type is checked, further checks are done here */
 			if (reg->type != PTR_TO_BTF_ID &&
@@ -8568,6 +8815,15 @@  static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 				env->insn_aux_data[insn_idx].kptr_struct_meta =
 					btf_find_struct_meta(meta.arg_obj_drop.btf,
 							     meta.arg_obj_drop.btf_id);
+			} else if (meta.func_id == special_kfunc_list[KF_bpf_list_pop_front] ||
+				   meta.func_id == special_kfunc_list[KF_bpf_list_pop_back]) {
+				struct btf_field *field = meta.arg_list_head.field;
+
+				mark_reg_known_zero(env, regs, BPF_REG_0);
+				regs[BPF_REG_0].type = PTR_TO_BTF_ID | MEM_ALLOC;
+				regs[BPF_REG_0].btf = field->list_head.btf;
+				regs[BPF_REG_0].btf_id = field->list_head.value_btf_id;
+				regs[BPF_REG_0].off = field->list_head.node_offset;
 			} else {
 				verbose(env, "kernel function %s unhandled dynamic return type\n",
 					meta.func_name);
@@ -13264,11 +13520,14 @@  static int do_check(struct bpf_verifier_env *env)
 					return -EINVAL;
 				}
 
-				if (env->cur_state->active_lock.ptr &&
-				    (insn->src_reg == BPF_PSEUDO_CALL ||
-				     insn->imm != BPF_FUNC_spin_unlock)) {
-					verbose(env, "function calls are not allowed while holding a lock\n");
-					return -EINVAL;
+				if (env->cur_state->active_lock.ptr) {
+					if ((insn->src_reg == BPF_REG_0 && insn->imm != BPF_FUNC_spin_unlock) ||
+					    (insn->src_reg == BPF_PSEUDO_CALL) ||
+					    (insn->src_reg == BPF_PSEUDO_KFUNC_CALL &&
+					     (insn->off != 0 || !is_bpf_list_api_kfunc(insn->imm)))) {
+						verbose(env, "function calls are not allowed while holding a lock\n");
+						return -EINVAL;
+					}
 				}
 				if (insn->src_reg == BPF_PSEUDO_CALL)
 					err = check_func_call(env, insn, &env->insn_idx);
diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h
index 8473395a11af..d6b143275e82 100644
--- a/tools/testing/selftests/bpf/bpf_experimental.h
+++ b/tools/testing/selftests/bpf/bpf_experimental.h
@@ -35,4 +35,32 @@  extern void bpf_obj_drop_impl(void *kptr, void *meta) __ksym;
 /* Convenience macro to wrap over bpf_obj_drop_impl */
 #define bpf_obj_drop(kptr) bpf_obj_drop_impl(kptr, NULL)
 
+/* Description
+ *	Add a new entry to the beginning of the BPF linked list.
+ * Returns
+ *	Void.
+ */
+extern void bpf_list_push_front(struct bpf_list_head *head, struct bpf_list_node *node) __ksym;
+
+/* Description
+ *	Add a new entry to the end of the BPF linked list.
+ * Returns
+ *	Void.
+ */
+extern void bpf_list_push_back(struct bpf_list_head *head, struct bpf_list_node *node) __ksym;
+
+/* Description
+ *	Remove the entry at the beginning of the BPF linked list.
+ * Returns
+ *	Pointer to bpf_list_node of deleted entry, or NULL if list is empty.
+ */
+extern struct bpf_list_node *bpf_list_pop_front(struct bpf_list_head *head) __ksym;
+
+/* Description
+ *	Remove the entry at the end of the BPF linked list.
+ * Returns
+ *	Pointer to bpf_list_node of deleted entry, or NULL if list is empty.
+ */
+extern struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head) __ksym;
+
 #endif