diff mbox series

[bpf-next,v5,10/25] bpf: Allow locking bpf_spin_lock global variables

Message ID 20221107230950.7117-11-memxor@gmail.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series Local kptrs, BPF linked lists | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR pending PR summary
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count fail Series longer than 15 patches (and no cover letter)
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 67 this patch: 66
netdev/cc_maintainers warning 8 maintainers not CCed: sdf@google.com kpsingh@kernel.org haoluo@google.com yhs@fb.com jolsa@kernel.org martin.lau@linux.dev song@kernel.org john.fastabend@gmail.com
netdev/build_clang success Errors and warnings before: 19 this patch: 17
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 67 this patch: 66
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-2 fail Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-1 pending Logs for ${{ matrix.test }} on ${{ matrix.arch }} with ${{ matrix.toolchain }}
bpf/vmtest-bpf-next-VM_Test-3 fail Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 fail Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-5 fail Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-6 fail Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 fail Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-8 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-9 success Logs for set-matrix

Commit Message

Kumar Kartikeya Dwivedi Nov. 7, 2022, 11:09 p.m. UTC
Global variables reside in maps accessible using direct_value_addr
callbacks, so giving each load instruction's rewrite a unique reg->id
disallows us from holding locks which are global.

The reason for preserving reg->id as a unique value for registers that
may point to spin lock is that two separate lookups are treated as two
separate memory regions, and any possible aliasing is ignored for the
purposes of spin lock correctness.

This is not great especially for the global variable case, which are
served from maps that have max_entries == 1, i.e. they always lead to
map values pointing into the same map value.

So refactor the active_spin_lock into a 'active_lock' structure which
represents the lock identity, and instead of the reg->id, remember two
fields, a pointer and the reg->id. The pointer will store reg->map_ptr
or reg->btf. It's only necessary to distinguish for the id == 0 case of
global variables, but always setting the pointer to a non-NULL value and
using the pointer to check whether the lock is held simplifies code in
the verifier.

This is generic enough to allow it for global variables, map lookups,
and local kptr registers at the same time.

Note that while whether a lock is held can be answered by just comparing
active_lock.ptr to NULL, to determine whether the register is pointing
to the same held lock requires comparing _both_ ptr and id.

Finally, as a result of this refactoring, pseudo load instructions are
not given a unique reg->id, as they are doing lookup for the same map
value (max_entries is never greater than 1).

Essentially, we consider that the tuple of (ptr, id) will always be
unique for any kind of argument to bpf_spin_{lock,unlock}.

Note that this can be extended in the future to also remember offset
used for locking, so that we can introduce multiple bpf_spin_lock fields
in the same allocation.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf_verifier.h |  5 ++++-
 kernel/bpf/verifier.c        | 41 ++++++++++++++++++++++++------------
 2 files changed, 32 insertions(+), 14 deletions(-)

Comments

Andrii Nakryiko Nov. 8, 2022, 11:37 p.m. UTC | #1
On Mon, Nov 7, 2022 at 3:10 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>
> Global variables reside in maps accessible using direct_value_addr
> callbacks, so giving each load instruction's rewrite a unique reg->id
> disallows us from holding locks which are global.
>
> The reason for preserving reg->id as a unique value for registers that
> may point to spin lock is that two separate lookups are treated as two
> separate memory regions, and any possible aliasing is ignored for the
> purposes of spin lock correctness.
>
> This is not great especially for the global variable case, which are
> served from maps that have max_entries == 1, i.e. they always lead to
> map values pointing into the same map value.
>
> So refactor the active_spin_lock into a 'active_lock' structure which
> represents the lock identity, and instead of the reg->id, remember two
> fields, a pointer and the reg->id. The pointer will store reg->map_ptr
> or reg->btf. It's only necessary to distinguish for the id == 0 case of
> global variables, but always setting the pointer to a non-NULL value and
> using the pointer to check whether the lock is held simplifies code in
> the verifier.
>
> This is generic enough to allow it for global variables, map lookups,
> and local kptr registers at the same time.
>
> Note that while whether a lock is held can be answered by just comparing
> active_lock.ptr to NULL, to determine whether the register is pointing
> to the same held lock requires comparing _both_ ptr and id.
>
> Finally, as a result of this refactoring, pseudo load instructions are
> not given a unique reg->id, as they are doing lookup for the same map
> value (max_entries is never greater than 1).
>
> Essentially, we consider that the tuple of (ptr, id) will always be
> unique for any kind of argument to bpf_spin_{lock,unlock}.
>
> Note that this can be extended in the future to also remember offset
> used for locking, so that we can introduce multiple bpf_spin_lock fields
> in the same allocation.
>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  include/linux/bpf_verifier.h |  5 ++++-
>  kernel/bpf/verifier.c        | 41 ++++++++++++++++++++++++------------
>  2 files changed, 32 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 1a32baa78ce2..70cccac62a15 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -323,7 +323,10 @@ struct bpf_verifier_state {
>         u32 branches;
>         u32 insn_idx;
>         u32 curframe;
> -       u32 active_spin_lock;
> +       struct {
> +               void *ptr;

document that this could be either struct bpf_map or struct btf
pointer, at least?

> +               u32 id;
> +       } active_lock;
>         bool speculative;
>
>         /* first and last insn idx of this verifier state */

[...]
Kumar Kartikeya Dwivedi Nov. 9, 2022, 12:03 a.m. UTC | #2
On Wed, Nov 09, 2022 at 05:07:44AM IST, Andrii Nakryiko wrote:
> On Mon, Nov 7, 2022 at 3:10 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> >
> > Global variables reside in maps accessible using direct_value_addr
> > callbacks, so giving each load instruction's rewrite a unique reg->id
> > disallows us from holding locks which are global.
> >
> > The reason for preserving reg->id as a unique value for registers that
> > may point to spin lock is that two separate lookups are treated as two
> > separate memory regions, and any possible aliasing is ignored for the
> > purposes of spin lock correctness.
> >
> > This is not great especially for the global variable case, which are
> > served from maps that have max_entries == 1, i.e. they always lead to
> > map values pointing into the same map value.
> >
> > So refactor the active_spin_lock into a 'active_lock' structure which
> > represents the lock identity, and instead of the reg->id, remember two
> > fields, a pointer and the reg->id. The pointer will store reg->map_ptr
> > or reg->btf. It's only necessary to distinguish for the id == 0 case of
> > global variables, but always setting the pointer to a non-NULL value and
> > using the pointer to check whether the lock is held simplifies code in
> > the verifier.
> >
> > This is generic enough to allow it for global variables, map lookups,
> > and local kptr registers at the same time.
> >
> > Note that while whether a lock is held can be answered by just comparing
> > active_lock.ptr to NULL, to determine whether the register is pointing
> > to the same held lock requires comparing _both_ ptr and id.
> >
> > Finally, as a result of this refactoring, pseudo load instructions are
> > not given a unique reg->id, as they are doing lookup for the same map
> > value (max_entries is never greater than 1).
> >
> > Essentially, we consider that the tuple of (ptr, id) will always be
> > unique for any kind of argument to bpf_spin_{lock,unlock}.
> >
> > Note that this can be extended in the future to also remember offset
> > used for locking, so that we can introduce multiple bpf_spin_lock fields
> > in the same allocation.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  include/linux/bpf_verifier.h |  5 ++++-
> >  kernel/bpf/verifier.c        | 41 ++++++++++++++++++++++++------------
> >  2 files changed, 32 insertions(+), 14 deletions(-)
> >
> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index 1a32baa78ce2..70cccac62a15 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> > @@ -323,7 +323,10 @@ struct bpf_verifier_state {
> >         u32 branches;
> >         u32 insn_idx;
> >         u32 curframe;
> > -       u32 active_spin_lock;
> > +       struct {
> > +               void *ptr;
>
> document that this could be either struct bpf_map or struct btf
> pointer, at least?
>
Ack, I'll add a comment.

Though it's not really meant to be used (i.e. turned back into a pointer to
them), it's just an 'identity' pointer.
diff mbox series

Patch

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 1a32baa78ce2..70cccac62a15 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -323,7 +323,10 @@  struct bpf_verifier_state {
 	u32 branches;
 	u32 insn_idx;
 	u32 curframe;
-	u32 active_spin_lock;
+	struct {
+		void *ptr;
+		u32 id;
+	} active_lock;
 	bool speculative;
 
 	/* first and last insn idx of this verifier state */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f1170e9db699..281a6a04a0d8 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1210,7 +1210,8 @@  static int copy_verifier_state(struct bpf_verifier_state *dst_state,
 	}
 	dst_state->speculative = src->speculative;
 	dst_state->curframe = src->curframe;
-	dst_state->active_spin_lock = src->active_spin_lock;
+	dst_state->active_lock.ptr = src->active_lock.ptr;
+	dst_state->active_lock.id = src->active_lock.id;
 	dst_state->branches = src->branches;
 	dst_state->parent = src->parent;
 	dst_state->first_insn_idx = src->first_insn_idx;
@@ -5582,7 +5583,7 @@  int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state
  * Since only one bpf_spin_lock is allowed the checks are simpler than
  * reg_is_refcounted() logic. The verifier needs to remember only
  * one spin_lock instead of array of acquired_refs.
- * cur_state->active_spin_lock remembers which map value element got locked
+ * cur_state->active_lock remembers which map value element got locked
  * and clears it after bpf_spin_unlock.
  */
 static int process_spin_lock(struct bpf_verifier_env *env, int regno,
@@ -5631,22 +5632,35 @@  static int process_spin_lock(struct bpf_verifier_env *env, int regno,
 		return -EINVAL;
 	}
 	if (is_lock) {
-		if (cur->active_spin_lock) {
+		if (cur->active_lock.ptr) {
 			verbose(env,
 				"Locking two bpf_spin_locks are not allowed\n");
 			return -EINVAL;
 		}
-		cur->active_spin_lock = reg->id;
+		if (map)
+			cur->active_lock.ptr = map;
+		else
+			cur->active_lock.ptr = btf;
+		cur->active_lock.id = reg->id;
 	} else {
-		if (!cur->active_spin_lock) {
+		void *ptr;
+
+		if (map)
+			ptr = map;
+		else
+			ptr = btf;
+
+		if (!cur->active_lock.ptr) {
 			verbose(env, "bpf_spin_unlock without taking a lock\n");
 			return -EINVAL;
 		}
-		if (cur->active_spin_lock != reg->id) {
+		if (cur->active_lock.ptr != ptr ||
+		    cur->active_lock.id != reg->id) {
 			verbose(env, "bpf_spin_unlock of different lock\n");
 			return -EINVAL;
 		}
-		cur->active_spin_lock = 0;
+		cur->active_lock.ptr = NULL;
+		cur->active_lock.id = 0;
 	}
 	return 0;
 }
@@ -10573,8 +10587,8 @@  static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
 	    insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE) {
 		dst_reg->type = PTR_TO_MAP_VALUE;
 		dst_reg->off = aux->map_off;
-		if (btf_record_has_field(map->record, BPF_SPIN_LOCK))
-			dst_reg->id = ++env->id_gen;
+		WARN_ON_ONCE(map->max_entries != 1);
+		/* We want reg->id to be same (0) as map_value is not distinct */
 	} else if (insn->src_reg == BPF_PSEUDO_MAP_FD ||
 		   insn->src_reg == BPF_PSEUDO_MAP_IDX) {
 		dst_reg->type = CONST_PTR_TO_MAP;
@@ -10652,7 +10666,7 @@  static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
 		return err;
 	}
 
-	if (env->cur_state->active_spin_lock) {
+	if (env->cur_state->active_lock.ptr) {
 		verbose(env, "BPF_LD_[ABS|IND] cannot be used inside bpf_spin_lock-ed region\n");
 		return -EINVAL;
 	}
@@ -11918,7 +11932,8 @@  static bool states_equal(struct bpf_verifier_env *env,
 	if (old->speculative && !cur->speculative)
 		return false;
 
-	if (old->active_spin_lock != cur->active_spin_lock)
+	if (old->active_lock.ptr != cur->active_lock.ptr ||
+	    old->active_lock.id != cur->active_lock.id)
 		return false;
 
 	/* for states to be equal callsites have to be the same
@@ -12563,7 +12578,7 @@  static int do_check(struct bpf_verifier_env *env)
 					return -EINVAL;
 				}
 
-				if (env->cur_state->active_spin_lock &&
+				if (env->cur_state->active_lock.ptr &&
 				    (insn->src_reg == BPF_PSEUDO_CALL ||
 				     insn->imm != BPF_FUNC_spin_unlock)) {
 					verbose(env, "function calls are not allowed while holding a lock\n");
@@ -12600,7 +12615,7 @@  static int do_check(struct bpf_verifier_env *env)
 					return -EINVAL;
 				}
 
-				if (env->cur_state->active_spin_lock) {
+				if (env->cur_state->active_lock.ptr) {
 					verbose(env, "bpf_spin_unlock is missing\n");
 					return -EINVAL;
 				}