diff mbox series

[bpf,v1,1/2] bpf: Fix state pruning check for PTR_TO_MAP_VALUE

Message ID 20221111202719.982118-2-memxor@gmail.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series Fix map value pruning check | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 33 this patch: 33
netdev/cc_maintainers fail 1 blamed authors not CCed: peterz@infradead.org; 9 maintainers not CCed: sdf@google.com kpsingh@kernel.org haoluo@google.com peterz@infradead.org yhs@fb.com jolsa@kernel.org martin.lau@linux.dev song@kernel.org john.fastabend@gmail.com
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 31 this patch: 31
netdev/checkpatch warning WARNING: line length of 93 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-PR fail PR summary
bpf/vmtest-bpf-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-VM_Test-7 success Logs for llvm-toolchain
bpf/vmtest-bpf-VM_Test-8 success Logs for set-matrix
bpf/vmtest-bpf-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-VM_Test-3 success Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-VM_Test-6 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-4 success Logs for build for s390x with gcc
bpf/vmtest-bpf-VM_Test-12 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-13 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-22 fail Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-23 fail Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-27 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-28 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-32 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-33 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-37 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-38 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-9 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-10 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-VM_Test-14 fail Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-15 fail Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-VM_Test-17 fail Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-18 fail Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-19 fail Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-20 fail Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-25 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-VM_Test-29 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-30 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-VM_Test-34 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-35 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-VM_Test-36 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-VM_Test-16 fail Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-VM_Test-21 fail Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-VM_Test-26 success Logs for test_progs_no_alu32_parallel on s390x with gcc
bpf/vmtest-bpf-VM_Test-31 success Logs for test_progs_parallel on s390x with gcc
bpf/vmtest-bpf-VM_Test-11 success Logs for test_maps on s390x with gcc

Commit Message

Kumar Kartikeya Dwivedi Nov. 11, 2022, 8:27 p.m. UTC
Currently, the verifier preserves reg->id for PTR_TO_MAP_VALUE when it
points to a map value that contains a bpf_spin_lock element (see the
logic in reg_may_point_to_spin_lock and how it is used to skip setting
reg->id to 0 in mark_ptr_or_null_reg). This gives a unique lock ID for
each critical section begun by a bpf_spin_lock helper call.

The same reg->id is matched with env->active_spin_lock during unlock to
determine whether bpf_spin_unlock is called for the same bpf_spin_lock
object.

However, regsafe takes a different approach to safety checks currently.
The comparison of reg->id was explicitly skipped in the commit being
fixed with the reasoning that the reg->id value should have no bearing
on the safety of the program if the old state was verified to be safe.

This however is demonstrably not true (with a selftest having the
verbose working test case in a later commit), with the following pseudo
code:

	r0 = bpf_map_lookup_elem(&map, ...); // id=1
	r6 = r0;
	r0 = bpf_map_lookup_elem(&map, ...); // id=2
	r7 = r0;

	bpf_spin_lock(r1=r6);
	if (cond) // unknown scalar, hence verifier cannot predict branch
		r6 = r7;
	p:
	bpf_spin_unlock(r1=r7);

In the first exploration path, we would want the verifier to skip
over the r6 = r7 assignment so that it reaches BPF_EXIT and the
state branches counter drops to 0 and it becomes a safe verified
state.

The branch target 'p' acts a pruning point, hence states will be
compared. If the old state was verified without assignment, it has
r6 with id=1, but the new state will have r6 with id=2. The other
parts of register, stack, and reference state and any other verifier
state compared in states_equal remain unaffected by the assignment.

Now, when the memcmp fails for r6, the verifier drops to the switch case
and simply memcmp until the id member, and requires the var_off to be
more permissive in the current state. Once establishing this fact, it
returns true and search is pruned.

Essentially, we end up calling unlock for a bpf_spin_lock that was never
locked whenever the condition is true at runtime.

To fix this, also include id in the memcmp comparison. Since ref_obj_id
is never set for PTR_TO_MAP_VALUE, change the offsetof to be until that
member.

Note that by default the reg->id in case of PTR_TO_MAP_VALUE should be 0
(without PTR_MAYBE_NULL), so it should only really impact cases where a
bpf_spin_lock is present in the map element.

Fixes: d83525ca62cf ("bpf: introduce bpf_spin_lock")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/verifier.c | 33 +++++++++++++++++++++++++++------
 1 file changed, 27 insertions(+), 6 deletions(-)

Comments

Stanislav Fomichev Nov. 13, 2022, 1:58 a.m. UTC | #1
On 11/12, Kumar Kartikeya Dwivedi wrote:
> Currently, the verifier preserves reg->id for PTR_TO_MAP_VALUE when it
> points to a map value that contains a bpf_spin_lock element (see the
> logic in reg_may_point_to_spin_lock and how it is used to skip setting
> reg->id to 0 in mark_ptr_or_null_reg). This gives a unique lock ID for
> each critical section begun by a bpf_spin_lock helper call.

> The same reg->id is matched with env->active_spin_lock during unlock to
> determine whether bpf_spin_unlock is called for the same bpf_spin_lock
> object.

> However, regsafe takes a different approach to safety checks currently.
> The comparison of reg->id was explicitly skipped in the commit being
> fixed with the reasoning that the reg->id value should have no bearing
> on the safety of the program if the old state was verified to be safe.

> This however is demonstrably not true (with a selftest having the
> verbose working test case in a later commit), with the following pseudo
> code:

> 	r0 = bpf_map_lookup_elem(&map, ...); // id=1
> 	r6 = r0;
> 	r0 = bpf_map_lookup_elem(&map, ...); // id=2
> 	r7 = r0;

> 	bpf_spin_lock(r1=r6);
> 	if (cond) // unknown scalar, hence verifier cannot predict branch
> 		r6 = r7;
> 	p:
> 	bpf_spin_unlock(r1=r7);

> In the first exploration path, we would want the verifier to skip
> over the r6 = r7 assignment so that it reaches BPF_EXIT and the
> state branches counter drops to 0 and it becomes a safe verified
> state.

> The branch target 'p' acts a pruning point, hence states will be
> compared. If the old state was verified without assignment, it has
> r6 with id=1, but the new state will have r6 with id=2. The other
> parts of register, stack, and reference state and any other verifier
> state compared in states_equal remain unaffected by the assignment.

> Now, when the memcmp fails for r6, the verifier drops to the switch case
> and simply memcmp until the id member, and requires the var_off to be
> more permissive in the current state. Once establishing this fact, it
> returns true and search is pruned.

> Essentially, we end up calling unlock for a bpf_spin_lock that was never
> locked whenever the condition is true at runtime.

> To fix this, also include id in the memcmp comparison. Since ref_obj_id
> is never set for PTR_TO_MAP_VALUE, change the offsetof to be until that
> member.

> Note that by default the reg->id in case of PTR_TO_MAP_VALUE should be 0
> (without PTR_MAYBE_NULL), so it should only really impact cases where a
> bpf_spin_lock is present in the map element.

> Fixes: d83525ca62cf ("bpf: introduce bpf_spin_lock")
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

Acked-by: Stanislav Fomichev <sdf@google.com>

Sounds convincing. Also run the selftest to make sure it fails w/o this
patch.

> ---
>   kernel/bpf/verifier.c | 33 +++++++++++++++++++++++++++------
>   1 file changed, 27 insertions(+), 6 deletions(-)

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 264b3dc714cc..7e6bac344d37 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -11559,13 +11559,34 @@ static bool regsafe(struct bpf_verifier_env  
> *env, struct bpf_reg_state *rold,

>   		/* If the new min/max/var_off satisfy the old ones and
>   		 * everything else matches, we are OK.
> -		 * 'id' is not compared, since it's only used for maps with
> -		 * bpf_spin_lock inside map element and in such cases if
> -		 * the rest of the prog is valid for one map element then
> -		 * it's valid for all map elements regardless of the key
> -		 * used in bpf_map_lookup()
> +		 *
> +		 * 'id' must also be compared, since it's used for maps with
> +		 * bpf_spin_lock inside map element and in such cases if the
> +		 * rest of the prog is valid for one map element with a specific
> +		 * id, then the id in the current state must match that of the
> +		 * old state so that any operations on this reg in the rest of
> +		 * the program work correctly.
> +		 *
> +		 * One example is a program doing the following:
> +		 *	r0 = bpf_map_lookup_elem(&map, ...); // id=1
> +		 *	r6 = r0;
> +		 *	r0 = bpf_map_lookup_elem(&map, ...); // id=2
> +		 *	r7 = r0;
> +		 *
> +		 *	bpf_spin_lock(r1=r6);
> +		 *	if (cond)
> +		 *		r6 = r7;
> +		 * p:
> +		 *	bpf_spin_unlock(r1=r6);
> +		 *
> +		 * The label 'p' is a pruning point, hence states for that
> +		 * insn_idx will be compared. If we don't compare the id, the
> +		 * program will pass as the r6 and r7 are otherwise identical
> +		 * during the second pass that compares the already verified
> +		 * state with the one coming from the path having the additional
> +		 * r6 = r7 assignment.
>   		 */
> -		return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 &&
> +		return memcmp(rold, rcur, offsetof(struct bpf_reg_state, ref_obj_id))  
> == 0 &&
>   		       range_within(rold, rcur) &&
>   		       tnum_in(rold->var_off, rcur->var_off);
>   	case PTR_TO_PACKET_META:
> --
> 2.38.1
Andrii Nakryiko Nov. 23, 2022, 9:01 p.m. UTC | #2
On Fri, Nov 11, 2022 at 12:27 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> Currently, the verifier preserves reg->id for PTR_TO_MAP_VALUE when it
> points to a map value that contains a bpf_spin_lock element (see the
> logic in reg_may_point_to_spin_lock and how it is used to skip setting
> reg->id to 0 in mark_ptr_or_null_reg). This gives a unique lock ID for
> each critical section begun by a bpf_spin_lock helper call.
>
> The same reg->id is matched with env->active_spin_lock during unlock to
> determine whether bpf_spin_unlock is called for the same bpf_spin_lock
> object.
>
> However, regsafe takes a different approach to safety checks currently.
> The comparison of reg->id was explicitly skipped in the commit being
> fixed with the reasoning that the reg->id value should have no bearing
> on the safety of the program if the old state was verified to be safe.
>
> This however is demonstrably not true (with a selftest having the
> verbose working test case in a later commit), with the following pseudo
> code:
>
>         r0 = bpf_map_lookup_elem(&map, ...); // id=1
>         r6 = r0;
>         r0 = bpf_map_lookup_elem(&map, ...); // id=2
>         r7 = r0;
>
>         bpf_spin_lock(r1=r6);
>         if (cond) // unknown scalar, hence verifier cannot predict branch
>                 r6 = r7;
>         p:
>         bpf_spin_unlock(r1=r7);
>
> In the first exploration path, we would want the verifier to skip
> over the r6 = r7 assignment so that it reaches BPF_EXIT and the
> state branches counter drops to 0 and it becomes a safe verified
> state.
>
> The branch target 'p' acts a pruning point, hence states will be
> compared. If the old state was verified without assignment, it has
> r6 with id=1, but the new state will have r6 with id=2. The other
> parts of register, stack, and reference state and any other verifier
> state compared in states_equal remain unaffected by the assignment.
>
> Now, when the memcmp fails for r6, the verifier drops to the switch case
> and simply memcmp until the id member, and requires the var_off to be
> more permissive in the current state. Once establishing this fact, it
> returns true and search is pruned.
>
> Essentially, we end up calling unlock for a bpf_spin_lock that was never
> locked whenever the condition is true at runtime.
>
> To fix this, also include id in the memcmp comparison. Since ref_obj_id
> is never set for PTR_TO_MAP_VALUE, change the offsetof to be until that
> member.
>
> Note that by default the reg->id in case of PTR_TO_MAP_VALUE should be 0
> (without PTR_MAYBE_NULL), so it should only really impact cases where a
> bpf_spin_lock is present in the map element.
>
> Fixes: d83525ca62cf ("bpf: introduce bpf_spin_lock")
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  kernel/bpf/verifier.c | 33 +++++++++++++++++++++++++++------
>  1 file changed, 27 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 264b3dc714cc..7e6bac344d37 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -11559,13 +11559,34 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
>
>                 /* If the new min/max/var_off satisfy the old ones and
>                  * everything else matches, we are OK.
> -                * 'id' is not compared, since it's only used for maps with
> -                * bpf_spin_lock inside map element and in such cases if
> -                * the rest of the prog is valid for one map element then
> -                * it's valid for all map elements regardless of the key
> -                * used in bpf_map_lookup()
> +                *
> +                * 'id' must also be compared, since it's used for maps with
> +                * bpf_spin_lock inside map element and in such cases if the
> +                * rest of the prog is valid for one map element with a specific
> +                * id, then the id in the current state must match that of the
> +                * old state so that any operations on this reg in the rest of
> +                * the program work correctly.
> +                *
> +                * One example is a program doing the following:
> +                *      r0 = bpf_map_lookup_elem(&map, ...); // id=1
> +                *      r6 = r0;
> +                *      r0 = bpf_map_lookup_elem(&map, ...); // id=2
> +                *      r7 = r0;
> +                *
> +                *      bpf_spin_lock(r1=r6);
> +                *      if (cond)
> +                *              r6 = r7;
> +                * p:
> +                *      bpf_spin_unlock(r1=r6);
> +                *
> +                * The label 'p' is a pruning point, hence states for that
> +                * insn_idx will be compared. If we don't compare the id, the
> +                * program will pass as the r6 and r7 are otherwise identical
> +                * during the second pass that compares the already verified
> +                * state with the one coming from the path having the additional
> +                * r6 = r7 assignment.
>                  */
> -               return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 &&
> +               return memcmp(rold, rcur, offsetof(struct bpf_reg_state, ref_obj_id)) == 0 &&

I don't think it's right to check ids exactly. I do think that this
check is missing check_ids(), though, but its unrelated to the problem
you are trying to solve.

As for the problem with spin_lock above. Again, there is nothing wrong
about doing the whole id remapping idea, it just establishes
equivalence between registers/slots without requiring absolute values
of IDs to be exact, rather it finds a consistent and correct
permutation.

I think the fix is what Ed suggested. Where we currently check

old->active_lock.ptr != cur->active_lock.ptr || old->active_lock.id !=
cur->active_lock.id

at least for IDs we should do the comparison with check_ids().


But another thing which I'm not sure about and jumped on me when I
looked at the code is that we reset ID map for each function frame,
not preserving the mapping across all frames. It might be sufficient,
but seems iffy. Also, I think we should take idmap into account when
doing refsafe(), which would require "global" idmap across all frames.

Thoughts?


>                        range_within(rold, rcur) &&
>                        tnum_in(rold->var_off, rcur->var_off);
>         case PTR_TO_PACKET_META:
> --
> 2.38.1
>
diff mbox series

Patch

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 264b3dc714cc..7e6bac344d37 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11559,13 +11559,34 @@  static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
 
 		/* If the new min/max/var_off satisfy the old ones and
 		 * everything else matches, we are OK.
-		 * 'id' is not compared, since it's only used for maps with
-		 * bpf_spin_lock inside map element and in such cases if
-		 * the rest of the prog is valid for one map element then
-		 * it's valid for all map elements regardless of the key
-		 * used in bpf_map_lookup()
+		 *
+		 * 'id' must also be compared, since it's used for maps with
+		 * bpf_spin_lock inside map element and in such cases if the
+		 * rest of the prog is valid for one map element with a specific
+		 * id, then the id in the current state must match that of the
+		 * old state so that any operations on this reg in the rest of
+		 * the program work correctly.
+		 *
+		 * One example is a program doing the following:
+		 *	r0 = bpf_map_lookup_elem(&map, ...); // id=1
+		 *	r6 = r0;
+		 *	r0 = bpf_map_lookup_elem(&map, ...); // id=2
+		 *	r7 = r0;
+		 *
+		 *	bpf_spin_lock(r1=r6);
+		 *	if (cond)
+		 *		r6 = r7;
+		 * p:
+		 *	bpf_spin_unlock(r1=r6);
+		 *
+		 * The label 'p' is a pruning point, hence states for that
+		 * insn_idx will be compared. If we don't compare the id, the
+		 * program will pass as the r6 and r7 are otherwise identical
+		 * during the second pass that compares the already verified
+		 * state with the one coming from the path having the additional
+		 * r6 = r7 assignment.
 		 */
-		return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 &&
+		return memcmp(rold, rcur, offsetof(struct bpf_reg_state, ref_obj_id)) == 0 &&
 		       range_within(rold, rcur) &&
 		       tnum_in(rold->var_off, rcur->var_off);
 	case PTR_TO_PACKET_META: