diff mbox series

[bpf] bpf: fix potential race in tail call compatibility check

Message ID 20211021183951.169905-1-toke@redhat.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series [bpf] bpf: fix potential race in tail call compatibility check | expand

Checks

Context Check Description
netdev/cover_letter success Single patches do not need cover letters
netdev/fixes_present success Fixes tag present in non-next series
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for bpf
netdev/subject_prefix success Link
netdev/cc_maintainers warning 9 maintainers not CCed: john.fastabend@gmail.com hawk@kernel.org yhs@fb.com songliubraving@fb.com kafai@fb.com kpsingh@kernel.org davem@davemloft.net andrii@kernel.org kuba@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 11702 this patch: 11702
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Fixes tag looks correct
netdev/checkpatch warning CHECK: Prefer using the BIT macro
netdev/build_allmodconfig_warn success Errors and warnings before: 11331 this patch: 11331
netdev/header_inline success No static functions without inline keyword in header files
bpf/vmtest-bpf-PR success PR summary
bpf/vmtest-bpf success VM_Test

Commit Message

Toke Høiland-Jørgensen Oct. 21, 2021, 6:39 p.m. UTC
Lorenzo noticed that the code testing for program type compatibility of
tail call maps is potentially racy in that two threads could encounter a
map with an unset type simultaneously and both return true even though they
are inserting incompatible programs.

The race window is quite small, but artificially enlarging it by adding a
usleep_range() inside the check in bpf_prog_array_compatible() makes it
trivial to trigger from userspace with a program that does, essentially:

        map_fd = bpf_create_map(BPF_MAP_TYPE_PROG_ARRAY, 4, 4, 2, 0);
        pid = fork();
        if (pid) {
                key = 0;
                value = xdp_fd;
        } else {
                key = 1;
                value = tc_fd;
        }
        err = bpf_map_update_elem(map_fd, &key, &value, 0);

While the race window is small, it has potentially serious ramifications in
that triggering it would allow a BPF program to tail call to a program of a
different type. So let's get rid of it by changing to an atomic update of
the array map aux->type. To do this, move the aux->jited boolean to be
encoded in the top-most bit of the aux->type field, so we can cmpxchg() the
whole thing in one go. The commit in the Fixes tag is the last commit that
touches the code in question.

Fixes: 3324b584b6f6 ("ebpf: misc core cleanup")
Reported-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
We noticed this while reworking the code in question to also deal with XDP
multi-buffer compatibility. We figured we'd send this fix separately, but
we'd like to base some code on it in bpf-next. What's the right procedure
here, should we wait until bpf gets merged into bpf-next, or can we include
this patch in the multibuf series as well?

-Toke

 include/linux/bpf.h  |  9 +++++++--
 kernel/bpf/core.c    | 23 ++++++++++++++++-------
 kernel/bpf/syscall.c |  4 ++--
 3 files changed, 25 insertions(+), 11 deletions(-)

Comments

Alexei Starovoitov Oct. 22, 2021, 12:48 a.m. UTC | #1
On Thu, Oct 21, 2021 at 11:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> +       map_type = READ_ONCE(array->aux->type);
> +       if (!map_type) {
> +               /* There's no owner yet where we could check for compatibility.
> +                * Do an atomic swap to prevent racing with another invocation
> +                * of this branch (via simultaneous map_update syscalls).
>                  */
> -               array->aux->type  = fp->type;
> -               array->aux->jited = fp->jited;
> +               if (cmpxchg(&array->aux->type, 0, prog_type))
> +                       return false;

Other fields might be used in the compatibility check in the future.
This hack is too fragile.
Just use a spin_lock.
Toke Høiland-Jørgensen Oct. 22, 2021, 10:54 a.m. UTC | #2
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Thu, Oct 21, 2021 at 11:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> +       map_type = READ_ONCE(array->aux->type);
>> +       if (!map_type) {
>> +               /* There's no owner yet where we could check for compatibility.
>> +                * Do an atomic swap to prevent racing with another invocation
>> +                * of this branch (via simultaneous map_update syscalls).
>>                  */
>> -               array->aux->type  = fp->type;
>> -               array->aux->jited = fp->jited;
>> +               if (cmpxchg(&array->aux->type, 0, prog_type))
>> +                       return false;
>
> Other fields might be used in the compatibility check in the future.
> This hack is too fragile.
> Just use a spin_lock.

Well, yeah, we're adding another field for xdp_mb. I was just going to
eat more bits of the 'type' field, but OK, can switch to a spinlock
instead :)

-Toke
diff mbox series

Patch

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 020a7d5bf470..48cc42063a86 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -923,14 +923,19 @@  struct bpf_prog_aux {
 	};
 };
 
+#define BPF_MAP_JITED_FLAG (1 << 31)
+
 struct bpf_array_aux {
 	/* 'Ownership' of prog array is claimed by the first program that
 	 * is going to use this map or by the first program which FD is
 	 * stored in the map to make sure that all callers and callees have
 	 * the same prog type and JITed flag.
+	 *
+	 * We store the type as a u32 and encode the jited state in the
+	 * most-significant bit (BPF_MAP_JITED_FLAG). This allows setting the
+	 * type atomically without locking.
 	 */
-	enum bpf_prog_type type;
-	bool jited;
+	u32 type;
 	/* Programs with direct jumps into programs part of this array. */
 	struct list_head poke_progs;
 	struct bpf_map *map;
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index c1e7eb3f1876..2811e7723886 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1823,20 +1823,29 @@  static unsigned int __bpf_prog_ret0_warn(const void *ctx,
 bool bpf_prog_array_compatible(struct bpf_array *array,
 			       const struct bpf_prog *fp)
 {
+	u32 map_type, prog_type = fp->type;
+
 	if (fp->kprobe_override)
 		return false;
 
-	if (!array->aux->type) {
-		/* There's no owner yet where we could check for
-		 * compatibility.
+	/* Encode the jited status in the top-most bit of the aux->type field so
+	 * we have a single value we can atomically swap in below
+	 */
+	if (fp->jited)
+		prog_type |= BPF_MAP_JITED_FLAG;
+
+	map_type = READ_ONCE(array->aux->type);
+	if (!map_type) {
+		/* There's no owner yet where we could check for compatibility.
+		 * Do an atomic swap to prevent racing with another invocation
+		 * of this branch (via simultaneous map_update syscalls).
 		 */
-		array->aux->type  = fp->type;
-		array->aux->jited = fp->jited;
+		if (cmpxchg(&array->aux->type, 0, prog_type))
+			return false;
 		return true;
 	}
 
-	return array->aux->type  == fp->type &&
-	       array->aux->jited == fp->jited;
+	return map_type == prog_type;
 }
 
 static int bpf_check_tail_call(const struct bpf_prog *fp)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 4e50c0bfdb7d..45485ebdfb2b 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -543,8 +543,8 @@  static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp)
 
 	if (map->map_type == BPF_MAP_TYPE_PROG_ARRAY) {
 		array = container_of(map, struct bpf_array, map);
-		type  = array->aux->type;
-		jited = array->aux->jited;
+		type  = array->aux->type & ~BPF_MAP_JITED_FLAG;
+		jited = !!(array->aux->type & BPF_MAP_JITED_FLAG);
 	}
 
 	seq_printf(m,