Message ID | 20230815174712.660956-4-thinker.li@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | Sleepable BPF programs on cgroup {get,set}sockopt | expand |
On 8/15/23 10:47 AM, thinker.li@gmail.com wrote: > From: Kui-Feng Lee <thinker.li@gmail.com> > > Since the buffer pointed by ctx->user_optval is in user space, BPF programs > in kernel space should not access it directly. They should use > bpf_copy_from_user() and bpf_copy_to_user() to move data between user and > kernel space. > > Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> > --- > kernel/bpf/cgroup.c | 16 +++++++++-- > kernel/bpf/verifier.c | 66 +++++++++++++++++++++---------------------- > 2 files changed, 47 insertions(+), 35 deletions(-) > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c > index b977768a28e5..425094e071ba 100644 > --- a/kernel/bpf/cgroup.c > +++ b/kernel/bpf/cgroup.c > @@ -2494,12 +2494,24 @@ static bool cg_sockopt_is_valid_access(int off, int size, > case offsetof(struct bpf_sockopt, optval): > if (size != sizeof(__u64)) > return false; > - info->reg_type = PTR_TO_PACKET; > + if (prog->aux->sleepable) > + /* Prohibit access to the memory pointed by optval > + * in sleepable programs. > + */ > + info->reg_type = PTR_TO_PACKET | MEM_USER; Is MEM_USER needed to call bpf_copy_from_user()? Also, from looking at patch 4, the optval could be changed from user memory to kernel memory during runtime. Is it useful to check MEM_USER during the verifier load time? How about just return false here to disallow sleepable prog to use ->optval and ->optval_end. Enforce sleepable prog to stay with the bpf_dynptr_read/write API and avoid needing the optval + len > optval_end check in the sleepable bpf prog. WDYT? Regarding ->optlen, do you think the sleepable prog can stay with the bpf_dynptr_size() and bpf_dynptr_adjust() such that no need to expose optlen to the sleepable prog also. > + else > + info->reg_type = PTR_TO_PACKET; > break; > case offsetof(struct bpf_sockopt, optval_end): > if (size != sizeof(__u64)) > return false; > - info->reg_type = PTR_TO_PACKET_END; > + if (prog->aux->sleepable) > + /* Prohibit access to the memory pointed by > + * optval_end in sleepable programs. > + */ > + info->reg_type = PTR_TO_PACKET_END | MEM_USER; > + else > + info->reg_type = PTR_TO_PACKET_END; > break;
On Tue, Aug 15, 2023 at 10:47:10AM -0700, thinker.li@gmail.com wrote: > From: Kui-Feng Lee <thinker.li@gmail.com> > > Since the buffer pointed by ctx->user_optval is in user space, BPF programs > in kernel space should not access it directly. They should use > bpf_copy_from_user() and bpf_copy_to_user() to move data between user and > kernel space. > > Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> > --- > kernel/bpf/cgroup.c | 16 +++++++++-- > kernel/bpf/verifier.c | 66 +++++++++++++++++++++---------------------- > 2 files changed, 47 insertions(+), 35 deletions(-) > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c > index b977768a28e5..425094e071ba 100644 > --- a/kernel/bpf/cgroup.c > +++ b/kernel/bpf/cgroup.c > @@ -2494,12 +2494,24 @@ static bool cg_sockopt_is_valid_access(int off, int size, > case offsetof(struct bpf_sockopt, optval): > if (size != sizeof(__u64)) > return false; > - info->reg_type = PTR_TO_PACKET; > + if (prog->aux->sleepable) > + /* Prohibit access to the memory pointed by optval > + * in sleepable programs. > + */ > + info->reg_type = PTR_TO_PACKET | MEM_USER; > + else > + info->reg_type = PTR_TO_PACKET; > break; > case offsetof(struct bpf_sockopt, optval_end): > if (size != sizeof(__u64)) > return false; > - info->reg_type = PTR_TO_PACKET_END; > + if (prog->aux->sleepable) > + /* Prohibit access to the memory pointed by > + * optval_end in sleepable programs. > + */ > + info->reg_type = PTR_TO_PACKET_END | MEM_USER; This doesn't look correct. packet memory and user memory are non overlapping address spaces. There cannot be a packet memory that is also and user memory.
On 8/16/23 17:55, Martin KaFai Lau wrote: > On 8/15/23 10:47 AM, thinker.li@gmail.com wrote: >> From: Kui-Feng Lee <thinker.li@gmail.com> >> >> Since the buffer pointed by ctx->user_optval is in user space, BPF >> programs >> in kernel space should not access it directly. They should use >> bpf_copy_from_user() and bpf_copy_to_user() to move data between user and >> kernel space. >> >> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> >> --- >> kernel/bpf/cgroup.c | 16 +++++++++-- >> kernel/bpf/verifier.c | 66 +++++++++++++++++++++---------------------- >> 2 files changed, 47 insertions(+), 35 deletions(-) >> >> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c >> index b977768a28e5..425094e071ba 100644 >> --- a/kernel/bpf/cgroup.c >> +++ b/kernel/bpf/cgroup.c >> @@ -2494,12 +2494,24 @@ static bool cg_sockopt_is_valid_access(int >> off, int size, >> case offsetof(struct bpf_sockopt, optval): >> if (size != sizeof(__u64)) >> return false; >> - info->reg_type = PTR_TO_PACKET; >> + if (prog->aux->sleepable) >> + /* Prohibit access to the memory pointed by optval >> + * in sleepable programs. >> + */ >> + info->reg_type = PTR_TO_PACKET | MEM_USER; > > Is MEM_USER needed to call bpf_copy_from_user()? > > Also, from looking at patch 4, the optval could be changed from user > memory to kernel memory during runtime. Is it useful to check MEM_USER > during the verifier load time? It has been checked. optval & optval_end are exported and can be used to compute the size of the user space buffer. However, it can not be used to read the content of the user space buffer. To be specific, __check_mem_access() will fail due to having MEM_USER in reg->type. However, it is implicit. I will make it explicit if necessary. > > How about just return false here to disallow sleepable prog to use > ->optval and ->optval_end. Enforce sleepable prog to stay with the > bpf_dynptr_read/write API and avoid needing the optval + len > > optval_end check in the sleepable bpf prog. WDYT? Then, we need to export another variable to get the size of the buffer pointed by optval. Then, I would like to have a new context type instead of struct bpf_sockopt for the sleepable programs. With the new type, we can remove optval & optval_end completely from the view of sleepable ones. They will get errors of accessing optval & optval_end as early as compile time. > > Regarding ->optlen, do you think the sleepable prog can stay with the > bpf_dynptr_size() and bpf_dynptr_adjust() such that no need to expose > optlen to the sleepable prog also. > >> + else >> + info->reg_type = PTR_TO_PACKET; >> break; >> case offsetof(struct bpf_sockopt, optval_end): >> if (size != sizeof(__u64)) >> return false; >> - info->reg_type = PTR_TO_PACKET_END; >> + if (prog->aux->sleepable) >> + /* Prohibit access to the memory pointed by >> + * optval_end in sleepable programs. >> + */ >> + info->reg_type = PTR_TO_PACKET_END | MEM_USER; >> + else >> + info->reg_type = PTR_TO_PACKET_END; >> break; >
On 8/16/23 18:17, Alexei Starovoitov wrote: > On Tue, Aug 15, 2023 at 10:47:10AM -0700, thinker.li@gmail.com wrote: >> From: Kui-Feng Lee <thinker.li@gmail.com> >> >> Since the buffer pointed by ctx->user_optval is in user space, BPF programs >> in kernel space should not access it directly. They should use >> bpf_copy_from_user() and bpf_copy_to_user() to move data between user and >> kernel space. >> >> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> >> --- >> kernel/bpf/cgroup.c | 16 +++++++++-- >> kernel/bpf/verifier.c | 66 +++++++++++++++++++++---------------------- >> 2 files changed, 47 insertions(+), 35 deletions(-) >> >> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c >> index b977768a28e5..425094e071ba 100644 >> --- a/kernel/bpf/cgroup.c >> +++ b/kernel/bpf/cgroup.c >> @@ -2494,12 +2494,24 @@ static bool cg_sockopt_is_valid_access(int off, int size, >> case offsetof(struct bpf_sockopt, optval): >> if (size != sizeof(__u64)) >> return false; >> - info->reg_type = PTR_TO_PACKET; >> + if (prog->aux->sleepable) >> + /* Prohibit access to the memory pointed by optval >> + * in sleepable programs. >> + */ >> + info->reg_type = PTR_TO_PACKET | MEM_USER; >> + else >> + info->reg_type = PTR_TO_PACKET; >> break; >> case offsetof(struct bpf_sockopt, optval_end): >> if (size != sizeof(__u64)) >> return false; >> - info->reg_type = PTR_TO_PACKET_END; >> + if (prog->aux->sleepable) >> + /* Prohibit access to the memory pointed by >> + * optval_end in sleepable programs. >> + */ >> + info->reg_type = PTR_TO_PACKET_END | MEM_USER; > > This doesn't look correct. > packet memory and user memory are non overlapping address spaces. > There cannot be a packet memory that is also and user memory. Got it! I will find a new type to replace this one.
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c index b977768a28e5..425094e071ba 100644 --- a/kernel/bpf/cgroup.c +++ b/kernel/bpf/cgroup.c @@ -2494,12 +2494,24 @@ static bool cg_sockopt_is_valid_access(int off, int size, case offsetof(struct bpf_sockopt, optval): if (size != sizeof(__u64)) return false; - info->reg_type = PTR_TO_PACKET; + if (prog->aux->sleepable) + /* Prohibit access to the memory pointed by optval + * in sleepable programs. + */ + info->reg_type = PTR_TO_PACKET | MEM_USER; + else + info->reg_type = PTR_TO_PACKET; break; case offsetof(struct bpf_sockopt, optval_end): if (size != sizeof(__u64)) return false; - info->reg_type = PTR_TO_PACKET_END; + if (prog->aux->sleepable) + /* Prohibit access to the memory pointed by + * optval_end in sleepable programs. + */ + info->reg_type = PTR_TO_PACKET_END | MEM_USER; + else + info->reg_type = PTR_TO_PACKET_END; break; case offsetof(struct bpf_sockopt, retval): if (size != size_default) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 61be432b9420..936a171ea976 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13373,7 +13373,7 @@ static void find_good_pkt_pointers(struct bpf_verifier_state *vstate, * dst_reg->off is known < MAX_PACKET_OFF, therefore it fits in a u16. */ bpf_for_each_reg_in_vstate(vstate, state, reg, ({ - if (reg->type == type && reg->id == dst_reg->id) + if (base_type(reg->type) == type && reg->id == dst_reg->id) /* keep the maximum range already checked */ reg->range = max(reg->range, new_range); })); @@ -13926,84 +13926,84 @@ static bool try_match_pkt_pointers(const struct bpf_insn *insn, switch (BPF_OP(insn->code)) { case BPF_JGT: - if ((dst_reg->type == PTR_TO_PACKET && - src_reg->type == PTR_TO_PACKET_END) || - (dst_reg->type == PTR_TO_PACKET_META && + if ((base_type(dst_reg->type) == PTR_TO_PACKET && + base_type(src_reg->type) == PTR_TO_PACKET_END) || + (base_type(dst_reg->type) == PTR_TO_PACKET_META && reg_is_init_pkt_pointer(src_reg, PTR_TO_PACKET))) { /* pkt_data' > pkt_end, pkt_meta' > pkt_data */ find_good_pkt_pointers(this_branch, dst_reg, - dst_reg->type, false); + base_type(dst_reg->type), false); mark_pkt_end(other_branch, insn->dst_reg, true); - } else if ((dst_reg->type == PTR_TO_PACKET_END && - src_reg->type == PTR_TO_PACKET) || + } else if ((base_type(dst_reg->type) == PTR_TO_PACKET_END && + base_type(src_reg->type) == PTR_TO_PACKET) || (reg_is_init_pkt_pointer(dst_reg, PTR_TO_PACKET) && - src_reg->type == PTR_TO_PACKET_META)) { + base_type(src_reg->type) == PTR_TO_PACKET_META)) { /* pkt_end > pkt_data', pkt_data > pkt_meta' */ find_good_pkt_pointers(other_branch, src_reg, - src_reg->type, true); + base_type(src_reg->type), true); mark_pkt_end(this_branch, insn->src_reg, false); } else { return false; } break; case BPF_JLT: - if ((dst_reg->type == PTR_TO_PACKET && - src_reg->type == PTR_TO_PACKET_END) || - (dst_reg->type == PTR_TO_PACKET_META && + if ((base_type(dst_reg->type) == PTR_TO_PACKET && + base_type(src_reg->type) == PTR_TO_PACKET_END) || + (base_type(dst_reg->type) == PTR_TO_PACKET_META && reg_is_init_pkt_pointer(src_reg, PTR_TO_PACKET))) { /* pkt_data' < pkt_end, pkt_meta' < pkt_data */ find_good_pkt_pointers(other_branch, dst_reg, - dst_reg->type, true); + base_type(dst_reg->type), true); mark_pkt_end(this_branch, insn->dst_reg, false); - } else if ((dst_reg->type == PTR_TO_PACKET_END && - src_reg->type == PTR_TO_PACKET) || + } else if ((base_type(dst_reg->type) == PTR_TO_PACKET_END && + base_type(src_reg->type) == PTR_TO_PACKET) || (reg_is_init_pkt_pointer(dst_reg, PTR_TO_PACKET) && - src_reg->type == PTR_TO_PACKET_META)) { + base_type(src_reg->type) == PTR_TO_PACKET_META)) { /* pkt_end < pkt_data', pkt_data > pkt_meta' */ find_good_pkt_pointers(this_branch, src_reg, - src_reg->type, false); + base_type(src_reg->type), false); mark_pkt_end(other_branch, insn->src_reg, true); } else { return false; } break; case BPF_JGE: - if ((dst_reg->type == PTR_TO_PACKET && - src_reg->type == PTR_TO_PACKET_END) || - (dst_reg->type == PTR_TO_PACKET_META && + if ((base_type(dst_reg->type) == PTR_TO_PACKET && + base_type(src_reg->type) == PTR_TO_PACKET_END) || + (base_type(dst_reg->type) == PTR_TO_PACKET_META && reg_is_init_pkt_pointer(src_reg, PTR_TO_PACKET))) { /* pkt_data' >= pkt_end, pkt_meta' >= pkt_data */ find_good_pkt_pointers(this_branch, dst_reg, - dst_reg->type, true); + base_type(dst_reg->type), true); mark_pkt_end(other_branch, insn->dst_reg, false); - } else if ((dst_reg->type == PTR_TO_PACKET_END && - src_reg->type == PTR_TO_PACKET) || + } else if ((base_type(dst_reg->type) == PTR_TO_PACKET_END && + base_type(src_reg->type) == PTR_TO_PACKET) || (reg_is_init_pkt_pointer(dst_reg, PTR_TO_PACKET) && - src_reg->type == PTR_TO_PACKET_META)) { + base_type(src_reg->type) == PTR_TO_PACKET_META)) { /* pkt_end >= pkt_data', pkt_data >= pkt_meta' */ find_good_pkt_pointers(other_branch, src_reg, - src_reg->type, false); + base_type(src_reg->type), false); mark_pkt_end(this_branch, insn->src_reg, true); } else { return false; } break; case BPF_JLE: - if ((dst_reg->type == PTR_TO_PACKET && - src_reg->type == PTR_TO_PACKET_END) || - (dst_reg->type == PTR_TO_PACKET_META && + if ((base_type(dst_reg->type) == PTR_TO_PACKET && + base_type(src_reg->type) == PTR_TO_PACKET_END) || + (base_type(dst_reg->type) == PTR_TO_PACKET_META && reg_is_init_pkt_pointer(src_reg, PTR_TO_PACKET))) { /* pkt_data' <= pkt_end, pkt_meta' <= pkt_data */ find_good_pkt_pointers(other_branch, dst_reg, - dst_reg->type, false); + base_type(dst_reg->type), false); mark_pkt_end(this_branch, insn->dst_reg, true); - } else if ((dst_reg->type == PTR_TO_PACKET_END && - src_reg->type == PTR_TO_PACKET) || + } else if ((base_type(dst_reg->type) == PTR_TO_PACKET_END && + base_type(src_reg->type) == PTR_TO_PACKET) || (reg_is_init_pkt_pointer(dst_reg, PTR_TO_PACKET) && - src_reg->type == PTR_TO_PACKET_META)) { + base_type(src_reg->type) == PTR_TO_PACKET_META)) { /* pkt_end <= pkt_data', pkt_data <= pkt_meta' */ find_good_pkt_pointers(this_branch, src_reg, - src_reg->type, true); + base_type(src_reg->type), true); mark_pkt_end(other_branch, insn->src_reg, false); } else { return false;