Message ID | 20221015092448.117563-1-shaozhengchao@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] bpf: fix issue that packet only contains l2 is dropped | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Clearly marked for bpf-next |
netdev/fixes_present | success | Fixes tag not required for -next series |
netdev/subject_prefix | success | Link |
netdev/cover_letter | success | Single patches do not need cover letters |
netdev/patch_count | success | Link |
netdev/header_inline | success | No static functions without inline keyword in header files |
netdev/build_32bit | success | Errors and warnings before: 35 this patch: 35 |
netdev/cc_maintainers | success | CCed 18 of 18 maintainers |
netdev/build_clang | success | Errors and warnings before: 5 this patch: 5 |
netdev/module_param | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Signed-off-by tag matches author and committer |
netdev/check_selftest | success | No net selftest shell script |
netdev/verify_fixes | success | Fixes tag looks correct |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 35 this patch: 35 |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 18 lines checked |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/source_inline | success | Was 0 now: 0 |
bpf/vmtest-bpf-next-VM_Test-4 | success | Logs for llvm-toolchain |
bpf/vmtest-bpf-next-VM_Test-5 | success | Logs for set-matrix |
bpf/vmtest-bpf-next-VM_Test-2 | success | Logs for build for x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-3 | success | Logs for build for x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-1 | success | Logs for build for s390x with gcc |
bpf/vmtest-bpf-next-VM_Test-16 | success | Logs for test_verifier on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-17 | success | Logs for test_verifier on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-7 | success | Logs for test_maps on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-8 | success | Logs for test_maps on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-10 | success | Logs for test_progs on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-11 | success | Logs for test_progs on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-13 | success | Logs for test_progs_no_alu32 on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-14 | fail | Logs for test_progs_no_alu32 on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-6 | fail | Logs for test_maps on s390x with gcc |
bpf/vmtest-bpf-next-VM_Test-15 | success | Logs for test_verifier on s390x with gcc |
bpf/vmtest-bpf-next-VM_Test-9 | fail | Logs for test_progs on s390x with gcc |
bpf/vmtest-bpf-next-PR | fail | PR summary |
bpf/vmtest-bpf-next-VM_Test-12 | success | Logs for test_progs_no_alu32 on s390x with gcc |
On Sat, Oct 15, 2022 at 2:16 AM Zhengchao Shao <shaozhengchao@huawei.com> wrote: > > As [0] see, bpf_prog_test_run_skb() should allow user space to forward > 14-bytes packet via BPF_PROG_RUN instead of dropping packet directly. > So fix it. > > 0: https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0 > > Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len") > Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> > --- > net/bpf/test_run.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c > index 13d578ce2a09..aa1b49f19ca3 100644 > --- a/net/bpf/test_run.c > +++ b/net/bpf/test_run.c > @@ -979,9 +979,6 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb) > { > struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; > > - if (!skb->len) > - return -EINVAL; > - > if (!__skb) > return 0; > > @@ -1102,6 +1099,9 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, > if (IS_ERR(data)) > return PTR_ERR(data); > > + if (size == ETH_HLEN) > + is_l2 = true; > + Don't think this will work? That is_l2 is there to expose proper l2/l3 skb for specific hooks; we can't suddenly start exposing l2 headers to the hooks that don't expect it. Does it make sense to start with a small reproducer that triggers the issue first? We can have a couple of cases for len=0/ETH_HLEN-1/ETH_HLEN+1 and trigger them from the bpf program that redirects to different devices (to trigger dev_is_mac_header_xmit). > ctx = bpf_ctx_init(kattr, sizeof(struct __sk_buff)); > if (IS_ERR(ctx)) { > kfree(data); > -- > 2.17.1 >
On 2022/10/18 0:36, Stanislav Fomichev wrote: > On Sat, Oct 15, 2022 at 2:16 AM Zhengchao Shao <shaozhengchao@huawei.com> wrote: >> >> As [0] see, bpf_prog_test_run_skb() should allow user space to forward >> 14-bytes packet via BPF_PROG_RUN instead of dropping packet directly. >> So fix it. >> >> 0: https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0 >> >> Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len") >> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> >> --- >> net/bpf/test_run.c | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c >> index 13d578ce2a09..aa1b49f19ca3 100644 >> --- a/net/bpf/test_run.c >> +++ b/net/bpf/test_run.c >> @@ -979,9 +979,6 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb) >> { >> struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; >> >> - if (!skb->len) >> - return -EINVAL; >> - >> if (!__skb) >> return 0; >> >> @@ -1102,6 +1099,9 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, >> if (IS_ERR(data)) >> return PTR_ERR(data); >> >> + if (size == ETH_HLEN) >> + is_l2 = true; >> + > > Don't think this will work? That is_l2 is there to expose proper l2/l3 > skb for specific hooks; we can't suddenly start exposing l2 headers to > the hooks that don't expect it. > Does it make sense to start with a small reproducer that triggers the > issue first? We can have a couple of cases for > len=0/ETH_HLEN-1/ETH_HLEN+1 and trigger them from the bpf program that > redirects to different devices (to trigger dev_is_mac_header_xmit). > > Hi Stanislav: Thank you for your review. Is_l2 is the flag of a specific hook. Therefore, do you mean that if skb->len is equal to 0, just add the length back? > > > >> ctx = bpf_ctx_init(kattr, sizeof(struct __sk_buff)); >> if (IS_ERR(ctx)) { >> kfree(data); >> -- >> 2.17.1 >>
On Wed, Oct 19, 2022 at 6:47 PM shaozhengchao <shaozhengchao@huawei.com> wrote: > > > > On 2022/10/18 0:36, Stanislav Fomichev wrote: > > On Sat, Oct 15, 2022 at 2:16 AM Zhengchao Shao <shaozhengchao@huawei.com> wrote: > >> > >> As [0] see, bpf_prog_test_run_skb() should allow user space to forward > >> 14-bytes packet via BPF_PROG_RUN instead of dropping packet directly. > >> So fix it. > >> > >> 0: https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0 > >> > >> Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len") > >> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> > >> --- > >> net/bpf/test_run.c | 6 +++--- > >> 1 file changed, 3 insertions(+), 3 deletions(-) > >> > >> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c > >> index 13d578ce2a09..aa1b49f19ca3 100644 > >> --- a/net/bpf/test_run.c > >> +++ b/net/bpf/test_run.c > >> @@ -979,9 +979,6 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb) > >> { > >> struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; > >> > >> - if (!skb->len) > >> - return -EINVAL; > >> - > >> if (!__skb) > >> return 0; > >> > >> @@ -1102,6 +1099,9 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, > >> if (IS_ERR(data)) > >> return PTR_ERR(data); > >> > >> + if (size == ETH_HLEN) > >> + is_l2 = true; > >> + > > > > Don't think this will work? That is_l2 is there to expose proper l2/l3 > > skb for specific hooks; we can't suddenly start exposing l2 headers to > > the hooks that don't expect it. > > Does it make sense to start with a small reproducer that triggers the > > issue first? We can have a couple of cases for > > len=0/ETH_HLEN-1/ETH_HLEN+1 and trigger them from the bpf program that > > redirects to different devices (to trigger dev_is_mac_header_xmit). > > > > > Hi Stanislav: > Thank you for your review. Is_l2 is the flag of a specific > hook. Therefore, do you mean that if skb->len is equal to 0, just > add the length back? Not sure I understand your question. All I'm saying is - you can't flip that flag arbitrarily. This flag depends on the attach point that you're running the prog against. Some attach points expect packets with l2, some expect packets without l2. What about starting with a small reproducer? Does it make sense to create a small selftest that adds net namespace + fq_codel + bpf_prog_test run and do redirect ingress/egress with len 0/1...tcphdr? Because I'm not sure I 100% understand whether it's only len=0 that's problematic or some other combination as well? > > > > > > > >> ctx = bpf_ctx_init(kattr, sizeof(struct __sk_buff)); > >> if (IS_ERR(ctx)) { > >> kfree(data); > >> -- > >> 2.17.1 > >>
On 2022/10/21 1:45, Stanislav Fomichev wrote: > On Wed, Oct 19, 2022 at 6:47 PM shaozhengchao <shaozhengchao@huawei.com> wrote: >> >> >> >> On 2022/10/18 0:36, Stanislav Fomichev wrote: >>> On Sat, Oct 15, 2022 at 2:16 AM Zhengchao Shao <shaozhengchao@huawei.com> wrote: >>>> >>>> As [0] see, bpf_prog_test_run_skb() should allow user space to forward >>>> 14-bytes packet via BPF_PROG_RUN instead of dropping packet directly. >>>> So fix it. >>>> >>>> 0: https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0 >>>> >>>> Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len") >>>> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> >>>> --- >>>> net/bpf/test_run.c | 6 +++--- >>>> 1 file changed, 3 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c >>>> index 13d578ce2a09..aa1b49f19ca3 100644 >>>> --- a/net/bpf/test_run.c >>>> +++ b/net/bpf/test_run.c >>>> @@ -979,9 +979,6 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb) >>>> { >>>> struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; >>>> >>>> - if (!skb->len) >>>> - return -EINVAL; >>>> - >>>> if (!__skb) >>>> return 0; >>>> >>>> @@ -1102,6 +1099,9 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, >>>> if (IS_ERR(data)) >>>> return PTR_ERR(data); >>>> >>>> + if (size == ETH_HLEN) >>>> + is_l2 = true; >>>> + >>> >>> Don't think this will work? That is_l2 is there to expose proper l2/l3 >>> skb for specific hooks; we can't suddenly start exposing l2 headers to >>> the hooks that don't expect it. >>> Does it make sense to start with a small reproducer that triggers the >>> issue first? We can have a couple of cases for >>> len=0/ETH_HLEN-1/ETH_HLEN+1 and trigger them from the bpf program that >>> redirects to different devices (to trigger dev_is_mac_header_xmit). >>> >>> >> Hi Stanislav: >> Thank you for your review. Is_l2 is the flag of a specific >> hook. Therefore, do you mean that if skb->len is equal to 0, just >> add the length back? > > Not sure I understand your question. All I'm saying is - you can't > flip that flag arbitrarily. This flag depends on the attach point that > you're running the prog against. Some attach points expect packets > with l2, some expect packets without l2. > > What about starting with a small reproducer? Does it make sense to > create a small selftest that adds net namespace + fq_codel + > bpf_prog_test run and do redirect ingress/egress with len > 0/1...tcphdr? Because I'm not sure I 100% understand whether it's only > len=0 that's problematic or some other combination as well? > yes, only skb->len = 0 will cause null-ptr-deref issue. The following is the process of triggering the problem: enqueue a skb: fq_codel_enqueue() ... idx = fq_codel_classify() --->if idx != 0 flow = &q->flows[idx]; flow_queue_add(flow, skb); --->add skb to flow[idex] q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0 ... fq_codel_drop() --->set sch->limit = 0, always drop packets ... idx = i --->becuase backlogs in every flows is 0, so idx = 0 ... flow = &q->flows[idx]; --->get idx=0 flow ... dequeue_head() skb = flow->head; --->flow->head = NULL flow->head = skb->next; --->cause null-ptr-deref So, if skb->len !=0,fq_codel_drop() could get the correct idx, and then skb!=NULL, it will be OK. Maybe, I will fix it in fq_codel. But, as I know, skb->len = 0 is just invalid packet. I prefer to add the length back, like bellow: if (is_l2 || !skb->len) __skb_push(skb, hh_len); is it OK? >>> >>> >>> >>>> ctx = bpf_ctx_init(kattr, sizeof(struct __sk_buff)); >>>> if (IS_ERR(ctx)) { >>>> kfree(data); >>>> -- >>>> 2.17.1 >>>>
On Fri, Oct 21, 2022 at 12:25 AM shaozhengchao <shaozhengchao@huawei.com> wrote: > > > > On 2022/10/21 1:45, Stanislav Fomichev wrote: > > On Wed, Oct 19, 2022 at 6:47 PM shaozhengchao <shaozhengchao@huawei.com> wrote: > >> > >> > >> > >> On 2022/10/18 0:36, Stanislav Fomichev wrote: > >>> On Sat, Oct 15, 2022 at 2:16 AM Zhengchao Shao <shaozhengchao@huawei.com> wrote: > >>>> > >>>> As [0] see, bpf_prog_test_run_skb() should allow user space to forward > >>>> 14-bytes packet via BPF_PROG_RUN instead of dropping packet directly. > >>>> So fix it. > >>>> > >>>> 0: https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0 > >>>> > >>>> Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len") > >>>> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> > >>>> --- > >>>> net/bpf/test_run.c | 6 +++--- > >>>> 1 file changed, 3 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c > >>>> index 13d578ce2a09..aa1b49f19ca3 100644 > >>>> --- a/net/bpf/test_run.c > >>>> +++ b/net/bpf/test_run.c > >>>> @@ -979,9 +979,6 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb) > >>>> { > >>>> struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; > >>>> > >>>> - if (!skb->len) > >>>> - return -EINVAL; > >>>> - > >>>> if (!__skb) > >>>> return 0; > >>>> > >>>> @@ -1102,6 +1099,9 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, > >>>> if (IS_ERR(data)) > >>>> return PTR_ERR(data); > >>>> > >>>> + if (size == ETH_HLEN) > >>>> + is_l2 = true; > >>>> + > >>> > >>> Don't think this will work? That is_l2 is there to expose proper l2/l3 > >>> skb for specific hooks; we can't suddenly start exposing l2 headers to > >>> the hooks that don't expect it. > >>> Does it make sense to start with a small reproducer that triggers the > >>> issue first? We can have a couple of cases for > >>> len=0/ETH_HLEN-1/ETH_HLEN+1 and trigger them from the bpf program that > >>> redirects to different devices (to trigger dev_is_mac_header_xmit). > >>> > >>> > >> Hi Stanislav: > >> Thank you for your review. Is_l2 is the flag of a specific > >> hook. Therefore, do you mean that if skb->len is equal to 0, just > >> add the length back? > > > > Not sure I understand your question. All I'm saying is - you can't > > flip that flag arbitrarily. This flag depends on the attach point that > > you're running the prog against. Some attach points expect packets > > with l2, some expect packets without l2. > > > > What about starting with a small reproducer? Does it make sense to > > create a small selftest that adds net namespace + fq_codel + > > bpf_prog_test run and do redirect ingress/egress with len > > 0/1...tcphdr? Because I'm not sure I 100% understand whether it's only > > len=0 that's problematic or some other combination as well? > > > yes, only skb->len = 0 will cause null-ptr-deref issue. > The following is the process of triggering the problem: > enqueue a skb: > fq_codel_enqueue() > ... > idx = fq_codel_classify() --->if idx != 0 > flow = &q->flows[idx]; > flow_queue_add(flow, skb); --->add skb to flow[idex] > q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0 > ... > fq_codel_drop() --->set sch->limit = 0, always > drop packets > ... > idx = i --->becuase backlogs in every > flows is 0, so idx = 0 > ... > flow = &q->flows[idx]; --->get idx=0 flow > ... > dequeue_head() > skb = flow->head; --->flow->head = NULL > flow->head = skb->next; --->cause null-ptr-deref > So, if skb->len !=0,fq_codel_drop() could get the correct idx, and > then skb!=NULL, it will be OK. > Maybe, I will fix it in fq_codel. I think the consensus here is that the stack, in general, doesn't expect the packets like this. So there are probably more broken things besides fq_codel. Thus, it's better if we remove the ability to generate them from the bpf side instead of fixing the individual users like fq_codel. > But, as I know, skb->len = 0 is just invalid packet. I prefer to add the > length back, like bellow: > if (is_l2 || !skb->len) > __skb_push(skb, hh_len); > is it OK? Probably not? Looking at the original syzkaller report, prog_type is BPF_PROG_TYPE_LWT_XMIT which does expect a packet without l2 header. Can we do something like: if (!is_l2 && !skb->len) { // append some dummy byte to the skb ? } } > >>> > >>> > >>> > >>>> ctx = bpf_ctx_init(kattr, sizeof(struct __sk_buff)); > >>>> if (IS_ERR(ctx)) { > >>>> kfree(data); > >>>> -- > >>>> 2.17.1 > >>>>
On 2022/10/22 2:16, Stanislav Fomichev wrote: > On Fri, Oct 21, 2022 at 12:25 AM shaozhengchao <shaozhengchao@huawei.com> wrote: >> >> >> >> On 2022/10/21 1:45, Stanislav Fomichev wrote: >>> On Wed, Oct 19, 2022 at 6:47 PM shaozhengchao <shaozhengchao@huawei.com> wrote: >>>> >>>> >>>> >>>> On 2022/10/18 0:36, Stanislav Fomichev wrote: >>>>> On Sat, Oct 15, 2022 at 2:16 AM Zhengchao Shao <shaozhengchao@huawei.com> wrote: >>>>>> >>>>>> As [0] see, bpf_prog_test_run_skb() should allow user space to forward >>>>>> 14-bytes packet via BPF_PROG_RUN instead of dropping packet directly. >>>>>> So fix it. >>>>>> >>>>>> 0: https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0 >>>>>> >>>>>> Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len") >>>>>> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> >>>>>> --- >>>>>> net/bpf/test_run.c | 6 +++--- >>>>>> 1 file changed, 3 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c >>>>>> index 13d578ce2a09..aa1b49f19ca3 100644 >>>>>> --- a/net/bpf/test_run.c >>>>>> +++ b/net/bpf/test_run.c >>>>>> @@ -979,9 +979,6 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb) >>>>>> { >>>>>> struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; >>>>>> >>>>>> - if (!skb->len) >>>>>> - return -EINVAL; >>>>>> - >>>>>> if (!__skb) >>>>>> return 0; >>>>>> >>>>>> @@ -1102,6 +1099,9 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, >>>>>> if (IS_ERR(data)) >>>>>> return PTR_ERR(data); >>>>>> >>>>>> + if (size == ETH_HLEN) >>>>>> + is_l2 = true; >>>>>> + >>>>> >>>>> Don't think this will work? That is_l2 is there to expose proper l2/l3 >>>>> skb for specific hooks; we can't suddenly start exposing l2 headers to >>>>> the hooks that don't expect it. >>>>> Does it make sense to start with a small reproducer that triggers the >>>>> issue first? We can have a couple of cases for >>>>> len=0/ETH_HLEN-1/ETH_HLEN+1 and trigger them from the bpf program that >>>>> redirects to different devices (to trigger dev_is_mac_header_xmit). >>>>> >>>>> >>>> Hi Stanislav: >>>> Thank you for your review. Is_l2 is the flag of a specific >>>> hook. Therefore, do you mean that if skb->len is equal to 0, just >>>> add the length back? >>> >>> Not sure I understand your question. All I'm saying is - you can't >>> flip that flag arbitrarily. This flag depends on the attach point that >>> you're running the prog against. Some attach points expect packets >>> with l2, some expect packets without l2. >>> >>> What about starting with a small reproducer? Does it make sense to >>> create a small selftest that adds net namespace + fq_codel + >>> bpf_prog_test run and do redirect ingress/egress with len >>> 0/1...tcphdr? Because I'm not sure I 100% understand whether it's only >>> len=0 that's problematic or some other combination as well? >>> >> yes, only skb->len = 0 will cause null-ptr-deref issue. >> The following is the process of triggering the problem: >> enqueue a skb: >> fq_codel_enqueue() >> ... >> idx = fq_codel_classify() --->if idx != 0 >> flow = &q->flows[idx]; >> flow_queue_add(flow, skb); --->add skb to flow[idex] >> q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0 >> ... >> fq_codel_drop() --->set sch->limit = 0, always >> drop packets >> ... >> idx = i --->becuase backlogs in every >> flows is 0, so idx = 0 >> ... >> flow = &q->flows[idx]; --->get idx=0 flow >> ... >> dequeue_head() >> skb = flow->head; --->flow->head = NULL >> flow->head = skb->next; --->cause null-ptr-deref >> So, if skb->len !=0,fq_codel_drop() could get the correct idx, and >> then skb!=NULL, it will be OK. >> Maybe, I will fix it in fq_codel. > > I think the consensus here is that the stack, in general, doesn't > expect the packets like this. So there are probably more broken things > besides fq_codel. Thus, it's better if we remove the ability to > generate them from the bpf side instead of fixing the individual users > like fq_codel. > >> But, as I know, skb->len = 0 is just invalid packet. I prefer to add the >> length back, like bellow: >> if (is_l2 || !skb->len) >> __skb_push(skb, hh_len); >> is it OK? > > Probably not? > > Looking at the original syzkaller report, prog_type is > BPF_PROG_TYPE_LWT_XMIT which does expect a packet without l2 header. > Can we do something like: > > if (!is_l2 && !skb->len) { > // append some dummy byte to the skb ? > } > > I pad one byte, and test OK. if (!is_l2 && !skb->len) __skb_push(skb, 1); Does it look OK to you? > } > >>>>> >>>>> >>>>> >>>>>> ctx = bpf_ctx_init(kattr, sizeof(struct __sk_buff)); >>>>>> if (IS_ERR(ctx)) { >>>>>> kfree(data); >>>>>> -- >>>>>> 2.17.1 >>>>>>
On Sat, Oct 22, 2022 at 4:36 AM shaozhengchao <shaozhengchao@huawei.com> wrote: > > > > On 2022/10/22 2:16, Stanislav Fomichev wrote: > > On Fri, Oct 21, 2022 at 12:25 AM shaozhengchao <shaozhengchao@huawei.com> wrote: > >> > >> > >> > >> On 2022/10/21 1:45, Stanislav Fomichev wrote: > >>> On Wed, Oct 19, 2022 at 6:47 PM shaozhengchao <shaozhengchao@huawei.com> wrote: > >>>> > >>>> > >>>> > >>>> On 2022/10/18 0:36, Stanislav Fomichev wrote: > >>>>> On Sat, Oct 15, 2022 at 2:16 AM Zhengchao Shao <shaozhengchao@huawei.com> wrote: > >>>>>> > >>>>>> As [0] see, bpf_prog_test_run_skb() should allow user space to forward > >>>>>> 14-bytes packet via BPF_PROG_RUN instead of dropping packet directly. > >>>>>> So fix it. > >>>>>> > >>>>>> 0: https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0 > >>>>>> > >>>>>> Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len") > >>>>>> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> > >>>>>> --- > >>>>>> net/bpf/test_run.c | 6 +++--- > >>>>>> 1 file changed, 3 insertions(+), 3 deletions(-) > >>>>>> > >>>>>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c > >>>>>> index 13d578ce2a09..aa1b49f19ca3 100644 > >>>>>> --- a/net/bpf/test_run.c > >>>>>> +++ b/net/bpf/test_run.c > >>>>>> @@ -979,9 +979,6 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb) > >>>>>> { > >>>>>> struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; > >>>>>> > >>>>>> - if (!skb->len) > >>>>>> - return -EINVAL; > >>>>>> - > >>>>>> if (!__skb) > >>>>>> return 0; > >>>>>> > >>>>>> @@ -1102,6 +1099,9 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, > >>>>>> if (IS_ERR(data)) > >>>>>> return PTR_ERR(data); > >>>>>> > >>>>>> + if (size == ETH_HLEN) > >>>>>> + is_l2 = true; > >>>>>> + > >>>>> > >>>>> Don't think this will work? That is_l2 is there to expose proper l2/l3 > >>>>> skb for specific hooks; we can't suddenly start exposing l2 headers to > >>>>> the hooks that don't expect it. > >>>>> Does it make sense to start with a small reproducer that triggers the > >>>>> issue first? We can have a couple of cases for > >>>>> len=0/ETH_HLEN-1/ETH_HLEN+1 and trigger them from the bpf program that > >>>>> redirects to different devices (to trigger dev_is_mac_header_xmit). > >>>>> > >>>>> > >>>> Hi Stanislav: > >>>> Thank you for your review. Is_l2 is the flag of a specific > >>>> hook. Therefore, do you mean that if skb->len is equal to 0, just > >>>> add the length back? > >>> > >>> Not sure I understand your question. All I'm saying is - you can't > >>> flip that flag arbitrarily. This flag depends on the attach point that > >>> you're running the prog against. Some attach points expect packets > >>> with l2, some expect packets without l2. > >>> > >>> What about starting with a small reproducer? Does it make sense to > >>> create a small selftest that adds net namespace + fq_codel + > >>> bpf_prog_test run and do redirect ingress/egress with len > >>> 0/1...tcphdr? Because I'm not sure I 100% understand whether it's only > >>> len=0 that's problematic or some other combination as well? > >>> > >> yes, only skb->len = 0 will cause null-ptr-deref issue. > >> The following is the process of triggering the problem: > >> enqueue a skb: > >> fq_codel_enqueue() > >> ... > >> idx = fq_codel_classify() --->if idx != 0 > >> flow = &q->flows[idx]; > >> flow_queue_add(flow, skb); --->add skb to flow[idex] > >> q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0 > >> ... > >> fq_codel_drop() --->set sch->limit = 0, always > >> drop packets > >> ... > >> idx = i --->becuase backlogs in every > >> flows is 0, so idx = 0 > >> ... > >> flow = &q->flows[idx]; --->get idx=0 flow > >> ... > >> dequeue_head() > >> skb = flow->head; --->flow->head = NULL > >> flow->head = skb->next; --->cause null-ptr-deref > >> So, if skb->len !=0,fq_codel_drop() could get the correct idx, and > >> then skb!=NULL, it will be OK. > >> Maybe, I will fix it in fq_codel. > > > > I think the consensus here is that the stack, in general, doesn't > > expect the packets like this. So there are probably more broken things > > besides fq_codel. Thus, it's better if we remove the ability to > > generate them from the bpf side instead of fixing the individual users > > like fq_codel. > > > >> But, as I know, skb->len = 0 is just invalid packet. I prefer to add the > >> length back, like bellow: > >> if (is_l2 || !skb->len) > >> __skb_push(skb, hh_len); > >> is it OK? > > > > Probably not? > > > > Looking at the original syzkaller report, prog_type is > > BPF_PROG_TYPE_LWT_XMIT which does expect a packet without l2 header. > > Can we do something like: > > > > if (!is_l2 && !skb->len) { > > // append some dummy byte to the skb ? > > } > > > > > I pad one byte, and test OK. > if (!is_l2 && !skb->len) > __skb_push(skb, 1); > > Does it look OK to you? Nope, this will eat a byte out of the l2 header. We need to skb_put and make sure we allocate enough to make that skb_put succeed. But stepping back a bit: it feels like it's all unnecessary? The only valid use-case of this is probing for the BPF_PROG_TEST_RUN as cilium does. This is mostly about testing, so fixing it in the users seems fair? No real production code is expected to generate these zero-len packets. Or are we concerned that this will leak into stable kernels? I feel like we are trying to add more complexity here for no apparent reason.
On 2022/10/25 1:13, Stanislav Fomichev wrote: > On Sat, Oct 22, 2022 at 4:36 AM shaozhengchao <shaozhengchao@huawei.com> wrote: >> >> >> >> On 2022/10/22 2:16, Stanislav Fomichev wrote: >>> On Fri, Oct 21, 2022 at 12:25 AM shaozhengchao <shaozhengchao@huawei.com> wrote: >>>> >>>> >>>> >>>> On 2022/10/21 1:45, Stanislav Fomichev wrote: >>>>> On Wed, Oct 19, 2022 at 6:47 PM shaozhengchao <shaozhengchao@huawei.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 2022/10/18 0:36, Stanislav Fomichev wrote: >>>>>>> On Sat, Oct 15, 2022 at 2:16 AM Zhengchao Shao <shaozhengchao@huawei.com> wrote: >>>>>>>> >>>>>>>> As [0] see, bpf_prog_test_run_skb() should allow user space to forward >>>>>>>> 14-bytes packet via BPF_PROG_RUN instead of dropping packet directly. >>>>>>>> So fix it. >>>>>>>> >>>>>>>> 0: https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0 >>>>>>>> >>>>>>>> Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len") >>>>>>>> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> >>>>>>>> --- >>>>>>>> net/bpf/test_run.c | 6 +++--- >>>>>>>> 1 file changed, 3 insertions(+), 3 deletions(-) >>>>>>>> >>>>>>>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c >>>>>>>> index 13d578ce2a09..aa1b49f19ca3 100644 >>>>>>>> --- a/net/bpf/test_run.c >>>>>>>> +++ b/net/bpf/test_run.c >>>>>>>> @@ -979,9 +979,6 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb) >>>>>>>> { >>>>>>>> struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; >>>>>>>> >>>>>>>> - if (!skb->len) >>>>>>>> - return -EINVAL; >>>>>>>> - >>>>>>>> if (!__skb) >>>>>>>> return 0; >>>>>>>> >>>>>>>> @@ -1102,6 +1099,9 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, >>>>>>>> if (IS_ERR(data)) >>>>>>>> return PTR_ERR(data); >>>>>>>> >>>>>>>> + if (size == ETH_HLEN) >>>>>>>> + is_l2 = true; >>>>>>>> + >>>>>>> >>>>>>> Don't think this will work? That is_l2 is there to expose proper l2/l3 >>>>>>> skb for specific hooks; we can't suddenly start exposing l2 headers to >>>>>>> the hooks that don't expect it. >>>>>>> Does it make sense to start with a small reproducer that triggers the >>>>>>> issue first? We can have a couple of cases for >>>>>>> len=0/ETH_HLEN-1/ETH_HLEN+1 and trigger them from the bpf program that >>>>>>> redirects to different devices (to trigger dev_is_mac_header_xmit). >>>>>>> >>>>>>> >>>>>> Hi Stanislav: >>>>>> Thank you for your review. Is_l2 is the flag of a specific >>>>>> hook. Therefore, do you mean that if skb->len is equal to 0, just >>>>>> add the length back? >>>>> >>>>> Not sure I understand your question. All I'm saying is - you can't >>>>> flip that flag arbitrarily. This flag depends on the attach point that >>>>> you're running the prog against. Some attach points expect packets >>>>> with l2, some expect packets without l2. >>>>> >>>>> What about starting with a small reproducer? Does it make sense to >>>>> create a small selftest that adds net namespace + fq_codel + >>>>> bpf_prog_test run and do redirect ingress/egress with len >>>>> 0/1...tcphdr? Because I'm not sure I 100% understand whether it's only >>>>> len=0 that's problematic or some other combination as well? >>>>> >>>> yes, only skb->len = 0 will cause null-ptr-deref issue. >>>> The following is the process of triggering the problem: >>>> enqueue a skb: >>>> fq_codel_enqueue() >>>> ... >>>> idx = fq_codel_classify() --->if idx != 0 >>>> flow = &q->flows[idx]; >>>> flow_queue_add(flow, skb); --->add skb to flow[idex] >>>> q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0 >>>> ... >>>> fq_codel_drop() --->set sch->limit = 0, always >>>> drop packets >>>> ... >>>> idx = i --->becuase backlogs in every >>>> flows is 0, so idx = 0 >>>> ... >>>> flow = &q->flows[idx]; --->get idx=0 flow >>>> ... >>>> dequeue_head() >>>> skb = flow->head; --->flow->head = NULL >>>> flow->head = skb->next; --->cause null-ptr-deref >>>> So, if skb->len !=0,fq_codel_drop() could get the correct idx, and >>>> then skb!=NULL, it will be OK. >>>> Maybe, I will fix it in fq_codel. >>> >>> I think the consensus here is that the stack, in general, doesn't >>> expect the packets like this. So there are probably more broken things >>> besides fq_codel. Thus, it's better if we remove the ability to >>> generate them from the bpf side instead of fixing the individual users >>> like fq_codel. >>> >>>> But, as I know, skb->len = 0 is just invalid packet. I prefer to add the >>>> length back, like bellow: >>>> if (is_l2 || !skb->len) >>>> __skb_push(skb, hh_len); >>>> is it OK? >>> >>> Probably not? >>> >>> Looking at the original syzkaller report, prog_type is >>> BPF_PROG_TYPE_LWT_XMIT which does expect a packet without l2 header. >>> Can we do something like: >>> >>> if (!is_l2 && !skb->len) { >>> // append some dummy byte to the skb ? >>> } >>> >>> >> I pad one byte, and test OK. >> if (!is_l2 && !skb->len) >> __skb_push(skb, 1); >> >> Does it look OK to you? > > Nope, this will eat a byte out of the l2 header. We need to skb_put > and make sure we allocate enough to make that skb_put succeed. > > But stepping back a bit: it feels like it's all unnecessary? The only > valid use-case of this is probing for the BPF_PROG_TEST_RUN as cilium > does. This is mostly about testing, so fixing it in the users seems > fair? No real production code is expected to generate these zero-len > packets. Or are we concerned that this will leak into stable kernels? > > I feel like we are trying to add more complexity here for no apparent reason. > I agree with you. users should make sure the correct skb len and configurations are passed into kernel. Incorrect configurations should be discarded to ensure kernel stability. Lorenz, Can you modify the user-mode test code? Zhengchao Shao
On Thu, Oct 27, 2022 at 4:58 AM shaozhengchao <shaozhengchao@huawei.com> wrote: > > > > On 2022/10/25 1:13, Stanislav Fomichev wrote: > > On Sat, Oct 22, 2022 at 4:36 AM shaozhengchao <shaozhengchao@huawei.com> wrote: > >> > >> > >> > >> On 2022/10/22 2:16, Stanislav Fomichev wrote: > >>> On Fri, Oct 21, 2022 at 12:25 AM shaozhengchao <shaozhengchao@huawei.com> wrote: > >>>> > >>>> > >>>> > >>>> On 2022/10/21 1:45, Stanislav Fomichev wrote: > >>>>> On Wed, Oct 19, 2022 at 6:47 PM shaozhengchao <shaozhengchao@huawei.com> wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 2022/10/18 0:36, Stanislav Fomichev wrote: > >>>>>>> On Sat, Oct 15, 2022 at 2:16 AM Zhengchao Shao <shaozhengchao@huawei.com> wrote: > >>>>>>>> > >>>>>>>> As [0] see, bpf_prog_test_run_skb() should allow user space to forward > >>>>>>>> 14-bytes packet via BPF_PROG_RUN instead of dropping packet directly. > >>>>>>>> So fix it. > >>>>>>>> > >>>>>>>> 0: https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0 > >>>>>>>> > >>>>>>>> Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len") > >>>>>>>> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> > >>>>>>>> --- > >>>>>>>> net/bpf/test_run.c | 6 +++--- > >>>>>>>> 1 file changed, 3 insertions(+), 3 deletions(-) > >>>>>>>> > >>>>>>>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c > >>>>>>>> index 13d578ce2a09..aa1b49f19ca3 100644 > >>>>>>>> --- a/net/bpf/test_run.c > >>>>>>>> +++ b/net/bpf/test_run.c > >>>>>>>> @@ -979,9 +979,6 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb) > >>>>>>>> { > >>>>>>>> struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; > >>>>>>>> > >>>>>>>> - if (!skb->len) > >>>>>>>> - return -EINVAL; > >>>>>>>> - > >>>>>>>> if (!__skb) > >>>>>>>> return 0; > >>>>>>>> > >>>>>>>> @@ -1102,6 +1099,9 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, > >>>>>>>> if (IS_ERR(data)) > >>>>>>>> return PTR_ERR(data); > >>>>>>>> > >>>>>>>> + if (size == ETH_HLEN) > >>>>>>>> + is_l2 = true; > >>>>>>>> + > >>>>>>> > >>>>>>> Don't think this will work? That is_l2 is there to expose proper l2/l3 > >>>>>>> skb for specific hooks; we can't suddenly start exposing l2 headers to > >>>>>>> the hooks that don't expect it. > >>>>>>> Does it make sense to start with a small reproducer that triggers the > >>>>>>> issue first? We can have a couple of cases for > >>>>>>> len=0/ETH_HLEN-1/ETH_HLEN+1 and trigger them from the bpf program that > >>>>>>> redirects to different devices (to trigger dev_is_mac_header_xmit). > >>>>>>> > >>>>>>> > >>>>>> Hi Stanislav: > >>>>>> Thank you for your review. Is_l2 is the flag of a specific > >>>>>> hook. Therefore, do you mean that if skb->len is equal to 0, just > >>>>>> add the length back? > >>>>> > >>>>> Not sure I understand your question. All I'm saying is - you can't > >>>>> flip that flag arbitrarily. This flag depends on the attach point that > >>>>> you're running the prog against. Some attach points expect packets > >>>>> with l2, some expect packets without l2. > >>>>> > >>>>> What about starting with a small reproducer? Does it make sense to > >>>>> create a small selftest that adds net namespace + fq_codel + > >>>>> bpf_prog_test run and do redirect ingress/egress with len > >>>>> 0/1...tcphdr? Because I'm not sure I 100% understand whether it's only > >>>>> len=0 that's problematic or some other combination as well? > >>>>> > >>>> yes, only skb->len = 0 will cause null-ptr-deref issue. > >>>> The following is the process of triggering the problem: > >>>> enqueue a skb: > >>>> fq_codel_enqueue() > >>>> ... > >>>> idx = fq_codel_classify() --->if idx != 0 > >>>> flow = &q->flows[idx]; > >>>> flow_queue_add(flow, skb); --->add skb to flow[idex] > >>>> q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0 > >>>> ... > >>>> fq_codel_drop() --->set sch->limit = 0, always > >>>> drop packets > >>>> ... > >>>> idx = i --->becuase backlogs in every > >>>> flows is 0, so idx = 0 > >>>> ... > >>>> flow = &q->flows[idx]; --->get idx=0 flow > >>>> ... > >>>> dequeue_head() > >>>> skb = flow->head; --->flow->head = NULL > >>>> flow->head = skb->next; --->cause null-ptr-deref > >>>> So, if skb->len !=0,fq_codel_drop() could get the correct idx, and > >>>> then skb!=NULL, it will be OK. > >>>> Maybe, I will fix it in fq_codel. > >>> > >>> I think the consensus here is that the stack, in general, doesn't > >>> expect the packets like this. So there are probably more broken things > >>> besides fq_codel. Thus, it's better if we remove the ability to > >>> generate them from the bpf side instead of fixing the individual users > >>> like fq_codel. > >>> > >>>> But, as I know, skb->len = 0 is just invalid packet. I prefer to add the > >>>> length back, like bellow: > >>>> if (is_l2 || !skb->len) > >>>> __skb_push(skb, hh_len); > >>>> is it OK? > >>> > >>> Probably not? > >>> > >>> Looking at the original syzkaller report, prog_type is > >>> BPF_PROG_TYPE_LWT_XMIT which does expect a packet without l2 header. > >>> Can we do something like: > >>> > >>> if (!is_l2 && !skb->len) { > >>> // append some dummy byte to the skb ? > >>> } > >>> > >>> > >> I pad one byte, and test OK. > >> if (!is_l2 && !skb->len) > >> __skb_push(skb, 1); > >> > >> Does it look OK to you? > > > > Nope, this will eat a byte out of the l2 header. We need to skb_put > > and make sure we allocate enough to make that skb_put succeed. > > > > But stepping back a bit: it feels like it's all unnecessary? The only > > valid use-case of this is probing for the BPF_PROG_TEST_RUN as cilium > > does. This is mostly about testing, so fixing it in the users seems > > fair? No real production code is expected to generate these zero-len > > packets. Or are we concerned that this will leak into stable kernels? > > > > I feel like we are trying to add more complexity here for no apparent reason. > > > I agree with you. users should make sure the correct skb len and > configurations are passed into kernel. Incorrect configurations should > be discarded to ensure kernel stability. > > Lorenz, Can you modify the user-mode test code? Lorenz already fixed it for Cilium. I think the discussion here is around other potential users out there. Let's wait for them to appear if it is indeed a problem?
On 2022/10/28 0:37, Stanislav Fomichev wrote: > On Thu, Oct 27, 2022 at 4:58 AM shaozhengchao <shaozhengchao@huawei.com> wrote: >> >> >> >> On 2022/10/25 1:13, Stanislav Fomichev wrote: >>> On Sat, Oct 22, 2022 at 4:36 AM shaozhengchao <shaozhengchao@huawei.com> wrote: >>>> >>>> >>>> >>>> On 2022/10/22 2:16, Stanislav Fomichev wrote: >>>>> On Fri, Oct 21, 2022 at 12:25 AM shaozhengchao <shaozhengchao@huawei.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 2022/10/21 1:45, Stanislav Fomichev wrote: >>>>>>> On Wed, Oct 19, 2022 at 6:47 PM shaozhengchao <shaozhengchao@huawei.com> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 2022/10/18 0:36, Stanislav Fomichev wrote: >>>>>>>>> On Sat, Oct 15, 2022 at 2:16 AM Zhengchao Shao <shaozhengchao@huawei.com> wrote: >>>>>>>>>> >>>>>>>>>> As [0] see, bpf_prog_test_run_skb() should allow user space to forward >>>>>>>>>> 14-bytes packet via BPF_PROG_RUN instead of dropping packet directly. >>>>>>>>>> So fix it. >>>>>>>>>> >>>>>>>>>> 0: https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0 >>>>>>>>>> >>>>>>>>>> Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len") >>>>>>>>>> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> >>>>>>>>>> --- >>>>>>>>>> net/bpf/test_run.c | 6 +++--- >>>>>>>>>> 1 file changed, 3 insertions(+), 3 deletions(-) >>>>>>>>>> >>>>>>>>>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c >>>>>>>>>> index 13d578ce2a09..aa1b49f19ca3 100644 >>>>>>>>>> --- a/net/bpf/test_run.c >>>>>>>>>> +++ b/net/bpf/test_run.c >>>>>>>>>> @@ -979,9 +979,6 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb) >>>>>>>>>> { >>>>>>>>>> struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; >>>>>>>>>> >>>>>>>>>> - if (!skb->len) >>>>>>>>>> - return -EINVAL; >>>>>>>>>> - >>>>>>>>>> if (!__skb) >>>>>>>>>> return 0; >>>>>>>>>> >>>>>>>>>> @@ -1102,6 +1099,9 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, >>>>>>>>>> if (IS_ERR(data)) >>>>>>>>>> return PTR_ERR(data); >>>>>>>>>> >>>>>>>>>> + if (size == ETH_HLEN) >>>>>>>>>> + is_l2 = true; >>>>>>>>>> + >>>>>>>>> >>>>>>>>> Don't think this will work? That is_l2 is there to expose proper l2/l3 >>>>>>>>> skb for specific hooks; we can't suddenly start exposing l2 headers to >>>>>>>>> the hooks that don't expect it. >>>>>>>>> Does it make sense to start with a small reproducer that triggers the >>>>>>>>> issue first? We can have a couple of cases for >>>>>>>>> len=0/ETH_HLEN-1/ETH_HLEN+1 and trigger them from the bpf program that >>>>>>>>> redirects to different devices (to trigger dev_is_mac_header_xmit). >>>>>>>>> >>>>>>>>> >>>>>>>> Hi Stanislav: >>>>>>>> Thank you for your review. Is_l2 is the flag of a specific >>>>>>>> hook. Therefore, do you mean that if skb->len is equal to 0, just >>>>>>>> add the length back? >>>>>>> >>>>>>> Not sure I understand your question. All I'm saying is - you can't >>>>>>> flip that flag arbitrarily. This flag depends on the attach point that >>>>>>> you're running the prog against. Some attach points expect packets >>>>>>> with l2, some expect packets without l2. >>>>>>> >>>>>>> What about starting with a small reproducer? Does it make sense to >>>>>>> create a small selftest that adds net namespace + fq_codel + >>>>>>> bpf_prog_test run and do redirect ingress/egress with len >>>>>>> 0/1...tcphdr? Because I'm not sure I 100% understand whether it's only >>>>>>> len=0 that's problematic or some other combination as well? >>>>>>> >>>>>> yes, only skb->len = 0 will cause null-ptr-deref issue. >>>>>> The following is the process of triggering the problem: >>>>>> enqueue a skb: >>>>>> fq_codel_enqueue() >>>>>> ... >>>>>> idx = fq_codel_classify() --->if idx != 0 >>>>>> flow = &q->flows[idx]; >>>>>> flow_queue_add(flow, skb); --->add skb to flow[idex] >>>>>> q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0 >>>>>> ... >>>>>> fq_codel_drop() --->set sch->limit = 0, always >>>>>> drop packets >>>>>> ... >>>>>> idx = i --->becuase backlogs in every >>>>>> flows is 0, so idx = 0 >>>>>> ... >>>>>> flow = &q->flows[idx]; --->get idx=0 flow >>>>>> ... >>>>>> dequeue_head() >>>>>> skb = flow->head; --->flow->head = NULL >>>>>> flow->head = skb->next; --->cause null-ptr-deref >>>>>> So, if skb->len !=0,fq_codel_drop() could get the correct idx, and >>>>>> then skb!=NULL, it will be OK. >>>>>> Maybe, I will fix it in fq_codel. >>>>> >>>>> I think the consensus here is that the stack, in general, doesn't >>>>> expect the packets like this. So there are probably more broken things >>>>> besides fq_codel. Thus, it's better if we remove the ability to >>>>> generate them from the bpf side instead of fixing the individual users >>>>> like fq_codel. >>>>> >>>>>> But, as I know, skb->len = 0 is just invalid packet. I prefer to add the >>>>>> length back, like bellow: >>>>>> if (is_l2 || !skb->len) >>>>>> __skb_push(skb, hh_len); >>>>>> is it OK? >>>>> >>>>> Probably not? >>>>> >>>>> Looking at the original syzkaller report, prog_type is >>>>> BPF_PROG_TYPE_LWT_XMIT which does expect a packet without l2 header. >>>>> Can we do something like: >>>>> >>>>> if (!is_l2 && !skb->len) { >>>>> // append some dummy byte to the skb ? >>>>> } >>>>> >>>>> >>>> I pad one byte, and test OK. >>>> if (!is_l2 && !skb->len) >>>> __skb_push(skb, 1); >>>> >>>> Does it look OK to you? >>> >>> Nope, this will eat a byte out of the l2 header. We need to skb_put >>> and make sure we allocate enough to make that skb_put succeed. >>> >>> But stepping back a bit: it feels like it's all unnecessary? The only >>> valid use-case of this is probing for the BPF_PROG_TEST_RUN as cilium >>> does. This is mostly about testing, so fixing it in the users seems >>> fair? No real production code is expected to generate these zero-len >>> packets. Or are we concerned that this will leak into stable kernels? >>> >>> I feel like we are trying to add more complexity here for no apparent reason. >>> >> I agree with you. users should make sure the correct skb len and >> configurations are passed into kernel. Incorrect configurations should >> be discarded to ensure kernel stability. >> >> Lorenz, Can you modify the user-mode test code? > > Lorenz already fixed it for Cilium. I think the discussion here is > around other potential users out there. > Let's wait for them to appear if it is indeed a problem? Currently, Lorenz only does the workaround. Maybe Lorenz still expects to change the value back to 14 bytes?
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 13d578ce2a09..aa1b49f19ca3 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -979,9 +979,6 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb) { struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; - if (!skb->len) - return -EINVAL; - if (!__skb) return 0; @@ -1102,6 +1099,9 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, if (IS_ERR(data)) return PTR_ERR(data); + if (size == ETH_HLEN) + is_l2 = true; + ctx = bpf_ctx_init(kattr, sizeof(struct __sk_buff)); if (IS_ERR(ctx)) { kfree(data);
As [0] see, bpf_prog_test_run_skb() should allow user space to forward 14-bytes packet via BPF_PROG_RUN instead of dropping packet directly. So fix it. 0: https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0 Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> --- net/bpf/test_run.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)