Message ID | 20221021084058.223823-1-shaozhengchao@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net: sched: fq_codel: fix null-ptr-deref issue in fq_codel_enqueue() | expand |
+Stanislav, bpf From: Zhengchao Shao <shaozhengchao@huawei.com> Date: Fri, 21 Oct 2022 16:40:58 +0800 > As [0] see, it will cause null-ptr-deref issue. > The following is the process of triggering the problem: > fq_codel_enqueue() > ... > idx = fq_codel_classify() --->if idx != 0 > flow = &q->flows[idx]; > flow_queue_add(flow, skb); --->add skb to flow[idex] > q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0 > ... > fq_codel_drop() --->set sch->limit = 0, always > drop packets > ... > idx = i --->because backlogs in every > flows is 0, so idx = 0 > ... > flow = &q->flows[idx]; --->get idx=0 flow > ... > dequeue_head() > skb = flow->head; --->flow->head = NULL > flow->head = skb->next; --->cause null-ptr-deref > > So, only need to discard the packets whose len is 0 on dropping path of > enqueue. Then, the correct flow id can be obtained by fq_codel_drop() on > next enqueuing. > > [0]: https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5 This can be caused by BPF, but there seems to be no consensus yet. https://lore.kernel.org/netdev/CAKH8qBsOMxVaemF0Oy=vE1V0vKO8ORUcVGB5YANS3HdKOhVjjw@mail.gmail.com/ """ I think the consensus here is that the stack, in general, doesn't expect the packets like this. So there are probably more broken things besides fq_codel. Thus, it's better if we remove the ability to generate them from the bpf side instead of fixing the individual users like fq_codel. """ > > Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM") > Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> > --- > net/sched/sch_fq_codel.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c > index 99d318b60568..3bbe7f69dfb5 100644 > --- a/net/sched/sch_fq_codel.c > +++ b/net/sched/sch_fq_codel.c > @@ -187,6 +187,7 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch, > struct fq_codel_sched_data *q = qdisc_priv(sch); > unsigned int idx, prev_backlog, prev_qlen; > struct fq_codel_flow *flow; > + struct sk_buff *drop_skb; We can move this into the if-block below or remove. > int ret; > unsigned int pkt_len; > bool memory_limited; > @@ -222,6 +223,13 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch, > > /* save this packet length as it might be dropped by fq_codel_drop() */ > pkt_len = qdisc_pkt_len(skb); > + > + /* drop skb if len = 0, so fq_codel_drop could get the right flow idx*/ > + if (unlikely(!pkt_len)) { > + drop_skb = dequeue_head(flow); > + __qdisc_drop(drop_skb, to_free); just __qdisc_drop(dequeue_head(flow), to_free); > + return NET_XMIT_SUCCESS; > + } > /* fq_codel_drop() is quite expensive, as it performs a linear search > * in q->backlogs[] to find a fat flow. > * So instead of dropping a single packet, drop half of its backlog > -- > 2.17.1
On Fri, Oct 21, 2022 at 5:13 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote: > > +Stanislav, bpf > > From: Zhengchao Shao <shaozhengchao@huawei.com> > Date: Fri, 21 Oct 2022 16:40:58 +0800 > > As [0] see, it will cause null-ptr-deref issue. > > The following is the process of triggering the problem: > > fq_codel_enqueue() > > ... > > idx = fq_codel_classify() --->if idx != 0 > > flow = &q->flows[idx]; > > flow_queue_add(flow, skb); --->add skb to flow[idex] > > q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0 > > ... > > fq_codel_drop() --->set sch->limit = 0, always > > drop packets > > ... > > idx = i --->because backlogs in every > > flows is 0, so idx = 0 > > ... > > flow = &q->flows[idx]; --->get idx=0 flow > > ... > > dequeue_head() > > skb = flow->head; --->flow->head = NULL > > flow->head = skb->next; --->cause null-ptr-deref > > > > So, only need to discard the packets whose len is 0 on dropping path of > > enqueue. Then, the correct flow id can be obtained by fq_codel_drop() on > > next enqueuing. > > > > [0]: https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5 > > This can be caused by BPF, but there seems to be no consensus yet. > https://lore.kernel.org/netdev/CAKH8qBsOMxVaemF0Oy=vE1V0vKO8ORUcVGB5YANS3HdKOhVjjw@mail.gmail.com/ > > """ > I think the consensus here is that the stack, in general, doesn't > expect the packets like this. So there are probably more broken things > besides fq_codel. Thus, it's better if we remove the ability to > generate them from the bpf side instead of fixing the individual users > like fq_codel. > """ That shouldn't happen after commit fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len"), so not sure why this patch is needed at all? > > Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM") > > Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> > > --- > > net/sched/sch_fq_codel.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c > > index 99d318b60568..3bbe7f69dfb5 100644 > > --- a/net/sched/sch_fq_codel.c > > +++ b/net/sched/sch_fq_codel.c > > @@ -187,6 +187,7 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch, > > struct fq_codel_sched_data *q = qdisc_priv(sch); > > unsigned int idx, prev_backlog, prev_qlen; > > struct fq_codel_flow *flow; > > + struct sk_buff *drop_skb; > > We can move this into the if-block below or remove. > > > > int ret; > > unsigned int pkt_len; > > bool memory_limited; > > @@ -222,6 +223,13 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch, > > > > /* save this packet length as it might be dropped by fq_codel_drop() */ > > pkt_len = qdisc_pkt_len(skb); > > + > > + /* drop skb if len = 0, so fq_codel_drop could get the right flow idx*/ > > + if (unlikely(!pkt_len)) { > > + drop_skb = dequeue_head(flow); > > + __qdisc_drop(drop_skb, to_free); > > just __qdisc_drop(dequeue_head(flow), to_free); > > > > + return NET_XMIT_SUCCESS; > > + } > > /* fq_codel_drop() is quite expensive, as it performs a linear search > > * in q->backlogs[] to find a fat flow. > > * So instead of dropping a single packet, drop half of its backlog > > -- > > 2.17.1 >
On Mon, Oct 24, 2022 at 9:54 AM Stanislav Fomichev <sdf@google.com> wrote: > > On Fri, Oct 21, 2022 at 5:13 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote: > > > > +Stanislav, bpf > > > > From: Zhengchao Shao <shaozhengchao@huawei.com> > > Date: Fri, 21 Oct 2022 16:40:58 +0800 > > > As [0] see, it will cause null-ptr-deref issue. > > > The following is the process of triggering the problem: > > > fq_codel_enqueue() > > > ... > > > idx = fq_codel_classify() --->if idx != 0 > > > flow = &q->flows[idx]; > > > flow_queue_add(flow, skb); --->add skb to flow[idex] > > > q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0 > > > ... > > > fq_codel_drop() --->set sch->limit = 0, always > > > drop packets > > > ... > > > idx = i --->because backlogs in every > > > flows is 0, so idx = 0 > > > ... > > > flow = &q->flows[idx]; --->get idx=0 flow > > > ... > > > dequeue_head() > > > skb = flow->head; --->flow->head = NULL > > > flow->head = skb->next; --->cause null-ptr-deref > > > > > > So, only need to discard the packets whose len is 0 on dropping path of > > > enqueue. Then, the correct flow id can be obtained by fq_codel_drop() on > > > next enqueuing. > > > > > > [0]: https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5 > > > > This can be caused by BPF, but there seems to be no consensus yet. > > https://lore.kernel.org/netdev/CAKH8qBsOMxVaemF0Oy=vE1V0vKO8ORUcVGB5YANS3HdKOhVjjw@mail.gmail.com/ > > > > """ > > I think the consensus here is that the stack, in general, doesn't > > expect the packets like this. So there are probably more broken things > > besides fq_codel. Thus, it's better if we remove the ability to > > generate them from the bpf side instead of fixing the individual users > > like fq_codel. > > """ > > That shouldn't happen after commit fd1894224407 ("bpf: Don't redirect > packets with invalid pkt_len"), so not sure why this patch is needed > at all? This patch keeps coming, I have already explained it is silly to add such tests. Most drivers will crash if skb->len == 0 anyway. > > > > Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM") > > > Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> > > > --- > > > net/sched/sch_fq_codel.c | 8 ++++++++ > > > 1 file changed, 8 insertions(+) > > > > > > diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c > > > index 99d318b60568..3bbe7f69dfb5 100644 > > > --- a/net/sched/sch_fq_codel.c > > > +++ b/net/sched/sch_fq_codel.c > > > @@ -187,6 +187,7 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch, > > > struct fq_codel_sched_data *q = qdisc_priv(sch); > > > unsigned int idx, prev_backlog, prev_qlen; > > > struct fq_codel_flow *flow; > > > + struct sk_buff *drop_skb; > > > > We can move this into the if-block below or remove. > > > > > > > int ret; > > > unsigned int pkt_len; > > > bool memory_limited; > > > @@ -222,6 +223,13 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch, > > > > > > /* save this packet length as it might be dropped by fq_codel_drop() */ > > > pkt_len = qdisc_pkt_len(skb); > > > + > > > + /* drop skb if len = 0, so fq_codel_drop could get the right flow idx*/ > > > + if (unlikely(!pkt_len)) { > > > + drop_skb = dequeue_head(flow); > > > + __qdisc_drop(drop_skb, to_free); > > > > just __qdisc_drop(dequeue_head(flow), to_free); > > > > > > > + return NET_XMIT_SUCCESS; > > > + } > > > /* fq_codel_drop() is quite expensive, as it performs a linear search > > > * in q->backlogs[] to find a fat flow. > > > * So instead of dropping a single packet, drop half of its backlog > > > -- > > > 2.17.1 > >
diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c index 99d318b60568..3bbe7f69dfb5 100644 --- a/net/sched/sch_fq_codel.c +++ b/net/sched/sch_fq_codel.c @@ -187,6 +187,7 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct fq_codel_sched_data *q = qdisc_priv(sch); unsigned int idx, prev_backlog, prev_qlen; struct fq_codel_flow *flow; + struct sk_buff *drop_skb; int ret; unsigned int pkt_len; bool memory_limited; @@ -222,6 +223,13 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch, /* save this packet length as it might be dropped by fq_codel_drop() */ pkt_len = qdisc_pkt_len(skb); + + /* drop skb if len = 0, so fq_codel_drop could get the right flow idx*/ + if (unlikely(!pkt_len)) { + drop_skb = dequeue_head(flow); + __qdisc_drop(drop_skb, to_free); + return NET_XMIT_SUCCESS; + } /* fq_codel_drop() is quite expensive, as it performs a linear search * in q->backlogs[] to find a fat flow. * So instead of dropping a single packet, drop half of its backlog
As [0] see, it will cause null-ptr-deref issue. The following is the process of triggering the problem: fq_codel_enqueue() ... idx = fq_codel_classify() --->if idx != 0 flow = &q->flows[idx]; flow_queue_add(flow, skb); --->add skb to flow[idex] q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0 ... fq_codel_drop() --->set sch->limit = 0, always drop packets ... idx = i --->because backlogs in every flows is 0, so idx = 0 ... flow = &q->flows[idx]; --->get idx=0 flow ... dequeue_head() skb = flow->head; --->flow->head = NULL flow->head = skb->next; --->cause null-ptr-deref So, only need to discard the packets whose len is 0 on dropping path of enqueue. Then, the correct flow id can be obtained by fq_codel_drop() on next enqueuing. [0]: https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5 Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> --- net/sched/sch_fq_codel.c | 8 ++++++++ 1 file changed, 8 insertions(+)