diff mbox series

[net] net: sched: fq_codel: fix null-ptr-deref issue in fq_codel_enqueue()

Message ID 20221021084058.223823-1-shaozhengchao@huawei.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net] net: sched: fq_codel: fix null-ptr-deref issue in fq_codel_enqueue() | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 20 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

shaozhengchao Oct. 21, 2022, 8:40 a.m. UTC
As [0] see, it will cause null-ptr-deref issue.
The following is the process of triggering the problem:
fq_codel_enqueue()
	...
	idx = fq_codel_classify()        --->if idx != 0
	flow = &q->flows[idx];
	flow_queue_add(flow, skb);       --->add skb to flow[idex]
	q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0
	...
	fq_codel_drop()          --->set sch->limit = 0, always
				     drop packets
		...
		idx = i          --->because backlogs in every
				     flows is 0, so idx = 0
		...
		flow = &q->flows[idx];   --->get idx=0 flow
		...
		dequeue_head()
			skb = flow->head; --->flow->head = NULL
			flow->head = skb->next; --->cause null-ptr-deref

So, only need to discard the packets whose len is 0 on dropping path of
enqueue. Then, the correct flow id can be obtained by fq_codel_drop() on
next enqueuing.

[0]: https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5

Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
---
 net/sched/sch_fq_codel.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Kuniyuki Iwashima Oct. 22, 2022, 12:13 a.m. UTC | #1
+Stanislav, bpf

From:   Zhengchao Shao <shaozhengchao@huawei.com>
Date:   Fri, 21 Oct 2022 16:40:58 +0800
> As [0] see, it will cause null-ptr-deref issue.
> The following is the process of triggering the problem:
> fq_codel_enqueue()
> 	...
> 	idx = fq_codel_classify()        --->if idx != 0
> 	flow = &q->flows[idx];
> 	flow_queue_add(flow, skb);       --->add skb to flow[idex]
> 	q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0
> 	...
> 	fq_codel_drop()          --->set sch->limit = 0, always
> 				     drop packets
> 		...
> 		idx = i          --->because backlogs in every
> 				     flows is 0, so idx = 0
> 		...
> 		flow = &q->flows[idx];   --->get idx=0 flow
> 		...
> 		dequeue_head()
> 			skb = flow->head; --->flow->head = NULL
> 			flow->head = skb->next; --->cause null-ptr-deref
> 
> So, only need to discard the packets whose len is 0 on dropping path of
> enqueue. Then, the correct flow id can be obtained by fq_codel_drop() on
> next enqueuing.
> 
> [0]: https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5

This can be caused by BPF, but there seems to be no consensus yet.
https://lore.kernel.org/netdev/CAKH8qBsOMxVaemF0Oy=vE1V0vKO8ORUcVGB5YANS3HdKOhVjjw@mail.gmail.com/

"""
I think the consensus here is that the stack, in general, doesn't
expect the packets like this. So there are probably more broken things
besides fq_codel. Thus, it's better if we remove the ability to
generate them from the bpf side instead of fixing the individual users
like fq_codel.
"""


> 
> Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM")
> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
> ---
>  net/sched/sch_fq_codel.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
> index 99d318b60568..3bbe7f69dfb5 100644
> --- a/net/sched/sch_fq_codel.c
> +++ b/net/sched/sch_fq_codel.c
> @@ -187,6 +187,7 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch,
>  	struct fq_codel_sched_data *q = qdisc_priv(sch);
>  	unsigned int idx, prev_backlog, prev_qlen;
>  	struct fq_codel_flow *flow;
> +	struct sk_buff *drop_skb;

We can move this into the if-block below or remove.


>  	int ret;
>  	unsigned int pkt_len;
>  	bool memory_limited;
> @@ -222,6 +223,13 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch,
>  
>  	/* save this packet length as it might be dropped by fq_codel_drop() */
>  	pkt_len = qdisc_pkt_len(skb);
> +
> +	/* drop skb if len = 0, so fq_codel_drop could get the right flow idx*/
> +	if (unlikely(!pkt_len)) {
> +		drop_skb = dequeue_head(flow);
> +		__qdisc_drop(drop_skb, to_free);

just            __qdisc_drop(dequeue_head(flow), to_free);


> +		return NET_XMIT_SUCCESS;
> +	}
>  	/* fq_codel_drop() is quite expensive, as it performs a linear search
>  	 * in q->backlogs[] to find a fat flow.
>  	 * So instead of dropping a single packet, drop half of its backlog
> -- 
> 2.17.1
Stanislav Fomichev Oct. 24, 2022, 4:53 p.m. UTC | #2
On Fri, Oct 21, 2022 at 5:13 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> +Stanislav, bpf
>
> From:   Zhengchao Shao <shaozhengchao@huawei.com>
> Date:   Fri, 21 Oct 2022 16:40:58 +0800
> > As [0] see, it will cause null-ptr-deref issue.
> > The following is the process of triggering the problem:
> > fq_codel_enqueue()
> >       ...
> >       idx = fq_codel_classify()        --->if idx != 0
> >       flow = &q->flows[idx];
> >       flow_queue_add(flow, skb);       --->add skb to flow[idex]
> >       q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0
> >       ...
> >       fq_codel_drop()          --->set sch->limit = 0, always
> >                                    drop packets
> >               ...
> >               idx = i          --->because backlogs in every
> >                                    flows is 0, so idx = 0
> >               ...
> >               flow = &q->flows[idx];   --->get idx=0 flow
> >               ...
> >               dequeue_head()
> >                       skb = flow->head; --->flow->head = NULL
> >                       flow->head = skb->next; --->cause null-ptr-deref
> >
> > So, only need to discard the packets whose len is 0 on dropping path of
> > enqueue. Then, the correct flow id can be obtained by fq_codel_drop() on
> > next enqueuing.
> >
> > [0]: https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
>
> This can be caused by BPF, but there seems to be no consensus yet.
> https://lore.kernel.org/netdev/CAKH8qBsOMxVaemF0Oy=vE1V0vKO8ORUcVGB5YANS3HdKOhVjjw@mail.gmail.com/
>
> """
> I think the consensus here is that the stack, in general, doesn't
> expect the packets like this. So there are probably more broken things
> besides fq_codel. Thus, it's better if we remove the ability to
> generate them from the bpf side instead of fixing the individual users
> like fq_codel.
> """

That shouldn't happen after commit fd1894224407 ("bpf: Don't redirect
packets with invalid pkt_len"), so not sure why this patch is needed
at all?

> > Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM")
> > Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
> > ---
> >  net/sched/sch_fq_codel.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
> > index 99d318b60568..3bbe7f69dfb5 100644
> > --- a/net/sched/sch_fq_codel.c
> > +++ b/net/sched/sch_fq_codel.c
> > @@ -187,6 +187,7 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> >       struct fq_codel_sched_data *q = qdisc_priv(sch);
> >       unsigned int idx, prev_backlog, prev_qlen;
> >       struct fq_codel_flow *flow;
> > +     struct sk_buff *drop_skb;
>
> We can move this into the if-block below or remove.
>
>
> >       int ret;
> >       unsigned int pkt_len;
> >       bool memory_limited;
> > @@ -222,6 +223,13 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> >
> >       /* save this packet length as it might be dropped by fq_codel_drop() */
> >       pkt_len = qdisc_pkt_len(skb);
> > +
> > +     /* drop skb if len = 0, so fq_codel_drop could get the right flow idx*/
> > +     if (unlikely(!pkt_len)) {
> > +             drop_skb = dequeue_head(flow);
> > +             __qdisc_drop(drop_skb, to_free);
>
> just            __qdisc_drop(dequeue_head(flow), to_free);
>
>
> > +             return NET_XMIT_SUCCESS;
> > +     }
> >       /* fq_codel_drop() is quite expensive, as it performs a linear search
> >        * in q->backlogs[] to find a fat flow.
> >        * So instead of dropping a single packet, drop half of its backlog
> > --
> > 2.17.1
>
Eric Dumazet Oct. 24, 2022, 5:04 p.m. UTC | #3
On Mon, Oct 24, 2022 at 9:54 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Fri, Oct 21, 2022 at 5:13 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > +Stanislav, bpf
> >
> > From:   Zhengchao Shao <shaozhengchao@huawei.com>
> > Date:   Fri, 21 Oct 2022 16:40:58 +0800
> > > As [0] see, it will cause null-ptr-deref issue.
> > > The following is the process of triggering the problem:
> > > fq_codel_enqueue()
> > >       ...
> > >       idx = fq_codel_classify()        --->if idx != 0
> > >       flow = &q->flows[idx];
> > >       flow_queue_add(flow, skb);       --->add skb to flow[idex]
> > >       q->backlogs[idx] += qdisc_pkt_len(skb); --->backlogs = 0
> > >       ...
> > >       fq_codel_drop()          --->set sch->limit = 0, always
> > >                                    drop packets
> > >               ...
> > >               idx = i          --->because backlogs in every
> > >                                    flows is 0, so idx = 0
> > >               ...
> > >               flow = &q->flows[idx];   --->get idx=0 flow
> > >               ...
> > >               dequeue_head()
> > >                       skb = flow->head; --->flow->head = NULL
> > >                       flow->head = skb->next; --->cause null-ptr-deref
> > >
> > > So, only need to discard the packets whose len is 0 on dropping path of
> > > enqueue. Then, the correct flow id can be obtained by fq_codel_drop() on
> > > next enqueuing.
> > >
> > > [0]: https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
> >
> > This can be caused by BPF, but there seems to be no consensus yet.
> > https://lore.kernel.org/netdev/CAKH8qBsOMxVaemF0Oy=vE1V0vKO8ORUcVGB5YANS3HdKOhVjjw@mail.gmail.com/
> >
> > """
> > I think the consensus here is that the stack, in general, doesn't
> > expect the packets like this. So there are probably more broken things
> > besides fq_codel. Thus, it's better if we remove the ability to
> > generate them from the bpf side instead of fixing the individual users
> > like fq_codel.
> > """
>
> That shouldn't happen after commit fd1894224407 ("bpf: Don't redirect
> packets with invalid pkt_len"), so not sure why this patch is needed
> at all?

This patch keeps coming, I have already explained it is silly to add such tests.

Most drivers will crash if skb->len == 0 anyway.

>
> > > Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM")
> > > Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
> > > ---
> > >  net/sched/sch_fq_codel.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > >
> > > diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
> > > index 99d318b60568..3bbe7f69dfb5 100644
> > > --- a/net/sched/sch_fq_codel.c
> > > +++ b/net/sched/sch_fq_codel.c
> > > @@ -187,6 +187,7 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> > >       struct fq_codel_sched_data *q = qdisc_priv(sch);
> > >       unsigned int idx, prev_backlog, prev_qlen;
> > >       struct fq_codel_flow *flow;
> > > +     struct sk_buff *drop_skb;
> >
> > We can move this into the if-block below or remove.
> >
> >
> > >       int ret;
> > >       unsigned int pkt_len;
> > >       bool memory_limited;
> > > @@ -222,6 +223,13 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> > >
> > >       /* save this packet length as it might be dropped by fq_codel_drop() */
> > >       pkt_len = qdisc_pkt_len(skb);
> > > +
> > > +     /* drop skb if len = 0, so fq_codel_drop could get the right flow idx*/
> > > +     if (unlikely(!pkt_len)) {
> > > +             drop_skb = dequeue_head(flow);
> > > +             __qdisc_drop(drop_skb, to_free);
> >
> > just            __qdisc_drop(dequeue_head(flow), to_free);
> >
> >
> > > +             return NET_XMIT_SUCCESS;
> > > +     }
> > >       /* fq_codel_drop() is quite expensive, as it performs a linear search
> > >        * in q->backlogs[] to find a fat flow.
> > >        * So instead of dropping a single packet, drop half of its backlog
> > > --
> > > 2.17.1
> >
diff mbox series

Patch

diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
index 99d318b60568..3bbe7f69dfb5 100644
--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -187,6 +187,7 @@  static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 	struct fq_codel_sched_data *q = qdisc_priv(sch);
 	unsigned int idx, prev_backlog, prev_qlen;
 	struct fq_codel_flow *flow;
+	struct sk_buff *drop_skb;
 	int ret;
 	unsigned int pkt_len;
 	bool memory_limited;
@@ -222,6 +223,13 @@  static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 
 	/* save this packet length as it might be dropped by fq_codel_drop() */
 	pkt_len = qdisc_pkt_len(skb);
+
+	/* drop skb if len = 0, so fq_codel_drop could get the right flow idx*/
+	if (unlikely(!pkt_len)) {
+		drop_skb = dequeue_head(flow);
+		__qdisc_drop(drop_skb, to_free);
+		return NET_XMIT_SUCCESS;
+	}
 	/* fq_codel_drop() is quite expensive, as it performs a linear search
 	 * in q->backlogs[] to find a fat flow.
 	 * So instead of dropping a single packet, drop half of its backlog