Message ID | 80dbe764b5ae660bba3cf6edcb045a74b0f85853.1618844973.git.dcaratti@redhat.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | fix stack OOB read while fragmenting IPv4 packets | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | success | CCed 7 of 7 maintainers |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 1 this patch: 1 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 20 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 1 this patch: 1 |
netdev/header_inline | success | Link |
On Mon, Apr 19, 2021 at 05:23:44PM +0200, Davide Caratti wrote: > when 'act_mirred' tries to fragment IPv4 packets that had been previously > re-assembled using 'act_ct', splats like the following can be observed on > kernels built with KASAN: Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
On Mon, Apr 19, 2021 at 8:24 AM Davide Caratti <dcaratti@redhat.com> wrote: > diff --git a/net/sched/sch_frag.c b/net/sched/sch_frag.c > index e1e77d3fb6c0..8c06381391d6 100644 > --- a/net/sched/sch_frag.c > +++ b/net/sched/sch_frag.c > @@ -90,16 +90,16 @@ static int sch_fragment(struct net *net, struct sk_buff *skb, > } > > if (skb_protocol(skb, true) == htons(ETH_P_IP)) { > - struct dst_entry sch_frag_dst; > + struct rtable sch_frag_rt = { 0 }; Is setting these fields 0 sufficient here? Because normally a struct table is initialized by rt_dst_alloc() which sets several of them to non-zero, notably, rt->rt_type and rt->rt_uncached. Similar for the IPv6 part, which is initialized by rt6_info_init(). Thanks.
hello Cong, thanks for looking at this! On Mon, 2021-04-19 at 11:46 -0700, Cong Wang wrote: > On Mon, Apr 19, 2021 at 8:24 AM Davide Caratti <dcaratti@redhat.com> wrote: > > diff --git a/net/sched/sch_frag.c b/net/sched/sch_frag.c > > index e1e77d3fb6c0..8c06381391d6 100644 > > --- a/net/sched/sch_frag.c > > +++ b/net/sched/sch_frag.c > > @@ -90,16 +90,16 @@ static int sch_fragment(struct net *net, struct sk_buff *skb, > > } > > > > if (skb_protocol(skb, true) == htons(ETH_P_IP)) { > > - struct dst_entry sch_frag_dst; > > + struct rtable sch_frag_rt = { 0 }; > > Is setting these fields 0 sufficient here? Because normally a struct table > is initialized by rt_dst_alloc() which sets several of them to non-zero, > notably, rt->rt_type and rt->rt_uncached. > > Similar for the IPv6 part, which is initialized by rt6_info_init(). for what we do now in openvswitch and sch_frag, that should be sufficient: a similar thing is done by br_netfilter [1], apparently for the same "refragmentation" purposes. On a fedora host (running 5.10, but it shouldn't be much different than current Linux), I just dumped 'fake_rtable' from a bridge device: # ip link add name test-br0 type bridge # ip link set dev test-br0 up # ip link add name test-port0 type dummy # ip link set dev test-port0 master test-br0 up # crash [...] crash> net NET_DEVICE NAME IP ADDRESS(ES) [...] ffff89fb44ed8000 test-br0 ffff89fbfc45c000 test-port0 crash> p sizeof(struct net_device) $12 = 3200 crash> p ((struct net_bridge*)(0xffff89fb44ed8000 + 3200))->fake_rtable $13 = { dst = { dev = 0xffff89fb44ed8000, ops = 0xffffffffc0afef40, _metrics = 18446744072647256257, expires = 0, xfrm = 0x0, input = 0x0, output = 0x0, flags = 18, <-- that should be DST_NOXFRM | DST_FAKE_RTABLE obsolete = 0, header_len = 0, trailer_len = 0, __refcnt = { counter = 1 }, __use = 0, lastuse = 0, lwtstate = 0x0, callback_head = { next = 0x0, func = 0x0 }, error = 0, __pad = 0, tclassid = 0 }, rt_genid = 0, rt_flags = 0, rt_type = 0, rt_is_input = 0 '\000', rt_uses_gateway = 0 '\000', rt_iif = 0, rt_gw_family = 0 '\000', { rt_gw4 = 0, rt_gw6 = { in6_u = { u6_addr8 = "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000", u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, u6_addr32 = {0, 0, 0, 0} } } }, rt_mtu_locked = 0, rt_pmtu = 0, rt_uncached = { next = 0x0, prev = 0x0 }, rt_uncached_list = 0x0 } only fake_rtable.dst members are set to something, the remaining are all zero-ed.
On Tue, Apr 20, 2021 at 1:59 AM Davide Caratti <dcaratti@redhat.com> wrote: > > hello Cong, thanks for looking at this! > > On Mon, 2021-04-19 at 11:46 -0700, Cong Wang wrote: > > On Mon, Apr 19, 2021 at 8:24 AM Davide Caratti <dcaratti@redhat.com> wrote: > > > diff --git a/net/sched/sch_frag.c b/net/sched/sch_frag.c > > > index e1e77d3fb6c0..8c06381391d6 100644 > > > --- a/net/sched/sch_frag.c > > > +++ b/net/sched/sch_frag.c > > > @@ -90,16 +90,16 @@ static int sch_fragment(struct net *net, struct sk_buff *skb, > > > } > > > > > > if (skb_protocol(skb, true) == htons(ETH_P_IP)) { > > > - struct dst_entry sch_frag_dst; > > > + struct rtable sch_frag_rt = { 0 }; > > > > Is setting these fields 0 sufficient here? Because normally a struct table > > is initialized by rt_dst_alloc() which sets several of them to non-zero, > > notably, rt->rt_type and rt->rt_uncached. > > > > Similar for the IPv6 part, which is initialized by rt6_info_init(). > > for what we do now in openvswitch and sch_frag, that should be > sufficient: a similar thing is done by br_netfilter [1], apparently for > the same "refragmentation" purposes. On a fedora host (running 5.10, but > it shouldn't be much different than current Linux), I just dumped > 'fake_rtable' from a bridge device: Sounds fair. It looks like all of these cases merely use dst->dev->mtu, so it is overkill to pass a full dst just to satisfy some callees. Anyway, your patch should be okay as a fix for -net. Thanks.
On Mon, Apr 19, 2021 at 8:24 AM Davide Caratti <dcaratti@redhat.com> wrote: > > when 'act_mirred' tries to fragment IPv4 packets that had been previously > re-assembled using 'act_ct', splats like the following can be observed on > kernels built with KASAN: [...] > for IPv4 packets, sch_fragment() uses a temporary struct dst_entry. Then, > in the following call graph: > > ip_do_fragment() > ip_skb_dst_mtu() > ip_dst_mtu_maybe_forward() > ip_mtu_locked() > > the pointer to struct dst_entry is used as pointer to struct rtable: this > turns the access to struct members like rt_mtu_locked into an OOB read in > the stack. Fix this changing the temporary variable used for IPv4 packets > in sch_fragment(), similarly to what is done for IPv6 few lines below. > > Fixes: c129412f74e9 ("net/sched: sch_frag: add generic packet fragment support.") > Cc: <stable@vger.kernel.org> # 5.11 > Reported-by: Shuang Li <shuali@redhat.com> > Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <cong.wang@bytedance.com> Thanks.
diff --git a/net/sched/sch_frag.c b/net/sched/sch_frag.c index e1e77d3fb6c0..8c06381391d6 100644 --- a/net/sched/sch_frag.c +++ b/net/sched/sch_frag.c @@ -90,16 +90,16 @@ static int sch_fragment(struct net *net, struct sk_buff *skb, } if (skb_protocol(skb, true) == htons(ETH_P_IP)) { - struct dst_entry sch_frag_dst; + struct rtable sch_frag_rt = { 0 }; unsigned long orig_dst; sch_frag_prepare_frag(skb, xmit); - dst_init(&sch_frag_dst, &sch_frag_dst_ops, NULL, 1, + dst_init(&sch_frag_rt.dst, &sch_frag_dst_ops, NULL, 1, DST_OBSOLETE_NONE, DST_NOCOUNT); - sch_frag_dst.dev = skb->dev; + sch_frag_rt.dst.dev = skb->dev; orig_dst = skb->_skb_refdst; - skb_dst_set_noref(skb, &sch_frag_dst); + skb_dst_set_noref(skb, &sch_frag_rt.dst); IPCB(skb)->frag_max_size = mru; ret = ip_do_fragment(net, skb->sk, skb, sch_frag_xmit);