Message ID | 20231207221627.746324-1-thinker.li@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] net/ipv6: insert the fib6 gc_link of a fib6_info only if in fib6. | expand |
Eric Dumazet, could you help to check if the patch works from your side with syzbot? Thanks a lot! On 12/7/23 14:16, thinker.li@gmail.com wrote: > From: Kui-Feng Lee <thinker.li@gmail.com> > > Check f6i->fib6_node before inserting a f6i (fib6_info) to tb6_gc_hlist. > > Previously, it checks only if f6i->fib6_table is not NULL, however it is > not enough. When a f6i is removed from a fib6_table, it's fib6_table is not > going to be reset. fib6_node is always reset when a f6i is removed from a > fib6_table and set when a f6i is added to a fib6_table. By checking > fib6_node, adding a f6i t0 tb6_gc_hlist only if f6i is in the table will be > enforced. > > Fixes: 3dec89b14d37 ("net/ipv6: Remove expired routes with a separated list of routes.") > Reported-by: syzbot+c15aa445274af8674f41@syzkaller.appspotmail.com > Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> > Cc: Eric Dumazet <edumazet@google.com> > Cc: dsahern@kernel.org > --- > include/net/ip6_fib.h | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h > index 95ed495c3a40..8477c9ff67ac 100644 > --- a/include/net/ip6_fib.h > +++ b/include/net/ip6_fib.h > @@ -512,7 +512,10 @@ static inline void fib6_set_expires_locked(struct fib6_info *f6i, > > tb6 = f6i->fib6_table; > f6i->expires = expires; > - if (tb6 && !fib6_has_expires(f6i)) > + if (tb6 && > + rcu_dereference_protected(f6i->fib6_node, > + lockdep_is_held(&tb6->tb6_lock)) && > + !fib6_has_expires(f6i)) > hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist); > f6i->fib6_flags |= RTF_EXPIRES; > }
On 12/7/23 3:16 PM, thinker.li@gmail.com wrote: > From: Kui-Feng Lee <thinker.li@gmail.com> > > Check f6i->fib6_node before inserting a f6i (fib6_info) to tb6_gc_hlist. any place setting expires should know if the entry is in a table or not. And the syzbot report contains a reproducer, a kernel config and other means to test a patch.
On 12/7/23 4:17 PM, David Ahern wrote: > On 12/7/23 3:16 PM, thinker.li@gmail.com wrote: >> From: Kui-Feng Lee <thinker.li@gmail.com> >> >> Check f6i->fib6_node before inserting a f6i (fib6_info) to tb6_gc_hlist. > > any place setting expires should know if the entry is in a table or not. > > And the syzbot report contains a reproducer, a kernel config and other > means to test a patch. > Fundamentally, the set and clear helpers are doing 2 things; they need to be split into separate helpers.
On 12/7/23 15:20, David Ahern wrote: > On 12/7/23 4:17 PM, David Ahern wrote: >> On 12/7/23 3:16 PM, thinker.li@gmail.com wrote: >>> From: Kui-Feng Lee <thinker.li@gmail.com> >>> >>> Check f6i->fib6_node before inserting a f6i (fib6_info) to tb6_gc_hlist. >> >> any place setting expires should know if the entry is in a table or not. >> >> And the syzbot report contains a reproducer, a kernel config and other >> means to test a patch. >> > > Fundamentally, the set and clear helpers are doing 2 things; they need > to be split into separate helpers. Sorry, I don't follow you. There are fib6_set_expires_locked()) and fib6_clean_expires_locked(), two separate helpers. Is this what you are saying? Doing checks of f6i->fib6_node in fib6_set_expires_locked() should already apply everywhere setting expires, right? Do I miss anything?
On 12/7/23 4:33 PM, Kui-Feng Lee wrote: > > > On 12/7/23 15:20, David Ahern wrote: >> On 12/7/23 4:17 PM, David Ahern wrote: >>> On 12/7/23 3:16 PM, thinker.li@gmail.com wrote: >>>> From: Kui-Feng Lee <thinker.li@gmail.com> >>>> >>>> Check f6i->fib6_node before inserting a f6i (fib6_info) to >>>> tb6_gc_hlist. >>> >>> any place setting expires should know if the entry is in a table or not. >>> >>> And the syzbot report contains a reproducer, a kernel config and other >>> means to test a patch. >>> >> >> Fundamentally, the set and clear helpers are doing 2 things; they need >> to be split into separate helpers. > > Sorry, I don't follow you. > > There are fib6_set_expires_locked()) and fib6_clean_expires_locked(), > two separate helpers. Is this what you are saying? > > Doing checks of f6i->fib6_node in fib6_set_expires_locked() should > already apply everywhere setting expires, right? > > Do I miss anything? static inline void fib6_set_expires_locked(struct fib6_info *f6i, unsigned long expires) { struct fib6_table *tb6; tb6 = f6i->fib6_table; f6i->expires = expires; if (tb6 && !fib6_has_expires(f6i)) hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist); f6i->fib6_flags |= RTF_EXPIRES; } 1. You are abusing this helper in create_info to set the value of expires knowing (expecting) tb6 to NOT be set and hence not setting gc_link so no cleanup is needed on errors. 2. You then open code gc_link when adding to the tree. I had my reservations when you sent this patch months ago, but I did not have time to suggest a cleaner approach and now this is where we are -- trying to find some race corrupting the list.
On 12/7/23 14:16, thinker.li@gmail.com wrote: > From: Kui-Feng Lee <thinker.li@gmail.com> > > Check f6i->fib6_node before inserting a f6i (fib6_info) to tb6_gc_hlist. > > Previously, it checks only if f6i->fib6_table is not NULL, however it is > not enough. When a f6i is removed from a fib6_table, it's fib6_table is not > going to be reset. fib6_node is always reset when a f6i is removed from a > fib6_table and set when a f6i is added to a fib6_table. By checking > fib6_node, adding a f6i t0 tb6_gc_hlist only if f6i is in the table will be > enforced. > > Fixes: 3dec89b14d37 ("net/ipv6: Remove expired routes with a separated list of routes.") > Reported-by: syzbot+c15aa445274af8674f41@syzkaller.appspotmail.com > Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> > Cc: Eric Dumazet <edumazet@google.com> > Cc: dsahern@kernel.org > --- > include/net/ip6_fib.h | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h > index 95ed495c3a40..8477c9ff67ac 100644 > --- a/include/net/ip6_fib.h > +++ b/include/net/ip6_fib.h > @@ -512,7 +512,10 @@ static inline void fib6_set_expires_locked(struct fib6_info *f6i, > > tb6 = f6i->fib6_table; > f6i->expires = expires; > - if (tb6 && !fib6_has_expires(f6i)) > + if (tb6 && > + rcu_dereference_protected(f6i->fib6_node, > + lockdep_is_held(&tb6->tb6_lock)) && > + !fib6_has_expires(f6i)) > hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist); > f6i->fib6_flags |= RTF_EXPIRES; > } The following is the result from syzbot for my branch with this patch. -------------- Hello, syzbot has tested the proposed patch and the reproducer did not trigger any issue: Reported-and-tested-by: syzbot+c15aa445274af8674f41@syzkaller.appspotmail.com Tested on: commit: ac408ca0 net/ipv6: insert the fib6 gc_link of a fib6_i.. git tree: https://github.com/ThinkerYzu/linux.git fix-fib6_set_expires_locked console output: https://syzkaller.appspot.com/x/log.txt?x=14a55d1ce80000 kernel config: https://syzkaller.appspot.com/x/.config?x=f8715b6ede5c4b90 dashboard link: https://syzkaller.appspot.com/bug?extid=c15aa445274af8674f41 compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 Note: no patches were applied. Note: testing is done by a robot and is best-effort only. -------------- How often can syzbot reproduce the issue described in [1]? [1] https://lore.kernel.org/all/20231205173250.2982846-1-edumazet@google.com/
On 12/7/23 17:02, David Ahern wrote: > On 12/7/23 4:33 PM, Kui-Feng Lee wrote: >> >> >> On 12/7/23 15:20, David Ahern wrote: >>> On 12/7/23 4:17 PM, David Ahern wrote: >>>> On 12/7/23 3:16 PM, thinker.li@gmail.com wrote: >>>>> From: Kui-Feng Lee <thinker.li@gmail.com> >>>>> >>>>> Check f6i->fib6_node before inserting a f6i (fib6_info) to >>>>> tb6_gc_hlist. >>>> >>>> any place setting expires should know if the entry is in a table or not. >>>> >>>> And the syzbot report contains a reproducer, a kernel config and other >>>> means to test a patch. >>>> >>> >>> Fundamentally, the set and clear helpers are doing 2 things; they need >>> to be split into separate helpers. >> >> Sorry, I don't follow you. >> >> There are fib6_set_expires_locked()) and fib6_clean_expires_locked(), >> two separate helpers. Is this what you are saying? >> >> Doing checks of f6i->fib6_node in fib6_set_expires_locked() should >> already apply everywhere setting expires, right? >> >> Do I miss anything? > > static inline void fib6_set_expires_locked(struct fib6_info *f6i, > unsigned long expires) > { > struct fib6_table *tb6; > > tb6 = f6i->fib6_table; > f6i->expires = expires; > if (tb6 && !fib6_has_expires(f6i)) > hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist); > f6i->fib6_flags |= RTF_EXPIRES; > } > > 1. You are abusing this helper in create_info to set the value of > expires knowing (expecting) tb6 to NOT be set and hence not setting > gc_link so no cleanup is needed on errors. > > 2. You then open code gc_link when adding to the tree. > > I had my reservations when you sent this patch months ago, but I did not > have time to suggest a cleaner approach and now this is where we are -- > trying to find some race corrupting the list. So, what you said is to split fib6_set_expires_locked() into two functions. One is setting expires, and the other is adding a f6i to tb6_gc_hlist. fib6_clean_expires_locked() should be split in the same way. Is it correct?
On 12/7/23 6:16 PM, Kui-Feng Lee wrote: > > > On 12/7/23 17:02, David Ahern wrote: >> On 12/7/23 4:33 PM, Kui-Feng Lee wrote: >>> >>> >>> On 12/7/23 15:20, David Ahern wrote: >>>> On 12/7/23 4:17 PM, David Ahern wrote: >>>>> On 12/7/23 3:16 PM, thinker.li@gmail.com wrote: >>>>>> From: Kui-Feng Lee <thinker.li@gmail.com> >>>>>> >>>>>> Check f6i->fib6_node before inserting a f6i (fib6_info) to >>>>>> tb6_gc_hlist. >>>>> >>>>> any place setting expires should know if the entry is in a table or >>>>> not. >>>>> >>>>> And the syzbot report contains a reproducer, a kernel config and other >>>>> means to test a patch. >>>>> >>>> >>>> Fundamentally, the set and clear helpers are doing 2 things; they need >>>> to be split into separate helpers. >>> >>> Sorry, I don't follow you. >>> >>> There are fib6_set_expires_locked()) and fib6_clean_expires_locked(), >>> two separate helpers. Is this what you are saying? >>> >>> Doing checks of f6i->fib6_node in fib6_set_expires_locked() should >>> already apply everywhere setting expires, right? >>> >>> Do I miss anything? >> >> static inline void fib6_set_expires_locked(struct fib6_info *f6i, >> unsigned long expires) >> { >> struct fib6_table *tb6; >> >> tb6 = f6i->fib6_table; >> f6i->expires = expires; >> if (tb6 && !fib6_has_expires(f6i)) >> hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist); --> no check that f6i is already in the list yet fib6_set_expires and fib6_set_expires_locked are called on existing entries - entries already in the tree and so *maybe* already linked. See fib6_add_rt2node and most of fib6_set_expires callers. Your selftests only check that entries are removed; it does not check updating the expires time on an existing entry. It does not check a route replace that toggles between no expires value or to an expires (fib6_add_rt2node use of fib6_set_expires_locked) and then replacing that route -- various permutations here. >> f6i->fib6_flags |= RTF_EXPIRES; >> } >> >> 1. You are abusing this helper in create_info to set the value of >> expires knowing (expecting) tb6 to NOT be set and hence not setting >> gc_link so no cleanup is needed on errors. >> >> 2. You then open code gc_link when adding to the tree. >> >> I had my reservations when you sent this patch months ago, but I did not >> have time to suggest a cleaner approach and now this is where we are -- >> trying to find some race corrupting the list. > > So, what you said is to split fib6_set_expires_locked() into two > functions. One is setting expires, and the other is adding a f6i to > tb6_gc_hlist. fib6_clean_expires_locked() should be split in the same > way. > > Is it correct? > one helper sets expires value and one helper that adds/removes from the gc_link. If it is already linked, then no need to do it again.
On 12/7/23 17:23, David Ahern wrote: > On 12/7/23 6:16 PM, Kui-Feng Lee wrote: >> >> >> On 12/7/23 17:02, David Ahern wrote: >>> On 12/7/23 4:33 PM, Kui-Feng Lee wrote: >>>> >>>> >>>> On 12/7/23 15:20, David Ahern wrote: >>>>> On 12/7/23 4:17 PM, David Ahern wrote: >>>>>> On 12/7/23 3:16 PM, thinker.li@gmail.com wrote: >>>>>>> From: Kui-Feng Lee <thinker.li@gmail.com> >>>>>>> >>>>>>> Check f6i->fib6_node before inserting a f6i (fib6_info) to >>>>>>> tb6_gc_hlist. >>>>>> >>>>>> any place setting expires should know if the entry is in a table or >>>>>> not. >>>>>> >>>>>> And the syzbot report contains a reproducer, a kernel config and other >>>>>> means to test a patch. >>>>>> >>>>> >>>>> Fundamentally, the set and clear helpers are doing 2 things; they need >>>>> to be split into separate helpers. >>>> >>>> Sorry, I don't follow you. >>>> >>>> There are fib6_set_expires_locked()) and fib6_clean_expires_locked(), >>>> two separate helpers. Is this what you are saying? >>>> >>>> Doing checks of f6i->fib6_node in fib6_set_expires_locked() should >>>> already apply everywhere setting expires, right? >>>> >>>> Do I miss anything? >>> >>> static inline void fib6_set_expires_locked(struct fib6_info *f6i, >>> unsigned long expires) >>> { >>> struct fib6_table *tb6; >>> >>> tb6 = f6i->fib6_table; >>> f6i->expires = expires; >>> if (tb6 && !fib6_has_expires(f6i)) >>> hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist); > > --> no check that f6i is already in the list yet fib6_set_expires and > fib6_set_expires_locked are called on existing entries - entries already > in the tree and so *maybe* already linked. See fib6_add_rt2node and most > of fib6_set_expires callers. Although it bases on a false assumption, checking tb6 ensures the entry is added to the list only if this f6i is already on the tree. The correct one should checks f6i->fib6_node. So, with checking f6i->fib6_ndoe and the open code you mentioned, fib6_has_expires() does check if a f6i is already in the list. But, like what you mentioned earlier, hlist_unhashed(&f6i->gc_link) is clearer. I will move to ti. > > Your selftests only check that entries are removed; it does not check > updating the expires time on an existing entry. It does not check a > route replace that toggles between no expires value or to an expires > (fib6_add_rt2node use of fib6_set_expires_locked) and then replacing > that route -- various permutations here. > > >>> f6i->fib6_flags |= RTF_EXPIRES; >>> } >>> >>> 1. You are abusing this helper in create_info to set the value of >>> expires knowing (expecting) tb6 to NOT be set and hence not setting >>> gc_link so no cleanup is needed on errors. >>> >>> 2. You then open code gc_link when adding to the tree. >>> >>> I had my reservations when you sent this patch months ago, but I did not >>> have time to suggest a cleaner approach and now this is where we are -- >>> trying to find some race corrupting the list. >> >> So, what you said is to split fib6_set_expires_locked() into two >> functions. One is setting expires, and the other is adding a f6i to >> tb6_gc_hlist. fib6_clean_expires_locked() should be split in the same >> way. >> >> Is it correct? >> > > one helper sets expires value and one helper that adds/removes from the > gc_link. If it is already linked, then no need to do it again. Got it!
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index 95ed495c3a40..8477c9ff67ac 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -512,7 +512,10 @@ static inline void fib6_set_expires_locked(struct fib6_info *f6i, tb6 = f6i->fib6_table; f6i->expires = expires; - if (tb6 && !fib6_has_expires(f6i)) + if (tb6 && + rcu_dereference_protected(f6i->fib6_node, + lockdep_is_held(&tb6->tb6_lock)) && + !fib6_has_expires(f6i)) hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist); f6i->fib6_flags |= RTF_EXPIRES; }