diff mbox series

[net-next] net/ipv6: insert the fib6 gc_link of a fib6_info only if in fib6.

Message ID 20231207221627.746324-1-thinker.li@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net/ipv6: insert the fib6 gc_link of a fib6_info only if in fib6. | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1616 this patch: 1616
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 1147 this patch: 1147
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1656 this patch: 1656
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 11 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Kui-Feng Lee Dec. 7, 2023, 10:16 p.m. UTC
From: Kui-Feng Lee <thinker.li@gmail.com>

Check f6i->fib6_node before inserting a f6i (fib6_info) to tb6_gc_hlist.

Previously, it checks only if f6i->fib6_table is not NULL, however it is
not enough. When a f6i is removed from a fib6_table, it's fib6_table is not
going to be reset. fib6_node is always reset when a f6i is removed from a
fib6_table and set when a f6i is added to a fib6_table. By checking
fib6_node, adding a f6i t0 tb6_gc_hlist only if f6i is in the table will be
enforced.

Fixes: 3dec89b14d37 ("net/ipv6: Remove expired routes with a separated list of routes.")
Reported-by: syzbot+c15aa445274af8674f41@syzkaller.appspotmail.com
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: dsahern@kernel.org
---
 include/net/ip6_fib.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Kui-Feng Lee Dec. 7, 2023, 10:19 p.m. UTC | #1
Eric Dumazet, could you help to check if the patch works from your side
with syzbot? Thanks a lot!


On 12/7/23 14:16, thinker.li@gmail.com wrote:
> From: Kui-Feng Lee <thinker.li@gmail.com>
> 
> Check f6i->fib6_node before inserting a f6i (fib6_info) to tb6_gc_hlist.
> 
> Previously, it checks only if f6i->fib6_table is not NULL, however it is
> not enough. When a f6i is removed from a fib6_table, it's fib6_table is not
> going to be reset. fib6_node is always reset when a f6i is removed from a
> fib6_table and set when a f6i is added to a fib6_table. By checking
> fib6_node, adding a f6i t0 tb6_gc_hlist only if f6i is in the table will be
> enforced.
> 
> Fixes: 3dec89b14d37 ("net/ipv6: Remove expired routes with a separated list of routes.")
> Reported-by: syzbot+c15aa445274af8674f41@syzkaller.appspotmail.com
> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: dsahern@kernel.org
> ---
>   include/net/ip6_fib.h | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
> index 95ed495c3a40..8477c9ff67ac 100644
> --- a/include/net/ip6_fib.h
> +++ b/include/net/ip6_fib.h
> @@ -512,7 +512,10 @@ static inline void fib6_set_expires_locked(struct fib6_info *f6i,
>   
>   	tb6 = f6i->fib6_table;
>   	f6i->expires = expires;
> -	if (tb6 && !fib6_has_expires(f6i))
> +	if (tb6 &&
> +	    rcu_dereference_protected(f6i->fib6_node,
> +				      lockdep_is_held(&tb6->tb6_lock)) &&
> +	    !fib6_has_expires(f6i))
>   		hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist);
>   	f6i->fib6_flags |= RTF_EXPIRES;
>   }
David Ahern Dec. 7, 2023, 11:17 p.m. UTC | #2
On 12/7/23 3:16 PM, thinker.li@gmail.com wrote:
> From: Kui-Feng Lee <thinker.li@gmail.com>
> 
> Check f6i->fib6_node before inserting a f6i (fib6_info) to tb6_gc_hlist.

any place setting expires should know if the entry is in a table or not.

And the syzbot report contains a reproducer, a kernel config and other
means to test a patch.
David Ahern Dec. 7, 2023, 11:20 p.m. UTC | #3
On 12/7/23 4:17 PM, David Ahern wrote:
> On 12/7/23 3:16 PM, thinker.li@gmail.com wrote:
>> From: Kui-Feng Lee <thinker.li@gmail.com>
>>
>> Check f6i->fib6_node before inserting a f6i (fib6_info) to tb6_gc_hlist.
> 
> any place setting expires should know if the entry is in a table or not.
> 
> And the syzbot report contains a reproducer, a kernel config and other
> means to test a patch.
> 

Fundamentally, the set and clear helpers are doing 2 things; they need
to be split into separate helpers.
Kui-Feng Lee Dec. 7, 2023, 11:33 p.m. UTC | #4
On 12/7/23 15:20, David Ahern wrote:
> On 12/7/23 4:17 PM, David Ahern wrote:
>> On 12/7/23 3:16 PM, thinker.li@gmail.com wrote:
>>> From: Kui-Feng Lee <thinker.li@gmail.com>
>>>
>>> Check f6i->fib6_node before inserting a f6i (fib6_info) to tb6_gc_hlist.
>>
>> any place setting expires should know if the entry is in a table or not.
>>
>> And the syzbot report contains a reproducer, a kernel config and other
>> means to test a patch.
>>
> 
> Fundamentally, the set and clear helpers are doing 2 things; they need
> to be split into separate helpers.

Sorry, I don't follow you.

There are fib6_set_expires_locked()) and fib6_clean_expires_locked(),
two separate helpers. Is this what you are saying?

Doing checks of f6i->fib6_node in fib6_set_expires_locked() should
already apply everywhere setting expires, right?

Do I miss anything?
David Ahern Dec. 8, 2023, 1:02 a.m. UTC | #5
On 12/7/23 4:33 PM, Kui-Feng Lee wrote:
> 
> 
> On 12/7/23 15:20, David Ahern wrote:
>> On 12/7/23 4:17 PM, David Ahern wrote:
>>> On 12/7/23 3:16 PM, thinker.li@gmail.com wrote:
>>>> From: Kui-Feng Lee <thinker.li@gmail.com>
>>>>
>>>> Check f6i->fib6_node before inserting a f6i (fib6_info) to
>>>> tb6_gc_hlist.
>>>
>>> any place setting expires should know if the entry is in a table or not.
>>>
>>> And the syzbot report contains a reproducer, a kernel config and other
>>> means to test a patch.
>>>
>>
>> Fundamentally, the set and clear helpers are doing 2 things; they need
>> to be split into separate helpers.
> 
> Sorry, I don't follow you.
> 
> There are fib6_set_expires_locked()) and fib6_clean_expires_locked(),
> two separate helpers. Is this what you are saying?
> 
> Doing checks of f6i->fib6_node in fib6_set_expires_locked() should
> already apply everywhere setting expires, right?
> 
> Do I miss anything?

static inline void fib6_set_expires_locked(struct fib6_info *f6i,
                                           unsigned long expires)
{
        struct fib6_table *tb6;

        tb6 = f6i->fib6_table;
        f6i->expires = expires;
        if (tb6 && !fib6_has_expires(f6i))
                hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist);
        f6i->fib6_flags |= RTF_EXPIRES;
}

1. You are abusing this helper in create_info to set the value of
expires knowing (expecting) tb6 to NOT be set and hence not setting
gc_link so no cleanup is needed on errors.

2. You then open code gc_link when adding to the tree.

I had my reservations when you sent this patch months ago, but I did not
have time to suggest a cleaner approach and now this is where we are --
trying to find some race corrupting the list.
Kui-Feng Lee Dec. 8, 2023, 1:03 a.m. UTC | #6
On 12/7/23 14:16, thinker.li@gmail.com wrote:
> From: Kui-Feng Lee <thinker.li@gmail.com>
> 
> Check f6i->fib6_node before inserting a f6i (fib6_info) to tb6_gc_hlist.
> 
> Previously, it checks only if f6i->fib6_table is not NULL, however it is
> not enough. When a f6i is removed from a fib6_table, it's fib6_table is not
> going to be reset. fib6_node is always reset when a f6i is removed from a
> fib6_table and set when a f6i is added to a fib6_table. By checking
> fib6_node, adding a f6i t0 tb6_gc_hlist only if f6i is in the table will be
> enforced.
> 
> Fixes: 3dec89b14d37 ("net/ipv6: Remove expired routes with a separated list of routes.")
> Reported-by: syzbot+c15aa445274af8674f41@syzkaller.appspotmail.com
> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: dsahern@kernel.org
> ---
>   include/net/ip6_fib.h | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
> index 95ed495c3a40..8477c9ff67ac 100644
> --- a/include/net/ip6_fib.h
> +++ b/include/net/ip6_fib.h
> @@ -512,7 +512,10 @@ static inline void fib6_set_expires_locked(struct fib6_info *f6i,
>   
>   	tb6 = f6i->fib6_table;
>   	f6i->expires = expires;
> -	if (tb6 && !fib6_has_expires(f6i))
> +	if (tb6 &&
> +	    rcu_dereference_protected(f6i->fib6_node,
> +				      lockdep_is_held(&tb6->tb6_lock)) &&
> +	    !fib6_has_expires(f6i))
>   		hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist);
>   	f6i->fib6_flags |= RTF_EXPIRES;
>   }

The following is the result from syzbot for my branch with this patch.

--------------
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger 
any issue:

Reported-and-tested-by: 
syzbot+c15aa445274af8674f41@syzkaller.appspotmail.com

Tested on:

commit:         ac408ca0 net/ipv6: insert the fib6 gc_link of a fib6_i..
git tree:       https://github.com/ThinkerYzu/linux.git 
fix-fib6_set_expires_locked
console output: https://syzkaller.appspot.com/x/log.txt?x=14a55d1ce80000
kernel config:  https://syzkaller.appspot.com/x/.config?x=f8715b6ede5c4b90
dashboard link: https://syzkaller.appspot.com/bug?extid=c15aa445274af8674f41
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for 
Debian) 2.40

Note: no patches were applied.
Note: testing is done by a robot and is best-effort only.
--------------

How often can syzbot reproduce the issue described in [1]?


[1] 
https://lore.kernel.org/all/20231205173250.2982846-1-edumazet@google.com/
Kui-Feng Lee Dec. 8, 2023, 1:16 a.m. UTC | #7
On 12/7/23 17:02, David Ahern wrote:
> On 12/7/23 4:33 PM, Kui-Feng Lee wrote:
>>
>>
>> On 12/7/23 15:20, David Ahern wrote:
>>> On 12/7/23 4:17 PM, David Ahern wrote:
>>>> On 12/7/23 3:16 PM, thinker.li@gmail.com wrote:
>>>>> From: Kui-Feng Lee <thinker.li@gmail.com>
>>>>>
>>>>> Check f6i->fib6_node before inserting a f6i (fib6_info) to
>>>>> tb6_gc_hlist.
>>>>
>>>> any place setting expires should know if the entry is in a table or not.
>>>>
>>>> And the syzbot report contains a reproducer, a kernel config and other
>>>> means to test a patch.
>>>>
>>>
>>> Fundamentally, the set and clear helpers are doing 2 things; they need
>>> to be split into separate helpers.
>>
>> Sorry, I don't follow you.
>>
>> There are fib6_set_expires_locked()) and fib6_clean_expires_locked(),
>> two separate helpers. Is this what you are saying?
>>
>> Doing checks of f6i->fib6_node in fib6_set_expires_locked() should
>> already apply everywhere setting expires, right?
>>
>> Do I miss anything?
> 
> static inline void fib6_set_expires_locked(struct fib6_info *f6i,
>                                             unsigned long expires)
> {
>          struct fib6_table *tb6;
> 
>          tb6 = f6i->fib6_table;
>          f6i->expires = expires;
>          if (tb6 && !fib6_has_expires(f6i))
>                  hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist);
>          f6i->fib6_flags |= RTF_EXPIRES;
> }
> 
> 1. You are abusing this helper in create_info to set the value of
> expires knowing (expecting) tb6 to NOT be set and hence not setting
> gc_link so no cleanup is needed on errors.
> 
> 2. You then open code gc_link when adding to the tree.
> 
> I had my reservations when you sent this patch months ago, but I did not
> have time to suggest a cleaner approach and now this is where we are --
> trying to find some race corrupting the list.

So, what you said is to split fib6_set_expires_locked() into two
functions. One is setting expires, and the other is adding a f6i to
tb6_gc_hlist. fib6_clean_expires_locked() should be split in the same
way.

Is it correct?
David Ahern Dec. 8, 2023, 1:23 a.m. UTC | #8
On 12/7/23 6:16 PM, Kui-Feng Lee wrote:
> 
> 
> On 12/7/23 17:02, David Ahern wrote:
>> On 12/7/23 4:33 PM, Kui-Feng Lee wrote:
>>>
>>>
>>> On 12/7/23 15:20, David Ahern wrote:
>>>> On 12/7/23 4:17 PM, David Ahern wrote:
>>>>> On 12/7/23 3:16 PM, thinker.li@gmail.com wrote:
>>>>>> From: Kui-Feng Lee <thinker.li@gmail.com>
>>>>>>
>>>>>> Check f6i->fib6_node before inserting a f6i (fib6_info) to
>>>>>> tb6_gc_hlist.
>>>>>
>>>>> any place setting expires should know if the entry is in a table or
>>>>> not.
>>>>>
>>>>> And the syzbot report contains a reproducer, a kernel config and other
>>>>> means to test a patch.
>>>>>
>>>>
>>>> Fundamentally, the set and clear helpers are doing 2 things; they need
>>>> to be split into separate helpers.
>>>
>>> Sorry, I don't follow you.
>>>
>>> There are fib6_set_expires_locked()) and fib6_clean_expires_locked(),
>>> two separate helpers. Is this what you are saying?
>>>
>>> Doing checks of f6i->fib6_node in fib6_set_expires_locked() should
>>> already apply everywhere setting expires, right?
>>>
>>> Do I miss anything?
>>
>> static inline void fib6_set_expires_locked(struct fib6_info *f6i,
>>                                             unsigned long expires)
>> {
>>          struct fib6_table *tb6;
>>
>>          tb6 = f6i->fib6_table;
>>          f6i->expires = expires;
>>          if (tb6 && !fib6_has_expires(f6i))
>>                  hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist);

--> no check that f6i is already in the list yet fib6_set_expires and
fib6_set_expires_locked are called on existing entries - entries already
in the tree and so *maybe* already linked. See fib6_add_rt2node and most
of fib6_set_expires callers.

Your selftests only check that entries are removed; it does not check
updating the expires time on an existing entry. It does not check a
route replace that toggles between no expires value or to an expires
(fib6_add_rt2node use of fib6_set_expires_locked) and then replacing
that route -- various permutations here.


>>          f6i->fib6_flags |= RTF_EXPIRES;
>> }
>>
>> 1. You are abusing this helper in create_info to set the value of
>> expires knowing (expecting) tb6 to NOT be set and hence not setting
>> gc_link so no cleanup is needed on errors.
>>
>> 2. You then open code gc_link when adding to the tree.
>>
>> I had my reservations when you sent this patch months ago, but I did not
>> have time to suggest a cleaner approach and now this is where we are --
>> trying to find some race corrupting the list.
> 
> So, what you said is to split fib6_set_expires_locked() into two
> functions. One is setting expires, and the other is adding a f6i to
> tb6_gc_hlist. fib6_clean_expires_locked() should be split in the same
> way.
> 
> Is it correct?
> 

one helper sets expires value and one helper that adds/removes from the
gc_link. If it is already linked, then no need to do it again.
Kui-Feng Lee Dec. 8, 2023, 2:30 a.m. UTC | #9
On 12/7/23 17:23, David Ahern wrote:
> On 12/7/23 6:16 PM, Kui-Feng Lee wrote:
>>
>>
>> On 12/7/23 17:02, David Ahern wrote:
>>> On 12/7/23 4:33 PM, Kui-Feng Lee wrote:
>>>>
>>>>
>>>> On 12/7/23 15:20, David Ahern wrote:
>>>>> On 12/7/23 4:17 PM, David Ahern wrote:
>>>>>> On 12/7/23 3:16 PM, thinker.li@gmail.com wrote:
>>>>>>> From: Kui-Feng Lee <thinker.li@gmail.com>
>>>>>>>
>>>>>>> Check f6i->fib6_node before inserting a f6i (fib6_info) to
>>>>>>> tb6_gc_hlist.
>>>>>>
>>>>>> any place setting expires should know if the entry is in a table or
>>>>>> not.
>>>>>>
>>>>>> And the syzbot report contains a reproducer, a kernel config and other
>>>>>> means to test a patch.
>>>>>>
>>>>>
>>>>> Fundamentally, the set and clear helpers are doing 2 things; they need
>>>>> to be split into separate helpers.
>>>>
>>>> Sorry, I don't follow you.
>>>>
>>>> There are fib6_set_expires_locked()) and fib6_clean_expires_locked(),
>>>> two separate helpers. Is this what you are saying?
>>>>
>>>> Doing checks of f6i->fib6_node in fib6_set_expires_locked() should
>>>> already apply everywhere setting expires, right?
>>>>
>>>> Do I miss anything?
>>>
>>> static inline void fib6_set_expires_locked(struct fib6_info *f6i,
>>>                                              unsigned long expires)
>>> {
>>>           struct fib6_table *tb6;
>>>
>>>           tb6 = f6i->fib6_table;
>>>           f6i->expires = expires;
>>>           if (tb6 && !fib6_has_expires(f6i))
>>>                   hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist);
> 
> --> no check that f6i is already in the list yet fib6_set_expires and
> fib6_set_expires_locked are called on existing entries - entries already
> in the tree and so *maybe* already linked. See fib6_add_rt2node and most
> of fib6_set_expires callers.

Although it bases on a false assumption, checking tb6 ensures the
entry is added to the list only if this f6i is already on the tree.
The correct one should checks f6i->fib6_node.
So, with checking f6i->fib6_ndoe and the open code you mentioned,
fib6_has_expires() does check if a f6i is already in the list.

But, like what you mentioned earlier, hlist_unhashed(&f6i->gc_link) is
clearer. I will move to ti.

> 
> Your selftests only check that entries are removed; it does not check
> updating the expires time on an existing entry. It does not check a
> route replace that toggles between no expires value or to an expires
> (fib6_add_rt2node use of fib6_set_expires_locked) and then replacing
> that route -- various permutations here.
> 
> 
>>>           f6i->fib6_flags |= RTF_EXPIRES;
>>> }
>>>
>>> 1. You are abusing this helper in create_info to set the value of
>>> expires knowing (expecting) tb6 to NOT be set and hence not setting
>>> gc_link so no cleanup is needed on errors.
>>>
>>> 2. You then open code gc_link when adding to the tree.
>>>
>>> I had my reservations when you sent this patch months ago, but I did not
>>> have time to suggest a cleaner approach and now this is where we are --
>>> trying to find some race corrupting the list.
>>
>> So, what you said is to split fib6_set_expires_locked() into two
>> functions. One is setting expires, and the other is adding a f6i to
>> tb6_gc_hlist. fib6_clean_expires_locked() should be split in the same
>> way.
>>
>> Is it correct?
>>
> 
> one helper sets expires value and one helper that adds/removes from the
> gc_link. If it is already linked, then no need to do it again.

Got it!
diff mbox series

Patch

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 95ed495c3a40..8477c9ff67ac 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -512,7 +512,10 @@  static inline void fib6_set_expires_locked(struct fib6_info *f6i,
 
 	tb6 = f6i->fib6_table;
 	f6i->expires = expires;
-	if (tb6 && !fib6_has_expires(f6i))
+	if (tb6 &&
+	    rcu_dereference_protected(f6i->fib6_node,
+				      lockdep_is_held(&tb6->tb6_lock)) &&
+	    !fib6_has_expires(f6i))
 		hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist);
 	f6i->fib6_flags |= RTF_EXPIRES;
 }