diff mbox series

[net,v2] ipmr: Fix access to mfc_cache_list without lock held

Message ID 20241108-ipmr_rcu-v2-1-c718998e209b@debian.org (mailing list archive)
State Accepted
Commit e28acc9c1ccfcb24c08e020828f69d0a915b06ae
Delegated to: Netdev Maintainers
Headers show
Series [net,v2] ipmr: Fix access to mfc_cache_list without lock held | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present fail Series targets non-next tree, but doesn't contain any Fixes tags
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 3 this patch: 3
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 3 this patch: 3
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 6 this patch: 6
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-11-11--21-00 (tests: 787)

Commit Message

Breno Leitao Nov. 8, 2024, 2:08 p.m. UTC
Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
following code flow, the RCU read lock is not held, causing the
following error when `RCU_PROVE` is not held. The same problem might
show up in the IPv6 code path.

	6.12.0-rc5-kbuilder-01145-gbac17284bdcb #33 Tainted: G            E    N
	-----------------------------
	net/ipv4/ipmr_base.c:313 RCU-list traversed in non-reader section!!

	rcu_scheduler_active = 2, debug_locks = 1
		   2 locks held by RetransmitAggre/3519:
		    #0: ffff88816188c6c0 (nlk_cb_mutex-ROUTE){+.+.}-{3:3}, at: __netlink_dump_start+0x8a/0x290
		    #1: ffffffff83fcf7a8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_dumpit+0x6b/0x90

	stack backtrace:
		    lockdep_rcu_suspicious
		    mr_table_dump
		    ipmr_rtm_dumproute
		    rtnl_dump_all
		    rtnl_dumpit
		    netlink_dump
		    __netlink_dump_start
		    rtnetlink_rcv_msg
		    netlink_rcv_skb
		    netlink_unicast
		    netlink_sendmsg

This is not a problem per see, since the RTNL lock is held here, so, it
is safe to iterate in the list without the RCU read lock, as suggested
by Eric.

To alleviate the concern, modify the code to use
list_for_each_entry_rcu() with the RTNL-held argument.

The annotation will raise an error only if RTNL or RCU read lock are
missing during iteration, signaling a legitimate problem, otherwise it
will avoid this false positive.

This will solve the IPv6 case as well, since ip6mr_rtm_dumproute() calls
this function as well.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
Changes in v2:
- Instead of getting an RCU read lock, rely on rtnl mutex (Eric)
- Link to v1: https://lore.kernel.org/r/20241107-ipmr_rcu-v1-1-ad0cba8dffed@debian.org
- Still sending it against `net`, so, since this warning is annoying
---
 net/ipv4/ipmr_base.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


---
base-commit: 25d70702142ac2115e75e01a0a985c6ea1d78033
change-id: 20241107-ipmr_rcu-291d85400b16

Best regards,

Comments

David Ahern Nov. 11, 2024, 1 a.m. UTC | #1
On 11/8/24 7:08 AM, Breno Leitao wrote:
> Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
> following code flow, the RCU read lock is not held, causing the
> following error when `RCU_PROVE` is not held. The same problem might
> show up in the IPv6 code path.
> 
> 	6.12.0-rc5-kbuilder-01145-gbac17284bdcb #33 Tainted: G            E    N
> 	-----------------------------
> 	net/ipv4/ipmr_base.c:313 RCU-list traversed in non-reader section!!
> 
> 	rcu_scheduler_active = 2, debug_locks = 1
> 		   2 locks held by RetransmitAggre/3519:
> 		    #0: ffff88816188c6c0 (nlk_cb_mutex-ROUTE){+.+.}-{3:3}, at: __netlink_dump_start+0x8a/0x290
> 		    #1: ffffffff83fcf7a8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_dumpit+0x6b/0x90
> 
> 	stack backtrace:
> 		    lockdep_rcu_suspicious
> 		    mr_table_dump
> 		    ipmr_rtm_dumproute
> 		    rtnl_dump_all
> 		    rtnl_dumpit
> 		    netlink_dump
> 		    __netlink_dump_start
> 		    rtnetlink_rcv_msg
> 		    netlink_rcv_skb
> 		    netlink_unicast
> 		    netlink_sendmsg
> 
> This is not a problem per see, since the RTNL lock is held here, so, it
> is safe to iterate in the list without the RCU read lock, as suggested
> by Eric.
> 
> To alleviate the concern, modify the code to use
> list_for_each_entry_rcu() with the RTNL-held argument.
> 
> The annotation will raise an error only if RTNL or RCU read lock are
> missing during iteration, signaling a legitimate problem, otherwise it
> will avoid this false positive.
> 
> This will solve the IPv6 case as well, since ip6mr_rtm_dumproute() calls
> this function as well.
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> Changes in v2:
> - Instead of getting an RCU read lock, rely on rtnl mutex (Eric)
> - Link to v1: https://lore.kernel.org/r/20241107-ipmr_rcu-v1-1-ad0cba8dffed@debian.org
> - Still sending it against `net`, so, since this warning is annoying
> ---
>  net/ipv4/ipmr_base.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>
Jakub Kicinski Nov. 14, 2024, 3:10 a.m. UTC | #2
On Fri, 08 Nov 2024 06:08:36 -0800 Breno Leitao wrote:
> Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
> following code flow, the RCU read lock is not held, causing the
> following error when `RCU_PROVE` is not held. The same problem might
> show up in the IPv6 code path.

good start, I hope someone can fix the gazillion warnings the CI 
is hitting on the table accesses :)
patchwork-bot+netdevbpf@kernel.org Nov. 14, 2024, 3:20 a.m. UTC | #3
Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Fri, 08 Nov 2024 06:08:36 -0800 you wrote:
> Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
> following code flow, the RCU read lock is not held, causing the
> following error when `RCU_PROVE` is not held. The same problem might
> show up in the IPv6 code path.
> 
> 	6.12.0-rc5-kbuilder-01145-gbac17284bdcb #33 Tainted: G            E    N
> 
> [...]

Here is the summary with links:
  - [net,v2] ipmr: Fix access to mfc_cache_list without lock held
    https://git.kernel.org/netdev/net/c/e28acc9c1ccf

You are awesome, thank you!
Breno Leitao Nov. 14, 2024, 8:55 a.m. UTC | #4
Hello Jakub,

On Wed, Nov 13, 2024 at 07:10:23PM -0800, Jakub Kicinski wrote:
> On Fri, 08 Nov 2024 06:08:36 -0800 Breno Leitao wrote:
> > Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
> > following code flow, the RCU read lock is not held, causing the
> > following error when `RCU_PROVE` is not held. The same problem might
> > show up in the IPv6 code path.
> 
> good start, I hope someone can fix the gazillion warnings the CI 
> is hitting on the table accesses :)

If you have an updated list, I'd be happy to take a look.

Last time, all the problems I found were being discussed upstream
already. I am wondering if they didn't land upstream, or, if you have
warnings that are not being currently fixed.
Jakub Kicinski Nov. 14, 2024, 3:03 p.m. UTC | #5
On Thu, 14 Nov 2024 00:55:57 -0800 Breno Leitao wrote:
> On Wed, Nov 13, 2024 at 07:10:23PM -0800, Jakub Kicinski wrote:
> > On Fri, 08 Nov 2024 06:08:36 -0800 Breno Leitao wrote:  
> > > Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
> > > following code flow, the RCU read lock is not held, causing the
> > > following error when `RCU_PROVE` is not held. The same problem might
> > > show up in the IPv6 code path.  
> > 
> > good start, I hope someone can fix the gazillion warnings the CI 
> > is hitting on the table accesses :)  
> 
> If you have an updated list, I'd be happy to take a look.
> 
> Last time, all the problems I found were being discussed upstream
> already. I am wondering if they didn't land upstream, or, if you have
> warnings that are not being currently fixed.

https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/859762/82-router-multicast-sh/stderr
Breno Leitao Nov. 15, 2024, 9:16 a.m. UTC | #6
On Thu, Nov 14, 2024 at 07:03:08AM -0800, Jakub Kicinski wrote:
> On Thu, 14 Nov 2024 00:55:57 -0800 Breno Leitao wrote:
> > On Wed, Nov 13, 2024 at 07:10:23PM -0800, Jakub Kicinski wrote:
> > > On Fri, 08 Nov 2024 06:08:36 -0800 Breno Leitao wrote:  
> > > > Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
> > > > following code flow, the RCU read lock is not held, causing the
> > > > following error when `RCU_PROVE` is not held. The same problem might
> > > > show up in the IPv6 code path.  
> > > 
> > > good start, I hope someone can fix the gazillion warnings the CI 
> > > is hitting on the table accesses :)  
> > 
> > If you have an updated list, I'd be happy to take a look.
> > 
> > Last time, all the problems I found were being discussed upstream
> > already. I am wondering if they didn't land upstream, or, if you have
> > warnings that are not being currently fixed.
> 
> https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/859762/82-router-multicast-sh/stderr

This one seems to be discussed in the following thread already.

https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/
Jakub Kicinski Nov. 15, 2024, 4 p.m. UTC | #7
On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote:
> This one seems to be discussed in the following thread already.
> 
> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/

That's why it rung a bell..
Stefan, are you planning to continue with the series?
Stefan Wiehler Nov. 15, 2024, 4:07 p.m. UTC | #8
> On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote:
>> This one seems to be discussed in the following thread already.
>>
>> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/
> 
> That's why it rung a bell..
> Stefan, are you planning to continue with the series?

Yes, sorry for the delay, went on vacation and was busy with other tasks, but
next week I plan to continue (i.e. refactor using refcount_t).

Kind regards,

Stefan
Paolo Abeni Nov. 15, 2024, 4:55 p.m. UTC | #9
On 11/15/24 17:07, Stefan Wiehler wrote:
>> On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote:
>>> This one seems to be discussed in the following thread already.
>>>
>>> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/
>>
>> That's why it rung a bell..
>> Stefan, are you planning to continue with the series?
> 
> Yes, sorry for the delay, went on vacation and was busy with other tasks, but
> next week I plan to continue (i.e. refactor using refcount_t).

I forgot about that series and spent a little time investigating the
scenario.

I think we don't need a refcount: the tables are freed only at netns
cleanup time, so the netns refcount is enough to guarantee that the
tables are not deleted when escaping the RCU section.

Some debug assertions could help clarify, document and make the schema
more robust to later change.

Side note, I think we need to drop the RCU lock moved by:

https://lore.kernel.org/all/20241017174109.85717-2-stefan.wiehler@nokia.com/

as the seqfile core can call blocking functions - alloc(GFP_KERNEL) -
between ->start() and ->stop().

The issue is pre-existent to that patch, and even to the patch
introducing the original RCU() - the old read_lock() created an illegal
atomic scope - but I think we should address it while touching this code.

I have some patches implementing the above but I have hard times testing
vs net/forwarding, as the forwarding ST setup is eluding me - with
mausezahn internal errors.

@Jakub: do you have by chance any cheap tip handy about the forwarding
self-tests setup?

Thanks,

Paolo
Jakub Kicinski Nov. 15, 2024, 7:16 p.m. UTC | #10
On Fri, 15 Nov 2024 17:55:11 +0100 Paolo Abeni wrote:
> @Jakub: do you have by chance any cheap tip handy about the forwarding
> self-tests setup?

I presume you mean how to get all the weird tools it requires in place?

This is all the things we build on the worker:

https://github.com/linux-netdev/nipa/blob/main/deploy/contest/remote/worker-setup.sh

For mcast routing you only need a handful of those (mtools, smcroute,
ndisc6?), but we don't really track which is needed where :S

Ideally we'd use some image building to compose this automatically..
one day.
Paolo Abeni Nov. 20, 2024, 9:54 a.m. UTC | #11
On 11/15/24 17:55, Paolo Abeni wrote:
> On 11/15/24 17:07, Stefan Wiehler wrote:
>>> On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote:
>>>> This one seems to be discussed in the following thread already.
>>>>
>>>> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/
>>>
>>> That's why it rung a bell..
>>> Stefan, are you planning to continue with the series?
>>
>> Yes, sorry for the delay, went on vacation and was busy with other tasks, but
>> next week I plan to continue (i.e. refactor using refcount_t).
> 
> I forgot about that series and spent a little time investigating the
> scenario.
> 
> I think we don't need a refcount: the tables are freed only at netns
> cleanup time, so the netns refcount is enough to guarantee that the
> tables are not deleted when escaping the RCU section.
> 
> Some debug assertions could help clarify, document and make the schema
> more robust to later change.
> 
> Side note, I think we need to drop the RCU lock moved by:
> 
> https://lore.kernel.org/all/20241017174109.85717-2-stefan.wiehler@nokia.com/
> 
> as the seqfile core can call blocking functions - alloc(GFP_KERNEL) -
> between ->start() and ->stop().
> 
> The issue is pre-existent to that patch, and even to the patch
> introducing the original RCU() - the old read_lock() created an illegal
> atomic scope - but I think we should address it while touching this code.

@Stefan: are you ok if I go ahead with this work, or do you prefer
finish it yourself?

Thanks,

Paolo
Stefan Wiehler Nov. 21, 2024, 2:47 p.m. UTC | #12
> On 11/15/24 17:55, Paolo Abeni wrote:
>> On 11/15/24 17:07, Stefan Wiehler wrote:
>>>> On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote:
>>>>> This one seems to be discussed in the following thread already.
>>>>>
>>>>> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@nokia.com/
>>>>
>>>> That's why it rung a bell..
>>>> Stefan, are you planning to continue with the series?
>>>
>>> Yes, sorry for the delay, went on vacation and was busy with other tasks, but
>>> next week I plan to continue (i.e. refactor using refcount_t).
>>
>> I forgot about that series and spent a little time investigating the
>> scenario.
>>
>> I think we don't need a refcount: the tables are freed only at netns
>> cleanup time, so the netns refcount is enough to guarantee that the
>> tables are not deleted when escaping the RCU section.
>>
>> Some debug assertions could help clarify, document and make the schema
>> more robust to later change.
>>
>> Side note, I think we need to drop the RCU lock moved by:
>>
>> https://lore.kernel.org/all/20241017174109.85717-2-stefan.wiehler@nokia.com/
>>
>> as the seqfile core can call blocking functions - alloc(GFP_KERNEL) -
>> between ->start() and ->stop().
>>
>> The issue is pre-existent to that patch, and even to the patch
>> introducing the original RCU() - the old read_lock() created an illegal
>> atomic scope - but I think we should address it while touching this code.
> 
> @Stefan: are you ok if I go ahead with this work, or do you prefer
> finish it yourself?

Please go ahead, I have neither the expertise in the net subsystem nor the time
to ramp-up (since this is just a side finding for us right now) to proceed with
your proposal. I'll follow the discussion though and hope to learn something
along the way!

Kind regards,

Stefan
diff mbox series

Patch

diff --git a/net/ipv4/ipmr_base.c b/net/ipv4/ipmr_base.c
index 271dc03fc6dbd9b35db4d5782716679134f225e4..f0af12a2f70bcdf5ba54321bf7ebebe798318abb 100644
--- a/net/ipv4/ipmr_base.c
+++ b/net/ipv4/ipmr_base.c
@@ -310,7 +310,8 @@  int mr_table_dump(struct mr_table *mrt, struct sk_buff *skb,
 	if (filter->filter_set)
 		flags |= NLM_F_DUMP_FILTERED;
 
-	list_for_each_entry_rcu(mfc, &mrt->mfc_cache_list, list) {
+	list_for_each_entry_rcu(mfc, &mrt->mfc_cache_list, list,
+				lockdep_rtnl_is_held()) {
 		if (e < s_e)
 			goto next_entry;
 		if (filter->dev &&