diff mbox series

[net] net/sched: fix initialization order when updating chain 0 head

Message ID b97d5f4eaffeeb9d058155bcab63347527261abf.1649341369.git.marcelo.leitner@gmail.com (mailing list archive)
State Accepted
Commit e65812fd22eba32f11abe28cb377cbd64cfb1ba0
Delegated to: Netdev Maintainers
Headers show
Series [net] net/sched: fix initialization order when updating chain 0 head | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 9 this patch: 9
netdev/cc_maintainers fail 1 blamed authors not CCed: davem@davemloft.net; 3 maintainers not CCed: davem@davemloft.net kuba@kernel.org pabeni@redhat.com
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 9 this patch: 9
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 11 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Marcelo Ricardo Leitner April 7, 2022, 2:29 p.m. UTC
Currently, when inserting a new filter that needs to sit at the head
of chain 0, it will first update the heads pointer on all devices using
the (shared) block, and only then complete the initialization of the new
element so that it has a "next" element.

This can lead to a situation that the chain 0 head is propagated to
another CPU before the "next" initialization is done. When this race
condition is triggered, packets being matched on that CPU will simply
miss all other filters, and will flow through the stack as if there were
no other filters installed. If the system is using OVS + TC, such
packets will get handled by vswitchd via upcall, which results in much
higher latency and reordering. For other applications it may result in
packet drops.

This is reproducible with a tc only setup, but it varies from system to
system. It could be reproduced with a shared block amongst 10 veth
tunnels, and an ingress filter mirroring packets to another veth.
That's because using the last added veth tunnel to the shared block to
do the actual traffic, it makes the race window bigger and easier to
trigger.

The fix is rather simple, to just initialize the next pointer of the new
filter instance (tp) before propagating the head change.

The fixes tag is pointing to the original code though this issue should
only be observed when using it unlocked.

Fixes: 2190d1d0944f ("net: sched: introduce helpers to work with filter chains")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
---
 net/sched/cls_api.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

patchwork-bot+netdevbpf@kernel.org April 8, 2022, 10 p.m. UTC | #1
Hello:

This patch was applied to netdev/net.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Thu,  7 Apr 2022 11:29:23 -0300 you wrote:
> Currently, when inserting a new filter that needs to sit at the head
> of chain 0, it will first update the heads pointer on all devices using
> the (shared) block, and only then complete the initialization of the new
> element so that it has a "next" element.
> 
> This can lead to a situation that the chain 0 head is propagated to
> another CPU before the "next" initialization is done. When this race
> condition is triggered, packets being matched on that CPU will simply
> miss all other filters, and will flow through the stack as if there were
> no other filters installed. If the system is using OVS + TC, such
> packets will get handled by vswitchd via upcall, which results in much
> higher latency and reordering. For other applications it may result in
> packet drops.
> 
> [...]

Here is the summary with links:
  - [net] net/sched: fix initialization order when updating chain 0 head
    https://git.kernel.org/netdev/net/c/e65812fd22eb

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 2957f8f5cea759315463d5e61fa1db745746e6f7..f0699f39afdb082067e581a5ff1ce217351c4a19 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -1672,10 +1672,10 @@  static int tcf_chain_tp_insert(struct tcf_chain *chain,
 	if (chain->flushing)
 		return -EAGAIN;
 
+	RCU_INIT_POINTER(tp->next, tcf_chain_tp_prev(chain, chain_info));
 	if (*chain_info->pprev == chain->filter_chain)
 		tcf_chain0_head_change(chain, tp);
 	tcf_proto_get(tp);
-	RCU_INIT_POINTER(tp->next, tcf_chain_tp_prev(chain, chain_info));
 	rcu_assign_pointer(*chain_info->pprev, tp);
 
 	return 0;