Message ID | cover.1689541664.git.lucien.xin@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | net: handle the exp removal problem with ovs upcall properly | expand |
On Sun, 16 Jul 2023 17:09:16 -0400 Xin Long wrote: > With the OVS upcall, the original ct in the skb will be dropped, and when > the skb comes back from userspace it has to create a new ct again through > nf_conntrack_in() in either OVS __ovs_ct_lookup() or TC tcf_ct_act(). > > However, the new ct will not be able to have the exp as the original ct > has taken it away from the hash table in nf_ct_find_expectation(). This > will cause some flow never to be matched, like: > > 'ip,ct_state=-trk,in_port=1 actions=ct(zone=1)' > 'ip,ct_state=+trk+new+rel,in_port=1 actions=ct(commit,zone=1)' > 'ip,ct_state=+trk+new+rel,in_port=1 actions=ct(commit,zone=2),normal' > > if the 2nd flow triggers the OVS upcall, the 3rd flow will never get > matched. > > OVS conntrack works around this by adding its own exp lookup function to > not remove the exp from the hash table and saving the exp and its master > info to the flow keys instead of create a real ct. But this way doesn't > work for TC act_ct. > > The patch 1/3 allows nf_ct_find_expectation() not to remove the exp from > the hash table if tmpl is set with IPS_CONFIRMED when doing lookup. This > allows both OVS conntrack and TC act_ct to have a simple and clear fix > for this problem in the patch 2/3 and 3/3. Florian, Pablo, any opinion on these? Would you prefer to take them via netfilter?
Jakub Kicinski <kuba@kernel.org> wrote: > On Sun, 16 Jul 2023 17:09:16 -0400 Xin Long wrote: > > With the OVS upcall, the original ct in the skb will be dropped, and when > > the skb comes back from userspace it has to create a new ct again through > > nf_conntrack_in() in either OVS __ovs_ct_lookup() or TC tcf_ct_act(). > > > > However, the new ct will not be able to have the exp as the original ct > > has taken it away from the hash table in nf_ct_find_expectation(). This > > will cause some flow never to be matched, like: > > > > 'ip,ct_state=-trk,in_port=1 actions=ct(zone=1)' > > 'ip,ct_state=+trk+new+rel,in_port=1 actions=ct(commit,zone=1)' > > 'ip,ct_state=+trk+new+rel,in_port=1 actions=ct(commit,zone=2),normal' > > > > if the 2nd flow triggers the OVS upcall, the 3rd flow will never get > > matched. > > > > OVS conntrack works around this by adding its own exp lookup function to > > not remove the exp from the hash table and saving the exp and its master > > info to the flow keys instead of create a real ct. But this way doesn't > > work for TC act_ct. > > > > The patch 1/3 allows nf_ct_find_expectation() not to remove the exp from > > the hash table if tmpl is set with IPS_CONFIRMED when doing lookup. This > > allows both OVS conntrack and TC act_ct to have a simple and clear fix > > for this problem in the patch 2/3 and 3/3. > > Florian, Pablo, any opinion on these? Sorry for the silence. I dislike moving tc/ovs artifacts into the conntrack core. But so far I could not come up with a counter-proposal, and it doesn't look too outrageous. Let me look at it again later today (its early morning here) and I will get back to this in my late evening.
Xin Long <lucien.xin@gmail.com> writes: > With the OVS upcall, the original ct in the skb will be dropped, and when > the skb comes back from userspace it has to create a new ct again through > nf_conntrack_in() in either OVS __ovs_ct_lookup() or TC tcf_ct_act(). > > However, the new ct will not be able to have the exp as the original ct > has taken it away from the hash table in nf_ct_find_expectation(). This > will cause some flow never to be matched, like: > > 'ip,ct_state=-trk,in_port=1 actions=ct(zone=1)' > 'ip,ct_state=+trk+new+rel,in_port=1 actions=ct(commit,zone=1)' > 'ip,ct_state=+trk+new+rel,in_port=1 actions=ct(commit,zone=2),normal' > > if the 2nd flow triggers the OVS upcall, the 3rd flow will never get > matched. > > OVS conntrack works around this by adding its own exp lookup function to > not remove the exp from the hash table and saving the exp and its master > info to the flow keys instead of create a real ct. But this way doesn't > work for TC act_ct. > > The patch 1/3 allows nf_ct_find_expectation() not to remove the exp from > the hash table if tmpl is set with IPS_CONFIRMED when doing lookup. This > allows both OVS conntrack and TC act_ct to have a simple and clear fix > for this problem in the patch 2/3 and 3/3. > > Xin Long (3): > netfilter: allow exp not to be removed in nf_ct_find_expectation > net: sched: set IPS_CONFIRMED in tmpl status only when commit is set > in act_ct > openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set > in conntrack > > include/net/netfilter/nf_conntrack_expect.h | 2 +- > net/netfilter/nf_conntrack_core.c | 2 +- > net/netfilter/nf_conntrack_expect.c | 4 +- > net/netfilter/nft_ct.c | 2 + > net/openvswitch/conntrack.c | 78 +++------------------ > net/sched/act_ct.c | 3 +- > 6 files changed, 18 insertions(+), 73 deletions(-) Hi Xin, The changes look okay to me, and I don't see anything that immediately jumps out. I would like to test it today / tomorrow with the ovs check-kernel testsuite if you haven't done so already. -Aaron
Florian Westphal <fw@strlen.de> wrote: > Jakub Kicinski <kuba@kernel.org> wrote: > > On Sun, 16 Jul 2023 17:09:16 -0400 Xin Long wrote: > > > With the OVS upcall, the original ct in the skb will be dropped, and when > > > the skb comes back from userspace it has to create a new ct again through > > > nf_conntrack_in() in either OVS __ovs_ct_lookup() or TC tcf_ct_act(). > > > > > > However, the new ct will not be able to have the exp as the original ct > > > has taken it away from the hash table in nf_ct_find_expectation(). This > > > will cause some flow never to be matched, like: > > > > > > 'ip,ct_state=-trk,in_port=1 actions=ct(zone=1)' > > > 'ip,ct_state=+trk+new+rel,in_port=1 actions=ct(commit,zone=1)' > > > 'ip,ct_state=+trk+new+rel,in_port=1 actions=ct(commit,zone=2),normal' > > > > > > if the 2nd flow triggers the OVS upcall, the 3rd flow will never get > > > matched. > > > > > > OVS conntrack works around this by adding its own exp lookup function to > > > not remove the exp from the hash table and saving the exp and its master > > > info to the flow keys instead of create a real ct. But this way doesn't > > > work for TC act_ct. > > > > > > The patch 1/3 allows nf_ct_find_expectation() not to remove the exp from > > > the hash table if tmpl is set with IPS_CONFIRMED when doing lookup. This > > > allows both OVS conntrack and TC act_ct to have a simple and clear fix > > > for this problem in the patch 2/3 and 3/3. > > > > Florian, Pablo, any opinion on these? > > Sorry for the silence. I dislike moving tc/ovs artifacts into > the conntrack core. Can't find a better solution, feel free to take this though the net-next tree. Acked-by: Florian Westphal <fw@strlen.de>
Hello: This series was applied to netdev/net-next.git (main) by Paolo Abeni <pabeni@redhat.com>: On Sun, 16 Jul 2023 17:09:16 -0400 you wrote: > With the OVS upcall, the original ct in the skb will be dropped, and when > the skb comes back from userspace it has to create a new ct again through > nf_conntrack_in() in either OVS __ovs_ct_lookup() or TC tcf_ct_act(). > > However, the new ct will not be able to have the exp as the original ct > has taken it away from the hash table in nf_ct_find_expectation(). This > will cause some flow never to be matched, like: > > [...] Here is the summary with links: - [net-next,1/3] netfilter: allow exp not to be removed in nf_ct_find_expectation https://git.kernel.org/netdev/net-next/c/4914109a8e1e - [net-next,2/3] net: sched: set IPS_CONFIRMED in tmpl status only when commit is set in act_ct https://git.kernel.org/netdev/net-next/c/76622ced50a1 - [net-next,3/3] openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack https://git.kernel.org/netdev/net-next/c/8c8b73320805 You are awesome, thank you!