Message ID | 20240607083205.3000-2-fw@strlen.de (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: flow dissector: allow explicit passing of netns | expand |
On Fri, Jun 7, 2024 at 10:36 AM Florian Westphal <fw@strlen.de> wrote: > > Years ago flow dissector gained ability to delegate flow dissection > to a bpf program, scoped per netns. > > Unfortunately, skb_get_hash() only gets an sk_buff argument instead > of both net+skb. This means the flow dissector needs to obtain the > netns pointer from somewhere else. > > The netns is derived from skb->dev, and if that is not available, from > skb->sk. If neither is set, we hit a (benign) WARN_ON_ONCE(). > > Trying both dev and sk covers most cases, but not all, as recently > reported by Christoph Paasch. > > In case of nf-generated tcp reset, both sk and dev are NULL: > > WARNING: .. net/core/flow_dissector.c:1104 > skb_flow_dissect_flow_keys include/linux/skbuff.h:1536 [inline] > skb_get_hash include/linux/skbuff.h:1578 [inline] > nft_trace_init+0x7d/0x120 net/netfilter/nf_tables_trace.c:320 > nft_do_chain+0xb26/0xb90 net/netfilter/nf_tables_core.c:268 > nft_do_chain_ipv4+0x7a/0xa0 net/netfilter/nft_chain_filter.c:23 > nf_hook_slow+0x57/0x160 net/netfilter/core.c:626 > __ip_local_out+0x21d/0x260 net/ipv4/ip_output.c:118 > ip_local_out+0x26/0x1e0 net/ipv4/ip_output.c:127 > nf_send_reset+0x58c/0x700 net/ipv4/netfilter/nf_reject_ipv4.c:308 > nft_reject_ipv4_eval+0x53/0x90 net/ipv4/netfilter/nft_reject_ipv4.c:30 > [..] > > syzkaller did something like this: > table inet filter { > chain input { > type filter hook input priority filter; policy accept; > meta nftrace set 1 # calls skb_get_hash > tcp dport 42 reject with tcp reset # emits skb with NULL skb dev/sk > } > chain output { > type filter hook output priority filter; policy accept; > # empty chain is enough > } > } > > ... then sends a tcp packet to port 42. > > Initial attempt to simply set skb->dev from nf_reject_ipv4 doesn't cover > all cases: skbs generated via ipv4 igmp_send_report trigger similar splat. > > Moreover, Pablo Neira found that nft_hash.c uses __skb_get_hash_symmetric() > which would trigger same warn splat for such skbs. > > Lets allow callers to pass the current netns explicitly. > The nf_trace infrastructure is adjusted to use the new helper. > > __skb_get_hash_symmetric is handled in the next patch. > > Reported-by: Christoph Paasch <cpaasch@apple.com> > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/494 > Signed-off-by: Florian Westphal <fw@strlen.de> Nice, I had an internal syzbot report about the same issue. Reviewed-by: Eric Dumazet <edumazet@google.com>
Hi Florian,
kernel test robot noticed the following build warnings:
[auto build test WARNING on net-next/main]
url: https://github.com/intel-lab-lkp/linux/commits/Florian-Westphal/net-add-and-use-skb_get_hash_net/20240607-163738
base: net-next/main
patch link: https://lore.kernel.org/r/20240607083205.3000-2-fw%40strlen.de
patch subject: [PATCH net-next 1/2] net: add and use skb_get_hash_net
config: openrisc-defconfig (https://download.01.org/0day-ci/archive/20240607/202406072022.OkRGOAuS-lkp@intel.com/config)
compiler: or1k-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240607/202406072022.OkRGOAuS-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202406072022.OkRGOAuS-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> net/core/flow_dissector.c:1872: warning: Function parameter or struct member 'net' not described in '__skb_get_hash_net'
vim +1872 net/core/flow_dissector.c
eb70db8756717b David S. Miller 2016-07-01 1861
d4fd32757176d1 Jiri Pirko 2015-05-12 1862 /**
11b45a5b56dab6 Florian Westphal 2024-06-07 1863 * __skb_get_hash_net: calculate a flow hash
d4fd32757176d1 Jiri Pirko 2015-05-12 1864 * @skb: sk_buff to calculate flow hash from
d4fd32757176d1 Jiri Pirko 2015-05-12 1865 *
d4fd32757176d1 Jiri Pirko 2015-05-12 1866 * This function calculates a flow hash based on src/dst addresses
61b905da33ae25 Tom Herbert 2014-03-24 1867 * and src/dst port numbers. Sets hash in skb to non-zero hash value
61b905da33ae25 Tom Herbert 2014-03-24 1868 * on success, zero indicates no valid hash. Also, sets l4_hash in skb
441d9d327f1e77 Cong Wang 2013-01-21 1869 * if hash is a canonical 4-tuple hash over transport ports.
441d9d327f1e77 Cong Wang 2013-01-21 1870 */
11b45a5b56dab6 Florian Westphal 2024-06-07 1871 void __skb_get_hash_net(const struct net *net, struct sk_buff *skb)
441d9d327f1e77 Cong Wang 2013-01-21 @1872 {
441d9d327f1e77 Cong Wang 2013-01-21 1873 struct flow_keys keys;
635c223cfa05af Gao Feng 2016-08-31 1874 u32 hash;
441d9d327f1e77 Cong Wang 2013-01-21 1875
11b45a5b56dab6 Florian Westphal 2024-06-07 1876 memset(&keys, 0, sizeof(keys));
11b45a5b56dab6 Florian Westphal 2024-06-07 1877
11b45a5b56dab6 Florian Westphal 2024-06-07 1878 __skb_flow_dissect(net, skb, &flow_keys_dissector,
11b45a5b56dab6 Florian Westphal 2024-06-07 1879 &keys, NULL, 0, 0, 0,
11b45a5b56dab6 Florian Westphal 2024-06-07 1880 FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL);
11b45a5b56dab6 Florian Westphal 2024-06-07 1881
50fb799289501c Tom Herbert 2015-05-01 1882 __flow_hash_secret_init();
50fb799289501c Tom Herbert 2015-05-01 1883
11b45a5b56dab6 Florian Westphal 2024-06-07 1884 hash = __flow_hash_from_keys(&keys, &hashrnd);
635c223cfa05af Gao Feng 2016-08-31 1885
635c223cfa05af Gao Feng 2016-08-31 1886 __skb_set_sw_hash(skb, hash, flow_keys_have_l4(&keys));
441d9d327f1e77 Cong Wang 2013-01-21 1887 }
11b45a5b56dab6 Florian Westphal 2024-06-07 1888 EXPORT_SYMBOL(__skb_get_hash_net);
441d9d327f1e77 Cong Wang 2013-01-21 1889
Eric Dumazet wrote: > On Fri, Jun 7, 2024 at 10:36 AM Florian Westphal <fw@strlen.de> wrote: > > > > Years ago flow dissector gained ability to delegate flow dissection > > to a bpf program, scoped per netns. > > > > Unfortunately, skb_get_hash() only gets an sk_buff argument instead > > of both net+skb. This means the flow dissector needs to obtain the > > netns pointer from somewhere else. > > > > The netns is derived from skb->dev, and if that is not available, from > > skb->sk. If neither is set, we hit a (benign) WARN_ON_ONCE(). > > > > Trying both dev and sk covers most cases, but not all, as recently > > reported by Christoph Paasch. > > > > In case of nf-generated tcp reset, both sk and dev are NULL: > > > > WARNING: .. net/core/flow_dissector.c:1104 > > skb_flow_dissect_flow_keys include/linux/skbuff.h:1536 [inline] > > skb_get_hash include/linux/skbuff.h:1578 [inline] > > nft_trace_init+0x7d/0x120 net/netfilter/nf_tables_trace.c:320 > > nft_do_chain+0xb26/0xb90 net/netfilter/nf_tables_core.c:268 > > nft_do_chain_ipv4+0x7a/0xa0 net/netfilter/nft_chain_filter.c:23 > > nf_hook_slow+0x57/0x160 net/netfilter/core.c:626 > > __ip_local_out+0x21d/0x260 net/ipv4/ip_output.c:118 > > ip_local_out+0x26/0x1e0 net/ipv4/ip_output.c:127 > > nf_send_reset+0x58c/0x700 net/ipv4/netfilter/nf_reject_ipv4.c:308 > > nft_reject_ipv4_eval+0x53/0x90 net/ipv4/netfilter/nft_reject_ipv4.c:30 > > [..] > > > > syzkaller did something like this: > > table inet filter { > > chain input { > > type filter hook input priority filter; policy accept; > > meta nftrace set 1 # calls skb_get_hash > > tcp dport 42 reject with tcp reset # emits skb with NULL skb dev/sk > > } > > chain output { > > type filter hook output priority filter; policy accept; > > # empty chain is enough > > } > > } > > > > ... then sends a tcp packet to port 42. > > > > Initial attempt to simply set skb->dev from nf_reject_ipv4 doesn't cover > > all cases: skbs generated via ipv4 igmp_send_report trigger similar splat. Does this mean we have more non-nf callsites to convert? > > > > Moreover, Pablo Neira found that nft_hash.c uses __skb_get_hash_symmetric() > > which would trigger same warn splat for such skbs. > > > > Lets allow callers to pass the current netns explicitly. > > The nf_trace infrastructure is adjusted to use the new helper. > > > > __skb_get_hash_symmetric is handled in the next patch. > > > > Reported-by: Christoph Paasch <cpaasch@apple.com> > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/494 > > Signed-off-by: Florian Westphal <fw@strlen.de> > > Nice, I had an internal syzbot report about the same issue. > > Reviewed-by: Eric Dumazet <edumazet@google.com> Subject to the documentation warning from the bot Reviewed-by: Willem de Bruijn <willemb@google.com> Thanks for fixing this, Florian.
Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > > syzkaller did something like this: > > > table inet filter { > > > chain input { > > > type filter hook input priority filter; policy accept; > > > meta nftrace set 1 # calls skb_get_hash > > > tcp dport 42 reject with tcp reset # emits skb with NULL skb dev/sk > > > } > > > chain output { > > > type filter hook output priority filter; policy accept; > > > # empty chain is enough > > > } > > > } > > > > > > ... then sends a tcp packet to port 42. > > > > > > Initial attempt to simply set skb->dev from nf_reject_ipv4 doesn't cover > > > all cases: skbs generated via ipv4 igmp_send_report trigger similar splat. > > Does this mean we have more non-nf callsites to convert? There might be non-nf call sites that need skb_get_hash_net(), but I don't know of any. The above comment was meant to say that I tried to patch this outside of flow dissector by setting skb->dev properly in nf_reject, but that still triggers a slightly different WARN trace, this time due to igmp_send_report also sending skb without dev+sk pointers.
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index fe7d8dbef77e..6e78019f899a 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1498,7 +1498,7 @@ __skb_set_sw_hash(struct sk_buff *skb, __u32 hash, bool is_l4) __skb_set_hash(skb, hash, true, is_l4); } -void __skb_get_hash(struct sk_buff *skb); +void __skb_get_hash_net(const struct net *net, struct sk_buff *skb); u32 __skb_get_hash_symmetric(const struct sk_buff *skb); u32 skb_get_poff(const struct sk_buff *skb); u32 __skb_get_poff(const struct sk_buff *skb, const void *data, @@ -1578,10 +1578,18 @@ void skb_flow_dissect_hash(const struct sk_buff *skb, struct flow_dissector *flow_dissector, void *target_container); +static inline __u32 skb_get_hash_net(const struct net *net, struct sk_buff *skb) +{ + if (!skb->l4_hash && !skb->sw_hash) + __skb_get_hash_net(net, skb); + + return skb->hash; +} + static inline __u32 skb_get_hash(struct sk_buff *skb) { if (!skb->l4_hash && !skb->sw_hash) - __skb_get_hash(skb); + __skb_get_hash_net(NULL, skb); return skb->hash; } diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 59fe46077b3c..32454181be60 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -1860,7 +1860,7 @@ u32 __skb_get_hash_symmetric(const struct sk_buff *skb) EXPORT_SYMBOL_GPL(__skb_get_hash_symmetric); /** - * __skb_get_hash: calculate a flow hash + * __skb_get_hash_net: calculate a flow hash * @skb: sk_buff to calculate flow hash from * * This function calculates a flow hash based on src/dst addresses @@ -1868,18 +1868,24 @@ EXPORT_SYMBOL_GPL(__skb_get_hash_symmetric); * on success, zero indicates no valid hash. Also, sets l4_hash in skb * if hash is a canonical 4-tuple hash over transport ports. */ -void __skb_get_hash(struct sk_buff *skb) +void __skb_get_hash_net(const struct net *net, struct sk_buff *skb) { struct flow_keys keys; u32 hash; + memset(&keys, 0, sizeof(keys)); + + __skb_flow_dissect(net, skb, &flow_keys_dissector, + &keys, NULL, 0, 0, 0, + FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL); + __flow_hash_secret_init(); - hash = ___skb_get_hash(skb, &keys, &hashrnd); + hash = __flow_hash_from_keys(&keys, &hashrnd); __skb_set_sw_hash(skb, hash, flow_keys_have_l4(&keys)); } -EXPORT_SYMBOL(__skb_get_hash); +EXPORT_SYMBOL(__skb_get_hash_net); __u32 skb_get_hash_perturb(const struct sk_buff *skb, const siphash_key_t *perturb) diff --git a/net/netfilter/nf_tables_trace.c b/net/netfilter/nf_tables_trace.c index a83637e3f455..580c55268f65 100644 --- a/net/netfilter/nf_tables_trace.c +++ b/net/netfilter/nf_tables_trace.c @@ -317,7 +317,7 @@ void nft_trace_init(struct nft_traceinfo *info, const struct nft_pktinfo *pkt, net_get_random_once(&trace_key, sizeof(trace_key)); info->skbid = (u32)siphash_3u32(hash32_ptr(skb), - skb_get_hash(skb), + skb_get_hash_net(nft_net(pkt), skb), skb->skb_iif, &trace_key); }
Years ago flow dissector gained ability to delegate flow dissection to a bpf program, scoped per netns. Unfortunately, skb_get_hash() only gets an sk_buff argument instead of both net+skb. This means the flow dissector needs to obtain the netns pointer from somewhere else. The netns is derived from skb->dev, and if that is not available, from skb->sk. If neither is set, we hit a (benign) WARN_ON_ONCE(). Trying both dev and sk covers most cases, but not all, as recently reported by Christoph Paasch. In case of nf-generated tcp reset, both sk and dev are NULL: WARNING: .. net/core/flow_dissector.c:1104 skb_flow_dissect_flow_keys include/linux/skbuff.h:1536 [inline] skb_get_hash include/linux/skbuff.h:1578 [inline] nft_trace_init+0x7d/0x120 net/netfilter/nf_tables_trace.c:320 nft_do_chain+0xb26/0xb90 net/netfilter/nf_tables_core.c:268 nft_do_chain_ipv4+0x7a/0xa0 net/netfilter/nft_chain_filter.c:23 nf_hook_slow+0x57/0x160 net/netfilter/core.c:626 __ip_local_out+0x21d/0x260 net/ipv4/ip_output.c:118 ip_local_out+0x26/0x1e0 net/ipv4/ip_output.c:127 nf_send_reset+0x58c/0x700 net/ipv4/netfilter/nf_reject_ipv4.c:308 nft_reject_ipv4_eval+0x53/0x90 net/ipv4/netfilter/nft_reject_ipv4.c:30 [..] syzkaller did something like this: table inet filter { chain input { type filter hook input priority filter; policy accept; meta nftrace set 1 # calls skb_get_hash tcp dport 42 reject with tcp reset # emits skb with NULL skb dev/sk } chain output { type filter hook output priority filter; policy accept; # empty chain is enough } } ... then sends a tcp packet to port 42. Initial attempt to simply set skb->dev from nf_reject_ipv4 doesn't cover all cases: skbs generated via ipv4 igmp_send_report trigger similar splat. Moreover, Pablo Neira found that nft_hash.c uses __skb_get_hash_symmetric() which would trigger same warn splat for such skbs. Lets allow callers to pass the current netns explicitly. The nf_trace infrastructure is adjusted to use the new helper. __skb_get_hash_symmetric is handled in the next patch. Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/494 Signed-off-by: Florian Westphal <fw@strlen.de> --- include/linux/skbuff.h | 12 ++++++++++-- net/core/flow_dissector.c | 14 ++++++++++---- net/netfilter/nf_tables_trace.c | 2 +- 3 files changed, 21 insertions(+), 7 deletions(-)