diff mbox series

[iproute2-next,v7] ip-link: add support for nolocalbypass in vxlan

Message ID 20230604140051.4523-1-vladimir@nikishkin.pw (mailing list archive)
State Superseded
Delegated to: David Ahern
Headers show
Series [iproute2-next,v7] ip-link: add support for nolocalbypass in vxlan | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Vladimir Nikishkin June 4, 2023, 2 p.m. UTC
Add userspace support for the [no]localbypass vxlan netlink
attribute. With localbypass on (default), the vxlan driver processes
the packets destined to the local machine by itself, bypassing the
userspace nework stack. With nolocalbypass the packets are always
forwarded to the userspace network stack, so userspace programs,
such as tcpdump have a chance to process them.

Signed-off-by: Vladimir Nikishkin <vladimir@nikishkin.pw>
---
v6=>v7:
Use the new vxlan_opts data structure. Rely on the printing loop
in vxlan_print_opt when printing the value of [no] localbypass.

ip/iplink_vxlan.c     | 10 ++++++++++
 man/man8/ip-link.8.in | 10 ++++++++++
 2 files changed, 20 insertions(+)

Comments

Ido Schimmel June 5, 2023, 6:36 a.m. UTC | #1
On Sun, Jun 04, 2023 at 10:00:51PM +0800, Vladimir Nikishkin wrote:
> Add userspace support for the [no]localbypass vxlan netlink
> attribute. With localbypass on (default), the vxlan driver processes
> the packets destined to the local machine by itself, bypassing the
> userspace nework stack. With nolocalbypass the packets are always
> forwarded to the userspace network stack, so userspace programs,
> such as tcpdump have a chance to process them.
> 
> Signed-off-by: Vladimir Nikishkin <vladimir@nikishkin.pw>
> ---
> v6=>v7:
> Use the new vxlan_opts data structure. Rely on the printing loop
> in vxlan_print_opt when printing the value of [no] localbypass.

Stephen's changes are still not present in the next branch so this patch
does not apply
Vladimir Nikishkin June 5, 2023, 6:47 a.m. UTC | #2
Ido Schimmel <idosch@idosch.org> writes:

> On Sun, Jun 04, 2023 at 10:00:51PM +0800, Vladimir Nikishkin wrote:
>> Add userspace support for the [no]localbypass vxlan netlink
>> attribute. With localbypass on (default), the vxlan driver processes
>> the packets destined to the local machine by itself, bypassing the
>> userspace nework stack. With nolocalbypass the packets are always
>> forwarded to the userspace network stack, so userspace programs,
>> such as tcpdump have a chance to process them.
>> 
>> Signed-off-by: Vladimir Nikishkin <vladimir@nikishkin.pw>
>> ---
>> v6=>v7:
>> Use the new vxlan_opts data structure. Rely on the printing loop
>> in vxlan_print_opt when printing the value of [no] localbypass.
>
> Stephen's changes are still not present in the next branch so this patch
> does not apply

Sorry for the confusion, I thought that the tree to develop against is
git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git

Apologies.
Ido Schimmel June 5, 2023, 8:26 a.m. UTC | #3
On Mon, Jun 05, 2023 at 02:47:12PM +0800, Vladimir Nikishkin wrote:
> 
> Ido Schimmel <idosch@idosch.org> writes:
> 
> > On Sun, Jun 04, 2023 at 10:00:51PM +0800, Vladimir Nikishkin wrote:
> >> Add userspace support for the [no]localbypass vxlan netlink
> >> attribute. With localbypass on (default), the vxlan driver processes
> >> the packets destined to the local machine by itself, bypassing the
> >> userspace nework stack. With nolocalbypass the packets are always
> >> forwarded to the userspace network stack, so userspace programs,
> >> such as tcpdump have a chance to process them.
> >> 
> >> Signed-off-by: Vladimir Nikishkin <vladimir@nikishkin.pw>
> >> ---
> >> v6=>v7:
> >> Use the new vxlan_opts data structure. Rely on the printing loop
> >> in vxlan_print_opt when printing the value of [no] localbypass.
> >
> > Stephen's changes are still not present in the next branch so this patch
> > does not apply
> 
> Sorry for the confusion, I thought that the tree to develop against is
> git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git

iproute2-next is developed at
git://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git

See the README file.

Anyway, patch looks fine, but indentation is a bit off. Please fold this
in:

diff --git a/ip/iplink_vxlan.c b/ip/iplink_vxlan.c
index 70f38a866c3b..7781d60bbb52 100644
--- a/ip/iplink_vxlan.c
+++ b/ip/iplink_vxlan.c
@@ -36,7 +36,7 @@ static const struct vxlan_bool_opt {
        { "udp_zero_csum6_rx", IFLA_VXLAN_UDP_ZERO_CSUM6_RX, false },
        { "remcsum_tx", IFLA_VXLAN_REMCSUM_TX,          false },
        { "remcsum_rx", IFLA_VXLAN_REMCSUM_RX,          false },
-       { "localbypass", IFLA_VXLAN_LOCALBYPASS,                true },
+       { "localbypass", IFLA_VXLAN_LOCALBYPASS,        true },
 };

And the kernel selftest would need to be modified to use the JSON output
(it fails with this version). Something like this:

diff --git a/tools/testing/selftests/net/test_vxlan_nolocalbypass.sh b/tools/testing/selftests/net/test_vxlan_nolocalbypass.sh
index 46067db53068..f75212bf142c 100755
--- a/tools/testing/selftests/net/test_vxlan_nolocalbypass.sh
+++ b/tools/testing/selftests/net/test_vxlan_nolocalbypass.sh
@@ -130,7 +130,7 @@ nolocalbypass()
        run_cmd "tc -n ns1 qdisc add dev lo clsact"
        run_cmd "tc -n ns1 filter add dev lo ingress pref 1 handle 101 proto ip flower ip_proto udp dst_port 4790 action drop"
 
-       run_cmd "ip -n ns1 -d link show dev vx0 | grep ' localbypass'"
+       run_cmd "ip -n ns1 -d -j link show dev vx0 | jq -e '.[][\"linkinfo\"][\"info_data\"][\"localbypass\"] == true'"
        log_test $? 0 "localbypass enabled"
 
        run_cmd "ip netns exec ns1 mausezahn vx0 -a $smac -b $dmac -c 1 -p 100 -q"
@@ -140,7 +140,7 @@ nolocalbypass()
 
        run_cmd "ip -n ns1 link set dev vx0 type vxlan nolocalbypass"
 
-       run_cmd "ip -n ns1 -d link show dev vx0 | grep 'nolocalbypass'"
+       run_cmd "ip -n ns1 -d -j link show dev vx0 | jq -e '.[][\"linkinfo\"][\"info_data\"][\"localbypass\"] == false'"
        log_test $? 0 "localbypass disabled"
 
        run_cmd "ip netns exec ns1 mausezahn vx0 -a $smac -b $dmac -c 1 -p 100 -q"
@@ -150,7 +150,7 @@ nolocalbypass()
 
        run_cmd "ip -n ns1 link set dev vx0 type vxlan localbypass"
 
-       run_cmd "ip -n ns1 -d link show dev vx0 | grep ' localbypass'"
+       run_cmd "ip -n ns1 -d -j link show dev vx0 | jq -e '.[][\"linkinfo\"][\"info_data\"][\"localbypass\"] == true'"
        log_test $? 0 "localbypass enabled"
 
        run_cmd "ip netns exec ns1 mausezahn vx0 -a $smac -b $dmac -c 1 -p 100 -q"

Please submit it after the iproute2 changes are accepted.

Thanks
Stephen Hemminger June 5, 2023, 3:02 p.m. UTC | #4
On Mon, 5 Jun 2023 11:26:56 +0300
Ido Schimmel <idosch@idosch.org> wrote:

> On Mon, Jun 05, 2023 at 02:47:12PM +0800, Vladimir Nikishkin wrote:
> > 
> > Ido Schimmel <idosch@idosch.org> writes:
> >   
> > > On Sun, Jun 04, 2023 at 10:00:51PM +0800, Vladimir Nikishkin wrote:  
> > >> Add userspace support for the [no]localbypass vxlan netlink
> > >> attribute. With localbypass on (default), the vxlan driver processes
> > >> the packets destined to the local machine by itself, bypassing the
> > >> userspace nework stack. With nolocalbypass the packets are always
> > >> forwarded to the userspace network stack, so userspace programs,
> > >> such as tcpdump have a chance to process them.
> > >> 
> > >> Signed-off-by: Vladimir Nikishkin <vladimir@nikishkin.pw>
> > >> ---  
> > >> v6=>v7:  
> > >> Use the new vxlan_opts data structure. Rely on the printing loop
> > >> in vxlan_print_opt when printing the value of [no] localbypass.  
> > >
> > > Stephen's changes are still not present in the next branch so this patch
> > > does not apply  
> > 
> > Sorry for the confusion, I thought that the tree to develop against is
> > git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git  
> 
> iproute2-next is developed at
> git://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git
> 
> See the README file.
> 
> Anyway, patch looks fine, but indentation is a bit off. Please fold this
> in:

David will to a merge from main to next if asked.
Ido Schimmel June 5, 2023, 3:14 p.m. UTC | #5
On Mon, Jun 05, 2023 at 08:02:17AM -0700, Stephen Hemminger wrote:
> David will to a merge from main to next if asked.

I see he is not even copied. Let me add him.

David, can you please merge main into next? We need commit 1215e9d38623
("vxlan: make option printing more consistent") to make progress with
the "localbypass" patch.

Thanks
David Ahern June 5, 2023, 3:19 p.m. UTC | #6
On 6/5/23 9:14 AM, Ido Schimmel wrote:
> On Mon, Jun 05, 2023 at 08:02:17AM -0700, Stephen Hemminger wrote:
>> David will to a merge from main to next if asked.
> 
> I see he is not even copied. Let me add him.

Thanks.

> 
> David, can you please merge main into next? We need commit 1215e9d38623
> ("vxlan: make option printing more consistent") to make progress with
> the "localbypass" patch.
> 
> Thanks

Done.
diff mbox series

Patch

diff --git a/ip/iplink_vxlan.c b/ip/iplink_vxlan.c
index 3053cdb8..70f38a86 100644
--- a/ip/iplink_vxlan.c
+++ b/ip/iplink_vxlan.c
@@ -36,6 +36,7 @@  static const struct vxlan_bool_opt {
 	{ "udp_zero_csum6_rx", IFLA_VXLAN_UDP_ZERO_CSUM6_RX, false },
 	{ "remcsum_tx", IFLA_VXLAN_REMCSUM_TX,		false },
 	{ "remcsum_rx", IFLA_VXLAN_REMCSUM_RX,		false },
+	{ "localbypass", IFLA_VXLAN_LOCALBYPASS,		true },
 };
 
 static void print_explain(FILE *f)
@@ -62,6 +63,7 @@  static void print_explain(FILE *f)
 		"		[ [no]udp6zerocsumtx ]\n"
 		"		[ [no]udp6zerocsumrx ]\n"
 		"		[ [no]remcsumtx ] [ [no]remcsumrx ]\n"
+		"		[ [no]localbypass ]\n"
 		"		[ [no]external ] [ gbp ] [ gpe ]\n"
 		"		[ [no]vnifilter ]\n"
 		"\n"
@@ -327,6 +329,14 @@  static int vxlan_parse_opt(struct link_util *lu, int argc, char **argv,
 			check_duparg(&attrs, IFLA_VXLAN_REMCSUM_RX,
 				     *argv, *argv);
 			addattr8(n, 1024, IFLA_VXLAN_REMCSUM_RX, 0);
+		} else if (strcmp(*argv, "localbypass") == 0) {
+			check_duparg(&attrs, IFLA_VXLAN_LOCALBYPASS,
+				     *argv, *argv);
+			addattr8(n, 1024, IFLA_VXLAN_LOCALBYPASS, 1);
+		} else if (strcmp(*argv, "nolocalbypass") == 0) {
+			check_duparg(&attrs, IFLA_VXLAN_LOCALBYPASS,
+				     *argv, *argv);
+			addattr8(n, 1024, IFLA_VXLAN_LOCALBYPASS, 0);
 		} else if (!matches(*argv, "external")) {
 			check_duparg(&attrs, IFLA_VXLAN_COLLECT_METADATA,
 				     *argv, *argv);
diff --git a/man/man8/ip-link.8.in b/man/man8/ip-link.8.in
index bf3605a9..6a82ddc4 100644
--- a/man/man8/ip-link.8.in
+++ b/man/man8/ip-link.8.in
@@ -634,6 +634,8 @@  the following additional arguments are supported:
 ] [
 .RB [ no ] udp6zerocsumrx
 ] [
+.RB [ no ] localbypass
+] [
 .BI ageing " SECONDS "
 ] [
 .BI maxaddress " NUMBER "
@@ -742,6 +744,14 @@  are entered into the VXLAN device forwarding database.
 .RB [ no ] udp6zerocsumrx
 - allow incoming UDP packets over IPv6 with zero checksum field.
 
+.sp
+.RB [ no ] localbypass
+- if FDB destination is local, with nolocalbypass set, forward encapsulated
+packets to the userspace network stack. If there is a userspace process
+listening for these packets, it will have a chance to process them. If
+localbypass is active (default), bypass the kernel network stack and
+inject the packets into the target VXLAN device, assuming one exists.
+
 .sp
 .BI ageing " SECONDS"
 - specifies the lifetime in seconds of FDB entries learnt by the kernel.