Message ID | 20240418104737.77914-5-donald.hunter@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | bf2ac490d28c21a349e9eef81edc45320fca4a3c |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | netlink: Add nftables spec w/ multi messages | expand |
Hi Donald, Apologies for a bit late jumping back on this, it took me a while. On Thu, Apr 18, 2024 at 11:47:37AM +0100, Donald Hunter wrote: > The NLM_F_ACK flag is ignored for nfnetlink batch begin and end > messages. This is a problem for ynl which wants to receive an ack for > every message it sends, not just the commands in between the begin/end > messages. Just a side note: Turning on NLM_F_ACK for every message fills up the receiver buffer very quickly, leading to ENOBUFS. Netlink, in general, supports batching (with non-atomic semantics), I did not look at ynl in detail, if it does send() + recv() for each message for other subsystem then fine, but if it uses batching to amortize the cost of the syscall then this explicit ACK could be an issue with very large batches. Out of curiosity: Why does the tool need an explicit ack for each command? As mentioned above, this consumes a lot netlink bandwidth. > Add processing for ACKs for begin/end messages and provide responses > when requested. > > I have checked that iproute2, pyroute2 and systemd are unaffected by > this change since none of them use NLM_F_ACK for batch begin/end. nitpick: Quick grep shows me iproute2 does not use the nfnetlink subsystem? If I am correct, maybe remove this. Thanks! One more comment below. > Signed-off-by: Donald Hunter <donald.hunter@gmail.com> > --- > net/netfilter/nfnetlink.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c > index c9fbe0f707b5..4abf660c7baf 100644 > --- a/net/netfilter/nfnetlink.c > +++ b/net/netfilter/nfnetlink.c > @@ -427,6 +427,9 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh, > > nfnl_unlock(subsys_id); > > + if (nlh->nlmsg_flags & NLM_F_ACK) > + nfnl_err_add(&err_list, nlh, 0, &extack); > + > while (skb->len >= nlmsg_total_size(0)) { > int msglen, type; > > @@ -573,6 +576,8 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh, > } else if (err) { > ss->abort(net, oskb, NFNL_ABORT_NONE); > netlink_ack(oskb, nlmsg_hdr(oskb), err, NULL); > + } else if (nlh->nlmsg_flags & NLM_F_ACK) { > + nfnl_err_add(&err_list, nlh, 0, &extack); > } > } else { > enum nfnl_abort_action abort_action; > -- > 2.44.0 >
On Thu, 18 Apr 2024 18:30:55 +0200 Pablo Neira Ayuso wrote: > Out of curiosity: Why does the tool need an explicit ack for each > command? As mentioned above, this consumes a lot netlink bandwidth. I think that the tool is sort of besides the point, it's just a PoC. The point is that we're trying to describe netlink protocols in machine readable fashion. Which in turn makes it possible to write netlink binding generators in any language, like modern RPC frameworks. For that to work we need protocol basics to be followed. That's not to say that we're going to force all netlink families to change to follow extra new rules. Just those that want to be accessed via the bindings.
Pablo Neira Ayuso <pablo@netfilter.org> writes: > Hi Donald, > > Apologies for a bit late jumping back on this, it took me a while. > > On Thu, Apr 18, 2024 at 11:47:37AM +0100, Donald Hunter wrote: >> The NLM_F_ACK flag is ignored for nfnetlink batch begin and end >> messages. This is a problem for ynl which wants to receive an ack for >> every message it sends, not just the commands in between the begin/end >> messages. > > Just a side note: Turning on NLM_F_ACK for every message fills up the > receiver buffer very quickly, leading to ENOBUFS. Netlink, in general, > supports batching (with non-atomic semantics), I did not look at ynl > in detail, if it does send() + recv() for each message for other > subsystem then fine, but if it uses batching to amortize the cost of > the syscall then this explicit ACK could be an issue with very large > batches. ynl is batching the messages for send() and will accept batches from recv() but nfnetlink is sending each ack separately. It is using netlink_ack() which uses a new skb for each message, for example: sudo strace -e sendto,recvfrom ./tools/net/ynl/cli.py \ --spec Documentation/netlink/specs/nftables.yaml \ --multi batch-begin '{"res-id": 10}' \ --multi newtable '{"name": "test", "nfgen-family": 1}' \ --multi newchain '{"name": "chain", "table": "test", "nfgen-family": 1}' \ --multi batch-end '{"res-id": 10}' sendto(3, [[{nlmsg_len=20, nlmsg_type=0x10 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28254, nlmsg_pid=0}, "\x00\x00\x00\x0a"], [{nlmsg_len=32, nlmsg_type=0xa00 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28255, nlmsg_pid=0}, "\x01\x00\x00\x00\x09\x00\x01\x00\x74\x65\x73\x74\x00\x00\x00\x00"], [{nlmsg_len=44, nlmsg_type=0xa03 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28256, nlmsg_pid=0}, "\x01\x00\x00\x00\x0a\x00\x03\x00\x63\x68\x61\x69\x6e\x00\x00\x00\x09\x00\x01\x00\x74\x65\x73\x74\x00\x00\x00\x00"], [{nlmsg_len=20, nlmsg_type=0x11 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28257, nlmsg_pid=0}, "\x00\x00\x00\x0a"]], 116, 0, NULL, 0) = 116 recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28254, nlmsg_pid=997}, {error=0, msg={nlmsg_len=20, nlmsg_type=NFNL_MSG_BATCH_BEGIN, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28254, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36 recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28255, nlmsg_pid=997}, {error=0, msg={nlmsg_len=32, nlmsg_type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWTABLE, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28255, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36 recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28256, nlmsg_pid=997}, {error=0, msg={nlmsg_len=44, nlmsg_type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWCHAIN, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28256, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36 recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28257, nlmsg_pid=997}, {error=0, msg={nlmsg_len=20, nlmsg_type=NFNL_MSG_BATCH_END, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28257, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36 > Out of curiosity: Why does the tool need an explicit ack for each > command? As mentioned above, this consumes a lot netlink bandwidth. For the ynl python library, I guess it was a design choice to request an ack for each command. Since the Netlink API allows a user to request acks, it seems necessary to be consistent about providing them. Otherwise we'd need to extend the netlink message specs to say which messages are ack capable and which are not. >> Add processing for ACKs for begin/end messages and provide responses >> when requested. >> >> I have checked that iproute2, pyroute2 and systemd are unaffected by >> this change since none of them use NLM_F_ACK for batch begin/end. > > nitpick: Quick grep shows me iproute2 does not use the nfnetlink > subsystem? If I am correct, maybe remove this. Yeah, my mistake. I did check iproute2 but didn't mean to add it to the list. For nft, NFNL_MSG_BATCH_* usage is contained in libnftnl from what I can see. I'll update the patch. > Thanks! > > One more comment below. Did you miss adding a comment? > >> Signed-off-by: Donald Hunter <donald.hunter@gmail.com> >> --- >> net/netfilter/nfnetlink.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c >> index c9fbe0f707b5..4abf660c7baf 100644 >> --- a/net/netfilter/nfnetlink.c >> +++ b/net/netfilter/nfnetlink.c >> @@ -427,6 +427,9 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh, >> >> nfnl_unlock(subsys_id); >> >> + if (nlh->nlmsg_flags & NLM_F_ACK) >> + nfnl_err_add(&err_list, nlh, 0, &extack); >> + >> while (skb->len >= nlmsg_total_size(0)) { >> int msglen, type; >> >> @@ -573,6 +576,8 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh, >> } else if (err) { >> ss->abort(net, oskb, NFNL_ABORT_NONE); >> netlink_ack(oskb, nlmsg_hdr(oskb), err, NULL); >> + } else if (nlh->nlmsg_flags & NLM_F_ACK) { >> + nfnl_err_add(&err_list, nlh, 0, &extack); >> } >> } else { >> enum nfnl_abort_action abort_action; >> -- >> 2.44.0 >>
On Thu, 18 Apr 2024 09:51:53 -0700 Jakub Kicinski wrote: > On Thu, 18 Apr 2024 18:30:55 +0200 Pablo Neira Ayuso wrote: > > Out of curiosity: Why does the tool need an explicit ack for each > > command? As mentioned above, this consumes a lot netlink bandwidth. > > I think that the tool is sort of besides the point, it's just a PoC. > The point is that we're trying to describe netlink protocols in machine > readable fashion. Which in turn makes it possible to write netlink > binding generators in any language, like modern RPC frameworks. > For that to work we need protocol basics to be followed. > > That's not to say that we're going to force all netlink families to > change to follow extra new rules. Just those that want to be accessed > via the bindings. Pablo, any thoughts? Convinced? Given this touches YNL in significant ways I'd prefer to merge it to net-next.
On Fri, Apr 19, 2024 at 12:20:25PM +0100, Donald Hunter wrote: > Pablo Neira Ayuso <pablo@netfilter.org> writes: > > > Hi Donald, > > > > Apologies for a bit late jumping back on this, it took me a while. > > > > On Thu, Apr 18, 2024 at 11:47:37AM +0100, Donald Hunter wrote: > >> The NLM_F_ACK flag is ignored for nfnetlink batch begin and end > >> messages. This is a problem for ynl which wants to receive an ack for > >> every message it sends, not just the commands in between the begin/end > >> messages. > > > > Just a side note: Turning on NLM_F_ACK for every message fills up the > > receiver buffer very quickly, leading to ENOBUFS. Netlink, in general, > > supports batching (with non-atomic semantics), I did not look at ynl > > in detail, if it does send() + recv() for each message for other > > subsystem then fine, but if it uses batching to amortize the cost of > > the syscall then this explicit ACK could be an issue with very large > > batches. > > ynl is batching the messages for send() and will accept batches from > recv() but nfnetlink is sending each ack separately. Yes, like it happens with other existing netlink interfaces when batching is used, nfnetlink is no different in such case. > It is using netlink_ack() which uses a new skb for each message, for > example: > > sudo strace -e sendto,recvfrom ./tools/net/ynl/cli.py \ > --spec Documentation/netlink/specs/nftables.yaml \ > --multi batch-begin '{"res-id": 10}' \ > --multi newtable '{"name": "test", "nfgen-family": 1}' \ > --multi newchain '{"name": "chain", "table": "test", "nfgen-family": 1}' \ > --multi batch-end '{"res-id": 10}' > sendto(3, [[{nlmsg_len=20, nlmsg_type=0x10 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28254, nlmsg_pid=0}, "\x00\x00\x00\x0a"], [{nlmsg_len=32, nlmsg_type=0xa00 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28255, nlmsg_pid=0}, "\x01\x00\x00\x00\x09\x00\x01\x00\x74\x65\x73\x74\x00\x00\x00\x00"], [{nlmsg_len=44, nlmsg_type=0xa03 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28256, nlmsg_pid=0}, "\x01\x00\x00\x00\x0a\x00\x03\x00\x63\x68\x61\x69\x6e\x00\x00\x00\x09\x00\x01\x00\x74\x65\x73\x74\x00\x00\x00\x00"], [{nlmsg_len=20, nlmsg_type=0x11 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28257, nlmsg_pid=0}, "\x00\x00\x00\x0a"]], 116, 0, NULL, 0) = 116 > recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28254, nlmsg_pid=997}, {error=0, msg={nlmsg_len=20, nlmsg_type=NFNL_MSG_BATCH_BEGIN, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28254, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36 > recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28255, nlmsg_pid=997}, {error=0, msg={nlmsg_len=32, nlmsg_type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWTABLE, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28255, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36 > recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28256, nlmsg_pid=997}, {error=0, msg={nlmsg_len=44, nlmsg_type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWCHAIN, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28256, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36 > recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28257, nlmsg_pid=997}, {error=0, msg={nlmsg_len=20, nlmsg_type=NFNL_MSG_BATCH_END, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28257, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36 > > > Out of curiosity: Why does the tool need an explicit ack for each > > command? As mentioned above, this consumes a lot netlink bandwidth. > > For the ynl python library, I guess it was a design choice to request an > ack for each command. > > Since the Netlink API allows a user to request acks, it seems necessary > to be consistent about providing them. Otherwise we'd need to extend the > netlink message specs to say which messages are ack capable and which > are not. I see. > >> Add processing for ACKs for begin/end messages and provide responses > >> when requested. > >> > >> I have checked that iproute2, pyroute2 and systemd are unaffected by > >> this change since none of them use NLM_F_ACK for batch begin/end. > > > > nitpick: Quick grep shows me iproute2 does not use the nfnetlink > > subsystem? If I am correct, maybe remove this. > > Yeah, my mistake. I did check iproute2 but didn't mean to add it to the > list. For nft, NFNL_MSG_BATCH_* usage is contained in libnftnl from what > I can see. I'll update the patch. > > > Thanks! > > > > One more comment below. > > Did you miss adding a comment? No more comments, thanks.
On Mon, Apr 22, 2024 at 01:33:28PM -0700, Jakub Kicinski wrote: > On Thu, 18 Apr 2024 09:51:53 -0700 Jakub Kicinski wrote: > > On Thu, 18 Apr 2024 18:30:55 +0200 Pablo Neira Ayuso wrote: > > > Out of curiosity: Why does the tool need an explicit ack for each > > > command? As mentioned above, this consumes a lot netlink bandwidth. > > > > I think that the tool is sort of besides the point, it's just a PoC. > > The point is that we're trying to describe netlink protocols in machine > > readable fashion. Which in turn makes it possible to write netlink > > binding generators in any language, like modern RPC frameworks. > > For that to work we need protocol basics to be followed. > > > > That's not to say that we're going to force all netlink families to > > change to follow extra new rules. Just those that want to be accessed > > via the bindings. > > Pablo, any thoughts? Convinced? Given this touches YNL in significant > ways I'd prefer to merge it to net-next. No objections, thanks for asking.
On 18. 04. 24, 12:47, Donald Hunter wrote: > The NLM_F_ACK flag is ignored for nfnetlink batch begin and end > messages. This is a problem for ynl which wants to receive an ack for > every message it sends, not just the commands in between the begin/end > messages. > > Add processing for ACKs for begin/end messages and provide responses > when requested. > > I have checked that iproute2, pyroute2 and systemd are unaffected by > this change since none of them use NLM_F_ACK for batch begin/end. > > Signed-off-by: Donald Hunter <donald.hunter@gmail.com> > --- > net/netfilter/nfnetlink.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c > index c9fbe0f707b5..4abf660c7baf 100644 > --- a/net/netfilter/nfnetlink.c > +++ b/net/netfilter/nfnetlink.c > @@ -427,6 +427,9 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh, > > nfnl_unlock(subsys_id); > > + if (nlh->nlmsg_flags & NLM_F_ACK) I believe a memset() is missing here: + memset(&extack, 0, sizeof(extack)); > + nfnl_err_add(&err_list, nlh, 0, &extack); > + Otherwise: > [ 36.330875][ T1048] Oops: general protection fault, probably for non-canonical address 0x339e5eab81f1f600: 0000 [#1] PREEMPT SMP NOPTI > [ 36.334610][ T1048] CPU: 1 PID: 1048 Comm: systemd-network Not tainted 6.10.3-1-default #1 openSUSE Tumbleweed 5d3a202ce24e9b465acfbb908cc2eb4f0547bea7 > [ 36.335330][ T1048] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 > [ 36.335906][ T1048] RIP: 0010:strlen+0x4/0x30 > [ 36.336204][ T1048] Code: f7 75 ec 31 c0 e9 17 e0 25 00 48 89 f8 e9 0f e0 25 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa <80> 3f 00 74 14 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 e9 de > [ 36.338921][ T1048] RSP: 0018:ffffb023808f3878 EFLAGS: 00010206 > [ 36.339802][ T1048] RAX: 00000000000000c2 RBX: 0000000000000000 RCX: ffff9291ca559620 > [ 36.340735][ T1048] RDX: ffff9291ca559620 RSI: 0000000000000000 RDI: 339e5eab81f1f600 > [ 36.341177][ T1048] RBP: ffff9291ca559620 R08: 0000000000000000 R09: ffff9291ce8a6500 > [ 36.341639][ T1048] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 > [ 36.342063][ T1048] R13: ffff9291c1015680 R14: dead000000000100 R15: ffff9291ce8a6500 > [ 36.342517][ T1048] FS: 00007f2ee943d900(0000) GS:ffff92923bd00000(0000) knlGS:0000000000000000 > [ 36.342732][ T1048] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 36.342868][ T1048] CR2: 00007f9d4769c000 CR3: 0000000100b82006 CR4: 0000000000370ef0 > [ 36.343044][ T1048] Call Trace: > [ 36.343329][ T1048] <TASK> > [ 36.344518][ T1048] ? __die_body.cold+0x14/0x24 > [ 36.344831][ T1048] ? die_addr+0x3c/0x60 > [ 36.345029][ T1048] ? exc_general_protection+0x1cc/0x3e0 > [ 36.345674][ T1048] ? asm_exc_general_protection+0x26/0x30 > [ 36.349001][ T1048] ? strlen+0x4/0x30 > [ 36.349423][ T1048] ? nf_tables_abort+0x67c/0xee0 [nf_tables c16b4fb993ee603261e060fba374eb60b413741a] > [ 36.350380][ T1048] netlink_ack_tlv_len+0x32/0xb0 > [ 36.352876][ T1048] netlink_ack+0x59/0x280 > [ 36.353269][ T1048] nfnetlink_rcv_batch+0x60c/0x7e0 [nfnetlink a5ded37673006e964178e189bb08592f3ffd89ce] extack->_msg is 0x339e5eab81f1f600 (garbage from stack). See: https://github.com/systemd/systemd/actions/runs/10282472628/job/28454253577?pr=33958#step:12:30 thanks,
Hi Jiri, On Fri, Aug 16, 2024 at 11:23:55AM +0200, Jiri Slaby wrote: > On 18. 04. 24, 12:47, Donald Hunter wrote: > > The NLM_F_ACK flag is ignored for nfnetlink batch begin and end > > messages. This is a problem for ynl which wants to receive an ack for > > every message it sends, not just the commands in between the begin/end > > messages. > > > > Add processing for ACKs for begin/end messages and provide responses > > when requested. > > > > I have checked that iproute2, pyroute2 and systemd are unaffected by > > this change since none of them use NLM_F_ACK for batch begin/end. > > > > Signed-off-by: Donald Hunter <donald.hunter@gmail.com> > > --- > > net/netfilter/nfnetlink.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c > > index c9fbe0f707b5..4abf660c7baf 100644 > > --- a/net/netfilter/nfnetlink.c > > +++ b/net/netfilter/nfnetlink.c > > @@ -427,6 +427,9 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh, > > nfnl_unlock(subsys_id); > > + if (nlh->nlmsg_flags & NLM_F_ACK) > > I believe a memset() is missing here: > + memset(&extack, 0, sizeof(extack)); Indeed, see below. > > + nfnl_err_add(&err_list, nlh, 0, &extack); > > + > > Otherwise: > > [ 36.330875][ T1048] Oops: general protection fault, probably for non-canonical address 0x339e5eab81f1f600: 0000 [#1] PREEMPT SMP NOPTI > > [ 36.334610][ T1048] CPU: 1 PID: 1048 Comm: systemd-network Not tainted 6.10.3-1-default #1 openSUSE Tumbleweed 5d3a202ce24e9b465acfbb908cc2eb4f0547bea7 > > [ 36.335330][ T1048] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 > > [ 36.335906][ T1048] RIP: 0010:strlen+0x4/0x30 > > [ 36.336204][ T1048] Code: f7 75 ec 31 c0 e9 17 e0 25 00 48 89 f8 e9 0f e0 25 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa <80> 3f 00 74 14 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 e9 de > > [ 36.338921][ T1048] RSP: 0018:ffffb023808f3878 EFLAGS: 00010206 > > [ 36.339802][ T1048] RAX: 00000000000000c2 RBX: 0000000000000000 RCX: ffff9291ca559620 > > [ 36.340735][ T1048] RDX: ffff9291ca559620 RSI: 0000000000000000 RDI: 339e5eab81f1f600 > > [ 36.341177][ T1048] RBP: ffff9291ca559620 R08: 0000000000000000 R09: ffff9291ce8a6500 > > [ 36.341639][ T1048] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 > > [ 36.342063][ T1048] R13: ffff9291c1015680 R14: dead000000000100 R15: ffff9291ce8a6500 > > [ 36.342517][ T1048] FS: 00007f2ee943d900(0000) GS:ffff92923bd00000(0000) knlGS:0000000000000000 > > [ 36.342732][ T1048] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 36.342868][ T1048] CR2: 00007f9d4769c000 CR3: 0000000100b82006 CR4: 0000000000370ef0 > > [ 36.343044][ T1048] Call Trace: > > [ 36.343329][ T1048] <TASK> > > [ 36.344518][ T1048] ? __die_body.cold+0x14/0x24 > > [ 36.344831][ T1048] ? die_addr+0x3c/0x60 > > [ 36.345029][ T1048] ? exc_general_protection+0x1cc/0x3e0 > > [ 36.345674][ T1048] ? asm_exc_general_protection+0x26/0x30 > > [ 36.349001][ T1048] ? strlen+0x4/0x30 > > [ 36.349423][ T1048] ? nf_tables_abort+0x67c/0xee0 [nf_tables c16b4fb993ee603261e060fba374eb60b413741a] > > [ 36.350380][ T1048] netlink_ack_tlv_len+0x32/0xb0 > > [ 36.352876][ T1048] netlink_ack+0x59/0x280 > > [ 36.353269][ T1048] nfnetlink_rcv_batch+0x60c/0x7e0 [nfnetlink a5ded37673006e964178e189bb08592f3ffd89ce] > > extack->_msg is 0x339e5eab81f1f600 (garbage from stack). > > See: > https://github.com/systemd/systemd/actions/runs/10282472628/job/28454253577?pr=33958#step:12:30 Fix: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=d1a7b382a9d3f0f3e5a80e0be2991c075fa4f618
Hi, On 16. 08. 24, 11:58, Pablo Neira Ayuso wrote: >> See: >> https://github.com/systemd/systemd/actions/runs/10282472628/job/28454253577?pr=33958#step:12:30 > > Fix: > > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=d1a7b382a9d3f0f3e5a80e0be2991c075fa4f618 Ah, I checked my local -next clone, but had not updated, so I didn't see this. Taking into openSUSE. Hopefully, this gets into stable soon? 6.10.6-rc1 does not have it yet. thanks,
diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c index c9fbe0f707b5..4abf660c7baf 100644 --- a/net/netfilter/nfnetlink.c +++ b/net/netfilter/nfnetlink.c @@ -427,6 +427,9 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh, nfnl_unlock(subsys_id); + if (nlh->nlmsg_flags & NLM_F_ACK) + nfnl_err_add(&err_list, nlh, 0, &extack); + while (skb->len >= nlmsg_total_size(0)) { int msglen, type; @@ -573,6 +576,8 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh, } else if (err) { ss->abort(net, oskb, NFNL_ABORT_NONE); netlink_ack(oskb, nlmsg_hdr(oskb), err, NULL); + } else if (nlh->nlmsg_flags & NLM_F_ACK) { + nfnl_err_add(&err_list, nlh, 0, &extack); } } else { enum nfnl_abort_action abort_action;
The NLM_F_ACK flag is ignored for nfnetlink batch begin and end messages. This is a problem for ynl which wants to receive an ack for every message it sends, not just the commands in between the begin/end messages. Add processing for ACKs for begin/end messages and provide responses when requested. I have checked that iproute2, pyroute2 and systemd are unaffected by this change since none of them use NLM_F_ACK for batch begin/end. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> --- net/netfilter/nfnetlink.c | 5 +++++ 1 file changed, 5 insertions(+)