Message ID | 20240625013645.45034-4-kuniyu@amazon.com (mailing list archive) |
---|---|
State | Accepted |
Commit | b94038d841a91d0e3f59cfe4d073e210910366ee |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | af_unix: Fix bunch of MSG_OOB bugs and add new tests. | expand |
On Mon, 2024-06-24 at 18:36 -0700, Kuniyuki Iwashima wrote: > After consuming OOB data, recv() reading the preceding data must break at > the OOB skb regardless of MSG_PEEK. > > Currently, MSG_PEEK does not stop recv() for AF_UNIX, and the behaviour is > not compliant with TCP. I'm unsure we can change the MSG_PEEK behavior at this point: existing application(s?) could relay on that, regardless of how inconsistent such behavior is. I think we need at least an explicit ack from Rao, as the main known user. Paolo
On Wed, 2024-06-26 at 18:56 +0200, Paolo Abeni wrote: > On Mon, 2024-06-24 at 18:36 -0700, Kuniyuki Iwashima wrote: > > After consuming OOB data, recv() reading the preceding data must break at > > the OOB skb regardless of MSG_PEEK. > > > > Currently, MSG_PEEK does not stop recv() for AF_UNIX, and the behaviour is > > not compliant with TCP. > > I'm unsure we can change the MSG_PEEK behavior at this point: existing > application(s?) could relay on that, regardless of how inconsistent > such behavior is. > > I think we need at least an explicit ack from Rao, as the main known > user. I see Rao stated that the unix OoB implementation was designed to mimic the tcp one: https://lore.kernel.org/netdev/c5f6abbe-de43-48b8-856a-36ded227e94f@oracle.com/ so the series should be ok. Still given the size of the changes and the behavior change I'm wondering if the series should target net-next instead. That would offer some time cushion to deal with eventual regression. WDYT? Thanks, Paolo
From: Paolo Abeni <pabeni@redhat.com> Date: Wed, 26 Jun 2024 23:10:40 +0200 > On Wed, 2024-06-26 at 18:56 +0200, Paolo Abeni wrote: > > On Mon, 2024-06-24 at 18:36 -0700, Kuniyuki Iwashima wrote: > > > After consuming OOB data, recv() reading the preceding data must break at > > > the OOB skb regardless of MSG_PEEK. > > > > > > Currently, MSG_PEEK does not stop recv() for AF_UNIX, and the behaviour is > > > not compliant with TCP. > > > > I'm unsure we can change the MSG_PEEK behavior at this point: existing > > application(s?) could relay on that, regardless of how inconsistent > > such behavior is. > > > > I think we need at least an explicit ack from Rao, as the main known > > user. > > I see Rao stated that the unix OoB implementation was designed to mimic > the tcp one: > > https://lore.kernel.org/netdev/c5f6abbe-de43-48b8-856a-36ded227e94f@oracle.com/ > > so the series should be ok. > > Still given the size of the changes and the behavior change I'm > wondering if the series should target net-next instead. > That would offer some time cushion to deal with eventual regression. > WDYT? The actual change is 37 LoC and we recently have this kind of changes (30 LoC in total) in net.git. The last two were merged in April and we have no user report so far. a6736a0addd6 af_unix: Read with MSG_PEEK loops if the first unread byte is OOB 22dd70eb2c3d af_unix: Don't peek OOB data without MSG_OOB. 283454c8a123 af_unix: Call manage_oob() for every skb in unix_stream_read_generic(). Most of the changes are due to selftest. The original test repeated the same set of code but covered few cases. OTOH, the new test spends most of lines to add as many test cases as possible, which IMHO nicely covers regression if we want to mimic TCP. On net.git: # FAILED: 20 / 38 tests passed. # Totals: pass:20 fail:18 xfail:0 xpass:0 skip:0 error:0 Also, now the original test is broken in stable after the commits above, so I think it would be nice to have this series in stable. Thanks!
On Wed, 2024-06-26 at 14:47 -0700, Kuniyuki Iwashima wrote: > From: Paolo Abeni <pabeni@redhat.com> > Date: Wed, 26 Jun 2024 23:10:40 +0200 > > On Wed, 2024-06-26 at 18:56 +0200, Paolo Abeni wrote: > > > On Mon, 2024-06-24 at 18:36 -0700, Kuniyuki Iwashima wrote: > > > > After consuming OOB data, recv() reading the preceding data must break at > > > > the OOB skb regardless of MSG_PEEK. > > > > > > > > Currently, MSG_PEEK does not stop recv() for AF_UNIX, and the behaviour is > > > > not compliant with TCP. > > > > > > I'm unsure we can change the MSG_PEEK behavior at this point: existing > > > application(s?) could relay on that, regardless of how inconsistent > > > such behavior is. > > > > > > I think we need at least an explicit ack from Rao, as the main known > > > user. > > > > I see Rao stated that the unix OoB implementation was designed to mimic > > the tcp one: > > > > https://lore.kernel.org/netdev/c5f6abbe-de43-48b8-856a-36ded227e94f@oracle.com/ > > > > so the series should be ok. > > > > Still given the size of the changes and the behavior change I'm > > wondering if the series should target net-next instead. > > That would offer some time cushion to deal with eventual regression. > > WDYT? > > The actual change is 37 LoC ... excluding the other ~1200 LoC ;) > and we recently have this kind of changes > (30 LoC in total) in net.git. The last two were merged in April and > we have no user report so far. > > a6736a0addd6 af_unix: Read with MSG_PEEK loops if the first unread byte is OOB > 22dd70eb2c3d af_unix: Don't peek OOB data without MSG_OOB. > 283454c8a123 af_unix: Call manage_oob() for every skb in unix_stream_read_generic(). The last commit mentioned above actually sparked a bit of post merge discussion, which is IMHO sub-optimal. On the flip side, I think all the changes in this series make sense, and the self-tests refactor and extension is largish, but very nice. TL;DR: I'm going to apply this now. Thanks, Paolo
On 6/26/24 02:10, Paolo Abeni wrote: > On Wed, 2024-06-26 at 18:56 +0200, Paolo Abeni wrote: >> On Mon, 2024-06-24 at 18:36 -0700, Kuniyuki Iwashima wrote: >>> After consuming OOB data, recv() reading the preceding data must break at >>> the OOB skb regardless of MSG_PEEK. >>> >>> Currently, MSG_PEEK does not stop recv() for AF_UNIX, and the behaviour is >>> not compliant with TCP. >> >> I'm unsure we can change the MSG_PEEK behavior at this point: existing >> application(s?) could relay on that, regardless of how inconsistent >> such behavior is. >> >> I think we need at least an explicit ack from Rao, as the main known >> user. > > I see Rao stated that the unix OoB implementation was designed to mimic > the tcp one: > > https://urldefense.com/v3/__https://lore.kernel.org/netdev/c5f6abbe-de43-48b8-856a-36ded227e94f@oracle.com/__;!!ACWV5N9M2RV99hQ!MxZ1Tdors9BCjgG4K-LeD_fvtJ0mQ6jgbR5CfGYIouxV7LbYRiKf7zCml6SKYN7OLG7LZnZHnBnVfyo$ > > so the series should be ok. Yes. Thanks, Shoaib > > Still given the size of the changes and the behavior change I'm > wondering if the series should target net-next instead. > That would offer some time cushion to deal with eventual regression. > WDYT? > > Thanks, > > Paolo >
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 5e695a9a609c..2eaecf9d78a4 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2613,9 +2613,12 @@ static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk, { struct unix_sock *u = unix_sk(sk); - if (!unix_skb_len(skb) && !(flags & MSG_PEEK)) { - skb_unlink(skb, &sk->sk_receive_queue); - consume_skb(skb); + if (!unix_skb_len(skb)) { + if (!(flags & MSG_PEEK)) { + skb_unlink(skb, &sk->sk_receive_queue); + consume_skb(skb); + } + skb = NULL; } else { struct sk_buff *unlinked_skb = NULL; diff --git a/tools/testing/selftests/net/af_unix/msg_oob.c b/tools/testing/selftests/net/af_unix/msg_oob.c index d427d39d0806..de8d1fcde883 100644 --- a/tools/testing/selftests/net/af_unix/msg_oob.c +++ b/tools/testing/selftests/net/af_unix/msg_oob.c @@ -21,6 +21,21 @@ FIXTURE(msg_oob) */ }; +FIXTURE_VARIANT(msg_oob) +{ + bool peek; +}; + +FIXTURE_VARIANT_ADD(msg_oob, no_peek) +{ + .peek = false, +}; + +FIXTURE_VARIANT_ADD(msg_oob, peek) +{ + .peek = true +}; + static void create_unix_socketpair(struct __test_metadata *_metadata, FIXTURE_DATA(msg_oob) *self) { @@ -156,8 +171,14 @@ static void __recvpair(struct __test_metadata *_metadata, __sendpair(_metadata, self, buf, len, flags) #define recvpair(expected_buf, expected_len, buf_len, flags) \ - __recvpair(_metadata, self, \ - expected_buf, expected_len, buf_len, flags) + do { \ + if (variant->peek) \ + __recvpair(_metadata, self, \ + expected_buf, expected_len, \ + buf_len, (flags) | MSG_PEEK); \ + __recvpair(_metadata, self, \ + expected_buf, expected_len, buf_len, flags); \ + } while (0) TEST_F(msg_oob, non_oob) {