diff mbox series

[net] af_unix: Support POLLPRI for OOB.

Message ID 20220314052110.53634-1-kuniyu@amazon.co.jp (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net] af_unix: Support POLLPRI for OOB. | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers warning 3 maintainers not CCed: linux-kselftest@vger.kernel.org shuah@kernel.org viro@zeniv.linux.org.uk
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 21 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Iwashima, Kuniyuki March 14, 2022, 5:21 a.m. UTC
The commit 314001f0bf92 ("af_unix: Add OOB support") introduced OOB for
AF_UNIX, but it lacks some changes for POLLPRI.  Let's add the missing
piece.

In the selftest, normal datagrams are sent followed by OOB data, so this
commit replaces `POLLIN | POLLPRI` with just `POLLPRI` in the first test
case.

Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
---
 net/unix/af_unix.c                                  | 2 ++
 tools/testing/selftests/net/af_unix/test_unix_oob.c | 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

Comments

Shoaib Rao March 14, 2022, 4:37 p.m. UTC | #1
On 3/13/22 22:21, Kuniyuki Iwashima wrote:
> The commit 314001f0bf92 ("af_unix: Add OOB support") introduced OOB for
> AF_UNIX, but it lacks some changes for POLLPRI.  Let's add the missing
> piece.
>
> In the selftest, normal datagrams are sent followed by OOB data, so this
> commit replaces `POLLIN | POLLPRI` with just `POLLPRI` in the first test
> case.
>
> Fixes: 314001f0bf92 ("af_unix: Add OOB support")
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
> ---
>   net/unix/af_unix.c                                  | 2 ++
>   tools/testing/selftests/net/af_unix/test_unix_oob.c | 6 +++---
>   2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index c19569819866..711d21b1c3e1 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -3139,6 +3139,8 @@ static __poll_t unix_poll(struct file *file, struct socket *sock, poll_table *wa
>   		mask |= EPOLLIN | EPOLLRDNORM;
>   	if (sk_is_readable(sk))
>   		mask |= EPOLLIN | EPOLLRDNORM;
> +	if (unix_sk(sk)->oob_skb)
> +		mask |= EPOLLPRI;
>   
>   	/* Connection-based need to check for termination and startup */
>   	if ((sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET) &&
> diff --git a/tools/testing/selftests/net/af_unix/test_unix_oob.c b/tools/testing/selftests/net/af_unix/test_unix_oob.c
> index 3dece8b29253..b57e91e1c3f2 100644
> --- a/tools/testing/selftests/net/af_unix/test_unix_oob.c
> +++ b/tools/testing/selftests/net/af_unix/test_unix_oob.c
> @@ -218,10 +218,10 @@ main(int argc, char **argv)
>   
>   	/* Test 1:
>   	 * veriyf that SIGURG is
> -	 * delivered and 63 bytes are
> -	 * read and oob is '@'
> +	 * delivered, 63 bytes are
> +	 * read, oob is '@', and POLLPRI works.
>   	 */
> -	wait_for_data(pfd, POLLIN | POLLPRI);
> +	wait_for_data(pfd, POLLPRI);
>   	read_oob(pfd, &oob);
>   	len = read_data(pfd, buf, 1024);
>   	if (!signal_recvd || len != 63 || oob != '@') {

Reviewed-by: Rao Shoaib <rao.shoaib@oracle.com>

Thanks for fixing this.

Shoaib
Eric Dumazet March 14, 2022, 5:42 p.m. UTC | #2
On 3/13/22 22:21, Kuniyuki Iwashima wrote:
> The commit 314001f0bf92 ("af_unix: Add OOB support") introduced OOB for
> AF_UNIX, but it lacks some changes for POLLPRI.  Let's add the missing
> piece.
>
> In the selftest, normal datagrams are sent followed by OOB data, so this
> commit replaces `POLLIN | POLLPRI` with just `POLLPRI` in the first test
> case.
>
> Fixes: 314001f0bf92 ("af_unix: Add OOB support")
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
> ---
>   net/unix/af_unix.c                                  | 2 ++
>   tools/testing/selftests/net/af_unix/test_unix_oob.c | 6 +++---
>   2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index c19569819866..711d21b1c3e1 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -3139,6 +3139,8 @@ static __poll_t unix_poll(struct file *file, struct socket *sock, poll_table *wa
>   		mask |= EPOLLIN | EPOLLRDNORM;
>   	if (sk_is_readable(sk))
>   		mask |= EPOLLIN | EPOLLRDNORM;
> +	if (unix_sk(sk)->oob_skb)
> +		mask |= EPOLLPRI;


This adds another data-race, maybe add something to avoid another syzbot 
report ?


>   
>   	/* Connection-based need to check for termination and startup */
>   	if ((sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET) &&
> diff --git a/tools/testing/selftests/net/af_unix/test_unix_oob.c b/tools/testing/selftests/net/af_unix/test_unix_oob.c
> index 3dece8b29253..b57e91e1c3f2 100644
> --- a/tools/testing/selftests/net/af_unix/test_unix_oob.c
> +++ b/tools/testing/selftests/net/af_unix/test_unix_oob.c
> @@ -218,10 +218,10 @@ main(int argc, char **argv)
>   
>   	/* Test 1:
>   	 * veriyf that SIGURG is
> -	 * delivered and 63 bytes are
> -	 * read and oob is '@'
> +	 * delivered, 63 bytes are
> +	 * read, oob is '@', and POLLPRI works.
>   	 */
> -	wait_for_data(pfd, POLLIN | POLLPRI);
> +	wait_for_data(pfd, POLLPRI);
>   	read_oob(pfd, &oob);
>   	len = read_data(pfd, buf, 1024);
>   	if (!signal_recvd || len != 63 || oob != '@') {
Shoaib Rao March 14, 2022, 6:10 p.m. UTC | #3
On 3/14/22 10:42, Eric Dumazet wrote:
>
> On 3/13/22 22:21, Kuniyuki Iwashima wrote:
>> The commit 314001f0bf92 ("af_unix: Add OOB support") introduced OOB for
>> AF_UNIX, but it lacks some changes for POLLPRI.  Let's add the missing
>> piece.
>>
>> In the selftest, normal datagrams are sent followed by OOB data, so this
>> commit replaces `POLLIN | POLLPRI` with just `POLLPRI` in the first test
>> case.
>>
>> Fixes: 314001f0bf92 ("af_unix: Add OOB support")
>> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
>> ---
>>   net/unix/af_unix.c                                  | 2 ++
>>   tools/testing/selftests/net/af_unix/test_unix_oob.c | 6 +++---
>>   2 files changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>> index c19569819866..711d21b1c3e1 100644
>> --- a/net/unix/af_unix.c
>> +++ b/net/unix/af_unix.c
>> @@ -3139,6 +3139,8 @@ static __poll_t unix_poll(struct file *file, 
>> struct socket *sock, poll_table *wa
>>           mask |= EPOLLIN | EPOLLRDNORM;
>>       if (sk_is_readable(sk))
>>           mask |= EPOLLIN | EPOLLRDNORM;
>> +    if (unix_sk(sk)->oob_skb)
>> +        mask |= EPOLLPRI;
>
>
> This adds another data-race, maybe add something to avoid another 
> syzbot report ?

It's not obvious to me how there would be a race as it is just a check.

Also this change should be under #if IS_ENABLED(CONFIG_AF_UNIX_OOB)

Thanks,

Shoaib

>
>
>>         /* Connection-based need to check for termination and startup */
>>       if ((sk->sk_type == SOCK_STREAM || sk->sk_type == 
>> SOCK_SEQPACKET) &&
>> diff --git a/tools/testing/selftests/net/af_unix/test_unix_oob.c 
>> b/tools/testing/selftests/net/af_unix/test_unix_oob.c
>> index 3dece8b29253..b57e91e1c3f2 100644
>> --- a/tools/testing/selftests/net/af_unix/test_unix_oob.c
>> +++ b/tools/testing/selftests/net/af_unix/test_unix_oob.c
>> @@ -218,10 +218,10 @@ main(int argc, char **argv)
>>         /* Test 1:
>>        * veriyf that SIGURG is
>> -     * delivered and 63 bytes are
>> -     * read and oob is '@'
>> +     * delivered, 63 bytes are
>> +     * read, oob is '@', and POLLPRI works.
>>        */
>> -    wait_for_data(pfd, POLLIN | POLLPRI);
>> +    wait_for_data(pfd, POLLPRI);
>>       read_oob(pfd, &oob);
>>       len = read_data(pfd, buf, 1024);
>>       if (!signal_recvd || len != 63 || oob != '@') {
Eric Dumazet March 15, 2022, 12:26 a.m. UTC | #4
On 3/14/22 11:10, Shoaib Rao wrote:
>
> On 3/14/22 10:42, Eric Dumazet wrote:
>>
>> On 3/13/22 22:21, Kuniyuki Iwashima wrote:
>>> The commit 314001f0bf92 ("af_unix: Add OOB support") introduced OOB for
>>> AF_UNIX, but it lacks some changes for POLLPRI.  Let's add the missing
>>> piece.
>>>
>>> In the selftest, normal datagrams are sent followed by OOB data, so 
>>> this
>>> commit replaces `POLLIN | POLLPRI` with just `POLLPRI` in the first 
>>> test
>>> case.
>>>
>>> Fixes: 314001f0bf92 ("af_unix: Add OOB support")
>>> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
>>> ---
>>>   net/unix/af_unix.c                                  | 2 ++
>>>   tools/testing/selftests/net/af_unix/test_unix_oob.c | 6 +++---
>>>   2 files changed, 5 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>>> index c19569819866..711d21b1c3e1 100644
>>> --- a/net/unix/af_unix.c
>>> +++ b/net/unix/af_unix.c
>>> @@ -3139,6 +3139,8 @@ static __poll_t unix_poll(struct file *file, 
>>> struct socket *sock, poll_table *wa
>>>           mask |= EPOLLIN | EPOLLRDNORM;
>>>       if (sk_is_readable(sk))
>>>           mask |= EPOLLIN | EPOLLRDNORM;
>>> +    if (unix_sk(sk)->oob_skb)
>>> +        mask |= EPOLLPRI;
>>
>>
>> This adds another data-race, maybe add something to avoid another 
>> syzbot report ?
>
> It's not obvious to me how there would be a race as it is just a check.
>

KCSAN will detect that whenever unix_poll() reads oob_skb,

its value can be changed by another cpu.


unix_poll() runs without holding a lock.



> Also this change should be under #if IS_ENABLED(CONFIG_AF_UNIX_OOB)
>
> Thanks,
>
> Shoaib
>
>>
>>
>>>         /* Connection-based need to check for termination and 
>>> startup */
>>>       if ((sk->sk_type == SOCK_STREAM || sk->sk_type == 
>>> SOCK_SEQPACKET) &&
>>> diff --git a/tools/testing/selftests/net/af_unix/test_unix_oob.c 
>>> b/tools/testing/selftests/net/af_unix/test_unix_oob.c
>>> index 3dece8b29253..b57e91e1c3f2 100644
>>> --- a/tools/testing/selftests/net/af_unix/test_unix_oob.c
>>> +++ b/tools/testing/selftests/net/af_unix/test_unix_oob.c
>>> @@ -218,10 +218,10 @@ main(int argc, char **argv)
>>>         /* Test 1:
>>>        * veriyf that SIGURG is
>>> -     * delivered and 63 bytes are
>>> -     * read and oob is '@'
>>> +     * delivered, 63 bytes are
>>> +     * read, oob is '@', and POLLPRI works.
>>>        */
>>> -    wait_for_data(pfd, POLLIN | POLLPRI);
>>> +    wait_for_data(pfd, POLLPRI);
>>>       read_oob(pfd, &oob);
>>>       len = read_data(pfd, buf, 1024);
>>>       if (!signal_recvd || len != 63 || oob != '@') {
Iwashima, Kuniyuki March 15, 2022, 12:45 a.m. UTC | #5
From:   Eric Dumazet <eric.dumazet@gmail.com>
Date:   Mon, 14 Mar 2022 17:26:54 -0700
> On 3/14/22 11:10, Shoaib Rao wrote:
>>
>> On 3/14/22 10:42, Eric Dumazet wrote:
>>>
>>> On 3/13/22 22:21, Kuniyuki Iwashima wrote:
>>>> The commit 314001f0bf92 ("af_unix: Add OOB support") introduced OOB for
>>>> AF_UNIX, but it lacks some changes for POLLPRI.  Let's add the missing
>>>> piece.
>>>>
>>>> In the selftest, normal datagrams are sent followed by OOB data, so 
>>>> this
>>>> commit replaces `POLLIN | POLLPRI` with just `POLLPRI` in the first 
>>>> test
>>>> case.
>>>>
>>>> Fixes: 314001f0bf92 ("af_unix: Add OOB support")
>>>> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
>>>> ---
>>>>   net/unix/af_unix.c                                  | 2 ++
>>>>   tools/testing/selftests/net/af_unix/test_unix_oob.c | 6 +++---
>>>>   2 files changed, 5 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>>>> index c19569819866..711d21b1c3e1 100644
>>>> --- a/net/unix/af_unix.c
>>>> +++ b/net/unix/af_unix.c
>>>> @@ -3139,6 +3139,8 @@ static __poll_t unix_poll(struct file *file, 
>>>> struct socket *sock, poll_table *wa
>>>>           mask |= EPOLLIN | EPOLLRDNORM;
>>>>       if (sk_is_readable(sk))
>>>>           mask |= EPOLLIN | EPOLLRDNORM;
>>>> +    if (unix_sk(sk)->oob_skb)
>>>> +        mask |= EPOLLPRI;
>>>
>>>
>>> This adds another data-race, maybe add something to avoid another 
>>> syzbot report ?
>>
>> It's not obvious to me how there would be a race as it is just a check.
>>
> 
> KCSAN will detect that whenever unix_poll() reads oob_skb,
> 
> its value can be changed by another cpu.
> 
> 
> unix_poll() runs without holding a lock.

Thanks for pointing out!
So, READ_ONCE() is necessary here, right?
oob_skb is written under the lock, so I think there is no paired
WRITE_ONCE(), but is it ok?


> 
> 
> 
>> Also this change should be under #if IS_ENABLED(CONFIG_AF_UNIX_OOB)

And thanks, Shoaib!
Will add the condition in v2.


>>
>> Thanks,
>>
>> Shoaib
>>
>>>
>>>
>>>>         /* Connection-based need to check for termination and 
>>>> startup */
>>>>       if ((sk->sk_type == SOCK_STREAM || sk->sk_type == 
>>>> SOCK_SEQPACKET) &&
>>>> diff --git a/tools/testing/selftests/net/af_unix/test_unix_oob.c 
>>>> b/tools/testing/selftests/net/af_unix/test_unix_oob.c
>>>> index 3dece8b29253..b57e91e1c3f2 100644
>>>> --- a/tools/testing/selftests/net/af_unix/test_unix_oob.c
>>>> +++ b/tools/testing/selftests/net/af_unix/test_unix_oob.c
>>>> @@ -218,10 +218,10 @@ main(int argc, char **argv)
>>>>         /* Test 1:
>>>>        * veriyf that SIGURG is
>>>> -     * delivered and 63 bytes are
>>>> -     * read and oob is '@'
>>>> +     * delivered, 63 bytes are
>>>> +     * read, oob is '@', and POLLPRI works.
>>>>        */
>>>> -    wait_for_data(pfd, POLLIN | POLLPRI);
>>>> +    wait_for_data(pfd, POLLPRI);
>>>>       read_oob(pfd, &oob);
>>>>       len = read_data(pfd, buf, 1024);
>>>>       if (!signal_recvd || len != 63 || oob != '@') {
Iwashima, Kuniyuki March 15, 2022, 5:30 a.m. UTC | #6
From:   Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Date:   Tue, 15 Mar 2022 09:45:03 +0900
> From:   Eric Dumazet <eric.dumazet@gmail.com>
> Date:   Mon, 14 Mar 2022 17:26:54 -0700
>> On 3/14/22 11:10, Shoaib Rao wrote:
>>>
>>> On 3/14/22 10:42, Eric Dumazet wrote:
>>>>
>>>> On 3/13/22 22:21, Kuniyuki Iwashima wrote:
>>>>> The commit 314001f0bf92 ("af_unix: Add OOB support") introduced OOB for
>>>>> AF_UNIX, but it lacks some changes for POLLPRI.  Let's add the missing
>>>>> piece.
>>>>>
>>>>> In the selftest, normal datagrams are sent followed by OOB data, so 
>>>>> this
>>>>> commit replaces `POLLIN | POLLPRI` with just `POLLPRI` in the first 
>>>>> test
>>>>> case.
>>>>>
>>>>> Fixes: 314001f0bf92 ("af_unix: Add OOB support")
>>>>> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
>>>>> ---
>>>>>   net/unix/af_unix.c                                  | 2 ++
>>>>>   tools/testing/selftests/net/af_unix/test_unix_oob.c | 6 +++---
>>>>>   2 files changed, 5 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>>>>> index c19569819866..711d21b1c3e1 100644
>>>>> --- a/net/unix/af_unix.c
>>>>> +++ b/net/unix/af_unix.c
>>>>> @@ -3139,6 +3139,8 @@ static __poll_t unix_poll(struct file *file, 
>>>>> struct socket *sock, poll_table *wa
>>>>>           mask |= EPOLLIN | EPOLLRDNORM;
>>>>>       if (sk_is_readable(sk))
>>>>>           mask |= EPOLLIN | EPOLLRDNORM;
>>>>> +    if (unix_sk(sk)->oob_skb)
>>>>> +        mask |= EPOLLPRI;
>>>>
>>>>
>>>> This adds another data-race, maybe add something to avoid another 
>>>> syzbot report ?
>>>
>>> It's not obvious to me how there would be a race as it is just a check.
>>>
>> 
>> KCSAN will detect that whenever unix_poll() reads oob_skb,
>> 
>> its value can be changed by another cpu.
>> 
>> 
>> unix_poll() runs without holding a lock.
> 
> Thanks for pointing out!
> So, READ_ONCE() is necessary here, right?
> oob_skb is written under the lock, so I think there is no paired
> WRITE_ONCE(), but is it ok?

I've tested the prog below and KCSAN repoted the race.
Also, READ_ONCE() suppressed it.

Thank you Eric!
I'll post v2 with READ_ONCE().

---8<---
[   60.021825] ==================================================================
[   60.021999] BUG: KCSAN: data-race in unix_poll / unix_stream_sendmsg
[   60.021999] 
[   60.021999] write to 0xffff8880050d9ff0 of 8 bytes by task 175 on cpu 3:
[   60.021999]  unix_stream_sendmsg+0x9dc/0xbb0
[   60.021999]  sock_sendmsg+0x90/0xa0
[   60.021999]  __sys_sendto+0x138/0x190
[   60.021999]  __x64_sys_sendto+0x6d/0x80
[   60.021999]  do_syscall_64+0x38/0x80
[   60.021999]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   60.021999] 
[   60.021999] read to 0xffff8880050d9ff0 of 8 bytes by task 176 on cpu 2:
[   60.021999]  unix_poll+0xf4/0x1b0
[   60.021999]  sock_poll+0xa4/0x1e0
[   60.021999]  do_sys_poll+0x326/0x750
[   60.021999]  __x64_sys_poll+0x55/0x1f0
[   60.021999]  do_syscall_64+0x38/0x80
[   60.021999]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   60.021999] 
[   60.021999] value changed: 0xffff8880056bf000 -> 0xffff8880056bff00
[   60.021999] 
[   60.021999] Reported by Kernel Concurrency Sanitizer on:
[   60.021999] CPU: 2 PID: 176 Comm: unix_race_oob Not tainted 5.17.0-rc5-59531-gbdaabdfaadf8-dirty #9
[   60.021999] Hardware name: Red Hat KVM, BIOS 1.11.0-2.amzn2 04/01/2014
[   60.021999] ==================================================================
---8<---

---8<---
#include <errno.h>
#include <poll.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <sys/un.h>
#include <sys/wait.h>

#define offsetof(type, member) ((size_t)&((type *)0)->member)
#define SUNADDR "\0test\0"
#define ADDRLEN (socklen_t)(offsetof(struct sockaddr_un, sun_path) + 6)

int setup_server(void)
{
	struct sockaddr_un addr = {
		.sun_family = AF_UNIX,
		.sun_path = SUNADDR,
	};
	int err, fd;

	fd = socket(AF_UNIX, SOCK_STREAM | SOCK_NONBLOCK, 0);
	if (fd == -1) {
		perror("socket");
		goto out;
	}

	err = bind(fd, (struct sockaddr *)&addr, ADDRLEN);
	if (err == -1) {
		perror("bind");
		goto err;
	}

	err = listen(fd, 32);
	if (err == -1) {
		perror("listen");
		goto err;
	}

out:
	return fd;

err:
	close(fd);
	return err;
}

int setup_client(void)
{
	struct sockaddr_un addr = {
		.sun_family = AF_UNIX,
		.sun_path = SUNADDR,
	};
	int err, fd;

	fd = socket(AF_UNIX, SOCK_STREAM | SOCK_NONBLOCK, 0);
	if (fd == -1) {
		perror("socket");
		goto out;
	}

	err = connect(fd, (struct sockaddr *)&addr, ADDRLEN);
	if (err == -1) {
		perror("connect");
		goto err;
	}

out:
	return fd;

err:
	close(fd);
	return err;
}

int setup_child(int server)
{
	struct sockaddr_un addr;
	socklen_t len;
	int child;

	child = accept(server, (struct sockaddr *)&addr, &len);
	if (child == -1)
		perror("accept");

	return child;
}

int sender(int client)
{
	char c = 'a';
	int ret;

	printf("start sender\n");

	while (1) {
		ret = send(client, &c, sizeof(c), MSG_OOB);
		if (ret != 1 || errno)
			perror("send");
	}

	return 0;
}


int receiver(int child)
{
	struct pollfd pfds[1];
	char buf[1024];
	int ret;

	pfds[0].fd = child;
	pfds[0].events = POLLIN | POLLPRI;

	printf("start receiver\n");

	while (1) {
		poll(pfds, 1, -1);

		ret = recv(child, buf, sizeof(buf), MSG_OOB);
		if (ret < 0 || errno)
			perror("recv (MSG_OOB)");

		ret = recv(child, buf, sizeof(buf), 0);
		if (ret < 0 || errno)
			perror("recv");
	}

	return 0;
}

int main(void)
{
	int server, client, child;
	int status;
	pid_t pid;

	server = setup_server();
	if (server == -1)
		goto out;

	client = setup_client();
	if (client == -1)
		goto close_server;

	child = setup_child(server);
	if (child == -1)
		goto close_client;

	pid = fork();
	if (pid == 0)
		return sender(client);

	pid = fork();
	if (pid == 0)
		return receiver(child);

	wait(&status);

	close(child);
close_client:
	close(client);
close_server:
	close(server);
out:
	return 0;
}
---8<---
Iwashima, Kuniyuki March 15, 2022, 10:52 a.m. UTC | #7
From:   Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Date:   Tue, 15 Mar 2022 14:30:40 +0900
> From:   Kuniyuki Iwashima <kuniyu@amazon.co.jp>
> Date:   Tue, 15 Mar 2022 09:45:03 +0900
>> From:   Eric Dumazet <eric.dumazet@gmail.com>
>> Date:   Mon, 14 Mar 2022 17:26:54 -0700
>>> On 3/14/22 11:10, Shoaib Rao wrote:
>>>>
>>>> On 3/14/22 10:42, Eric Dumazet wrote:
>>>>>
>>>>> On 3/13/22 22:21, Kuniyuki Iwashima wrote:
>>>>>> The commit 314001f0bf92 ("af_unix: Add OOB support") introduced OOB for
>>>>>> AF_UNIX, but it lacks some changes for POLLPRI.  Let's add the missing
>>>>>> piece.
>>>>>>
>>>>>> In the selftest, normal datagrams are sent followed by OOB data, so 
>>>>>> this
>>>>>> commit replaces `POLLIN | POLLPRI` with just `POLLPRI` in the first 
>>>>>> test
>>>>>> case.
>>>>>>
>>>>>> Fixes: 314001f0bf92 ("af_unix: Add OOB support")
>>>>>> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
>>>>>> ---
>>>>>>   net/unix/af_unix.c                                  | 2 ++
>>>>>>   tools/testing/selftests/net/af_unix/test_unix_oob.c | 6 +++---
>>>>>>   2 files changed, 5 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>>>>>> index c19569819866..711d21b1c3e1 100644
>>>>>> --- a/net/unix/af_unix.c
>>>>>> +++ b/net/unix/af_unix.c
>>>>>> @@ -3139,6 +3139,8 @@ static __poll_t unix_poll(struct file *file, 
>>>>>> struct socket *sock, poll_table *wa
>>>>>>           mask |= EPOLLIN | EPOLLRDNORM;
>>>>>>       if (sk_is_readable(sk))
>>>>>>           mask |= EPOLLIN | EPOLLRDNORM;
>>>>>> +    if (unix_sk(sk)->oob_skb)
>>>>>> +        mask |= EPOLLPRI;
>>>>>
>>>>>
>>>>> This adds another data-race, maybe add something to avoid another 
>>>>> syzbot report ?
>>>>
>>>> It's not obvious to me how there would be a race as it is just a check.
>>>>
>>> 
>>> KCSAN will detect that whenever unix_poll() reads oob_skb,
>>> 
>>> its value can be changed by another cpu.
>>> 
>>> 
>>> unix_poll() runs without holding a lock.
>> 
>> Thanks for pointing out!
>> So, READ_ONCE() is necessary here, right?
>> oob_skb is written under the lock, so I think there is no paired
>> WRITE_ONCE(), but is it ok?

I have misunderstood this, the lock has nothing to do with WRITE_ONCE()
this time.  The write of oob_skb can be teared by GCC, which leads to a
data race at the read side even with READ_ONCE(), so the paired one is
needed.

I will respin v3 with READ_ONCE() and WRITE_ONCE().


> 
> I've tested the prog below and KCSAN repoted the race.
> Also, READ_ONCE() suppressed it.
> 
> Thank you Eric!
> I'll post v2 with READ_ONCE().
diff mbox series

Patch

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index c19569819866..711d21b1c3e1 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -3139,6 +3139,8 @@  static __poll_t unix_poll(struct file *file, struct socket *sock, poll_table *wa
 		mask |= EPOLLIN | EPOLLRDNORM;
 	if (sk_is_readable(sk))
 		mask |= EPOLLIN | EPOLLRDNORM;
+	if (unix_sk(sk)->oob_skb)
+		mask |= EPOLLPRI;
 
 	/* Connection-based need to check for termination and startup */
 	if ((sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET) &&
diff --git a/tools/testing/selftests/net/af_unix/test_unix_oob.c b/tools/testing/selftests/net/af_unix/test_unix_oob.c
index 3dece8b29253..b57e91e1c3f2 100644
--- a/tools/testing/selftests/net/af_unix/test_unix_oob.c
+++ b/tools/testing/selftests/net/af_unix/test_unix_oob.c
@@ -218,10 +218,10 @@  main(int argc, char **argv)
 
 	/* Test 1:
 	 * veriyf that SIGURG is
-	 * delivered and 63 bytes are
-	 * read and oob is '@'
+	 * delivered, 63 bytes are
+	 * read, oob is '@', and POLLPRI works.
 	 */
-	wait_for_data(pfd, POLLIN | POLLPRI);
+	wait_for_data(pfd, POLLPRI);
 	read_oob(pfd, &oob);
 	len = read_data(pfd, buf, 1024);
 	if (!signal_recvd || len != 63 || oob != '@') {