diff mbox series

[net-next] net: define and implement new SOL_SOCKET SO_RX_IFINDEX option

Message ID 20241024154119.1096947-1-maze@google.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: define and implement new SOL_SOCKET SO_RX_IFINDEX option | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 69 this patch: 69
netdev/build_tools success Errors and warnings before: 157 (+1) this patch: 157 (+1)
netdev/cc_maintainers warning 17 maintainers not CCed: horms@kernel.org andreas@gaisler.com arnd@arndb.de James.Bottomley@HansenPartnership.com almasrymina@google.com deller@gmx.de linux-alpha@vger.kernel.org tsbogend@alpha.franken.de willemb@google.com sparclinux@vger.kernel.org linux-mips@vger.kernel.org linux-parisc@vger.kernel.org mattst88@gmail.com richard.henderson@linaro.org linux-arch@vger.kernel.org asml.silence@gmail.com kaiyuanz@google.com
netdev/build_clang success Errors and warnings before: 1010 this patch: 1010
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 14621 this patch: 14621
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 50 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 9 this patch: 9
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-10-31--09-00 (tests: 779)

Commit Message

Maciej Żenczykowski Oct. 24, 2024, 3:41 p.m. UTC
This is currently only implemented for TCP and is not
guaranteed to return correct information for a multitude
of reasons (including multipath reception), but there are
scenarios where it is useful: in particular a strong host
model where connections are only viable via a single interface,
for example a VPN interface.  One could for example choose
to use this to SO_BINDTODEVICE.

Test:
  // Python 2.7.18 (default, Jul 13 2022, 18:14:36)
  import socket
  SO_RX_IFINDEX=82
  s = socket.socket(socket.AF_INET6, socket.SOCK_STREAM, 0)
  c = socket.socket(socket.AF_INET6, socket.SOCK_STREAM, 0)
  s.bind(('::', 8888))
  s.listen(128)
  c.connect(('::', 8888))
  a = s.accept()
  print a  # (<socket._socketobject object>, ('::1', 58144, 0, 0))
  p=a[0]
  p.getsockname()  # ('::1', 8888, 0, 0)
  p.getpeername()  # ('::1', 58144, 0, 0)
  c.getsockname()  # ('::1', 58144, 0, 0)
  c.getpeername()  # ('::1', 8888, 0, 0)
  p.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 1 (lo)
  c.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 0 (unknown)
  c.send(b'X')  # 1
  p.recv(2)  # 'X'
  p.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 1 (lo)
  c.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 0 (unknown)
  p.send(b'Z')  # 1
  c.recv(2)  # 'Z'
  p.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 1 (lo)
  c.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 1 (lo)

Which shows we should possibly fix the 3-way handshake SYN-ACK
to set sk->sk_rx_dst_ifindex.

Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
---
 arch/alpha/include/uapi/asm/socket.h  | 2 ++
 arch/mips/include/uapi/asm/socket.h   | 2 ++
 arch/parisc/include/uapi/asm/socket.h | 2 ++
 arch/sparc/include/uapi/asm/socket.h  | 2 ++
 include/uapi/asm-generic/socket.h     | 2 ++
 net/core/sock.c                       | 4 ++++
 6 files changed, 14 insertions(+)

Comments

Maciej Żenczykowski Oct. 24, 2024, 3:44 p.m. UTC | #1
On Thu, Oct 24, 2024 at 5:41 PM Maciej Żenczykowski <maze@google.com> wrote:
>
> This is currently only implemented for TCP and is not
> guaranteed to return correct information for a multitude
> of reasons (including multipath reception), but there are
> scenarios where it is useful: in particular a strong host
> model where connections are only viable via a single interface,
> for example a VPN interface.  One could for example choose
> to use this to SO_BINDTODEVICE.
>
> Test:
>   // Python 2.7.18 (default, Jul 13 2022, 18:14:36)
>   import socket
>   SO_RX_IFINDEX=82
>   s = socket.socket(socket.AF_INET6, socket.SOCK_STREAM, 0)
>   c = socket.socket(socket.AF_INET6, socket.SOCK_STREAM, 0)
>   s.bind(('::', 8888))
>   s.listen(128)
>   c.connect(('::', 8888))
>   a = s.accept()
>   print a  # (<socket._socketobject object>, ('::1', 58144, 0, 0))
>   p=a[0]
>   p.getsockname()  # ('::1', 8888, 0, 0)
>   p.getpeername()  # ('::1', 58144, 0, 0)
>   c.getsockname()  # ('::1', 58144, 0, 0)
>   c.getpeername()  # ('::1', 8888, 0, 0)
>   p.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 1 (lo)
>   c.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 0 (unknown)
>   c.send(b'X')  # 1
>   p.recv(2)  # 'X'
>   p.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 1 (lo)
>   c.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 0 (unknown)
>   p.send(b'Z')  # 1
>   c.recv(2)  # 'Z'
>   p.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 1 (lo)
>   c.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 1 (lo)
>
> Which shows we should possibly fix the 3-way handshake SYN-ACK
> to set sk->sk_rx_dst_ifindex.
>
> Cc: Lorenzo Colitti <lorenzo@google.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Maciej Żenczykowski <maze@google.com>
> ---
>  arch/alpha/include/uapi/asm/socket.h  | 2 ++
>  arch/mips/include/uapi/asm/socket.h   | 2 ++
>  arch/parisc/include/uapi/asm/socket.h | 2 ++
>  arch/sparc/include/uapi/asm/socket.h  | 2 ++
>  include/uapi/asm-generic/socket.h     | 2 ++
>  net/core/sock.c                       | 4 ++++
>  6 files changed, 14 insertions(+)
>
> diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
> index 302507bf9b5d..5f139b095a49 100644
> --- a/arch/alpha/include/uapi/asm/socket.h
> +++ b/arch/alpha/include/uapi/asm/socket.h
> @@ -148,6 +148,8 @@
>
>  #define SCM_TS_OPT_ID          81
>
> +#define SO_RX_IFINDEX          82
> +
>  #if !defined(__KERNEL__)
>
>  #if __BITS_PER_LONG == 64
> diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
> index d118d4731580..ff25d24b4dea 100644
> --- a/arch/mips/include/uapi/asm/socket.h
> +++ b/arch/mips/include/uapi/asm/socket.h
> @@ -159,6 +159,8 @@
>
>  #define SCM_TS_OPT_ID          81
>
> +#define SO_RX_IFINDEX          82
> +
>  #if !defined(__KERNEL__)
>
>  #if __BITS_PER_LONG == 64
> diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
> index d268d69bfcd2..3f89c388e356 100644
> --- a/arch/parisc/include/uapi/asm/socket.h
> +++ b/arch/parisc/include/uapi/asm/socket.h
> @@ -140,6 +140,8 @@
>
>  #define SCM_TS_OPT_ID          0x404C
>
> +#define SO_RX_IFINDEX          82
> +
>  #if !defined(__KERNEL__)
>
>  #if __BITS_PER_LONG == 64
> diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
> index 113cd9f353e3..f1af74f5f1ad 100644
> --- a/arch/sparc/include/uapi/asm/socket.h
> +++ b/arch/sparc/include/uapi/asm/socket.h
> @@ -141,6 +141,8 @@
>
>  #define SCM_TS_OPT_ID            0x005a
>
> +#define SO_RX_IFINDEX            0x005b
> +
>  #if !defined(__KERNEL__)
>
>
> diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
> index deacfd6dd197..b16c69e22606 100644
> --- a/include/uapi/asm-generic/socket.h
> +++ b/include/uapi/asm-generic/socket.h
> @@ -143,6 +143,8 @@
>
>  #define SCM_TS_OPT_ID          81
>
> +#define SO_RX_IFINDEX          82
> +
>  #if !defined(__KERNEL__)
>
>  #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 7f398bd07fb7..6c985413c21f 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1932,6 +1932,10 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
>                 v.val = READ_ONCE(sk->sk_mark);
>                 break;
>
> +       case SO_RX_IFINDEX:
> +               v.val = READ_ONCE(sk->sk_rx_dst_ifindex);
> +               break;
> +
>         case SO_RCVMARK:
>                 v.val = sock_flag(sk, SOCK_RCVMARK);
>                 break;
> --
> 2.47.0.105.g07ac214952-goog
>

Note: I'm not sure if I did the right thing with parisc...
It has:
#define SO_DEVMEM_LINEAR 78
#define SCM_DEVMEM_LINEAR SO_DEVMEM_LINEAR
#define SO_DEVMEM_DMABUF 79
#define SCM_DEVMEM_DMABUF SO_DEVMEM_DMABUF
#define SO_DEVMEM_DONTNEED 80
which is weird...

--
Maciej Żenczykowski, Kernel Networking Developer @ Google
Willem de Bruijn Oct. 25, 2024, 2:44 p.m. UTC | #2
Maciej Żenczykowski wrote:
> On Thu, Oct 24, 2024 at 5:41 PM Maciej Żenczykowski <maze@google.com> wrote:
> >
> > This is currently only implemented for TCP and is not
> > guaranteed to return correct information for a multitude
> > of reasons (including multipath reception), but there are
> > scenarios where it is useful: in particular a strong host
> > model where connections are only viable via a single interface,
> > for example a VPN interface.  One could for example choose
> > to use this to SO_BINDTODEVICE.

Fair to say that this is the equivalent of ipi_ifindex in IP_PKTINFO,
but for non datagram sockets where skb_iff cannot be read directly?

> >
> > Test:
> >   // Python 2.7.18 (default, Jul 13 2022, 18:14:36)
> >   import socket
> >   SO_RX_IFINDEX=82
> >   s = socket.socket(socket.AF_INET6, socket.SOCK_STREAM, 0)
> >   c = socket.socket(socket.AF_INET6, socket.SOCK_STREAM, 0)
> >   s.bind(('::', 8888))
> >   s.listen(128)
> >   c.connect(('::', 8888))
> >   a = s.accept()
> >   print a  # (<socket._socketobject object>, ('::1', 58144, 0, 0))
> >   p=a[0]
> >   p.getsockname()  # ('::1', 8888, 0, 0)
> >   p.getpeername()  # ('::1', 58144, 0, 0)
> >   c.getsockname()  # ('::1', 58144, 0, 0)
> >   c.getpeername()  # ('::1', 8888, 0, 0)
> >   p.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 1 (lo)
> >   c.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 0 (unknown)
> >   c.send(b'X')  # 1
> >   p.recv(2)  # 'X'
> >   p.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 1 (lo)
> >   c.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 0 (unknown)
> >   p.send(b'Z')  # 1
> >   c.recv(2)  # 'Z'
> >   p.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 1 (lo)
> >   c.getsockopt(socket.SOL_SOCKET, SO_RX_IFINDEX)  # 1 (lo)
> >
> > Which shows we should possibly fix the 3-way handshake SYN-ACK
> > to set sk->sk_rx_dst_ifindex.
> >
> > Cc: Lorenzo Colitti <lorenzo@google.com>
> > Cc: Eric Dumazet <edumazet@google.com>
> > Signed-off-by: Maciej Żenczykowski <maze@google.com>
> > ---
> >  arch/alpha/include/uapi/asm/socket.h  | 2 ++
> >  arch/mips/include/uapi/asm/socket.h   | 2 ++
> >  arch/parisc/include/uapi/asm/socket.h | 2 ++
> >  arch/sparc/include/uapi/asm/socket.h  | 2 ++
> >  include/uapi/asm-generic/socket.h     | 2 ++
> >  net/core/sock.c                       | 4 ++++
> >  6 files changed, 14 insertions(+)
> >
> > diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
> > index 302507bf9b5d..5f139b095a49 100644
> > --- a/arch/alpha/include/uapi/asm/socket.h
> > +++ b/arch/alpha/include/uapi/asm/socket.h
> > @@ -148,6 +148,8 @@
> >
> >  #define SCM_TS_OPT_ID          81
> >
> > +#define SO_RX_IFINDEX          82
> > +
> >  #if !defined(__KERNEL__)
> >
> >  #if __BITS_PER_LONG == 64
> > diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
> > index d118d4731580..ff25d24b4dea 100644
> > --- a/arch/mips/include/uapi/asm/socket.h
> > +++ b/arch/mips/include/uapi/asm/socket.h
> > @@ -159,6 +159,8 @@
> >
> >  #define SCM_TS_OPT_ID          81
> >
> > +#define SO_RX_IFINDEX          82
> > +
> >  #if !defined(__KERNEL__)
> >
> >  #if __BITS_PER_LONG == 64
> > diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
> > index d268d69bfcd2..3f89c388e356 100644
> > --- a/arch/parisc/include/uapi/asm/socket.h
> > +++ b/arch/parisc/include/uapi/asm/socket.h
> > @@ -140,6 +140,8 @@
> >
> >  #define SCM_TS_OPT_ID          0x404C
> >
> > +#define SO_RX_IFINDEX          82
> > +
> >  #if !defined(__KERNEL__)
> >
> >  #if __BITS_PER_LONG == 64
> > diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
> > index 113cd9f353e3..f1af74f5f1ad 100644
> > --- a/arch/sparc/include/uapi/asm/socket.h
> > +++ b/arch/sparc/include/uapi/asm/socket.h
> > @@ -141,6 +141,8 @@
> >
> >  #define SCM_TS_OPT_ID            0x005a
> >
> > +#define SO_RX_IFINDEX            0x005b
> > +
> >  #if !defined(__KERNEL__)
> >
> >
> > diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
> > index deacfd6dd197..b16c69e22606 100644
> > --- a/include/uapi/asm-generic/socket.h
> > +++ b/include/uapi/asm-generic/socket.h
> > @@ -143,6 +143,8 @@
> >
> >  #define SCM_TS_OPT_ID          81
> >
> > +#define SO_RX_IFINDEX          82
> > +
> >  #if !defined(__KERNEL__)
> >
> >  #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
> > diff --git a/net/core/sock.c b/net/core/sock.c
> > index 7f398bd07fb7..6c985413c21f 100644
> > --- a/net/core/sock.c
> > +++ b/net/core/sock.c
> > @@ -1932,6 +1932,10 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
> >                 v.val = READ_ONCE(sk->sk_mark);
> >                 break;
> >
> > +       case SO_RX_IFINDEX:
> > +               v.val = READ_ONCE(sk->sk_rx_dst_ifindex);
> > +               break;
> > +

If it is limited to TCP, return error in other cases.

So that we can extend it later with well defined behavior.

> >         case SO_RCVMARK:
> >                 v.val = sock_flag(sk, SOCK_RCVMARK);
> >                 break;
> > --
> > 2.47.0.105.g07ac214952-goog
> >
> 
> Note: I'm not sure if I did the right thing with parisc...
> It has:
> #define SO_DEVMEM_LINEAR 78
> #define SCM_DEVMEM_LINEAR SO_DEVMEM_LINEAR
> #define SO_DEVMEM_DMABUF 79
> #define SCM_DEVMEM_DMABUF SO_DEVMEM_DMABUF
> #define SO_DEVMEM_DONTNEED 80
> which is weird...

This is a common pattern.

To define separate SCM constants for cmsg fields, even though they
have the same constant as their [gs]etsockopt counterparts.
diff mbox series

Patch

diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
index 302507bf9b5d..5f139b095a49 100644
--- a/arch/alpha/include/uapi/asm/socket.h
+++ b/arch/alpha/include/uapi/asm/socket.h
@@ -148,6 +148,8 @@ 
 
 #define SCM_TS_OPT_ID		81
 
+#define SO_RX_IFINDEX		82
+
 #if !defined(__KERNEL__)
 
 #if __BITS_PER_LONG == 64
diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
index d118d4731580..ff25d24b4dea 100644
--- a/arch/mips/include/uapi/asm/socket.h
+++ b/arch/mips/include/uapi/asm/socket.h
@@ -159,6 +159,8 @@ 
 
 #define SCM_TS_OPT_ID		81
 
+#define SO_RX_IFINDEX		82
+
 #if !defined(__KERNEL__)
 
 #if __BITS_PER_LONG == 64
diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
index d268d69bfcd2..3f89c388e356 100644
--- a/arch/parisc/include/uapi/asm/socket.h
+++ b/arch/parisc/include/uapi/asm/socket.h
@@ -140,6 +140,8 @@ 
 
 #define SCM_TS_OPT_ID		0x404C
 
+#define SO_RX_IFINDEX		82
+
 #if !defined(__KERNEL__)
 
 #if __BITS_PER_LONG == 64
diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
index 113cd9f353e3..f1af74f5f1ad 100644
--- a/arch/sparc/include/uapi/asm/socket.h
+++ b/arch/sparc/include/uapi/asm/socket.h
@@ -141,6 +141,8 @@ 
 
 #define SCM_TS_OPT_ID            0x005a
 
+#define SO_RX_IFINDEX            0x005b
+
 #if !defined(__KERNEL__)
 
 
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index deacfd6dd197..b16c69e22606 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -143,6 +143,8 @@ 
 
 #define SCM_TS_OPT_ID		81
 
+#define SO_RX_IFINDEX		82
+
 #if !defined(__KERNEL__)
 
 #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
diff --git a/net/core/sock.c b/net/core/sock.c
index 7f398bd07fb7..6c985413c21f 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1932,6 +1932,10 @@  int sk_getsockopt(struct sock *sk, int level, int optname,
 		v.val = READ_ONCE(sk->sk_mark);
 		break;
 
+	case SO_RX_IFINDEX:
+		v.val = READ_ONCE(sk->sk_rx_dst_ifindex);
+		break;
+
 	case SO_RCVMARK:
 		v.val = sock_flag(sk, SOCK_RCVMARK);
 		break;