diff mbox series

[net-next,v03,1/3] af_packet: allow fanout_add when socket is not RUNNING

Message ID 9e15c0c2cd19d94207a1791de0dc9051a5abb95a.1728555449.git.gur.stavi@huawei.com (mailing list archive)
State New
Headers show
Series net: af_packet: allow joining a fanout when link is down | expand

Commit Message

Gur Stavi Oct. 10, 2024, 10:25 a.m. UTC
PACKET socket can retain its fanout membership through link down and up
and leave a fanout while closed regardless of link state.
However, socket was forbidden from joining a fanout while it was not
RUNNING.

This patch allows PACKET socket to join fanout while not RUNNING.

The previous test for RUNNING also implicitly tested that the socket is
bound to a device. An explicit test of ifindex was added instead.

Signed-off-by: Gur Stavi <gur.stavi@huawei.com>
---
 net/packet/af_packet.c | 35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)

Comments

Gur Stavi Oct. 10, 2024, 10:38 a.m. UTC | #1
> Subject: [PATCH net-next v03 1/3] af_packet: allow fanout_add when socket
> is not RUNNING
> 
> PACKET socket can retain its fanout membership through link down and up
> and leave a fanout while closed regardless of link state.
> However, socket was forbidden from joining a fanout while it was not
> RUNNING.
> 
> This patch allows PACKET socket to join fanout while not RUNNING.
> 
> The previous test for RUNNING also implicitly tested that the socket is
> bound to a device. An explicit test of ifindex was added instead.

I had some mess up with the send-email. I prepared the following more
complete git comment but sent the wrong patch. The code is still the same.

---
PACKET socket can retain its fanout membership through link down and up
and leave a fanout while closed regardless of link state.
However, socket was forbidden from joining a fanout while it was not
RUNNING.

This patch allows PACKET socket to join fanout while not RUNNING.

Socket can be RUNNING if it has a specified protocol. Either directly 
from packet_create (being implicitly bound to any interface) or following
a successful bind. Socket RUNNING state is switched off if it is bound to
an interface that went down.

Instead of the test for RUNNING, this patch adds a test that socket can
receive packets before allowing it into a fanout group. Socket can
receive packets if it has a configured protocol and is bound to an
interface or bound to any interface.

The only difference between the previous test and the current test is
that the bound interface is allowed to have IFF_UP cleared.
---
Willem de Bruijn Oct. 10, 2024, 1:23 p.m. UTC | #2
Gur Stavi wrote:
> PACKET socket can retain its fanout membership through link down and up
> and leave a fanout while closed regardless of link state.
> However, socket was forbidden from joining a fanout while it was not
> RUNNING.
> 
> This patch allows PACKET socket to join fanout while not RUNNING.
> 
> The previous test for RUNNING also implicitly tested that the socket is
> bound to a device. An explicit test of ifindex was added instead.
> 
> Signed-off-by: Gur Stavi <gur.stavi@huawei.com>
> ---
>  net/packet/af_packet.c | 35 +++++++++++++++++++++--------------
>  1 file changed, 21 insertions(+), 14 deletions(-)
> 
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index f8942062f776..8137c33ab0fd 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -1843,26 +1843,29 @@ static int fanout_add(struct sock *sk, struct fanout_args *args)
>  		match->prot_hook.ignore_outgoing = type_flags & PACKET_FANOUT_FLAG_IGNORE_OUTGOING;
>  		list_add(&match->list, &fanout_list);
>  	}
> -	err = -EINVAL;
>  
>  	spin_lock(&po->bind_lock);
> -	if (packet_sock_flag(po, PACKET_SOCK_RUNNING) &&
> -	    match->type == type &&
> -	    match->prot_hook.type == po->prot_hook.type &&
> -	    match->prot_hook.dev == po->prot_hook.dev) {
> +	if (po->ifindex == -1 || po->num == 0) {

This patch is more complex than it needs to be.

No need to block the case of ETH_P_NONE or not bound to a socket.

I would have discussed that in v2, but you already respun.

> +		/* Socket can not receive packets */
> +		err = -ENXIO;
> +	} else if (match->type != type ||
> +		   match->prot_hook.type != po->prot_hook.type ||
> +		   match->prot_hook.dev != po->prot_hook.dev) {
> +		/* Joining an existing group, properties must be identical */
> +		err = -EINVAL;
> +	} else if (refcount_read(&match->sk_ref) >= match->max_num_members) {
>  		err = -ENOSPC;
> -		if (refcount_read(&match->sk_ref) < match->max_num_members) {
> +	} else {
> +		/* Paired with packet_setsockopt(PACKET_FANOUT_DATA) */
> +		WRITE_ONCE(po->fanout, match);
> +		po->rollover = rollover;
> +		rollover = NULL;
> +		refcount_set(&match->sk_ref, refcount_read(&match->sk_ref) + 1);
> +		if (packet_sock_flag(po, PACKET_SOCK_RUNNING)) {
>  			__dev_remove_pack(&po->prot_hook);
> -
> -			/* Paired with packet_setsockopt(PACKET_FANOUT_DATA) */
> -			WRITE_ONCE(po->fanout, match);
> -
> -			po->rollover = rollover;
> -			rollover = NULL;
> -			refcount_set(&match->sk_ref, refcount_read(&match->sk_ref) + 1);
>  			__fanout_link(sk, po);
> -			err = 0;
>  		}
> +		err = 0;
>  	}
>  	spin_unlock(&po->bind_lock);
>  
> @@ -3452,8 +3455,12 @@ static int packet_create(struct net *net, struct socket *sock, int protocol,
>  	po->prot_hook.af_packet_net = sock_net(sk);
>  
>  	if (proto) {
> +		/* Implicitly bind socket to "any interface" */
> +		po->ifindex = 0;
>  		po->prot_hook.type = proto;
>  		__register_prot_hook(sk);
> +	} else {
> +		po->ifindex = -1;
>  	}
>  
>  	mutex_lock(&net->packet.sklist_lock);
> -- 
> 2.45.2
>
Gur Stavi Oct. 10, 2024, 4 p.m. UTC | #3
> Gur Stavi wrote:
> > PACKET socket can retain its fanout membership through link down and up
> > and leave a fanout while closed regardless of link state.
> > However, socket was forbidden from joining a fanout while it was not
> > RUNNING.
> >
> > This patch allows PACKET socket to join fanout while not RUNNING.
> >
> > The previous test for RUNNING also implicitly tested that the socket is
> > bound to a device. An explicit test of ifindex was added instead.
> >
> > Signed-off-by: Gur Stavi <gur.stavi@huawei.com>
> > ---
> >  net/packet/af_packet.c | 35 +++++++++++++++++++++--------------
> >  1 file changed, 21 insertions(+), 14 deletions(-)
> >
> > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> > index f8942062f776..8137c33ab0fd 100644
> > --- a/net/packet/af_packet.c
> > +++ b/net/packet/af_packet.c
> > @@ -1843,26 +1843,29 @@ static int fanout_add(struct sock *sk, struct
> fanout_args *args)
> >  		match->prot_hook.ignore_outgoing = type_flags &
> PACKET_FANOUT_FLAG_IGNORE_OUTGOING;
> >  		list_add(&match->list, &fanout_list);
> >  	}
> > -	err = -EINVAL;
> >
> >  	spin_lock(&po->bind_lock);
> > -	if (packet_sock_flag(po, PACKET_SOCK_RUNNING) &&
> > -	    match->type == type &&
> > -	    match->prot_hook.type == po->prot_hook.type &&
> > -	    match->prot_hook.dev == po->prot_hook.dev) {
> > +	if (po->ifindex == -1 || po->num == 0) {
> 
> This patch is more complex than it needs to be.
> 
> No need to block the case of ETH_P_NONE or not bound to a socket.


ETH_P_NONE was blocked before as well.
packet_do_bind will not switch socket to RUNNING when proto is 0.

	if (proto == 0 || !need_rehook)
		goto out_unlock;

Same for packet_create.

So the old condition could only pass the RUNNING condition if proto
was non-zero.
The new condition is exactly equivalent except for allowing IFF_UP
to be cleared in the bound device.


Yes, the same result could be achieved with a FEW less line changes
but I think that the new logic is more readable where every clause
explains itself with a comment instead of constructing one large if
statement. And since the solution does add another nested if for the
RUNNING the extra indentation started to look ugly.

> 
> I would have discussed that in v2, but you already respun.
> 
> > +		/* Socket can not receive packets */
> > +		err = -ENXIO;
> > +	} else if (match->type != type ||
> > +		   match->prot_hook.type != po->prot_hook.type ||
> > +		   match->prot_hook.dev != po->prot_hook.dev) {
> > +		/* Joining an existing group, properties must be identical */
> > +		err = -EINVAL;
> > +	} else if (refcount_read(&match->sk_ref) >= match->max_num_members)
> {
> >  		err = -ENOSPC;
> > -		if (refcount_read(&match->sk_ref) < match->max_num_members) {
> > +	} else {
> > +		/* Paired with packet_setsockopt(PACKET_FANOUT_DATA) */
> > +		WRITE_ONCE(po->fanout, match);
> > +		po->rollover = rollover;
> > +		rollover = NULL;
> > +		refcount_set(&match->sk_ref, refcount_read(&match->sk_ref) +
> 1);
> > +		if (packet_sock_flag(po, PACKET_SOCK_RUNNING)) {
> >  			__dev_remove_pack(&po->prot_hook);
> > -
> > -			/* Paired with packet_setsockopt(PACKET_FANOUT_DATA) */
> > -			WRITE_ONCE(po->fanout, match);
> > -
> > -			po->rollover = rollover;
> > -			rollover = NULL;
> > -			refcount_set(&match->sk_ref, refcount_read(&match-
> >sk_ref) + 1);
> >  			__fanout_link(sk, po);
> > -			err = 0;
> >  		}
> > +		err = 0;
> >  	}
> >  	spin_unlock(&po->bind_lock);
> >
> > @@ -3452,8 +3455,12 @@ static int packet_create(struct net *net, struct
> socket *sock, int protocol,
> >  	po->prot_hook.af_packet_net = sock_net(sk);
> >
> >  	if (proto) {
> > +		/* Implicitly bind socket to "any interface" */
> > +		po->ifindex = 0;
> >  		po->prot_hook.type = proto;
> >  		__register_prot_hook(sk);
> > +	} else {
> > +		po->ifindex = -1;
> >  	}
> >
> >  	mutex_lock(&net->packet.sklist_lock);
> > --
> > 2.45.2
> >
>
diff mbox series

Patch

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index f8942062f776..8137c33ab0fd 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1843,26 +1843,29 @@  static int fanout_add(struct sock *sk, struct fanout_args *args)
 		match->prot_hook.ignore_outgoing = type_flags & PACKET_FANOUT_FLAG_IGNORE_OUTGOING;
 		list_add(&match->list, &fanout_list);
 	}
-	err = -EINVAL;
 
 	spin_lock(&po->bind_lock);
-	if (packet_sock_flag(po, PACKET_SOCK_RUNNING) &&
-	    match->type == type &&
-	    match->prot_hook.type == po->prot_hook.type &&
-	    match->prot_hook.dev == po->prot_hook.dev) {
+	if (po->ifindex == -1 || po->num == 0) {
+		/* Socket can not receive packets */
+		err = -ENXIO;
+	} else if (match->type != type ||
+		   match->prot_hook.type != po->prot_hook.type ||
+		   match->prot_hook.dev != po->prot_hook.dev) {
+		/* Joining an existing group, properties must be identical */
+		err = -EINVAL;
+	} else if (refcount_read(&match->sk_ref) >= match->max_num_members) {
 		err = -ENOSPC;
-		if (refcount_read(&match->sk_ref) < match->max_num_members) {
+	} else {
+		/* Paired with packet_setsockopt(PACKET_FANOUT_DATA) */
+		WRITE_ONCE(po->fanout, match);
+		po->rollover = rollover;
+		rollover = NULL;
+		refcount_set(&match->sk_ref, refcount_read(&match->sk_ref) + 1);
+		if (packet_sock_flag(po, PACKET_SOCK_RUNNING)) {
 			__dev_remove_pack(&po->prot_hook);
-
-			/* Paired with packet_setsockopt(PACKET_FANOUT_DATA) */
-			WRITE_ONCE(po->fanout, match);
-
-			po->rollover = rollover;
-			rollover = NULL;
-			refcount_set(&match->sk_ref, refcount_read(&match->sk_ref) + 1);
 			__fanout_link(sk, po);
-			err = 0;
 		}
+		err = 0;
 	}
 	spin_unlock(&po->bind_lock);
 
@@ -3452,8 +3455,12 @@  static int packet_create(struct net *net, struct socket *sock, int protocol,
 	po->prot_hook.af_packet_net = sock_net(sk);
 
 	if (proto) {
+		/* Implicitly bind socket to "any interface" */
+		po->ifindex = 0;
 		po->prot_hook.type = proto;
 		__register_prot_hook(sk);
+	} else {
+		po->ifindex = -1;
 	}
 
 	mutex_lock(&net->packet.sklist_lock);