Message ID | 20250306183101.817063-1-edumazet@google.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] udp: expand SKB_DROP_REASON_UDP_CSUM use | expand |
Eric Dumazet wrote: > Use SKB_DROP_REASON_UDP_CSUM in __first_packet_length() > and udp_read_skb() when dropping a packet because of > a wrong UDP checksum. > > Signed-off-by: Eric Dumazet <edumazet@google.com> > --- > net/ipv4/udp.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c > index 17c7736d8349433ad2d4cbcc9414b2f8112610af..39c3adf333b5f02ca53f768c918c75f2fc7f93ac 100644 > --- a/net/ipv4/udp.c > +++ b/net/ipv4/udp.c > @@ -1848,7 +1848,7 @@ static struct sk_buff *__first_packet_length(struct sock *sk, > atomic_inc(&sk->sk_drops); > __skb_unlink(skb, rcvq); > *total += skb->truesize; > - kfree_skb(skb); > + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); > } else { > udp_skb_csum_unnecessary_set(skb); > break; > @@ -2002,7 +2002,7 @@ int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor) > __UDP_INC_STATS(net, UDP_MIB_CSUMERRORS, is_udplite); > __UDP_INC_STATS(net, UDP_MIB_INERRORS, is_udplite); > atomic_inc(&sk->sk_drops); > - kfree_skb(skb); > + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); > goto try_again; > } From a quick search for UDP_MIB_CSUMERRORS, one more case with regular kfree_skb: csum_copy_err: if (!__sk_queue_drop_skb(sk, &udp_sk(sk)->reader_queue, skb, flags, udp_skb_destructor)) { UDP_INC_STATS(sock_net(sk), UDP_MIB_CSUMERRORS, is_udplite); UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite); } kfree_skb(skb); And the same in net/ipv6/udp.c
On Thu, Mar 6, 2025 at 8:01 PM Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > Eric Dumazet wrote: > > Use SKB_DROP_REASON_UDP_CSUM in __first_packet_length() > > and udp_read_skb() when dropping a packet because of > > a wrong UDP checksum. > > > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > --- > > net/ipv4/udp.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c > > index 17c7736d8349433ad2d4cbcc9414b2f8112610af..39c3adf333b5f02ca53f768c918c75f2fc7f93ac 100644 > > --- a/net/ipv4/udp.c > > +++ b/net/ipv4/udp.c > > @@ -1848,7 +1848,7 @@ static struct sk_buff *__first_packet_length(struct sock *sk, > > atomic_inc(&sk->sk_drops); > > __skb_unlink(skb, rcvq); > > *total += skb->truesize; > > - kfree_skb(skb); > > + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); > > } else { > > udp_skb_csum_unnecessary_set(skb); > > break; > > @@ -2002,7 +2002,7 @@ int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor) > > __UDP_INC_STATS(net, UDP_MIB_CSUMERRORS, is_udplite); > > __UDP_INC_STATS(net, UDP_MIB_INERRORS, is_udplite); > > atomic_inc(&sk->sk_drops); > > - kfree_skb(skb); > > + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); > > goto try_again; > > } > > From a quick search for UDP_MIB_CSUMERRORS, one more case with regular > kfree_skb: > > csum_copy_err: > if (!__sk_queue_drop_skb(sk, &udp_sk(sk)->reader_queue, skb, flags, > udp_skb_destructor)) { > UDP_INC_STATS(sock_net(sk), UDP_MIB_CSUMERRORS, is_udplite); > UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite); > } > kfree_skb(skb); Right, I was unsure because of the conditional SNMP updates.
Eric Dumazet wrote: > On Thu, Mar 6, 2025 at 8:01 PM Willem de Bruijn > <willemdebruijn.kernel@gmail.com> wrote: > > > > Eric Dumazet wrote: > > > Use SKB_DROP_REASON_UDP_CSUM in __first_packet_length() > > > and udp_read_skb() when dropping a packet because of > > > a wrong UDP checksum. > > > > > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > > --- > > > net/ipv4/udp.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c > > > index 17c7736d8349433ad2d4cbcc9414b2f8112610af..39c3adf333b5f02ca53f768c918c75f2fc7f93ac 100644 > > > --- a/net/ipv4/udp.c > > > +++ b/net/ipv4/udp.c > > > @@ -1848,7 +1848,7 @@ static struct sk_buff *__first_packet_length(struct sock *sk, > > > atomic_inc(&sk->sk_drops); > > > __skb_unlink(skb, rcvq); > > > *total += skb->truesize; > > > - kfree_skb(skb); > > > + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); > > > } else { > > > udp_skb_csum_unnecessary_set(skb); > > > break; > > > @@ -2002,7 +2002,7 @@ int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor) > > > __UDP_INC_STATS(net, UDP_MIB_CSUMERRORS, is_udplite); > > > __UDP_INC_STATS(net, UDP_MIB_INERRORS, is_udplite); > > > atomic_inc(&sk->sk_drops); > > > - kfree_skb(skb); > > > + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); > > > goto try_again; > > > } > > > > From a quick search for UDP_MIB_CSUMERRORS, one more case with regular > > kfree_skb: > > > > csum_copy_err: > > if (!__sk_queue_drop_skb(sk, &udp_sk(sk)->reader_queue, skb, flags, > > udp_skb_destructor)) { > > UDP_INC_STATS(sock_net(sk), UDP_MIB_CSUMERRORS, is_udplite); > > UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite); > > } > > kfree_skb(skb); > > Right, I was unsure because of the conditional SNMP updates. That seems to only suppress the update if peeking and the skb was already dequeued (ENOENT). Frankly, we probably never intended to increment this counter when peeking, as it is intended to be a per-datagram counter, not a per-recvmsg counter. I think ever erring on the side of extra increments in the unlikely case of ENOENT is fine. Cleaner is perhaps to have a kfree_skb_reason inside that branch and a consume_skb in the else.
On Thu, Mar 6, 2025 at 8:57 PM Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > Eric Dumazet wrote: > > On Thu, Mar 6, 2025 at 8:01 PM Willem de Bruijn > > <willemdebruijn.kernel@gmail.com> wrote: > > > > > > Eric Dumazet wrote: > > > > Use SKB_DROP_REASON_UDP_CSUM in __first_packet_length() > > > > and udp_read_skb() when dropping a packet because of > > > > a wrong UDP checksum. > > > > > > > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > > > --- > > > > net/ipv4/udp.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c > > > > index 17c7736d8349433ad2d4cbcc9414b2f8112610af..39c3adf333b5f02ca53f768c918c75f2fc7f93ac 100644 > > > > --- a/net/ipv4/udp.c > > > > +++ b/net/ipv4/udp.c > > > > @@ -1848,7 +1848,7 @@ static struct sk_buff *__first_packet_length(struct sock *sk, > > > > atomic_inc(&sk->sk_drops); > > > > __skb_unlink(skb, rcvq); > > > > *total += skb->truesize; > > > > - kfree_skb(skb); > > > > + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); > > > > } else { > > > > udp_skb_csum_unnecessary_set(skb); > > > > break; > > > > @@ -2002,7 +2002,7 @@ int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor) > > > > __UDP_INC_STATS(net, UDP_MIB_CSUMERRORS, is_udplite); > > > > __UDP_INC_STATS(net, UDP_MIB_INERRORS, is_udplite); > > > > atomic_inc(&sk->sk_drops); > > > > - kfree_skb(skb); > > > > + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); > > > > goto try_again; > > > > } > > > > > > From a quick search for UDP_MIB_CSUMERRORS, one more case with regular > > > kfree_skb: > > > > > > csum_copy_err: > > > if (!__sk_queue_drop_skb(sk, &udp_sk(sk)->reader_queue, skb, flags, > > > udp_skb_destructor)) { > > > UDP_INC_STATS(sock_net(sk), UDP_MIB_CSUMERRORS, is_udplite); > > > UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite); > > > } > > > kfree_skb(skb); > > > > Right, I was unsure because of the conditional SNMP updates. > > That seems to only suppress the update if peeking and the skb was > already dequeued (ENOENT). > > Frankly, we probably never intended to increment this counter when > peeking, as it is intended to be a per-datagram counter, not a > per-recvmsg counter. > > I think ever erring on the side of extra increments in the unlikely > case of ENOENT is fine. > > Cleaner is perhaps to have a kfree_skb_reason inside that branch and > a consume_skb in the else. I think it should be a kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM), because only one real 'free' will happen, if two or more threads were using MSG_PEEK for this packet. Using a consume_skb() would be racy, because if the skb refcount reaches 0, a wrong event would be generated. I will squash in v2 : diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 39c3adf333b5f02ca53f768c918c75f2fc7f93ac..d0bffcfa56d8deb14f38cc48a4a2b1d899ad6af4 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2117,7 +2117,7 @@ int udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags, UDP_INC_STATS(sock_net(sk), UDP_MIB_CSUMERRORS, is_udplite); UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite); } - kfree_skb(skb); + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); /* starting over for a new packet, but check if we need to yield */ cond_resched(); diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 3a0d6c5a8286b1685e8a1dec50365fe392ab9a87..024458ef163c9e24dfb37aea2690b2030f6a0fbc 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -586,7 +586,7 @@ int udpv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, SNMP_INC_STATS(mib, UDP_MIB_CSUMERRORS); SNMP_INC_STATS(mib, UDP_MIB_INERRORS); } - kfree_skb(skb); + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); /* starting over for a new packet, but check if we need to yield */ cond_resched();
Eric Dumazet wrote: > On Thu, Mar 6, 2025 at 8:57 PM Willem de Bruijn > <willemdebruijn.kernel@gmail.com> wrote: > > > > Eric Dumazet wrote: > > > On Thu, Mar 6, 2025 at 8:01 PM Willem de Bruijn > > > <willemdebruijn.kernel@gmail.com> wrote: > > > > > > > > Eric Dumazet wrote: > > > > > Use SKB_DROP_REASON_UDP_CSUM in __first_packet_length() > > > > > and udp_read_skb() when dropping a packet because of > > > > > a wrong UDP checksum. > > > > > > > > > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > > > > --- > > > > > net/ipv4/udp.c | 4 ++-- > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c > > > > > index 17c7736d8349433ad2d4cbcc9414b2f8112610af..39c3adf333b5f02ca53f768c918c75f2fc7f93ac 100644 > > > > > --- a/net/ipv4/udp.c > > > > > +++ b/net/ipv4/udp.c > > > > > @@ -1848,7 +1848,7 @@ static struct sk_buff *__first_packet_length(struct sock *sk, > > > > > atomic_inc(&sk->sk_drops); > > > > > __skb_unlink(skb, rcvq); > > > > > *total += skb->truesize; > > > > > - kfree_skb(skb); > > > > > + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); > > > > > } else { > > > > > udp_skb_csum_unnecessary_set(skb); > > > > > break; > > > > > @@ -2002,7 +2002,7 @@ int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor) > > > > > __UDP_INC_STATS(net, UDP_MIB_CSUMERRORS, is_udplite); > > > > > __UDP_INC_STATS(net, UDP_MIB_INERRORS, is_udplite); > > > > > atomic_inc(&sk->sk_drops); > > > > > - kfree_skb(skb); > > > > > + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); > > > > > goto try_again; > > > > > } > > > > > > > > From a quick search for UDP_MIB_CSUMERRORS, one more case with regular > > > > kfree_skb: > > > > > > > > csum_copy_err: > > > > if (!__sk_queue_drop_skb(sk, &udp_sk(sk)->reader_queue, skb, flags, > > > > udp_skb_destructor)) { > > > > UDP_INC_STATS(sock_net(sk), UDP_MIB_CSUMERRORS, is_udplite); > > > > UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite); > > > > } > > > > kfree_skb(skb); > > > > > > Right, I was unsure because of the conditional SNMP updates. > > > > That seems to only suppress the update if peeking and the skb was > > already dequeued (ENOENT). > > > > Frankly, we probably never intended to increment this counter when > > peeking, as it is intended to be a per-datagram counter, not a > > per-recvmsg counter. > > > > I think ever erring on the side of extra increments in the unlikely > > case of ENOENT is fine. > > > > Cleaner is perhaps to have a kfree_skb_reason inside that branch and > > a consume_skb in the else. > > I think it should be a kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM), > because only one real 'free' will happen, if two or more threads were > using MSG_PEEK for this packet. Oh right. > Using a consume_skb() would be racy, because if the skb refcount > reaches 0, a wrong event would be generated. > > I will squash in v2 : > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c > index 39c3adf333b5f02ca53f768c918c75f2fc7f93ac..d0bffcfa56d8deb14f38cc48a4a2b1d899ad6af4 > 100644 > --- a/net/ipv4/udp.c > +++ b/net/ipv4/udp.c > @@ -2117,7 +2117,7 @@ int udp_recvmsg(struct sock *sk, struct msghdr > *msg, size_t len, int flags, > UDP_INC_STATS(sock_net(sk), UDP_MIB_CSUMERRORS, is_udplite); > UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite); > } > - kfree_skb(skb); > + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); > > /* starting over for a new packet, but check if we need to yield */ > cond_resched(); > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c > index 3a0d6c5a8286b1685e8a1dec50365fe392ab9a87..024458ef163c9e24dfb37aea2690b2030f6a0fbc > 100644 > --- a/net/ipv6/udp.c > +++ b/net/ipv6/udp.c > @@ -586,7 +586,7 @@ int udpv6_recvmsg(struct sock *sk, struct msghdr > *msg, size_t len, > SNMP_INC_STATS(mib, UDP_MIB_CSUMERRORS); > SNMP_INC_STATS(mib, UDP_MIB_INERRORS); > } > - kfree_skb(skb); > + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); > > /* starting over for a new packet, but check if we need to yield */ > cond_resched(); Sounds good. Thanks Eric.
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 17c7736d8349433ad2d4cbcc9414b2f8112610af..39c3adf333b5f02ca53f768c918c75f2fc7f93ac 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1848,7 +1848,7 @@ static struct sk_buff *__first_packet_length(struct sock *sk, atomic_inc(&sk->sk_drops); __skb_unlink(skb, rcvq); *total += skb->truesize; - kfree_skb(skb); + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); } else { udp_skb_csum_unnecessary_set(skb); break; @@ -2002,7 +2002,7 @@ int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor) __UDP_INC_STATS(net, UDP_MIB_CSUMERRORS, is_udplite); __UDP_INC_STATS(net, UDP_MIB_INERRORS, is_udplite); atomic_inc(&sk->sk_drops); - kfree_skb(skb); + kfree_skb_reason(skb, SKB_DROP_REASON_UDP_CSUM); goto try_again; }
Use SKB_DROP_REASON_UDP_CSUM in __first_packet_length() and udp_read_skb() when dropping a packet because of a wrong UDP checksum. Signed-off-by: Eric Dumazet <edumazet@google.com> --- net/ipv4/udp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)