Message ID | 1649847244-5738-1-git-send-email-yangpc@wangsu.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] tcp: ensure to use the most recently sent skb when filling the rate sample | expand |
On Wed, Apr 13, 2022 at 6:54 AM Pengcheng Yang <yangpc@wangsu.com> wrote: > > If an ACK (s)acks multiple skbs, we favor the information > from the most recently sent skb by choosing the skb with > the highest prior_delivered count. > But prior_delivered may be equal, because tp->delivered only > changes when receiving, which requires further comparison of > skb timestamp. > > Signed-off-by: Pengcheng Yang <yangpc@wangsu.com> > --- > net/ipv4/tcp_rate.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/net/ipv4/tcp_rate.c b/net/ipv4/tcp_rate.c > index 617b818..ad893ad 100644 > --- a/net/ipv4/tcp_rate.c > +++ b/net/ipv4/tcp_rate.c > @@ -86,7 +86,9 @@ void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb, > return; > > if (!rs->prior_delivered || > - after(scb->tx.delivered, rs->prior_delivered)) { > + after(scb->tx.delivered, rs->prior_delivered) || > + (scb->tx.delivered == rs->prior_delivered && > + tcp_skb_timestamp_us(skb) > tp->first_tx_mstamp)) { > rs->prior_delivered_ce = scb->tx.delivered_ce; > rs->prior_delivered = scb->tx.delivered; > rs->prior_mstamp = scb->tx.delivered_mstamp; > -- Thank you for posting this patch! I agree there is a bug there, and your patch is an improvement. However, I think this patch is not a complete solution, since it does not handle the case where there are multiple skbs with the tcp_skb_timestamp_us() (which can happen if a outgoing buffered TSO/GSO skb is later split into multiple skbs with the same timestamp). RACK has to deal with the same question "which skb was sent first?", and already has a solution in tcp_rack_sent_after(). I suggest we share code between RACK and tcp_rate_skb_delivered() to make this check. This might involve making a copy of tcp_rack_sent_after() in include/net/tcp.h, naming the .h copy to tcp_skb_sent_after(), and reworking this logic to save and use the sequence number and timestamp so that it can use the new tcp_skb_sent_after() helper. After this fix propagates to net-next we could later then change RACK to use the new tcp_skb_sent_after() function, so we have a single helper used in two places. If you want to post a version of this patch that uses some approach like that, IMHO that would be welcome. If you do not have cycles, I am happy to post one when I get a moment. thanks, neal
On Fri, Apr 15, 2022 at 12:12 AM Neal Cardwell <ncardwell@google.com> wrote: > > On Wed, Apr 13, 2022 at 6:54 AM Pengcheng Yang <yangpc@wangsu.com> wrote: > > > > If an ACK (s)acks multiple skbs, we favor the information > > from the most recently sent skb by choosing the skb with > > the highest prior_delivered count. > > But prior_delivered may be equal, because tp->delivered only > > changes when receiving, which requires further comparison of > > skb timestamp. > > > > Signed-off-by: Pengcheng Yang <yangpc@wangsu.com> > > Thank you for posting this patch! I agree there is a bug there, and > your patch is an improvement. However, I think this patch is not a > complete solution, since it does not handle the case where there are > multiple skbs with the tcp_skb_timestamp_us() (which can happen if a > outgoing buffered TSO/GSO skb is later split into multiple skbs with > the same timestamp). > My initial thought was that this case would not affect the correctness of rate_sample, since the timestamp of these skbs are the same. So I am confused whether we have to find the *real* most recently (with the highest seq) skb (at the cost of a little extra cost)? > RACK has to deal with the same question "which skb was sent first?", > and already has a solution in tcp_rack_sent_after(). I suggest we > share code between RACK and tcp_rate_skb_delivered() to make this > check. This might involve making a copy of tcp_rack_sent_after() in > include/net/tcp.h, naming the .h copy to tcp_skb_sent_after(), and > reworking this logic to save and use the sequence number and timestamp > so that it can use the new tcp_skb_sent_after() helper. After this fix > propagates to net-next we could later then change RACK to use the new > tcp_skb_sent_after() function, so we have a single helper used in two > places. > Ok. I will send the V2 later according to your suggestion. Thanks neal. > If you want to post a version of this patch that uses some approach > like that, IMHO that would be welcome. If you do not have cycles, I am > happy to post one when I get a moment. > > thanks, > neal
diff --git a/net/ipv4/tcp_rate.c b/net/ipv4/tcp_rate.c index 617b818..ad893ad 100644 --- a/net/ipv4/tcp_rate.c +++ b/net/ipv4/tcp_rate.c @@ -86,7 +86,9 @@ void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb, return; if (!rs->prior_delivered || - after(scb->tx.delivered, rs->prior_delivered)) { + after(scb->tx.delivered, rs->prior_delivered) || + (scb->tx.delivered == rs->prior_delivered && + tcp_skb_timestamp_us(skb) > tp->first_tx_mstamp)) { rs->prior_delivered_ce = scb->tx.delivered_ce; rs->prior_delivered = scb->tx.delivered; rs->prior_mstamp = scb->tx.delivered_mstamp;
If an ACK (s)acks multiple skbs, we favor the information from the most recently sent skb by choosing the skb with the highest prior_delivered count. But prior_delivered may be equal, because tp->delivered only changes when receiving, which requires further comparison of skb timestamp. Signed-off-by: Pengcheng Yang <yangpc@wangsu.com> --- net/ipv4/tcp_rate.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)