Message ID | 20250218105824.34511-1-wanghai38@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] tcp: Fix error ts_recent time during three-way handshake | expand |
Hi Wang, On Tue, Feb 18, 2025 at 7:02 PM Wang Hai <wanghai38@huawei.com> wrote: > > If two ack packets from a connection enter tcp_check_req at the same time > through different cpu, it may happen that req->ts_recent is updated with > with a more recent time and the skb with an older time creates a new sock, > which will cause the tcp_validate_incoming check to fail. > > cpu1 cpu2 > tcp_check_req > tcp_check_req > req->ts_recent = tmp_opt.rcv_tsval = t1 > req->ts_recent = tmp_opt.rcv_tsval = t2 > > newsk->ts_recent = req->ts_recent = t2 // t1 < t2 > tcp_child_process > tcp_rcv_state_process > tcp_validate_incoming > tcp_paws_check > if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win) // failed > > In tcp_check_req, restore ts_recent to this skb's to fix this bug. It's known that it's possible to receive two packets in two different cpus like this case nearly at the same time. I'm curious if the socket was running on the loopback? Even if the above check that you commented in tcp_paws_check() fails, how about the rest of the checks in tcp_validate_incoming()? In your test, I doubt if really that check failed which finally caused the skb from CPU2 to be discarded. Let me put it this way, is the paws_win check the root cause? What is the drop reason for tcp_validate_incoming()? Thanks, Jason > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > Signed-off-by: Wang Hai <wanghai38@huawei.com> > --- > net/ipv4/tcp_minisocks.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c > index b089b08e9617..0208455f9eb8 100644 > --- a/net/ipv4/tcp_minisocks.c > +++ b/net/ipv4/tcp_minisocks.c > @@ -878,6 +878,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, > sock_rps_save_rxhash(child, skb); > tcp_synack_rtt_meas(child, req); > *req_stolen = !own_req; > + if (own_req && tcp_sk(child)->rx_opt.tstamp_ok && > + unlikely(tcp_sk(child)->rx_opt.ts_recent != tmp_opt.rcv_tsval)) > + tcp_sk(child)->rx_opt.ts_recent = tmp_opt.rcv_tsval; > + > return inet_csk_complete_hashdance(sk, child, req, own_req); > > listen_overflow: > -- > 2.17.1 > >
On Tue, Feb 18, 2025 at 12:00 PM Wang Hai <wanghai38@huawei.com> wrote: > > If two ack packets from a connection enter tcp_check_req at the same time > through different cpu, it may happen that req->ts_recent is updated with > with a more recent time and the skb with an older time creates a new sock, > which will cause the tcp_validate_incoming check to fail. > > cpu1 cpu2 > tcp_check_req > tcp_check_req > req->ts_recent = tmp_opt.rcv_tsval = t1 > req->ts_recent = tmp_opt.rcv_tsval = t2 > > newsk->ts_recent = req->ts_recent = t2 // t1 < t2 > tcp_child_process > tcp_rcv_state_process > tcp_validate_incoming > tcp_paws_check > if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win) // failed > > In tcp_check_req, restore ts_recent to this skb's to fix this bug. > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > Signed-off-by: Wang Hai <wanghai38@huawei.com> > --- > net/ipv4/tcp_minisocks.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c > index b089b08e9617..0208455f9eb8 100644 > --- a/net/ipv4/tcp_minisocks.c > +++ b/net/ipv4/tcp_minisocks.c > @@ -878,6 +878,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, > sock_rps_save_rxhash(child, skb); > tcp_synack_rtt_meas(child, req); > *req_stolen = !own_req; > + if (own_req && tcp_sk(child)->rx_opt.tstamp_ok && > + unlikely(tcp_sk(child)->rx_opt.ts_recent != tmp_opt.rcv_tsval)) > + tcp_sk(child)->rx_opt.ts_recent = tmp_opt.rcv_tsval; > + > return inet_csk_complete_hashdance(sk, child, req, own_req); Have you seen the comment at line 818 ? /* TODO: We probably should defer ts_recent change once * we take ownership of @req. */ Plan was clear and explained. Why implement something else (and buggy) ?
On 2025/2/18 20:02, Jason Xing wrote: > Hi Wang, > > On Tue, Feb 18, 2025 at 7:02 PM Wang Hai <wanghai38@huawei.com> wrote: >> >> If two ack packets from a connection enter tcp_check_req at the same time >> through different cpu, it may happen that req->ts_recent is updated with >> with a more recent time and the skb with an older time creates a new sock, >> which will cause the tcp_validate_incoming check to fail. >> >> cpu1 cpu2 >> tcp_check_req >> tcp_check_req >> req->ts_recent = tmp_opt.rcv_tsval = t1 >> req->ts_recent = tmp_opt.rcv_tsval = t2 >> >> newsk->ts_recent = req->ts_recent = t2 // t1 < t2 >> tcp_child_process >> tcp_rcv_state_process >> tcp_validate_incoming >> tcp_paws_check >> if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win) // failed >> >> In tcp_check_req, restore ts_recent to this skb's to fix this bug. > > It's known that it's possible to receive two packets in two different > cpus like this case nearly at the same time. I'm curious if the socket > was running on the loopback? > Hi Jason, Yeah, it's running in loopback. > Even if the above check that you commented in tcp_paws_check() fails, > how about the rest of the checks in tcp_validate_incoming()? The skb from cpu1 is a valid skb, so it should have passed the tcp_validate_incoming check, but the current concurrency issue prevented it from passing the check. 5951 static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, 5952 const struct tcphdr *th, int syn_inerr) 5953 { 5954 struct tcp_sock *tp = tcp_sk(sk); 5955 SKB_DR(reason); 5956 5957 /* RFC1323: H1. Apply PAWS check first. */ 5958 if (!tcp_fast_parse_options(sock_net(sk), skb, th, tp) || 5959 !tp->rx_opt.saw_tstamp || 5960 tcp_paws_check(&tp->rx_opt, TCP_PAWS_WINDOW)) // failed 5961 goto step1; 5962 5963 reason = tcp_disordered_ack_check(sk, skb); 5964 if (!reason) // failed 5965 goto step1; 5966 /* Reset is accepted even if it did not pass PAWS. */ 5967 if (th->rst) // failed 5968 goto step1; 5969 if (unlikely(th->syn)) // failed 5970 goto syn_challenge; 5971 5972 /* Old ACK are common, increment PAWS_OLD_ACK 5973 * and do not send a dupack. 5974 */ 5975 if (reason == SKB_DROP_REASON_TCP_RFC7323_PAWS_ACK) { 5976 NET_INC_STATS(sock_net(sk), LINUX_MIB_PAWS_OLD_ACK); 5977 goto discard; 5978 } 5979 NET_INC_STATS(sock_net(sk), LINUX_MIB_PAWSESTABREJECTED); 5980 if (!tcp_oow_rate_limited(sock_net(sk), skb, 5981 LINUX_MIB_TCPACKSKIPPEDPAWS, 5982 &tp->last_oow_ack_time)) 5983 tcp_send_dupack(sk, skb); 5984 goto discard; // Drop the skb from here. > test, I doubt if really that check failed which finally caused the skb > from CPU2 to be discarded. Let me put it this way, is the paws_win > check the root cause? What is the drop reason for > tcp_validate_incoming()? > The values of ts_recent and rcv_tsval are compared in tcp_paws_check, where ts_recent comes from req->ts_recent = t2 and rcv_tsval is the current cpu1 skb time. ts_recent - rcv_tsval may be greater than 1 and here will fail here. Tcp_paws_check fails causing tcp_validate_incoming to fail too. > Thanks, > Jason > >> >> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") >> Signed-off-by: Wang Hai <wanghai38@huawei.com> >> --- >> net/ipv4/tcp_minisocks.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c >> index b089b08e9617..0208455f9eb8 100644 >> --- a/net/ipv4/tcp_minisocks.c >> +++ b/net/ipv4/tcp_minisocks.c >> @@ -878,6 +878,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, >> sock_rps_save_rxhash(child, skb); >> tcp_synack_rtt_meas(child, req); >> *req_stolen = !own_req; >> + if (own_req && tcp_sk(child)->rx_opt.tstamp_ok && >> + unlikely(tcp_sk(child)->rx_opt.ts_recent != tmp_opt.rcv_tsval)) >> + tcp_sk(child)->rx_opt.ts_recent = tmp_opt.rcv_tsval; >> + >> return inet_csk_complete_hashdance(sk, child, req, own_req); >> >> listen_overflow: >> -- >> 2.17.1 >> >>
On 2025/2/18 21:35, Eric Dumazet wrote: > On Tue, Feb 18, 2025 at 12:00 PM Wang Hai <wanghai38@huawei.com> wrote: >> >> If two ack packets from a connection enter tcp_check_req at the same time >> through different cpu, it may happen that req->ts_recent is updated with >> with a more recent time and the skb with an older time creates a new sock, >> which will cause the tcp_validate_incoming check to fail. >> >> cpu1 cpu2 >> tcp_check_req >> tcp_check_req >> req->ts_recent = tmp_opt.rcv_tsval = t1 >> req->ts_recent = tmp_opt.rcv_tsval = t2 >> >> newsk->ts_recent = req->ts_recent = t2 // t1 < t2 >> tcp_child_process >> tcp_rcv_state_process >> tcp_validate_incoming >> tcp_paws_check >> if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win) // failed >> >> In tcp_check_req, restore ts_recent to this skb's to fix this bug. >> >> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") >> Signed-off-by: Wang Hai <wanghai38@huawei.com> >> --- >> net/ipv4/tcp_minisocks.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c >> index b089b08e9617..0208455f9eb8 100644 >> --- a/net/ipv4/tcp_minisocks.c >> +++ b/net/ipv4/tcp_minisocks.c >> @@ -878,6 +878,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, >> sock_rps_save_rxhash(child, skb); >> tcp_synack_rtt_meas(child, req); >> *req_stolen = !own_req; >> + if (own_req && tcp_sk(child)->rx_opt.tstamp_ok && >> + unlikely(tcp_sk(child)->rx_opt.ts_recent != tmp_opt.rcv_tsval)) >> + tcp_sk(child)->rx_opt.ts_recent = tmp_opt.rcv_tsval; >> + >> return inet_csk_complete_hashdance(sk, child, req, own_req); > > Have you seen the comment at line 818 ? > > /* TODO: We probably should defer ts_recent change once > * we take ownership of @req. > */ > > Plan was clear and explained. Why implement something else (and buggy) ? > Hi Eric, Currently we have a real problem, so we want to solve it. This bug causes the upper layers to be unable to be notified to call accept after the successful three-way handshake. Skb from cpu1 that fails at tcp_paws_check (which it could have succeeded) will not be able to enter the TCP_ESTABLISHED state, and therefore parent->sk_data_ready(parent) will not be triggered, and skb from cpu2 can complete the three-way handshake, but there is also no way to call parent->sk_data_ready(parent) to notify the upper layer, which will result in the upper layer not being able to sense and call accept to obtain the nsk. cpu1 cpu2 tcp_check_req tcp_check_req req->ts_recent = tmp_opt.rcv_tsval = t1 req->ts_recent=tmp_opt.rcv_tsval= t2 newsk->ts_recent = req->ts_recent = t2 // t1 < t2 tcp_child_process tcp_rcv_state_process tcp_validate_incoming tcp_paws_check // failed parent->sk_data_ready(parent); // will not be called tcp_v4_do_rcv tcp_rcv_state_process // Complete the three-way handshake // missing parent->sk_data_ready(parent);
On 2025/2/18 21:35, Eric Dumazet wrote: > On Tue, Feb 18, 2025 at 12:00 PM Wang Hai <wanghai38@huawei.com> wrote: >> >> If two ack packets from a connection enter tcp_check_req at the same time >> through different cpu, it may happen that req->ts_recent is updated with >> with a more recent time and the skb with an older time creates a new sock, >> which will cause the tcp_validate_incoming check to fail. >> >> cpu1 cpu2 >> tcp_check_req >> tcp_check_req >> req->ts_recent = tmp_opt.rcv_tsval = t1 >> req->ts_recent = tmp_opt.rcv_tsval = t2 >> >> newsk->ts_recent = req->ts_recent = t2 // t1 < t2 >> tcp_child_process >> tcp_rcv_state_process >> tcp_validate_incoming >> tcp_paws_check >> if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win) // failed >> >> In tcp_check_req, restore ts_recent to this skb's to fix this bug. >> >> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") >> Signed-off-by: Wang Hai <wanghai38@huawei.com> >> --- >> net/ipv4/tcp_minisocks.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c >> index b089b08e9617..0208455f9eb8 100644 >> --- a/net/ipv4/tcp_minisocks.c >> +++ b/net/ipv4/tcp_minisocks.c >> @@ -878,6 +878,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, >> sock_rps_save_rxhash(child, skb); >> tcp_synack_rtt_meas(child, req); >> *req_stolen = !own_req; >> + if (own_req && tcp_sk(child)->rx_opt.tstamp_ok && >> + unlikely(tcp_sk(child)->rx_opt.ts_recent != tmp_opt.rcv_tsval)) >> + tcp_sk(child)->rx_opt.ts_recent = tmp_opt.rcv_tsval; >> + >> return inet_csk_complete_hashdance(sk, child, req, own_req); > > Have you seen the comment at line 818 ? > > /* TODO: We probably should defer ts_recent change once > * we take ownership of @req. > */ > > Plan was clear and explained. Why implement something else (and buggy) ? Hi Eric, According to the plan, can we fix it like this? diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index b089b08e9617..1210d4967b94 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -814,13 +814,6 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, } /* In sequence, PAWS is OK. */ - - /* TODO: We probably should defer ts_recent change once - * we take ownership of @req. - */ - if (tmp_opt.saw_tstamp && !after(TCP_SKB_CB(skb)->seq, tcp_rsk(req)->rcv_nxt)) - WRITE_ONCE(req->ts_recent, tmp_opt.rcv_tsval); - if (TCP_SKB_CB(skb)->seq == tcp_rsk(req)->rcv_isn) { /* Truncate SYN, it is out of window starting at tcp_rsk(req)->rcv_isn + 1. */ @@ -878,6 +871,9 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, sock_rps_save_rxhash(child, skb); tcp_synack_rtt_meas(child, req); *req_stolen = !own_req; + if (own_req && tmp_opt.saw_tstamp && !after(TCP_SKB_CB(skb)- seq, tcp_rsk(req)->rcv_nxt)) + tcp_sk(child)->rx_opt.ts_recent = tmp_opt.rcv_tsval; + return inet_csk_complete_hashdance(sk, child, req, own_req); listen_overflow: >
On Wed, Feb 19, 2025 at 10:16 AM Wang Hai <wanghai38@huawei.com> wrote: > > > > On 2025/2/18 21:35, Eric Dumazet wrote: > > On Tue, Feb 18, 2025 at 12:00 PM Wang Hai <wanghai38@huawei.com> wrote: > >> > >> If two ack packets from a connection enter tcp_check_req at the same time > >> through different cpu, it may happen that req->ts_recent is updated with > >> with a more recent time and the skb with an older time creates a new sock, > >> which will cause the tcp_validate_incoming check to fail. > >> > >> cpu1 cpu2 > >> tcp_check_req > >> tcp_check_req > >> req->ts_recent = tmp_opt.rcv_tsval = t1 > >> req->ts_recent = tmp_opt.rcv_tsval = t2 > >> > >> newsk->ts_recent = req->ts_recent = t2 // t1 < t2 > >> tcp_child_process > >> tcp_rcv_state_process > >> tcp_validate_incoming > >> tcp_paws_check > >> if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win) // failed > >> > >> In tcp_check_req, restore ts_recent to this skb's to fix this bug. > >> > >> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > >> Signed-off-by: Wang Hai <wanghai38@huawei.com> > >> --- > >> net/ipv4/tcp_minisocks.c | 4 ++++ > >> 1 file changed, 4 insertions(+) > >> > >> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c > >> index b089b08e9617..0208455f9eb8 100644 > >> --- a/net/ipv4/tcp_minisocks.c > >> +++ b/net/ipv4/tcp_minisocks.c > >> @@ -878,6 +878,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, > >> sock_rps_save_rxhash(child, skb); > >> tcp_synack_rtt_meas(child, req); > >> *req_stolen = !own_req; > >> + if (own_req && tcp_sk(child)->rx_opt.tstamp_ok && > >> + unlikely(tcp_sk(child)->rx_opt.ts_recent != tmp_opt.rcv_tsval)) > >> + tcp_sk(child)->rx_opt.ts_recent = tmp_opt.rcv_tsval; > >> + > >> return inet_csk_complete_hashdance(sk, child, req, own_req); > > > > Have you seen the comment at line 818 ? > > > > /* TODO: We probably should defer ts_recent change once > > * we take ownership of @req. > > */ > > > > Plan was clear and explained. Why implement something else (and buggy) ? > > > Hi Eric, > > Currently we have a real problem, so we want to solve it. This bug > causes the upper layers to be unable to be notified to call accept after > the successful three-way handshake. > > Skb from cpu1 that fails at tcp_paws_check (which it could have > succeeded) will not be able to enter the TCP_ESTABLISHED state, and > therefore parent->sk_data_ready(parent) will not be triggered, and skb > from cpu2 can complete the three-way handshake, but there is also no way > to call parent->sk_data_ready(parent) to notify the upper layer, which > will result > in the upper layer not being able to sense and call accept to obtain the > nsk. > > cpu1 cpu2 > tcp_check_req > tcp_check_req > req->ts_recent = tmp_opt.rcv_tsval = t1 > req->ts_recent=tmp_opt.rcv_tsval= t2 > > newsk->ts_recent = req->ts_recent = t2 // t1 < t2 > tcp_child_process > tcp_rcv_state_process > tcp_validate_incoming > tcp_paws_check // failed > parent->sk_data_ready(parent); // will not be called > tcp_v4_do_rcv > tcp_rcv_state_process // Complete the three-way handshake > // missing parent->sk_data_ready(parent); IIUC, the ack received from cpu1 triggered calling inet_csk_complete_hashdance() so its state transited from TCP_NEW_SYN_RECV to TCP_SYN_RECV, right? If so, the reason why not call sk_data_ready() if the skb entered into tcp_child_process() is that its state failed to transit to TCP_ESTABLISHED? Here is another question. How did the skb on the right side enter into tcp_v4_do_rcv() after entering tcp_check_req() if the state of sk which the skb belongs to is TCP_NEW_SYN_RECV? Could you elaborate more on this point? Thanks, Jason
On 2025/2/19 11:31, Jason Xing wrote: > On Wed, Feb 19, 2025 at 10:16 AM Wang Hai <wanghai38@huawei.com> wrote: >> >> >> >> On 2025/2/18 21:35, Eric Dumazet wrote: >>> On Tue, Feb 18, 2025 at 12:00 PM Wang Hai <wanghai38@huawei.com> wrote: >>>> >>>> If two ack packets from a connection enter tcp_check_req at the same time >>>> through different cpu, it may happen that req->ts_recent is updated with >>>> with a more recent time and the skb with an older time creates a new sock, >>>> which will cause the tcp_validate_incoming check to fail. >>>> >>>> cpu1 cpu2 >>>> tcp_check_req >>>> tcp_check_req >>>> req->ts_recent = tmp_opt.rcv_tsval = t1 >>>> req->ts_recent = tmp_opt.rcv_tsval = t2 >>>> >>>> newsk->ts_recent = req->ts_recent = t2 // t1 < t2 >>>> tcp_child_process >>>> tcp_rcv_state_process >>>> tcp_validate_incoming >>>> tcp_paws_check >>>> if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win) // failed >>>> >>>> In tcp_check_req, restore ts_recent to this skb's to fix this bug. >>>> >>>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") >>>> Signed-off-by: Wang Hai <wanghai38@huawei.com> >>>> --- >>>> net/ipv4/tcp_minisocks.c | 4 ++++ >>>> 1 file changed, 4 insertions(+) >>>> >>>> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c >>>> index b089b08e9617..0208455f9eb8 100644 >>>> --- a/net/ipv4/tcp_minisocks.c >>>> +++ b/net/ipv4/tcp_minisocks.c >>>> @@ -878,6 +878,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, >>>> sock_rps_save_rxhash(child, skb); >>>> tcp_synack_rtt_meas(child, req); >>>> *req_stolen = !own_req; >>>> + if (own_req && tcp_sk(child)->rx_opt.tstamp_ok && >>>> + unlikely(tcp_sk(child)->rx_opt.ts_recent != tmp_opt.rcv_tsval)) >>>> + tcp_sk(child)->rx_opt.ts_recent = tmp_opt.rcv_tsval; >>>> + >>>> return inet_csk_complete_hashdance(sk, child, req, own_req); >>> >>> Have you seen the comment at line 818 ? >>> >>> /* TODO: We probably should defer ts_recent change once >>> * we take ownership of @req. >>> */ >>> >>> Plan was clear and explained. Why implement something else (and buggy) ? >>> >> Hi Eric, >> >> Currently we have a real problem, so we want to solve it. This bug >> causes the upper layers to be unable to be notified to call accept after >> the successful three-way handshake. >> >> Skb from cpu1 that fails at tcp_paws_check (which it could have >> succeeded) will not be able to enter the TCP_ESTABLISHED state, and >> therefore parent->sk_data_ready(parent) will not be triggered, and skb >> from cpu2 can complete the three-way handshake, but there is also no way >> to call parent->sk_data_ready(parent) to notify the upper layer, which >> will result >> in the upper layer not being able to sense and call accept to obtain the >> nsk. >> >> cpu1 cpu2 >> tcp_check_req >> tcp_check_req >> req->ts_recent = tmp_opt.rcv_tsval = t1 >> req->ts_recent=tmp_opt.rcv_tsval= t2 >> >> newsk->ts_recent = req->ts_recent = t2 // t1 < t2 >> tcp_child_process >> tcp_rcv_state_process >> tcp_validate_incoming >> tcp_paws_check // failed >> parent->sk_data_ready(parent); // will not be called >> tcp_v4_do_rcv >> tcp_rcv_state_process // Complete the three-way handshake >> // missing parent->sk_data_ready(parent); > > IIUC, the ack received from cpu1 triggered calling > inet_csk_complete_hashdance() so its state transited from > TCP_NEW_SYN_RECV to TCP_SYN_RECV, right? If so, the reason why not > call sk_data_ready() if the skb entered into tcp_child_process() is > that its state failed to transit to TCP_ESTABLISHED? > Yes, because it didn't switch to TCP_ESTABLISHED > Here is another question. How did the skb on the right side enter into > tcp_v4_do_rcv() after entering tcp_check_req() if the state of sk > which the skb belongs to is TCP_NEW_SYN_RECV? Could you elaborate more > on this point? Since cpu1 successfully created the child sock, cpu2 will return null in tcp_check_req and req_stolen is set to true, so that it will subsequently go to 'goto lookup' to re-process the packet, and at this point, sk->sk_state is already in TCP_SYN_RECV state, and then then tcp_v4_do_rcv is called. > > Thanks, > Jason
On Wed, Feb 19, 2025 at 9:11 PM Wang Hai <wanghai38@huawei.com> wrote: > > > > On 2025/2/19 11:31, Jason Xing wrote: > > On Wed, Feb 19, 2025 at 10:16 AM Wang Hai <wanghai38@huawei.com> wrote: > >> > >> > >> > >> On 2025/2/18 21:35, Eric Dumazet wrote: > >>> On Tue, Feb 18, 2025 at 12:00 PM Wang Hai <wanghai38@huawei.com> wrote: > >>>> > >>>> If two ack packets from a connection enter tcp_check_req at the same time > >>>> through different cpu, it may happen that req->ts_recent is updated with > >>>> with a more recent time and the skb with an older time creates a new sock, > >>>> which will cause the tcp_validate_incoming check to fail. > >>>> > >>>> cpu1 cpu2 > >>>> tcp_check_req > >>>> tcp_check_req > >>>> req->ts_recent = tmp_opt.rcv_tsval = t1 > >>>> req->ts_recent = tmp_opt.rcv_tsval = t2 > >>>> > >>>> newsk->ts_recent = req->ts_recent = t2 // t1 < t2 > >>>> tcp_child_process > >>>> tcp_rcv_state_process > >>>> tcp_validate_incoming > >>>> tcp_paws_check > >>>> if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win) // failed > >>>> > >>>> In tcp_check_req, restore ts_recent to this skb's to fix this bug. > >>>> > >>>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > >>>> Signed-off-by: Wang Hai <wanghai38@huawei.com> > >>>> --- > >>>> net/ipv4/tcp_minisocks.c | 4 ++++ > >>>> 1 file changed, 4 insertions(+) > >>>> > >>>> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c > >>>> index b089b08e9617..0208455f9eb8 100644 > >>>> --- a/net/ipv4/tcp_minisocks.c > >>>> +++ b/net/ipv4/tcp_minisocks.c > >>>> @@ -878,6 +878,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, > >>>> sock_rps_save_rxhash(child, skb); > >>>> tcp_synack_rtt_meas(child, req); > >>>> *req_stolen = !own_req; > >>>> + if (own_req && tcp_sk(child)->rx_opt.tstamp_ok && > >>>> + unlikely(tcp_sk(child)->rx_opt.ts_recent != tmp_opt.rcv_tsval)) > >>>> + tcp_sk(child)->rx_opt.ts_recent = tmp_opt.rcv_tsval; > >>>> + > >>>> return inet_csk_complete_hashdance(sk, child, req, own_req); > >>> > >>> Have you seen the comment at line 818 ? > >>> > >>> /* TODO: We probably should defer ts_recent change once > >>> * we take ownership of @req. > >>> */ > >>> > >>> Plan was clear and explained. Why implement something else (and buggy) ? > >>> > >> Hi Eric, > >> > >> Currently we have a real problem, so we want to solve it. This bug > >> causes the upper layers to be unable to be notified to call accept after > >> the successful three-way handshake. > >> > >> Skb from cpu1 that fails at tcp_paws_check (which it could have > >> succeeded) will not be able to enter the TCP_ESTABLISHED state, and > >> therefore parent->sk_data_ready(parent) will not be triggered, and skb > >> from cpu2 can complete the three-way handshake, but there is also no way > >> to call parent->sk_data_ready(parent) to notify the upper layer, which > >> will result > >> in the upper layer not being able to sense and call accept to obtain the > >> nsk. > >> > >> cpu1 cpu2 > >> tcp_check_req > >> tcp_check_req > >> req->ts_recent = tmp_opt.rcv_tsval = t1 > >> req->ts_recent=tmp_opt.rcv_tsval= t2 > >> > >> newsk->ts_recent = req->ts_recent = t2 // t1 < t2 > >> tcp_child_process > >> tcp_rcv_state_process > >> tcp_validate_incoming > >> tcp_paws_check // failed > >> parent->sk_data_ready(parent); // will not be called > >> tcp_v4_do_rcv > >> tcp_rcv_state_process // Complete the three-way handshake > >> // missing parent->sk_data_ready(parent); > > > > IIUC, the ack received from cpu1 triggered calling > > inet_csk_complete_hashdance() so its state transited from > > TCP_NEW_SYN_RECV to TCP_SYN_RECV, right? If so, the reason why not > > call sk_data_ready() if the skb entered into tcp_child_process() is > > that its state failed to transit to TCP_ESTABLISHED? > > > Yes, because it didn't switch to TCP_ESTABLISHED > > Here is another question. How did the skb on the right side enter into > > tcp_v4_do_rcv() after entering tcp_check_req() if the state of sk > > which the skb belongs to is TCP_NEW_SYN_RECV? Could you elaborate more > > on this point? > Since cpu1 successfully created the child sock, cpu2 will return > null in tcp_check_req and req_stolen is set to true, so that it will > subsequently go to 'goto lookup' to re-process the packet, and at > this point, sk->sk_state is already in TCP_SYN_RECV state, and then > then tcp_v4_do_rcv is called. Now I can see what happened there. Perhaps it would be good to update the commit message in the next iteration. Another key information I notice is that the second lookup process loses the chance to call sk_data_ready() for its parent socket. It's the one of the main reasons that cause your application to be unable to get notified. Taking a rough look at tcp_rcv_state_process(), I think it's not easy to acquire the parent socket there and then call sk_data_ready() without modifying more codes compared to the current solution. It's a different solution in theory. If your new approach (like your previous reply) works, the following commit[1] will be reverted/overwritten. commit e0f9759f530bf789e984961dce79f525b151ecf3 Author: Eric Dumazet <edumazet@google.com> Date: Tue Feb 13 06:14:12 2018 -0800 tcp: try to keep packet if SYN_RCV race is lost 배석진 reported that in some situations, packets for a given 5-tuple end up being processed by different CPUS. This involves RPS, and fragmentation. 배석진 is seeing packet drops when a SYN_RECV request socket is moved into ESTABLISH state. Other states are protected by socket lock. This is caused by a CPU losing the race, and simply not caring enough. Since this seems to occur frequently, we can do better and perform a second lookup. Note that all needed memory barriers are already in the existing code, thanks to the spin_lock()/spin_unlock() pair in inet_ehash_insert() and reqsk_put(). The second lookup must find the new socket, unless it has already been accepted and closed by another cpu. Note that the fragmentation could be avoided in the first place by use of a correct TCP MSS option in the SYN{ACK} packet, but this does not mean we can not be more robust. Please repost a V2 when you think it's ready, then TCP maintainers will handle it:) Thanks, Jason
On Wed, Feb 19, 2025 at 4:08 AM Wang Hai <wanghai38@huawei.com> wrote: > > Hi Eric, > > According to the plan, can we fix it like this? > > diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c > index b089b08e9617..1210d4967b94 100644 > --- a/net/ipv4/tcp_minisocks.c > +++ b/net/ipv4/tcp_minisocks.c > @@ -814,13 +814,6 @@ struct sock *tcp_check_req(struct sock *sk, struct > sk_buff *skb, > } > > /* In sequence, PAWS is OK. */ > - > - /* TODO: We probably should defer ts_recent change once > - * we take ownership of @req. > - */ > - if (tmp_opt.saw_tstamp && !after(TCP_SKB_CB(skb)->seq, > tcp_rsk(req)->rcv_nxt)) > - WRITE_ONCE(req->ts_recent, tmp_opt.rcv_tsval); > - > if (TCP_SKB_CB(skb)->seq == tcp_rsk(req)->rcv_isn) { > /* Truncate SYN, it is out of window starting > at tcp_rsk(req)->rcv_isn + 1. */ > @@ -878,6 +871,9 @@ struct sock *tcp_check_req(struct sock *sk, struct > sk_buff *skb, > sock_rps_save_rxhash(child, skb); > tcp_synack_rtt_meas(child, req); > *req_stolen = !own_req; > + if (own_req && tmp_opt.saw_tstamp && !after(TCP_SKB_CB(skb)- > seq, tcp_rsk(req)->rcv_nxt)) > + tcp_sk(child)->rx_opt.ts_recent = tmp_opt.rcv_tsval; > + > return inet_csk_complete_hashdance(sk, child, req, own_req); Yes, this was the plan ;)
On 2025/2/20 11:04, Jason Xing wrote: > On Wed, Feb 19, 2025 at 9:11 PM Wang Hai <wanghai38@huawei.com> wrote: >> >> >> >> On 2025/2/19 11:31, Jason Xing wrote: >>> On Wed, Feb 19, 2025 at 10:16 AM Wang Hai <wanghai38@huawei.com> wrote: >>>> >>>> >>>> >>>> On 2025/2/18 21:35, Eric Dumazet wrote: >>>>> On Tue, Feb 18, 2025 at 12:00 PM Wang Hai <wanghai38@huawei.com> wrote: >>>>>> >>>>>> If two ack packets from a connection enter tcp_check_req at the same time >>>>>> through different cpu, it may happen that req->ts_recent is updated with >>>>>> with a more recent time and the skb with an older time creates a new sock, >>>>>> which will cause the tcp_validate_incoming check to fail. >>>>>> >>>>>> cpu1 cpu2 >>>>>> tcp_check_req >>>>>> tcp_check_req >>>>>> req->ts_recent = tmp_opt.rcv_tsval = t1 >>>>>> req->ts_recent = tmp_opt.rcv_tsval = t2 >>>>>> >>>>>> newsk->ts_recent = req->ts_recent = t2 // t1 < t2 >>>>>> tcp_child_process >>>>>> tcp_rcv_state_process >>>>>> tcp_validate_incoming >>>>>> tcp_paws_check >>>>>> if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win) // failed >>>>>> >>>>>> In tcp_check_req, restore ts_recent to this skb's to fix this bug. >>>>>> >>>>>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") >>>>>> Signed-off-by: Wang Hai <wanghai38@huawei.com> >>>>>> --- >>>>>> net/ipv4/tcp_minisocks.c | 4 ++++ >>>>>> 1 file changed, 4 insertions(+) >>>>>> >>>>>> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c >>>>>> index b089b08e9617..0208455f9eb8 100644 >>>>>> --- a/net/ipv4/tcp_minisocks.c >>>>>> +++ b/net/ipv4/tcp_minisocks.c >>>>>> @@ -878,6 +878,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, >>>>>> sock_rps_save_rxhash(child, skb); >>>>>> tcp_synack_rtt_meas(child, req); >>>>>> *req_stolen = !own_req; >>>>>> + if (own_req && tcp_sk(child)->rx_opt.tstamp_ok && >>>>>> + unlikely(tcp_sk(child)->rx_opt.ts_recent != tmp_opt.rcv_tsval)) >>>>>> + tcp_sk(child)->rx_opt.ts_recent = tmp_opt.rcv_tsval; >>>>>> + >>>>>> return inet_csk_complete_hashdance(sk, child, req, own_req); >>>>> >>>>> Have you seen the comment at line 818 ? >>>>> >>>>> /* TODO: We probably should defer ts_recent change once >>>>> * we take ownership of @req. >>>>> */ >>>>> >>>>> Plan was clear and explained. Why implement something else (and buggy) ? >>>>> >>>> Hi Eric, >>>> >>>> Currently we have a real problem, so we want to solve it. This bug >>>> causes the upper layers to be unable to be notified to call accept after >>>> the successful three-way handshake. >>>> >>>> Skb from cpu1 that fails at tcp_paws_check (which it could have >>>> succeeded) will not be able to enter the TCP_ESTABLISHED state, and >>>> therefore parent->sk_data_ready(parent) will not be triggered, and skb >>>> from cpu2 can complete the three-way handshake, but there is also no way >>>> to call parent->sk_data_ready(parent) to notify the upper layer, which >>>> will result >>>> in the upper layer not being able to sense and call accept to obtain the >>>> nsk. >>>> >>>> cpu1 cpu2 >>>> tcp_check_req >>>> tcp_check_req >>>> req->ts_recent = tmp_opt.rcv_tsval = t1 >>>> req->ts_recent=tmp_opt.rcv_tsval= t2 >>>> >>>> newsk->ts_recent = req->ts_recent = t2 // t1 < t2 >>>> tcp_child_process >>>> tcp_rcv_state_process >>>> tcp_validate_incoming >>>> tcp_paws_check // failed >>>> parent->sk_data_ready(parent); // will not be called >>>> tcp_v4_do_rcv >>>> tcp_rcv_state_process // Complete the three-way handshake >>>> // missing parent->sk_data_ready(parent); >>> >>> IIUC, the ack received from cpu1 triggered calling >>> inet_csk_complete_hashdance() so its state transited from >>> TCP_NEW_SYN_RECV to TCP_SYN_RECV, right? If so, the reason why not >>> call sk_data_ready() if the skb entered into tcp_child_process() is >>> that its state failed to transit to TCP_ESTABLISHED? >>> >> Yes, because it didn't switch to TCP_ESTABLISHED >>> Here is another question. How did the skb on the right side enter into >>> tcp_v4_do_rcv() after entering tcp_check_req() if the state of sk >>> which the skb belongs to is TCP_NEW_SYN_RECV? Could you elaborate more >>> on this point? >> Since cpu1 successfully created the child sock, cpu2 will return >> null in tcp_check_req and req_stolen is set to true, so that it will >> subsequently go to 'goto lookup' to re-process the packet, and at >> this point, sk->sk_state is already in TCP_SYN_RECV state, and then >> then tcp_v4_do_rcv is called. > > Now I can see what happened there. Perhaps it would be good to update > the commit message > in the next iteration. Hi Jason, Thanks for the suggestion, I'll test it out and improve the commit message to send v2. > > Another key information I notice is that the second lookup process > loses the chance to call sk_data_ready() for its parent socket. It's > the one of the main reasons that cause your application to be unable > to get notified. Taking a rough look at tcp_rcv_state_process(), I > think it's not easy to acquire the parent socket there and then call > sk_data_ready() without modifying more codes compared to the current > solution. It's a different solution in theory. Yes, I have considered this fix before, but the complexity of the fix would be higher. > > If your new approach (like your previous reply) works, the following > commit[1] will be reverted/overwritten. I'm sorry, I may not have understood what you meant. Applying my fix, commit[1] is still necessary because it doesn't solve the bug that commit[1] fixes. can you explain in detail, why commit[1] will be reverted/overwritten. Thanks, Wang
On Thu, Feb 20, 2025 at 10:03 PM Wang Hai <wanghai38@huawei.com> wrote: > > > > On 2025/2/20 11:04, Jason Xing wrote: > > On Wed, Feb 19, 2025 at 9:11 PM Wang Hai <wanghai38@huawei.com> wrote: > >> > >> > >> > >> On 2025/2/19 11:31, Jason Xing wrote: > >>> On Wed, Feb 19, 2025 at 10:16 AM Wang Hai <wanghai38@huawei.com> wrote: > >>>> > >>>> > >>>> > >>>> On 2025/2/18 21:35, Eric Dumazet wrote: > >>>>> On Tue, Feb 18, 2025 at 12:00 PM Wang Hai <wanghai38@huawei.com> wrote: > >>>>>> > >>>>>> If two ack packets from a connection enter tcp_check_req at the same time > >>>>>> through different cpu, it may happen that req->ts_recent is updated with > >>>>>> with a more recent time and the skb with an older time creates a new sock, > >>>>>> which will cause the tcp_validate_incoming check to fail. > >>>>>> > >>>>>> cpu1 cpu2 > >>>>>> tcp_check_req > >>>>>> tcp_check_req > >>>>>> req->ts_recent = tmp_opt.rcv_tsval = t1 > >>>>>> req->ts_recent = tmp_opt.rcv_tsval = t2 > >>>>>> > >>>>>> newsk->ts_recent = req->ts_recent = t2 // t1 < t2 > >>>>>> tcp_child_process > >>>>>> tcp_rcv_state_process > >>>>>> tcp_validate_incoming > >>>>>> tcp_paws_check > >>>>>> if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win) // failed > >>>>>> > >>>>>> In tcp_check_req, restore ts_recent to this skb's to fix this bug. > >>>>>> > >>>>>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > >>>>>> Signed-off-by: Wang Hai <wanghai38@huawei.com> > >>>>>> --- > >>>>>> net/ipv4/tcp_minisocks.c | 4 ++++ > >>>>>> 1 file changed, 4 insertions(+) > >>>>>> > >>>>>> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c > >>>>>> index b089b08e9617..0208455f9eb8 100644 > >>>>>> --- a/net/ipv4/tcp_minisocks.c > >>>>>> +++ b/net/ipv4/tcp_minisocks.c > >>>>>> @@ -878,6 +878,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, > >>>>>> sock_rps_save_rxhash(child, skb); > >>>>>> tcp_synack_rtt_meas(child, req); > >>>>>> *req_stolen = !own_req; > >>>>>> + if (own_req && tcp_sk(child)->rx_opt.tstamp_ok && > >>>>>> + unlikely(tcp_sk(child)->rx_opt.ts_recent != tmp_opt.rcv_tsval)) > >>>>>> + tcp_sk(child)->rx_opt.ts_recent = tmp_opt.rcv_tsval; > >>>>>> + > >>>>>> return inet_csk_complete_hashdance(sk, child, req, own_req); > >>>>> > >>>>> Have you seen the comment at line 818 ? > >>>>> > >>>>> /* TODO: We probably should defer ts_recent change once > >>>>> * we take ownership of @req. > >>>>> */ > >>>>> > >>>>> Plan was clear and explained. Why implement something else (and buggy) ? > >>>>> > >>>> Hi Eric, > >>>> > >>>> Currently we have a real problem, so we want to solve it. This bug > >>>> causes the upper layers to be unable to be notified to call accept after > >>>> the successful three-way handshake. > >>>> > >>>> Skb from cpu1 that fails at tcp_paws_check (which it could have > >>>> succeeded) will not be able to enter the TCP_ESTABLISHED state, and > >>>> therefore parent->sk_data_ready(parent) will not be triggered, and skb > >>>> from cpu2 can complete the three-way handshake, but there is also no way > >>>> to call parent->sk_data_ready(parent) to notify the upper layer, which > >>>> will result > >>>> in the upper layer not being able to sense and call accept to obtain the > >>>> nsk. > >>>> > >>>> cpu1 cpu2 > >>>> tcp_check_req > >>>> tcp_check_req > >>>> req->ts_recent = tmp_opt.rcv_tsval = t1 > >>>> req->ts_recent=tmp_opt.rcv_tsval= t2 > >>>> > >>>> newsk->ts_recent = req->ts_recent = t2 // t1 < t2 > >>>> tcp_child_process > >>>> tcp_rcv_state_process > >>>> tcp_validate_incoming > >>>> tcp_paws_check // failed > >>>> parent->sk_data_ready(parent); // will not be called > >>>> tcp_v4_do_rcv > >>>> tcp_rcv_state_process // Complete the three-way handshake > >>>> // missing parent->sk_data_ready(parent); > >>> > >>> IIUC, the ack received from cpu1 triggered calling > >>> inet_csk_complete_hashdance() so its state transited from > >>> TCP_NEW_SYN_RECV to TCP_SYN_RECV, right? If so, the reason why not > >>> call sk_data_ready() if the skb entered into tcp_child_process() is > >>> that its state failed to transit to TCP_ESTABLISHED? > >>> > >> Yes, because it didn't switch to TCP_ESTABLISHED > >>> Here is another question. How did the skb on the right side enter into > >>> tcp_v4_do_rcv() after entering tcp_check_req() if the state of sk > >>> which the skb belongs to is TCP_NEW_SYN_RECV? Could you elaborate more > >>> on this point? > >> Since cpu1 successfully created the child sock, cpu2 will return > >> null in tcp_check_req and req_stolen is set to true, so that it will > >> subsequently go to 'goto lookup' to re-process the packet, and at > >> this point, sk->sk_state is already in TCP_SYN_RECV state, and then > >> then tcp_v4_do_rcv is called. > > > > Now I can see what happened there. Perhaps it would be good to update > > the commit message > > in the next iteration. > Hi Jason, > > Thanks for the suggestion, I'll test it out and improve the commit > message to send v2. > > > > Another key information I notice is that the second lookup process > > loses the chance to call sk_data_ready() for its parent socket. It's > > the one of the main reasons that cause your application to be unable > > to get notified. Taking a rough look at tcp_rcv_state_process(), I > > think it's not easy to acquire the parent socket there and then call > > sk_data_ready() without modifying more codes compared to the current > > solution. It's a different solution in theory. > Yes, I have considered this fix before, but the complexity of the fix > would be higher. Agreed. > > > > If your new approach (like your previous reply) works, the following > > commit[1] will be reverted/overwritten. > I'm sorry, I may not have understood what you meant. Applying my fix, > commit[1] is still necessary because it doesn't solve the bug that > commit[1] fixes. can you explain in detail, why commit[1] will be > reverted/overwritten. Right, what I was trying to say is that the commit[1] explains how it happened which is quite similar to your case from the commit message while your approach can avoid the failure of calling sk_data_ready() on the basis of commit[1]. IIUC, you postpone updating the recent time of the skb received from cpu1 which could let this pass tcp_paws_check() and have the chance to call sk_data_ready(). Then the issue you reported will be solved. After your patch, we will not resort to a second lookup. I'm just murmuring alone. Sorry for my previous inaccurate/wrong words. The above is what I meant. So let it go. Please go ahead and post your patch :) Thanks, Jason > > Thanks, > Wang >
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index b089b08e9617..0208455f9eb8 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -878,6 +878,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, sock_rps_save_rxhash(child, skb); tcp_synack_rtt_meas(child, req); *req_stolen = !own_req; + if (own_req && tcp_sk(child)->rx_opt.tstamp_ok && + unlikely(tcp_sk(child)->rx_opt.ts_recent != tmp_opt.rcv_tsval)) + tcp_sk(child)->rx_opt.ts_recent = tmp_opt.rcv_tsval; + return inet_csk_complete_hashdance(sk, child, req, own_req); listen_overflow:
If two ack packets from a connection enter tcp_check_req at the same time through different cpu, it may happen that req->ts_recent is updated with with a more recent time and the skb with an older time creates a new sock, which will cause the tcp_validate_incoming check to fail. cpu1 cpu2 tcp_check_req tcp_check_req req->ts_recent = tmp_opt.rcv_tsval = t1 req->ts_recent = tmp_opt.rcv_tsval = t2 newsk->ts_recent = req->ts_recent = t2 // t1 < t2 tcp_child_process tcp_rcv_state_process tcp_validate_incoming tcp_paws_check if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win) // failed In tcp_check_req, restore ts_recent to this skb's to fix this bug. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Wang Hai <wanghai38@huawei.com> --- net/ipv4/tcp_minisocks.c | 4 ++++ 1 file changed, 4 insertions(+)