Message ID | 1605440628-1283-1-git-send-email-vfedorenko@novek.ru (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net/tls: missing received data after fast remote close | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | fail | Series targets non-next tree, but doesn't contain any Fixes tags |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 9 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
netdev/stable | success | Stable not CCed |
On Sun, 15 Nov 2020 14:43:48 +0300 Vadim Fedorenko wrote: > In case when tcp socket received FIN after some data and the > parser haven't started before reading data caller will receive > an empty buffer. This is pretty terse, too terse for me to understand.. > This behavior differs from plain TCP socket and > leads to special treating in user-space. Patch unpauses parser > directly if we have unparsed data in tcp receive queue. Sure, but why is the parser paused? Does it pause itself on FIN? > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c > index 2fe9e2c..4db6943 100644 > --- a/net/tls/tls_sw.c > +++ b/net/tls/tls_sw.c > @@ -1289,6 +1289,9 @@ static struct sk_buff *tls_wait_data(struct sock *sk, struct sk_psock *psock, > struct sk_buff *skb; > DEFINE_WAIT_FUNC(wait, woken_wake_function); > > + if (!ctx->recv_pkt && skb_queue_empty(&sk->sk_receive_queue)) > + __strp_unpause(&ctx->strp); > + > while (!(skb = ctx->recv_pkt) && sk_psock_queue_empty(psock)) { > if (sk->sk_err) { > *err = sock_error(sk);
On 17.11.2020 22:38, Jakub Kicinski wrote: > On Sun, 15 Nov 2020 14:43:48 +0300 Vadim Fedorenko wrote: >> In case when tcp socket received FIN after some data and the >> parser haven't started before reading data caller will receive >> an empty buffer. > This is pretty terse, too terse for me to understand.. The flow is simple. Server sends small amount of data right after the connection is configured and closes the connection. In this case receiver sees TLS Handshake data, configures TLS socket right after Change Cipher Spec record. While the configuration is in process, TCP socket receives small Application Data record, Encrypted Alert record and FIN packet. So the TCP socket changes sk_shutdown to RCV_SHUTDOWN and sk_flag with SK_DONE bit set. >> This behavior differs from plain TCP socket and >> leads to special treating in user-space. Patch unpauses parser >> directly if we have unparsed data in tcp receive queue. > Sure, but why is the parser paused? Does it pause itself on FIN? No, it doesn't start even once. The trace looks like: tcp_recvmsg is called tcp_recvmsg returns 1 (Change Cipher Spec record data) tls_setsockopt is called tls_setsockopt returns tls_recvmsg is called tls_recvmsg returns 0 __strp_recv is called stack __strp_recv+1 tcp_read_sock+169 strp_read_sock+104 strp_work+68 process_one_work+436 worker_thread+80 kthread+276 ret_from_fork+34tls_read_size called So it looks like strp_work was scheduled after tls_recvmsg and nothing triggered parser because all the data was received before tls_setsockopt ended the configuration process. > >> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c >> index 2fe9e2c..4db6943 100644 >> --- a/net/tls/tls_sw.c >> +++ b/net/tls/tls_sw.c >> @@ -1289,6 +1289,9 @@ static struct sk_buff *tls_wait_data(struct sock *sk, struct sk_psock *psock, >> struct sk_buff *skb; >> DEFINE_WAIT_FUNC(wait, woken_wake_function); >> >> + if (!ctx->recv_pkt && skb_queue_empty(&sk->sk_receive_queue)) >> + __strp_unpause(&ctx->strp); >> + >> while (!(skb = ctx->recv_pkt) && sk_psock_queue_empty(psock)) { >> if (sk->sk_err) { >> *err = sock_error(sk);
On Wed, 18 Nov 2020 00:50:48 +0000 Vadim Fedorenko wrote: > On 17.11.2020 22:38, Jakub Kicinski wrote: > > On Sun, 15 Nov 2020 14:43:48 +0300 Vadim Fedorenko wrote: > >> In case when tcp socket received FIN after some data and the > >> parser haven't started before reading data caller will receive > >> an empty buffer. > > This is pretty terse, too terse for me to understand.. > The flow is simple. Server sends small amount of data right after the > connection is configured and closes the connection. In this case > receiver sees TLS Handshake data, configures TLS socket right after > Change Cipher Spec record. While the configuration is in process, TCP > socket receives small Application Data record, Encrypted Alert record > and FIN packet. So the TCP socket changes sk_shutdown to RCV_SHUTDOWN > and sk_flag with SK_DONE bit set. Thanks! That's clear. This is a race, right, you can't trigger it reliably? BTW please feel free to add your cases to the tls selftest in tools/testing/selftests. > >> This behavior differs from plain TCP socket and > >> leads to special treating in user-space. Patch unpauses parser > >> directly if we have unparsed data in tcp receive queue. > > Sure, but why is the parser paused? Does it pause itself on FIN? > No, it doesn't start even once. The trace looks like: > > tcp_recvmsg is called > tcp_recvmsg returns 1 (Change Cipher Spec record data) > tls_setsockopt is called > tls_setsockopt returns > tls_recvmsg is called > tls_recvmsg returns 0 > __strp_recv is called > stack > __strp_recv+1 > tcp_read_sock+169 > strp_read_sock+104 > strp_work+68 > process_one_work+436 > worker_thread+80 > kthread+276 > ret_from_fork+34tls_read_size called > > So it looks like strp_work was scheduled after tls_recvmsg and > nothing triggered parser because all the data was received before > tls_setsockopt ended the configuration process. Um. That makes me think we need to flush_work() on the strparser after we configure rx tls, no? Or __unpause at the right time instead of dealing with the async nature of strp_check_rcv()?
On 18.11.2020 01:53, Jakub Kicinski wrote: > On Wed, 18 Nov 2020 00:50:48 +0000 Vadim Fedorenko wrote: >> On 17.11.2020 22:38, Jakub Kicinski wrote: >>> On Sun, 15 Nov 2020 14:43:48 +0300 Vadim Fedorenko wrote: >>>> In case when tcp socket received FIN after some data and the >>>> parser haven't started before reading data caller will receive >>>> an empty buffer. >>> This is pretty terse, too terse for me to understand.. >> The flow is simple. Server sends small amount of data right after the >> connection is configured and closes the connection. In this case >> receiver sees TLS Handshake data, configures TLS socket right after >> Change Cipher Spec record. While the configuration is in process, TCP >> socket receives small Application Data record, Encrypted Alert record >> and FIN packet. So the TCP socket changes sk_shutdown to RCV_SHUTDOWN >> and sk_flag with SK_DONE bit set. > Thanks! That's clear. This is a race, right, you can't trigger > it reliably? It is triggered in the test cases usually but nit always. To trigger this problem always I just add some delay before setsockopt() call. This delay leaves sometime for kernel to receive all the data from sender in test case. > BTW please feel free to add your cases to the tls selftest in > tools/testing/selftests. Sure, will add this case in selftests. > >>>> This behavior differs from plain TCP socket and >>>> leads to special treating in user-space. Patch unpauses parser >>>> directly if we have unparsed data in tcp receive queue. >>> Sure, but why is the parser paused? Does it pause itself on FIN? >> No, it doesn't start even once. The trace looks like: >> >> tcp_recvmsg is called >> tcp_recvmsg returns 1 (Change Cipher Spec record data) >> tls_setsockopt is called >> tls_setsockopt returns >> tls_recvmsg is called >> tls_recvmsg returns 0 >> __strp_recv is called >> stack >> __strp_recv+1 >> tcp_read_sock+169 >> strp_read_sock+104 >> strp_work+68 >> process_one_work+436 >> worker_thread+80 >> kthread+276 >> ret_from_fork+34tls_read_size called >> >> So it looks like strp_work was scheduled after tls_recvmsg and >> nothing triggered parser because all the data was received before >> tls_setsockopt ended the configuration process. > Um. That makes me think we need to flush_work() on the strparser after > we configure rx tls, no? Or __unpause at the right time instead of > dealing with the async nature of strp_check_rcv()? I'm not sure that flush_work() will do right way in this case. Because: 1. The work is idle after tls_sw_strparser_arm() 2. The strparser needs socket lock to do it's work - that could be a deadlock because setsockopt_conf already holds socket lock. I'm not sure that it's a good idea to unlock socket just to wait the strparser. The async nature of parser is OK for classic HTTPS server/client case because it's very good to have parsed record before actual call to recvmsg or splice_read is done. The code inside the loop in tls_wait_data is slow path - maybe just move the check and the __unpause in this slow path?
On Wed, 18 Nov 2020 02:47:24 +0000 Vadim Fedorenko wrote: > >>>> This behavior differs from plain TCP socket and > >>>> leads to special treating in user-space. Patch unpauses parser > >>>> directly if we have unparsed data in tcp receive queue. > >>> Sure, but why is the parser paused? Does it pause itself on FIN? > >> No, it doesn't start even once. The trace looks like: > >> > >> tcp_recvmsg is called > >> tcp_recvmsg returns 1 (Change Cipher Spec record data) > >> tls_setsockopt is called > >> tls_setsockopt returns > >> tls_recvmsg is called > >> tls_recvmsg returns 0 > >> __strp_recv is called > >> stack > >> __strp_recv+1 > >> tcp_read_sock+169 > >> strp_read_sock+104 > >> strp_work+68 > >> process_one_work+436 > >> worker_thread+80 > >> kthread+276 > >> ret_from_fork+34tls_read_size called > >> > >> So it looks like strp_work was scheduled after tls_recvmsg and > >> nothing triggered parser because all the data was received before > >> tls_setsockopt ended the configuration process. > > Um. That makes me think we need to flush_work() on the strparser after > > we configure rx tls, no? Or __unpause at the right time instead of > > dealing with the async nature of strp_check_rcv()? > I'm not sure that flush_work() will do right way in this case. Because: > 1. The work is idle after tls_sw_strparser_arm() Not sure what you mean by idle, it queues the work via strp_check_rcv(). > 2. The strparser needs socket lock to do it's work - that could be a > deadlock because setsockopt_conf already holds socket lock. I'm not > sure that it's a good idea to unlock socket just to wait the strparser. Ack, I meant in do_tls_setsockopt() after the lock is released. > The async nature of parser is OK for classic HTTPS server/client case > because it's very good to have parsed record before actual call to recvmsg > or splice_read is done. The code inside the loop in tls_wait_data is slow > path - maybe just move the check and the __unpause in this slow path? Yeah, looking closer this problem can arise after we start as well :/ How about we move the __unparse code into the loop tho? Seems like this could help with latency. Right now AFAICT we get a TCP socket ready notification, which wakes the waiter for no good reason and makes strparser queue its work, the work then will call tls_queue() -> data_ready waking the waiting thread second time this time with the record actually parsed. Did I get that right? Should the waiter not cancel the work on the first wake up and just kick of the parsing itself?
On 18.11.2020 16:23, Jakub Kicinski wrote: > On Wed, 18 Nov 2020 02:47:24 +0000 Vadim Fedorenko wrote: >>>>>> This behavior differs from plain TCP socket and >>>>>> leads to special treating in user-space. Patch unpauses parser >>>>>> directly if we have unparsed data in tcp receive queue. >>>>> Sure, but why is the parser paused? Does it pause itself on FIN? >>>> No, it doesn't start even once. The trace looks like: >>>> >>>> tcp_recvmsg is called >>>> tcp_recvmsg returns 1 (Change Cipher Spec record data) >>>> tls_setsockopt is called >>>> tls_setsockopt returns >>>> tls_recvmsg is called >>>> tls_recvmsg returns 0 >>>> __strp_recv is called >>>> stack >>>> __strp_recv+1 >>>> tcp_read_sock+169 >>>> strp_read_sock+104 >>>> strp_work+68 >>>> process_one_work+436 >>>> worker_thread+80 >>>> kthread+276 >>>> ret_from_fork+34tls_read_size called >>>> >>>> So it looks like strp_work was scheduled after tls_recvmsg and >>>> nothing triggered parser because all the data was received before >>>> tls_setsockopt ended the configuration process. >>> Um. That makes me think we need to flush_work() on the strparser after >>> we configure rx tls, no? Or __unpause at the right time instead of >>> dealing with the async nature of strp_check_rcv()? >> I'm not sure that flush_work() will do right way in this case. Because: >> 1. The work is idle after tls_sw_strparser_arm() > Not sure what you mean by idle, it queues the work via strp_check_rcv(). > >> 2. The strparser needs socket lock to do it's work - that could be a >> deadlock because setsockopt_conf already holds socket lock. I'm not >> sure that it's a good idea to unlock socket just to wait the strparser. > Ack, I meant in do_tls_setsockopt() after the lock is released. > >> The async nature of parser is OK for classic HTTPS server/client case >> because it's very good to have parsed record before actual call to recvmsg >> or splice_read is done. The code inside the loop in tls_wait_data is slow >> path - maybe just move the check and the __unpause in this slow path? > Yeah, looking closer this problem can arise after we start as well :/ > > How about we move the __unparse code into the loop tho? Seems like this > could help with latency. Right now AFAICT we get a TCP socket ready > notification, which wakes the waiter for no good reason and makes > strparser queue its work, the work then will call tls_queue() -> > data_ready waking the waiting thread second time this time with the > record actually parsed. > > Did I get that right? Should the waiter not cancel the work on the > first wake up and just kick of the parsing itself? I was thinking of the same solution too, but simple check of emptyness of socket's receive queue is not working in case when we have partial record in queue - __unpause will return without changing ctx->skb and still having positive value in socket queue and we will have blocked loop until new data is received or strp_abort_strp is fired because of timeout. If you could suggest proper condition to break the loop it would be great. Or probably I misunderstood what loop did you mean exactly?
On Wed, 18 Nov 2020 20:51:30 +0000 Vadim Fedorenko wrote: > >> The async nature of parser is OK for classic HTTPS server/client case > >> because it's very good to have parsed record before actual call to recvmsg > >> or splice_read is done. The code inside the loop in tls_wait_data is slow > >> path - maybe just move the check and the __unpause in this slow path? > > Yeah, looking closer this problem can arise after we start as well :/ > > > > How about we move the __unparse code into the loop tho? Seems like this > > could help with latency. Right now AFAICT we get a TCP socket ready > > notification, which wakes the waiter for no good reason and makes > > strparser queue its work, the work then will call tls_queue() -> > > data_ready waking the waiting thread second time this time with the > > record actually parsed. > > > > Did I get that right? Should the waiter not cancel the work on the > > first wake up and just kick of the parsing itself? > I was thinking of the same solution too, but simple check of emptyness of > socket's receive queue is not working in case when we have partial record > in queue - __unpause will return without changing ctx->skb and still having > positive value in socket queue and we will have blocked loop until new data > is received or strp_abort_strp is fired because of timeout. If you could > suggest proper condition to break the loop it would be great. > > Or probably I misunderstood what loop did you mean exactly? Damn, you may be seeing some problem I'm missing again ;) Running __unparse can be opportunistic, if it doesn't parse anything that's fine. I was thinking: diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 95ab5545a931..6478bd968506 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1295,6 +1295,10 @@ static struct sk_buff *tls_wait_data(struct sock *sk, struct sk_psock *psock, return NULL; } + __strp_unpause(&ctx->strp); + if (ctx->recv_pkt) + return ctx->recv_pkt; + if (sk->sk_shutdown & RCV_SHUTDOWN) return NULL; Optionally it would be nice if unparse cancelled the work if it managed to parse the record out.
On 18.11.2020 23:39, Jakub Kicinski wrote: > On Wed, 18 Nov 2020 20:51:30 +0000 Vadim Fedorenko wrote: >>>> The async nature of parser is OK for classic HTTPS server/client case >>>> because it's very good to have parsed record before actual call to recvmsg >>>> or splice_read is done. The code inside the loop in tls_wait_data is slow >>>> path - maybe just move the check and the __unpause in this slow path? >>> Yeah, looking closer this problem can arise after we start as well :/ >>> >>> How about we move the __unparse code into the loop tho? Seems like this >>> could help with latency. Right now AFAICT we get a TCP socket ready >>> notification, which wakes the waiter for no good reason and makes >>> strparser queue its work, the work then will call tls_queue() -> >>> data_ready waking the waiting thread second time this time with the >>> record actually parsed. >>> >>> Did I get that right? Should the waiter not cancel the work on the >>> first wake up and just kick of the parsing itself? >> I was thinking of the same solution too, but simple check of emptyness of >> socket's receive queue is not working in case when we have partial record >> in queue - __unpause will return without changing ctx->skb and still having >> positive value in socket queue and we will have blocked loop until new data >> is received or strp_abort_strp is fired because of timeout. If you could >> suggest proper condition to break the loop it would be great. >> >> Or probably I misunderstood what loop did you mean exactly? > Damn, you may be seeing some problem I'm missing again ;) Running > __unparse can be opportunistic, if it doesn't parse anything that's > fine. I was thinking: > > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c > index 95ab5545a931..6478bd968506 100644 > --- a/net/tls/tls_sw.c > +++ b/net/tls/tls_sw.c > @@ -1295,6 +1295,10 @@ static struct sk_buff *tls_wait_data(struct sock *sk, struct sk_psock *psock, > return NULL; > } > > + __strp_unpause(&ctx->strp); > + if (ctx->recv_pkt) > + return ctx->recv_pkt; > + > if (sk->sk_shutdown & RCV_SHUTDOWN) > return NULL; > > Optionally it would be nice if unparse cancelled the work if it managed > to parse the record out. Oh, simple and fine solution. But is it better to unpause parser conditionally when there is something in the socket queue? Otherwise this call will be just wasting cycles. Maybe like this: diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 2fe9e2c..97c5f6e 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1295,6 +1295,12 @@ static struct sk_buff *tls_wait_data(struct sock *sk, struct sk_psock *psock, return NULL; } + if (!skb_queue_empty(&sk->sk_receive_queue)) { + __strp_unpause(&ctx->strp); + if (ctx->recv_pkt) + return ctx->recv_pkt; + } + if (sk->sk_shutdown & RCV_SHUTDOWN) return NULL;
On Thu, 19 Nov 2020 00:26:52 +0000 Vadim Fedorenko wrote: > > Damn, you may be seeing some problem I'm missing again ;) Running > > __unparse can be opportunistic, if it doesn't parse anything that's > > fine. I was thinking: > > > > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c > > index 95ab5545a931..6478bd968506 100644 > > --- a/net/tls/tls_sw.c > > +++ b/net/tls/tls_sw.c > > @@ -1295,6 +1295,10 @@ static struct sk_buff *tls_wait_data(struct sock *sk, struct sk_psock *psock, > > return NULL; > > } > > > > + __strp_unpause(&ctx->strp); > > + if (ctx->recv_pkt) > > + return ctx->recv_pkt; > > + > > if (sk->sk_shutdown & RCV_SHUTDOWN) > > return NULL; > > > > Optionally it would be nice if unparse cancelled the work if it managed > > to parse the record out. > Oh, simple and fine solution. But is it better to unpause parser conditionally when > there is something in the socket queue? Otherwise this call will be just wasting > cycles. Maybe like this: > > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c > index 2fe9e2c..97c5f6e 100644 > --- a/net/tls/tls_sw.c > +++ b/net/tls/tls_sw.c > @@ -1295,6 +1295,12 @@ static struct sk_buff *tls_wait_data(struct sock *sk, > struct sk_psock *psock, > return NULL; > } > > + if (!skb_queue_empty(&sk->sk_receive_queue)) { > + __strp_unpause(&ctx->strp); > + if (ctx->recv_pkt) > + return ctx->recv_pkt; > + } > + > if (sk->sk_shutdown & RCV_SHUTDOWN) > return NULL; > LGTM!
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 2fe9e2c..4db6943 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1289,6 +1289,9 @@ static struct sk_buff *tls_wait_data(struct sock *sk, struct sk_psock *psock, struct sk_buff *skb; DEFINE_WAIT_FUNC(wait, woken_wake_function); + if (!ctx->recv_pkt && skb_queue_empty(&sk->sk_receive_queue)) + __strp_unpause(&ctx->strp); + while (!(skb = ctx->recv_pkt) && sk_psock_queue_empty(psock)) { if (sk->sk_err) { *err = sock_error(sk);
In case when tcp socket received FIN after some data and the parser haven't started before reading data caller will receive an empty buffer. This behavior differs from plain TCP socket and leads to special treating in user-space. Patch unpauses parser directly if we have unparsed data in tcp receive queue. Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> --- net/tls/tls_sw.c | 3 +++ 1 file changed, 3 insertions(+)