mbox series

[net-next,0/3] net/smc: Improvements for TCP_CORK and sendfile()

Message ID 20220130180256.28303-1-tonylu@linux.alibaba.com (mailing list archive)
Headers show
Series net/smc: Improvements for TCP_CORK and sendfile() | expand

Message

Tony Lu Jan. 30, 2022, 6:02 p.m. UTC
Currently, SMC use default implement for syscall sendfile() [1], which
is wildly used in nginx and big data sences. Usually, applications use
sendfile() with TCP_CORK:

fstat(20, {st_mode=S_IFREG|0644, st_size=4096, ...}) = 0
setsockopt(19, SOL_TCP, TCP_CORK, [1], 4) = 0
writev(19, [{iov_base="HTTP/1.1 200 OK\r\nServer: nginx/1"..., iov_len=240}], 1) = 240
sendfile(19, 20, [0] => [4096], 4096)   = 4096
close(20)                               = 0
setsockopt(19, SOL_TCP, TCP_CORK, [0], 4) = 0

The above is an example of Nginx, when sendfile() on, Nginx first
enables TCP_CORK, write headers, the data will not be sent. Then call
sendfile(), it reads file and write to sndbuf. When TCP_CORK is cleared,
all pending data is sent out.

The performance of the default implement of sendfile is lower than when
it is off. After investigation, it shows two parts to improve:
- unnecessary lock contention of delayed work
- less data per send than when sendfile off

Patch #1 tries to reduce lock_sock() contention in smc_tx_work().
Patch #2 removes timed work for corking, and let applications control
it. See TCP_CORK [2] MSG_MORE [3].
Patch #3 adds MSG_SENDPAGE_NOTLAST for corking more data when
sendfile().

Test environments:
- CPU Intel Xeon Platinum 8 core, mem 32 GiB, nic Mellanox CX4
- socket sndbuf / rcvbuf: 16384 / 131072 bytes
- server: smc_run nginx
- client: smc_run ./wrk -c 100 -t 2 -d 30 http://192.168.100.1:8080/4k.html
- payload: 4KB local disk file

Items                     QPS
sendfile off        272477.10
sendfile on (orig)  223622.79
sendfile on (this)  395847.21

This benchmark shows +45.28% improvement compared with sendfile off, and
+77.02% compared with original sendfile implement.

[1] https://man7.org/linux/man-pages/man2/sendfile.2.html
[2] https://linux.die.net/man/7/tcp
[3] https://man7.org/linux/man-pages/man2/send.2.html

Tony Lu (3):
  net/smc: Send directly when TCP_CORK is cleared
  net/smc: Remove corked dealyed work
  net/smc: Cork when sendpage with MSG_SENDPAGE_NOTLAST flag

 net/smc/af_smc.c |  8 ++++---
 net/smc/smc_tx.c | 59 ++++++++++++++++++++++++++++++++----------------
 net/smc/smc_tx.h |  3 +++
 3 files changed, 47 insertions(+), 23 deletions(-)

Comments

Jakub Kicinski Jan. 31, 2022, 7:42 p.m. UTC | #1
On Mon, 31 Jan 2022 02:02:54 +0800 Tony Lu wrote:
> Currently, SMC use default implement for syscall sendfile() [1], which
> is wildly used in nginx and big data sences. Usually, applications use
> sendfile() with TCP_CORK:
> 
> fstat(20, {st_mode=S_IFREG|0644, st_size=4096, ...}) = 0
> setsockopt(19, SOL_TCP, TCP_CORK, [1], 4) = 0
> writev(19, [{iov_base="HTTP/1.1 200 OK\r\nServer: nginx/1"..., iov_len=240}], 1) = 240
> sendfile(19, 20, [0] => [4096], 4096)   = 4096
> close(20)                               = 0
> setsockopt(19, SOL_TCP, TCP_CORK, [0], 4) = 0
> 
> The above is an example of Nginx, when sendfile() on, Nginx first
> enables TCP_CORK, write headers, the data will not be sent. Then call
> sendfile(), it reads file and write to sndbuf. When TCP_CORK is cleared,
> all pending data is sent out.
> 
> The performance of the default implement of sendfile is lower than when
> it is off. After investigation, it shows two parts to improve:
> - unnecessary lock contention of delayed work
> - less data per send than when sendfile off
> 
> Patch #1 tries to reduce lock_sock() contention in smc_tx_work().
> Patch #2 removes timed work for corking, and let applications control
> it. See TCP_CORK [2] MSG_MORE [3].
> Patch #3 adds MSG_SENDPAGE_NOTLAST for corking more data when
> sendfile().
> 
> Test environments:
> - CPU Intel Xeon Platinum 8 core, mem 32 GiB, nic Mellanox CX4
> - socket sndbuf / rcvbuf: 16384 / 131072 bytes
> - server: smc_run nginx
> - client: smc_run ./wrk -c 100 -t 2 -d 30 http://192.168.100.1:8080/4k.html
> - payload: 4KB local disk file
> 
> Items                     QPS
> sendfile off        272477.10
> sendfile on (orig)  223622.79
> sendfile on (this)  395847.21
> 
> This benchmark shows +45.28% improvement compared with sendfile off, and
> +77.02% compared with original sendfile implement.
> 
> [1] https://man7.org/linux/man-pages/man2/sendfile.2.html
> [2] https://linux.die.net/man/7/tcp
> [3] https://man7.org/linux/man-pages/man2/send.2.html

I believe this is now commit 780bf05f44c2 ("Merge branch
'smc-improvements'") in net-next. Thanks!