From patchwork Thu May 12 05:06:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 12847037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D936AC433F5 for ; Thu, 12 May 2022 05:07:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349049AbiELFHo (ORCPT ); Thu, 12 May 2022 01:07:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348960AbiELFHk (ORCPT ); Thu, 12 May 2022 01:07:40 -0400 Received: from mail.sberdevices.ru (mail.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC0A63916D; Wed, 11 May 2022 22:07:38 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mail.sberdevices.ru (Postfix) with ESMTP id 457255FD07; Thu, 12 May 2022 08:07:36 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1652332056; bh=uWyeWhV+o14znQk2Sqg83unLL60ICf31MFs9KoLPASs=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=pGLINxJ/56ycahbSatBg9ZCY4q8VwMiaMCH2kC7QP7n6A4woJNffqyiqyLSmGedMh R7MgKWCcHi3x6cHqr2IB/mZ2j82Benqnq1N48czy6pN/rRjzVlH3huWEWzup9vBnfr aztnighBWkjJ/x3QfrLjXnsV/scazcNHBAnAzNEyEkQe26SiEOFG4NIZZxeXQjDJk8 MiH50Roq/+evLRhZR2ruhJ1tCMSq+g5KooNdUP5Nz3YB5STiiqI1X1egbZ9sXI/ZT9 N5uMENn7xq6B9UPRboMqrASZr0wSzi+6gWtQY38Fnz1OcKqPwt6YCMMkA26HMCA7O2 E2pFB71B8do9w== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mail.sberdevices.ru (Postfix) with ESMTP; Thu, 12 May 2022 08:07:35 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , "Jakub Kicinski" , Paolo Abeni CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 1/8] virtio/vsock: rework packet allocation logic Thread-Topic: [RFC PATCH v1 1/8] virtio/vsock: rework packet allocation logic Thread-Index: AQHYZb4YktEgwRFewE2DI8rAWLS0Dw== Date: Thu, 12 May 2022 05:06:52 +0000 Message-ID: <3a8d9936-fc88-62ce-8c35-060b7d09b1bc@sberdevices.ru> In-Reply-To: <7cdcb1e1-7c97-c054-19cf-5caeacae981d@sberdevices.ru> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <4A4315244458694AA50C5403D95B603F@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2022/05/12 02:55:00 #19424207 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-State: RFC To support zerocopy receive, packet's buffer allocation is changed: for buffers which could be mapped to user's vma we can't use 'kmalloc()'(as kernel restricts to map slab pages to user's vma) and raw buddy allocator now called. But, for tx packets(such packets won't be mapped to user), previous 'kmalloc()' way is used, but with special flag in packet's structure which allows to distinguish between 'kmalloc()' and raw pages buffers. Signed-off-by: Arseniy Krasnov --- include/linux/virtio_vsock.h | 1 + net/vmw_vsock/virtio_transport.c | 8 ++++++-- net/vmw_vsock/virtio_transport_common.c | 9 ++++++++- 3 files changed, 15 insertions(+), 3 deletions(-) diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h index 35d7eedb5e8e..d02cb7aa922f 100644 --- a/include/linux/virtio_vsock.h +++ b/include/linux/virtio_vsock.h @@ -50,6 +50,7 @@ struct virtio_vsock_pkt { u32 off; bool reply; bool tap_delivered; + bool slab_buf; }; struct virtio_vsock_pkt_info { diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index fb3302fff627..43b7b09b4a0a 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -254,16 +254,20 @@ static void virtio_vsock_rx_fill(struct virtio_vsock *vsock) vq = vsock->vqs[VSOCK_VQ_RX]; do { + struct page *buf_page; + pkt = kzalloc(sizeof(*pkt), GFP_KERNEL); if (!pkt) break; - pkt->buf = kmalloc(buf_len, GFP_KERNEL); - if (!pkt->buf) { + buf_page = alloc_page(GFP_KERNEL); + + if (!buf_page) { virtio_transport_free_pkt(pkt); break; } + pkt->buf = page_to_virt(buf_page); pkt->buf_len = buf_len; pkt->len = buf_len; diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index ec2c2afbf0d0..278567f748f2 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -69,6 +69,7 @@ virtio_transport_alloc_pkt(struct virtio_vsock_pkt_info *info, if (!pkt->buf) goto out_pkt; + pkt->slab_buf = true; pkt->buf_len = len; err = memcpy_from_msg(pkt->buf, info->msg, len); @@ -1342,7 +1343,13 @@ EXPORT_SYMBOL_GPL(virtio_transport_recv_pkt); void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt) { - kfree(pkt->buf); + if (pkt->buf_len) { + if (pkt->slab_buf) + kfree(pkt->buf); + else + free_pages(buf, get_order(pkt->buf_len)); + } + kfree(pkt); } EXPORT_SYMBOL_GPL(virtio_transport_free_pkt); From patchwork Thu May 12 05:09:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 12847038 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D3BBC433EF for ; Thu, 12 May 2022 05:10:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349148AbiELFKK (ORCPT ); Thu, 12 May 2022 01:10:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53416 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242098AbiELFKI (ORCPT ); Thu, 12 May 2022 01:10:08 -0400 Received: from mail.sberdevices.ru (mail.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C44B1D80A3; Wed, 11 May 2022 22:10:04 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mail.sberdevices.ru (Postfix) with ESMTP id DA33F5FD07; Thu, 12 May 2022 08:10:02 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1652332202; bh=hWa5wc59WhjDEzgjj/klRlZ7K1TFeSjuuN1vkJ+Q9iM=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=jCB4baASDxwDw5GXa4bdVExVXKvJvmoCO1AtQGwuMqN96IOAkx8SPgbzaqHsuX36T j+NBBCc1TjkAGeLX6PDi8b1IC8mXG62/BDo0Mcgf+liUBZ87Z97wQ2oRFHxriUrC31 mAylAeV9XZYmiC4/dEgEJnZI1tX4pwwUZ+w4pakX6n8wvAcJRCSjkLJlOGIYzNQPf9 75X1MAQnShc/Zd26mciTl2xfuKbRWXzGtjP3AEeQi0rFaVswThas8GkBvbk4LnbYCQ AmHdjnpanf9zPT3hTqf+QMHhNmcP+QPhu1puqW5+Yobk3c0qkDpnHQ9OeDpV9J5ic9 ES77S9D2KVtLQ== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mail.sberdevices.ru (Postfix) with ESMTP; Thu, 12 May 2022 08:10:02 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , "Jakub Kicinski" , Paolo Abeni CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel Subject: [RFC PATCH v1 2/8] vhost/vsock: rework packet allocation logic Thread-Topic: [RFC PATCH v1 2/8] vhost/vsock: rework packet allocation logic Thread-Index: AQHYZb5v345nmTI7vUa/Rih0CKpclA== Date: Thu, 12 May 2022 05:09:19 +0000 Message-ID: <988e9e3c-7993-d6e2-626d-deb46248ed9f@sberdevices.ru> In-Reply-To: <7cdcb1e1-7c97-c054-19cf-5caeacae981d@sberdevices.ru> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <947E2EBC49D02E47ACB875C40CBF0AF2@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2022/05/12 02:55:00 #19424207 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-State: RFC For packets received from virtio RX queue, use buddy allocator instead of 'kmalloc()' to be able to insert such pages to user provided vma. Single call to 'copy_from_iter()' replaced with per-page loop. Signed-off-by: Arseniy Krasnov --- drivers/vhost/vsock.c | 49 ++++++++++++++++++++++++++++++++++++------- 1 file changed, 41 insertions(+), 8 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 37f0b4274113..157798985389 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -360,6 +360,9 @@ vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq, struct iov_iter iov_iter; size_t nbytes; size_t len; + struct page *buf_page; + ssize_t pkt_len; + int page_idx; if (in != 0) { vq_err(vq, "Expected 0 input buffers, got %u\n", in); @@ -393,20 +396,50 @@ vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq, return NULL; } - pkt->buf = kmalloc(pkt->len, GFP_KERNEL); - if (!pkt->buf) { + /* This creates memory overrun, as we allocate + * at least one page for each packet. + */ + buf_page = alloc_pages(GFP_KERNEL, get_order(pkt->len)); + + if (buf_page == NULL) { kfree(pkt); return NULL; } + pkt->buf = page_to_virt(buf_page); pkt->buf_len = pkt->len; - nbytes = copy_from_iter(pkt->buf, pkt->len, &iov_iter); - if (nbytes != pkt->len) { - vq_err(vq, "Expected %u byte payload, got %zu bytes\n", - pkt->len, nbytes); - virtio_transport_free_pkt(pkt); - return NULL; + page_idx = 0; + pkt_len = pkt->len; + + /* As allocated pages are not mapped, process + * pages one by one. + */ + while (pkt_len > 0) { + void *mapped; + size_t to_copy; + + mapped = kmap(buf_page + page_idx); + + if (mapped == NULL) { + virtio_transport_free_pkt(pkt); + return NULL; + } + + to_copy = min(pkt_len, ((ssize_t)PAGE_SIZE)); + + nbytes = copy_from_iter(mapped, to_copy, &iov_iter); + if (nbytes != to_copy) { + vq_err(vq, "Expected %zu byte payload, got %zu bytes\n", + to_copy, nbytes); + virtio_transport_free_pkt(pkt); + return NULL; + } + + kunmap(mapped); + + pkt_len -= to_copy; + page_idx++; } return pkt; From patchwork Thu May 12 05:12:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 12847041 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34061C433F5 for ; Thu, 12 May 2022 05:13:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349257AbiELFN3 (ORCPT ); Thu, 12 May 2022 01:13:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345766AbiELFN1 (ORCPT ); Thu, 12 May 2022 01:13:27 -0400 Received: from mail.sberdevices.ru (mail.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B1D654183; Wed, 11 May 2022 22:13:25 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mail.sberdevices.ru (Postfix) with ESMTP id 979825FD06; Thu, 12 May 2022 08:13:23 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1652332403; bh=WEpoyG/FkCWYqdlSPYBa69mJl1hJm+eMFpJP7yLV9Dk=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=RN/PdgDxlPxCj5VoYog3nurPwkzOg86UAXtuLyazDIBRD3tBB+2S7k9QiXWQ7oyAb EwAxwRYudYiUZqqbi0hoxVonm4I4nQ3Z3S/0DSliezS2IhQtA2MO978rzmiQZ6x/BX vYynpzyO0O9AKITJfK7AjXtySYCUW+wkbEC0nzD2R4PUPLNJ7/rx32MOlq1j/mkDC2 E/49hBCSjn/V5DSoiB4xJj9hbHncqJmQu8lE5m/TYNcely0KZTpqwij4Gr6Lnc2m5w bfZev9CukZ+Vf9hQMHQg30qWihuD+2yiZ51T5/oPtLIRUknK/fediWvIbZpbMP9Fu5 f0hMEoIiDbi5Q== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mail.sberdevices.ru (Postfix) with ESMTP; Thu, 12 May 2022 08:13:23 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , "Jakub Kicinski" , Paolo Abeni CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel , Arseniy Krasnov , Krasnov Arseniy Subject: [RFC PATCH v1 3/8] af_vsock: add zerocopy receive logic Thread-Topic: [RFC PATCH v1 3/8] af_vsock: add zerocopy receive logic Thread-Index: AQHYZb7nBASrNXtuFUaX+/Xomq2sVQ== Date: Thu, 12 May 2022 05:12:40 +0000 Message-ID: <44d2404f-dc4f-f42c-1235-2ad7f537a030@sberdevices.ru> In-Reply-To: <7cdcb1e1-7c97-c054-19cf-5caeacae981d@sberdevices.ru> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <6C71C701713B9140B989C44D085F9EF4@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2022/05/12 02:55:00 #19424207 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-State: RFC This: 1) Adds callback for 'mmap()' call on socket. It checks vm area flags and sets vm area ops. 2) Adds special 'getsockopt()' case which calls transport zerocopy callback. Input argument is vm area address. Signed-off-by: Arseniy Krasnov --- include/net/af_vsock.h | 4 +++ include/uapi/linux/vm_sockets.h | 2 ++ net/vmw_vsock/af_vsock.c | 61 +++++++++++++++++++++++++++++++++ 3 files changed, 67 insertions(+) diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h index ab207677e0a8..d0aefb9ee4cf 100644 --- a/include/net/af_vsock.h +++ b/include/net/af_vsock.h @@ -135,6 +135,10 @@ struct vsock_transport { bool (*stream_is_active)(struct vsock_sock *); bool (*stream_allow)(u32 cid, u32 port); + int (*zerocopy_dequeue)(struct vsock_sock *vsk, + struct vm_area_struct *vma, + unsigned long addr); + /* SEQ_PACKET. */ ssize_t (*seqpacket_dequeue)(struct vsock_sock *vsk, struct msghdr *msg, int flags); diff --git a/include/uapi/linux/vm_sockets.h b/include/uapi/linux/vm_sockets.h index c60ca33eac59..62aec51a2bc3 100644 --- a/include/uapi/linux/vm_sockets.h +++ b/include/uapi/linux/vm_sockets.h @@ -83,6 +83,8 @@ #define SO_VM_SOCKETS_CONNECT_TIMEOUT_NEW 8 +#define SO_VM_SOCKETS_ZEROCOPY 9 + #if !defined(__KERNEL__) #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__)) #define SO_VM_SOCKETS_CONNECT_TIMEOUT SO_VM_SOCKETS_CONNECT_TIMEOUT_OLD diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 38baeb189d4e..3f98477ea546 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1652,6 +1652,42 @@ static int vsock_connectible_setsockopt(struct socket *sock, return err; } +static const struct vm_operations_struct afvsock_vm_ops = { +}; + +static int vsock_recv_zerocopy(struct socket *sock, + unsigned long address) +{ + struct sock *sk = sock->sk; + struct vsock_sock *vsk = vsock_sk(sk); + struct vm_area_struct *vma; + const struct vsock_transport *transport; + int res; + + transport = vsk->transport; + + if (!transport->zerocopy_dequeue) + return -EOPNOTSUPP; + + lock_sock(sk); + mmap_write_lock(current->mm); + + vma = vma_lookup(current->mm, address); + + if (!vma || vma->vm_ops != &afvsock_vm_ops) { + mmap_write_unlock(current->mm); + release_sock(sk); + return -EINVAL; + } + + res = transport->zerocopy_dequeue(vsk, vma, address); + + mmap_write_unlock(current->mm); + release_sock(sk); + + return res; +} + static int vsock_connectible_getsockopt(struct socket *sock, int level, int optname, char __user *optval, @@ -1696,6 +1732,17 @@ static int vsock_connectible_getsockopt(struct socket *sock, lv = sock_get_timeout(vsk->connect_timeout, &v, optname == SO_VM_SOCKETS_CONNECT_TIMEOUT_OLD); break; + case SO_VM_SOCKETS_ZEROCOPY: { + unsigned long vma_addr; + + if (len < sizeof(vma_addr)) + return -EINVAL; + + if (copy_from_user(&vma_addr, optval, sizeof(vma_addr))) + return -EFAULT; + + return vsock_recv_zerocopy(sock, vma_addr); + } default: return -ENOPROTOOPT; @@ -2124,6 +2171,19 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, return err; } +static int afvsock_mmap(struct file *file, struct socket *sock, + struct vm_area_struct *vma) +{ + if (vma->vm_flags & (VM_WRITE | VM_EXEC)) + return -EPERM; + + vma->vm_flags &= ~(VM_MAYWRITE | VM_MAYEXEC); + vma->vm_flags |= (VM_MIXEDMAP); + vma->vm_ops = &afvsock_vm_ops; + + return 0; +} + static const struct proto_ops vsock_stream_ops = { .family = PF_VSOCK, .owner = THIS_MODULE, @@ -2143,6 +2203,7 @@ static const struct proto_ops vsock_stream_ops = { .recvmsg = vsock_connectible_recvmsg, .mmap = sock_no_mmap, .sendpage = sock_no_sendpage, + .mmap = afvsock_mmap, }; static const struct proto_ops vsock_seqpacket_ops = { From patchwork Thu May 12 05:14:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 12847042 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D93F4C433F5 for ; Thu, 12 May 2022 05:15:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349359AbiELFPm (ORCPT ); Thu, 12 May 2022 01:15:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43238 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344371AbiELFPi (ORCPT ); Thu, 12 May 2022 01:15:38 -0400 Received: from mail.sberdevices.ru (mail.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B22F515C1BF; Wed, 11 May 2022 22:15:34 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mail.sberdevices.ru (Postfix) with ESMTP id D39C05FD06; Thu, 12 May 2022 08:15:32 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1652332532; bh=Izj6qmaDftSm7vCIdQAAbOT+b+SthQ6rdLkGKpOl1fI=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=so8ZFNggqp6OjX1gvmCVO/4aRAuI6+YyHs2LS/j0EP3v8SRTm/YJnoPxrjxJGqurw GNo25wlS0BXmWATUTjFG6V3RqmrjTEPwSgw0qGLGQUduiH+IjlNMs1mLH3CYOQaZuL oTN13QtXZHmQHO4DLjHzAXTlo0FWJ8gT9KHtryZbEvxNksWMtpM1mGjC+If9RjGWGi MREsvDTqxMs3iJ9bT9ysGjFTKEHjpE6KFxGsyfPsl34FTRNIUQsJpnyHl20tiZIZC2 kGeqricGCcSkEf2czCdJyPOi3Fw/enrWg0CXyNdnWIO4KTOgTCbkwM58ONL5ixus7J Or4Wgc5UdvJBg== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mail.sberdevices.ru (Postfix) with ESMTP; Thu, 12 May 2022 08:15:32 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , "Jakub Kicinski" , Paolo Abeni CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel , Krasnov Arseniy , Arseniy Krasnov Subject: [RFC PATCH v1 4/8] virtio/vsock: add transport zerocopy callback Thread-Topic: [RFC PATCH v1 4/8] virtio/vsock: add transport zerocopy callback Thread-Index: AQHYZb800Rp4tFVLaEqKpJtt6b7xqQ== Date: Thu, 12 May 2022 05:14:49 +0000 Message-ID: <9c1fa0ba-76b3-6214-4b9f-879bf932fa9c@sberdevices.ru> In-Reply-To: <7cdcb1e1-7c97-c054-19cf-5caeacae981d@sberdevices.ru> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <86554AE6F041E6488D96A4C79387CA1A@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2022/05/12 02:55:00 #19424207 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-State: RFC This adds transport callback which processes rx queue of socket and instead of copying data to user provided buffer, it inserts data pages of each packet to user's vm area. Signed-off-by: Arseniy Krasnov --- include/linux/virtio_vsock.h | 4 + include/uapi/linux/virtio_vsock.h | 5 + net/vmw_vsock/virtio_transport_common.c | 195 +++++++++++++++++++++++- 3 files changed, 201 insertions(+), 3 deletions(-) diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h index d02cb7aa922f..47a68a2ea838 100644 --- a/include/linux/virtio_vsock.h +++ b/include/linux/virtio_vsock.h @@ -51,6 +51,7 @@ struct virtio_vsock_pkt { bool reply; bool tap_delivered; bool slab_buf; + bool split; }; struct virtio_vsock_pkt_info { @@ -131,6 +132,9 @@ int virtio_transport_dgram_bind(struct vsock_sock *vsk, struct sockaddr_vm *addr); bool virtio_transport_dgram_allow(u32 cid, u32 port); +int virtio_transport_zerocopy_dequeue(struct vsock_sock *vsk, + struct vm_area_struct *vma, + unsigned long addr); int virtio_transport_connect(struct vsock_sock *vsk); int virtio_transport_shutdown(struct vsock_sock *vsk, int mode); diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h index 64738838bee5..214ac9727307 100644 --- a/include/uapi/linux/virtio_vsock.h +++ b/include/uapi/linux/virtio_vsock.h @@ -66,6 +66,11 @@ struct virtio_vsock_hdr { __le32 fwd_cnt; } __attribute__((packed)); +struct virtio_vsock_usr_hdr { + u32 flags; + u32 len; +} __attribute__((packed)); + enum virtio_vsock_type { VIRTIO_VSOCK_TYPE_STREAM = 1, VIRTIO_VSOCK_TYPE_SEQPACKET = 2, diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 278567f748f2..3c7ac47a8672 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -347,6 +348,183 @@ virtio_transport_stream_do_peek(struct vsock_sock *vsk, return err; } +#define MAX_PAGES_TO_MAP 256 + +int virtio_transport_zerocopy_dequeue(struct vsock_sock *vsk, + struct vm_area_struct *vma, + unsigned long addr) +{ + struct virtio_vsock_sock *vvs = vsk->trans; + struct virtio_vsock_usr_hdr *usr_hdr_buffer; + unsigned long max_pages_to_insert; + unsigned long tmp_pages_inserted; + unsigned long pages_to_insert; + struct page *usr_hdr_page; + unsigned long vma_size; + struct page **pages; + int max_vma_pages; + int max_usr_hdrs; + int res; + int err; + int i; + + /* Only use VMA from first page. */ + if (vma->vm_start != addr) + return -EFAULT; + + vma_size = vma->vm_end - vma->vm_start; + + /* Too small vma(at least one page for headers + * and one page for data). + */ + if (vma_size < 2 * PAGE_SIZE) + return -EFAULT; + + /* Page for meta data. */ + usr_hdr_page = alloc_page(GFP_KERNEL); + + if (!usr_hdr_page) + return -EFAULT; + + pages = kmalloc_array(MAX_PAGES_TO_MAP, sizeof(pages[0]), GFP_KERNEL); + + if (!pages) + return -EFAULT; + + pages[pages_to_insert++] = usr_hdr_page; + + usr_hdr_buffer = page_to_virt(usr_hdr_page); + + err = 0; + + /* As we use first page for headers, so total number of + * pages for user is min between number of headers in + * first page and size of vma(in pages, except first page). + */ + max_usr_hdrs = PAGE_SIZE / sizeof(*usr_hdr_buffer); + max_vma_pages = (vma_size / PAGE_SIZE) - 1; + max_pages_to_insert = min(max_usr_hdrs, max_vma_pages); + + if (max_pages_to_insert > MAX_PAGES_TO_MAP) + max_pages_to_insert = MAX_PAGES_TO_MAP; + + spin_lock_bh(&vvs->rx_lock); + + while (!list_empty(&vvs->rx_queue) && + pages_to_insert < max_pages_to_insert) { + struct virtio_vsock_pkt *pkt; + ssize_t rest_data_bytes; + size_t moved_data_bytes; + unsigned long pg_offs; + + pkt = list_first_entry(&vvs->rx_queue, + struct virtio_vsock_pkt, list); + + /* This could happen, when packet was dequeued before + * by an ordinary 'read()' call. We can't handle such + * packet. Drop it. + */ + if (pkt->off % PAGE_SIZE) { + list_del(&pkt->list); + virtio_transport_dec_rx_pkt(vvs, pkt); + virtio_transport_free_pkt(pkt); + continue; + } + + rest_data_bytes = le32_to_cpu(pkt->hdr.len) - pkt->off; + + /* For packets, bigger than one page, split it's + * high order allocated buffer to 0 order pages. + * Otherwise 'vm_insert_pages()' will fail, for + * all pages except first. + */ + if (rest_data_bytes > PAGE_SIZE) { + /* High order buffer not split yet. */ + if (!pkt->split) { + split_page(virt_to_page(pkt->buf), + get_order(le32_to_cpu(pkt->hdr.len))); + pkt->split = true; + } + } + + pg_offs = pkt->off; + moved_data_bytes = 0; + + while (rest_data_bytes && + pages_to_insert < max_pages_to_insert) { + struct page *buf_page; + + buf_page = virt_to_page(pkt->buf + pg_offs); + + pages[pages_to_insert++] = buf_page; + /* Get reference to prevent this page being + * returned to page allocator when packet will + * be freed. Ref count will be 2. + */ + get_page(buf_page); + pg_offs += PAGE_SIZE; + + if (rest_data_bytes >= PAGE_SIZE) { + moved_data_bytes += PAGE_SIZE; + rest_data_bytes -= PAGE_SIZE; + } else { + moved_data_bytes += rest_data_bytes; + rest_data_bytes = 0; + } + } + + usr_hdr_buffer->flags = le32_to_cpu(pkt->hdr.flags); + usr_hdr_buffer->len = moved_data_bytes; + usr_hdr_buffer++; + + pkt->off = pg_offs; + + if (rest_data_bytes == 0) { + list_del(&pkt->list); + virtio_transport_dec_rx_pkt(vvs, pkt); + virtio_transport_free_pkt(pkt); + } + + /* Now ref count for all pages of packet is 1. */ + } + + /* Set last buffer empty(if we have one). */ + if (pages_to_insert - 1 < max_usr_hdrs) + usr_hdr_buffer->len = 0; + + spin_unlock_bh(&vvs->rx_lock); + + tmp_pages_inserted = pages_to_insert; + + res = vm_insert_pages(vma, addr, pages, &tmp_pages_inserted); + + if (res || tmp_pages_inserted) { + /* Failed to insert some pages, we have "partially" + * mapped vma. Do not return, set error code. This + * code will be returned to user. User needs to call + * 'madvise()/mmap()' to clear this vma. Anyway, + * references to all pages will to be dropped below. + */ + err = -EFAULT; + } + + /* Put reference for every page. */ + for (i = 0; i < pages_to_insert; i++) { + /* Ref count is 2 ('get_page()' + 'vm_insert_pages()' above). + * Put reference once, page will be returned to allocator + * after user's 'madvice()/munmap()' call(or it wasn't mapped + * if 'vm_insert_pages()' failed). + */ + put_page(pages[i]); + } + + virtio_transport_send_credit_update(vsk); + kfree(pages); + + return err; +} +EXPORT_SYMBOL_GPL(virtio_transport_zerocopy_dequeue); + static ssize_t virtio_transport_stream_do_dequeue(struct vsock_sock *vsk, struct msghdr *msg, @@ -1344,10 +1522,21 @@ EXPORT_SYMBOL_GPL(virtio_transport_recv_pkt); void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt) { if (pkt->buf_len) { - if (pkt->slab_buf) + if (pkt->slab_buf) { kfree(pkt->buf); - else - free_pages(buf, get_order(pkt->buf_len)); + } else { + unsigned int order = get_order(pkt->buf_len); + unsigned long buf = (unsigned long)pkt->buf; + + if (pkt->split) { + int i; + + for (i = 0; i < (1 << order); i++) + free_page(buf + i * PAGE_SIZE); + } else { + free_pages(buf, order); + } + } } kfree(pkt); From patchwork Thu May 12 05:17:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 12847045 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B0A4C433F5 for ; Thu, 12 May 2022 05:17:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349548AbiELFRu (ORCPT ); Thu, 12 May 2022 01:17:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242790AbiELFRs (ORCPT ); Thu, 12 May 2022 01:17:48 -0400 Received: from mail.sberdevices.ru (mail.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 527DE37A3B; Wed, 11 May 2022 22:17:46 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mail.sberdevices.ru (Postfix) with ESMTP id 661C85FD06; Thu, 12 May 2022 08:17:44 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1652332664; bh=42jDnDX08klVXS7FSnmeXaJxIYwIwhgFNvH0ZzzZhLE=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=H+Ybw6Csrazqt4x432PVXOPb1J7Z9p40cNcLl/G30tF51yRo5IFBpaG14STYemCS4 VF4YZl/jWoIP2mE5IG/4kQJtj0cc89e2EK77UE32K3UIZX4hlp1X3smOFRWGYnY/tn MKdMCs+FhxFFQ58Jv9USRedoShRaiqyq8VH9PEs0OsscwudIrXGPEe3VoST1T3l4yM w1BsNcP78QoZa+viEBqwOKVVEgPrDMX332TmBWU9yYniO0keZg6RlJAL2vtuP4s3xF g+JVAgg/CkFBR9B5f0Ff7ucgxwSe49wdZR1gndNncnS32KPl4HfjNRfWsZwQmpNBrE 2smAwEK1kGJdA== Received: from S-MS-EXCH02.sberdevices.ru (S-MS-EXCH02.sberdevices.ru [172.16.1.5]) by mail.sberdevices.ru (Postfix) with ESMTP; Thu, 12 May 2022 08:17:44 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , "Jakub Kicinski" , Paolo Abeni CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel , Krasnov Arseniy , Arseniy Krasnov Subject: [RFC PATCH v1 5/8] vhost/vsock: enable zerocopy callback Thread-Topic: [RFC PATCH v1 5/8] vhost/vsock: enable zerocopy callback Thread-Index: AQHYZb+Chv+MF/2wWEaC/fTa33nAOg== Date: Thu, 12 May 2022 05:17:01 +0000 Message-ID: In-Reply-To: <7cdcb1e1-7c97-c054-19cf-5caeacae981d@sberdevices.ru> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2022/05/12 02:55:00 #19424207 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-State: RFC This adds zerocopy callback to vhost transport. Signed-off-by: Arseniy Krasnov --- drivers/vhost/vsock.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 157798985389..93119d529fb0 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -484,6 +484,7 @@ static struct virtio_transport vhost_transport = { .stream_rcvhiwat = virtio_transport_stream_rcvhiwat, .stream_is_active = virtio_transport_stream_is_active, .stream_allow = virtio_transport_stream_allow, + .zerocopy_dequeue = virtio_transport_zerocopy_dequeue, .seqpacket_dequeue = virtio_transport_seqpacket_dequeue, .seqpacket_enqueue = virtio_transport_seqpacket_enqueue, From patchwork Thu May 12 05:18:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 12847046 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81FC2C433EF for ; Thu, 12 May 2022 05:19:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349519AbiELFTg (ORCPT ); Thu, 12 May 2022 01:19:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238423AbiELFTf (ORCPT ); Thu, 12 May 2022 01:19:35 -0400 Received: from mail.sberdevices.ru (mail.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 849B41ACFAD; Wed, 11 May 2022 22:19:33 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mail.sberdevices.ru (Postfix) with ESMTP id 8533E5FD06; Thu, 12 May 2022 08:19:31 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1652332771; bh=q2RcWNsdgZuAprNCwQFszNeiGtFEWlS8dk+02jxT2tA=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=smckDdrLTKHxtZhvHPL7axP5nLbobUR3Y4cLISMmnYb+LBwbjapbc+RPJQOPSJ/bU 06BeTURqEzjzKRUzepNhRppD4Z2odOblSELi2jfcD9dYQpo29PoZMwlAnwMOG7TEyn ibu+nPgh0zYUeNmPPj3NIGZYtt98a5lGzz6zufex+NxUiVXINFZ2H+uKpLKcpA/mhU 2Ln96+t7R94hHLqBN95OrhiFbcbPWb1tGjGnsRNCsyBVIKP/2llOTIo8IP4xmDG6hj rY4x6njPwVb917p+PsMHiobH3zxVGdjBR1du0KZw2v695OL+LKlXYrUKKFEMzzPuLz IHlPE4pRHKcxQ== Received: from S-MS-EXCH02.sberdevices.ru (S-MS-EXCH02.sberdevices.ru [172.16.1.5]) by mail.sberdevices.ru (Postfix) with ESMTP; Thu, 12 May 2022 08:19:30 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , "Jakub Kicinski" , Paolo Abeni CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel , Arseniy Krasnov , Krasnov Arseniy Subject: [RFC PATCH v1 6/8] virtio/vsock: enable zerocopy callback Thread-Topic: [RFC PATCH v1 6/8] virtio/vsock: enable zerocopy callback Thread-Index: AQHYZb/CBCCGHsRsTEumsu3XsE8R1A== Date: Thu, 12 May 2022 05:18:47 +0000 Message-ID: In-Reply-To: <7cdcb1e1-7c97-c054-19cf-5caeacae981d@sberdevices.ru> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <163A9CE28E0AAF4CABE7459E5DE429D6@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2022/05/12 02:55:00 #19424207 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-State: RFC This adds zerocopy callback for virtio transport. Signed-off-by: Arseniy Krasnov --- net/vmw_vsock/virtio_transport.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 43b7b09b4a0a..ea0e1567cfa8 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -478,6 +478,7 @@ static struct virtio_transport virtio_transport = { .stream_rcvhiwat = virtio_transport_stream_rcvhiwat, .stream_is_active = virtio_transport_stream_is_active, .stream_allow = virtio_transport_stream_allow, + .zerocopy_dequeue = virtio_transport_zerocopy_dequeue, .seqpacket_dequeue = virtio_transport_seqpacket_dequeue, .seqpacket_enqueue = virtio_transport_seqpacket_enqueue, From patchwork Thu May 12 05:20:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 12847052 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12E8CC433F5 for ; Thu, 12 May 2022 05:21:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349738AbiELFVl (ORCPT ); Thu, 12 May 2022 01:21:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238423AbiELFVj (ORCPT ); Thu, 12 May 2022 01:21:39 -0400 Received: from mail.sberdevices.ru (mail.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D130640A04; Wed, 11 May 2022 22:21:36 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mail.sberdevices.ru (Postfix) with ESMTP id E25FA5FD06; Thu, 12 May 2022 08:21:34 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1652332894; bh=T/Ygb1LAWbzswgKlynNdfb6VDE5z2e3CZmOLTaulrb4=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=NLSAwBbdGFb85oQ5zFdOtmmE549pp/g9T7kZhziRH88ZPj95RJpTkfU8bMw2wRejt 3HYGfHOAvzNZD9EChANNZmm4gPBAvOYgS7ctm2OHEUlEoeiNvTgQ3TIFBcmL92MgBN iyCcyKARJJpFn5Q6woWSdPShPnRrU/yS8LpbiCvh7/TbCmhhvBzA76QN+/Cv8WGDDB 6+z9D4rBpQapP3Pcm1e2B0W5cBT6AbF9X+kA+tDigIDvPiHOX6VeQF7QhaQPjr6R3Y QD3kTlBshi4i4nPxkiSH7kFkNe5y1OonpgxMRKvW+duqZ73zxEgjoW4yMbe9Gv8RoU yhXHtog/gUb5g== Received: from S-MS-EXCH02.sberdevices.ru (S-MS-EXCH02.sberdevices.ru [172.16.1.5]) by mail.sberdevices.ru (Postfix) with ESMTP; Thu, 12 May 2022 08:21:33 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , "Jakub Kicinski" , Paolo Abeni CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel , Krasnov Arseniy , Arseniy Krasnov Subject: [RFC PATCH v1 7/8] test/vsock: add receive zerocopy tests Thread-Topic: [RFC PATCH v1 7/8] test/vsock: add receive zerocopy tests Thread-Index: AQHYZcAL6Dv8pZsrWUCyze6KSb02uQ== Date: Thu, 12 May 2022 05:20:50 +0000 Message-ID: In-Reply-To: <7cdcb1e1-7c97-c054-19cf-5caeacae981d@sberdevices.ru> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <94E418106F5A834CA13E1A9A556CC82E@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2022/05/12 02:55:00 #19424207 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-State: RFC This adds tests for zerocopy feature: one test checks data transmission with simple integrity control. Second test covers 'error' branches in zerocopy logic(to check invalid arguments handling). Signed-off-by: Arseniy Krasnov --- tools/include/uapi/linux/virtio_vsock.h | 10 + tools/include/uapi/linux/vm_sockets.h | 7 + tools/testing/vsock/control.c | 34 +++ tools/testing/vsock/control.h | 2 + tools/testing/vsock/vsock_test.c | 284 ++++++++++++++++++++++++ 5 files changed, 337 insertions(+) create mode 100644 tools/include/uapi/linux/virtio_vsock.h create mode 100644 tools/include/uapi/linux/vm_sockets.h diff --git a/tools/include/uapi/linux/virtio_vsock.h b/tools/include/uapi/linux/virtio_vsock.h new file mode 100644 index 000000000000..df04e50b3dd7 --- /dev/null +++ b/tools/include/uapi/linux/virtio_vsock.h @@ -0,0 +1,10 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef _UAPI_LINUX_VIRTIO_VSOCK_H +#define _UAPI_LINUX_VIRTIO_VSOCK_H +#include + +struct virtio_vsock_usr_hdr { + u32 flags; + u32 len; +} __attribute__((packed)); +#endif /* _UAPI_LINUX_VIRTIO_VSOCK_H */ diff --git a/tools/include/uapi/linux/vm_sockets.h b/tools/include/uapi/linux/vm_sockets.h new file mode 100644 index 000000000000..001aaba03eea --- /dev/null +++ b/tools/include/uapi/linux/vm_sockets.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef _UAPI_LINUX_VM_SOCKETS_H +#define _UAPI_LINUX_VM_SOCKETS_H + +#define SO_VM_SOCKETS_ZEROCOPY 9 + +#endif /* _UAPI_LINUX_VM_SOCKETS_H */ diff --git a/tools/testing/vsock/control.c b/tools/testing/vsock/control.c index 4874872fc5a3..00a654e8f137 100644 --- a/tools/testing/vsock/control.c +++ b/tools/testing/vsock/control.c @@ -141,6 +141,40 @@ void control_writeln(const char *str) timeout_end(); } +void control_writelong(long value) +{ + char str[32]; + + if (snprintf(str, sizeof(str), "%li", value) >= sizeof(str)) { + perror("snprintf"); + exit(EXIT_FAILURE); + } + + control_writeln(str); +} + +long control_readlong(bool *ok) +{ + long value = -1; + char *str; + + if (ok) + *ok = false; + + str = control_readln(); + + if (str == NULL) + return value; + + value = strtol(str, NULL, 10); + free(str); + + if (ok) + *ok = true; + + return value; +} + /* Return the next line from the control socket (without the trailing newline). * * The program terminates if a timeout occurs. diff --git a/tools/testing/vsock/control.h b/tools/testing/vsock/control.h index 51814b4f9ac1..5272ad20e850 100644 --- a/tools/testing/vsock/control.h +++ b/tools/testing/vsock/control.h @@ -9,7 +9,9 @@ void control_init(const char *control_host, const char *control_port, void control_cleanup(void); void control_writeln(const char *str); char *control_readln(void); +long control_readlong(bool *ok); void control_expectln(const char *str); bool control_cmpln(char *line, const char *str, bool fail); +void control_writelong(long value); #endif /* CONTROL_H */ diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c index dc577461afc2..0d887c7d9474 100644 --- a/tools/testing/vsock/vsock_test.c +++ b/tools/testing/vsock/vsock_test.c @@ -18,11 +18,16 @@ #include #include #include +#include +#include +#include #include "timeout.h" #include "control.h" #include "util.h" +#define PAGE_SIZE 4096 + static void test_stream_connection_reset(const struct test_opts *opts) { union { @@ -596,6 +601,274 @@ static void test_seqpacket_invalid_rec_buffer_server(const struct test_opts *opt close(fd); } +static void test_stream_zerocopy_rx_client(const struct test_opts *opts) +{ + unsigned long total_sum; + size_t rx_map_len; + long rec_value; + void *rx_va; + int fd; + + fd = vsock_stream_connect(opts->peer_cid, 1234); + if (fd < 0) { + perror("connect"); + exit(EXIT_FAILURE); + } + + rx_map_len = PAGE_SIZE * 3; + + rx_va = mmap(NULL, rx_map_len, PROT_READ, MAP_SHARED, fd, 0); + if (rx_va == MAP_FAILED) { + perror("mmap"); + exit(EXIT_FAILURE); + } + + total_sum = 0; + + while (1) { + struct pollfd fds = { 0 }; + int hungup = 0; + int res; + + fds.fd = fd; + fds.events = POLLIN | POLLERR | POLLHUP | + POLLRDHUP | POLLNVAL; + + res = poll(&fds, 1, -1); + + if (res < 0) { + perror("poll"); + exit(EXIT_FAILURE); + } + + if (fds.revents & POLLERR) { + perror("poll error"); + exit(EXIT_FAILURE); + } + + if (fds.revents & POLLIN) { + struct virtio_vsock_usr_hdr *hdr; + uintptr_t tmp_rx_va = (uintptr_t)rx_va; + unsigned char *data_va; + unsigned char *end_va; + socklen_t len = sizeof(tmp_rx_va); + + if (getsockopt(fd, AF_VSOCK, + SO_VM_SOCKETS_ZEROCOPY, + &tmp_rx_va, &len) < 0) { + perror("getsockopt"); + exit(EXIT_FAILURE); + } + + hdr = (struct virtio_vsock_usr_hdr *)rx_va; + /* Skip headers page for data. */ + data_va = rx_va + PAGE_SIZE; + end_va = (unsigned char *)(tmp_rx_va + rx_map_len); + + while (data_va != end_va) { + int data_len = hdr->len; + + if (!hdr->len) { + if (fds.revents & (POLLHUP | POLLRDHUP)) { + if (hdr == rx_va) + hungup = 1; + } + + break; + } + + while (data_len > 0) { + int i; + int to_read = (data_len < PAGE_SIZE) ? + data_len : PAGE_SIZE; + + for (i = 0; i < to_read; i++) + total_sum += data_va[i]; + + data_va += PAGE_SIZE; + data_len -= PAGE_SIZE; + } + + hdr++; + } + + if (madvise((void *)rx_va, rx_map_len, + MADV_DONTNEED)) { + perror("madvise"); + exit(EXIT_FAILURE); + } + + if (hungup) + break; + } + } + + if (munmap(rx_va, rx_map_len)) { + perror("munmap"); + exit(EXIT_FAILURE); + } + + rec_value = control_readlong(NULL); + + if (total_sum != rec_value) { + fprintf(stderr, "sum mismatch %lu != %lu\n", + total_sum, rec_value); + exit(EXIT_FAILURE); + } + + close(fd); +} + +static void test_stream_zerocopy_rx_server(const struct test_opts *opts) +{ + size_t max_buf_size = 40000; + long total_sum = 0; + int n = 10; + int fd; + + fd = vsock_stream_accept(VMADDR_CID_ANY, 1234, NULL); + if (fd < 0) { + perror("accept"); + exit(EXIT_FAILURE); + } + + while (n) { + unsigned char *data; + size_t buf_size; + int i; + + buf_size = 1 + rand() % max_buf_size; + + data = malloc(buf_size); + + if (!data) { + perror("malloc"); + exit(EXIT_FAILURE); + } + + for (i = 0; i < buf_size; i++) { + data[i] = rand() & 0xff; + total_sum += data[i]; + } + + if (write(fd, data, buf_size) != buf_size) { + perror("write"); + exit(EXIT_FAILURE); + } + + free(data); + n--; + } + + control_writelong(total_sum); + + close(fd); +} + +static void test_stream_zerocopy_rx_inv_client(const struct test_opts *opts) +{ + size_t map_size = PAGE_SIZE * 5; + socklen_t len; + void *map_va; + int fd; + + fd = vsock_stream_connect(opts->peer_cid, 1234); + if (fd < 0) { + perror("connect"); + exit(EXIT_FAILURE); + } + + len = sizeof(map_va); + map_va = 0; + + /* Try zerocopy with invalid mapping address. */ + if (getsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_ZEROCOPY, + &map_va, &len) == 0) { + perror("getsockopt"); + exit(EXIT_FAILURE); + } + + /* Try zerocopy with valid, but not socket mapping. */ + map_va = mmap(NULL, map_size, PROT_READ, + MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + if (map_va == MAP_FAILED) { + perror("anon mmap"); + exit(EXIT_FAILURE); + } + + if (getsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_ZEROCOPY, + &map_va, &len) == 0) { + perror("getsockopt"); + exit(EXIT_FAILURE); + } + + if (munmap(map_va, map_size)) { + perror("munmap"); + exit(EXIT_FAILURE); + } + + /* Try zerocopy with valid, but too small mapping. */ + map_va = mmap(NULL, PAGE_SIZE, PROT_READ, MAP_SHARED, fd, 0); + if (map_va == MAP_FAILED) { + perror("socket mmap"); + exit(EXIT_FAILURE); + } + + //tmp_rx_va = (uintptr_t)map_va; + + if (getsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_ZEROCOPY, + &map_va, &len) == 0) { + perror("getsockopt"); + exit(EXIT_FAILURE); + } + + if (munmap(map_va, PAGE_SIZE)) { + perror("munmap"); + exit(EXIT_FAILURE); + } + + /* Try zerocopy with valid mapping, but not from first byte. */ + map_va = mmap(NULL, map_size, PROT_READ, MAP_SHARED, fd, 0); + if (map_va == MAP_FAILED) { + perror("socket mmap"); + exit(EXIT_FAILURE); + } + + //tmp_rx_va = (uintptr_t)map_va + PAGE_SIZE; + map_va += PAGE_SIZE; + + if (getsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_ZEROCOPY, + &map_va, &len) == 0) { + perror("getsockopt"); + exit(EXIT_FAILURE); + } + + if (munmap(map_va - PAGE_SIZE, map_size)) { + perror("munmap"); + exit(EXIT_FAILURE); + } + + control_writeln("DONE"); + + close(fd); +} + +static void test_stream_zerocopy_rx_inv_server(const struct test_opts *opts) +{ + int fd; + + fd = vsock_stream_accept(VMADDR_CID_ANY, 1234, NULL); + + if (fd < 0) { + perror("accept"); + exit(EXIT_FAILURE); + } + + control_expectln("DONE"); + + close(fd); +} + static struct test_case test_cases[] = { { .name = "SOCK_STREAM connection reset", @@ -646,6 +919,16 @@ static struct test_case test_cases[] = { .run_client = test_seqpacket_invalid_rec_buffer_client, .run_server = test_seqpacket_invalid_rec_buffer_server, }, + { + .name = "SOCK_STREAM zerocopy receive", + .run_client = test_stream_zerocopy_rx_client, + .run_server = test_stream_zerocopy_rx_server, + }, + { + .name = "SOCK_STREAM zerocopy invalid", + .run_client = test_stream_zerocopy_rx_inv_client, + .run_server = test_stream_zerocopy_rx_inv_server, + }, {}, }; @@ -729,6 +1012,7 @@ int main(int argc, char **argv) .peer_cid = VMADDR_CID_ANY, }; + srand(time(NULL)); init_signals(); for (;;) { From patchwork Thu May 12 05:22:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 12847053 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB35CC433EF for ; Thu, 12 May 2022 05:23:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349787AbiELFX2 (ORCPT ); Thu, 12 May 2022 01:23:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238423AbiELFX1 (ORCPT ); Thu, 12 May 2022 01:23:27 -0400 Received: from mail.sberdevices.ru (mail.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 037B166AF4; Wed, 11 May 2022 22:23:24 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mail.sberdevices.ru (Postfix) with ESMTP id 1DB655FD06; Thu, 12 May 2022 08:23:22 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1652333002; bh=PmEc4Pwi+Le/+bdbdUbiEIApLyKor2XoQjIS3kLu5ng=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=hFHdhWAaQIf7R+kXW9+l9b9fFJkMqZoOSu9F8GRdQ2ihhhapKFdIK18nxtpMDzAua scoiixfteCuiuUSOHUfvDXeFpzvFn8kqbz+MtcAcuH6JTfkd0XzYzPiFi5pGX+D4CC B0H8OI923CoUL+z1yUOqCEQey2Xl60JC0Yb743bKNnsRIIkko4wJ43dZseaUPjzh/o ve6B1BQCXpoOu1w/CaIE0130yYaBeEE1rDM4pde303xisYFYPY3tFLZsMNNlKo5oh7 9zwvOY1WwsBx64qYZKeKhlE/d2Srnngn/R95/jWesIQ/RABtMt/Cp7wCI39tCF3hOK 067hqTyuldu6Q== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mail.sberdevices.ru (Postfix) with ESMTP; Thu, 12 May 2022 08:23:21 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , "Jakub Kicinski" , Paolo Abeni CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel , Arseniy Krasnov , Krasnov Arseniy Subject: [RFC PATCH v1 8/8] test/vsock: vsock rx zerocopy utility Thread-Topic: [RFC PATCH v1 8/8] test/vsock: vsock rx zerocopy utility Thread-Index: AQHYZcBLyUYWHYMAWk+cQqC3/CFMhg== Date: Thu, 12 May 2022 05:22:38 +0000 Message-ID: In-Reply-To: <7cdcb1e1-7c97-c054-19cf-5caeacae981d@sberdevices.ru> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2022/05/12 02:55:00 #19424207 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-State: RFC This adds simple util for zerocopy benchmarking. Signed-off-by: Arseniy Krasnov --- tools/testing/vsock/Makefile | 1 + tools/testing/vsock/rx_zerocopy.c | 356 ++++++++++++++++++++++++++++++ 2 files changed, 357 insertions(+) create mode 100644 tools/testing/vsock/rx_zerocopy.c diff --git a/tools/testing/vsock/Makefile b/tools/testing/vsock/Makefile index f8293c6910c9..2cb5820ca2f3 100644 --- a/tools/testing/vsock/Makefile +++ b/tools/testing/vsock/Makefile @@ -3,6 +3,7 @@ all: test test: vsock_test vsock_diag_test vsock_test: vsock_test.o timeout.o control.o util.o vsock_diag_test: vsock_diag_test.o timeout.o control.o util.o +rx_zerocopy: rx_zerocopy.o timeout.o control.o util.o CFLAGS += -g -O2 -Werror -Wall -I. -I../../include -I../../../usr/include -Wno-pointer-sign -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -D_GNU_SOURCE .PHONY: all test clean diff --git a/tools/testing/vsock/rx_zerocopy.c b/tools/testing/vsock/rx_zerocopy.c new file mode 100644 index 000000000000..4db1a7f3a1af --- /dev/null +++ b/tools/testing/vsock/rx_zerocopy.c @@ -0,0 +1,356 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * rx_zerocopy - benchmark utility for zerocopy + * receive. + * + * Copyright (C) 2022 SberDevices. + * + * Author: Arseniy Krasnov + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "util.h" + +#define PAGE_SIZE 4096 + +#define DEFAULT_TX_SIZE 128 +#define DEFAULT_RX_SIZE 128 +#define DEFAULT_PORT 1234 + +static int client_mode = 1; +static int peer_cid = -1; +static int port = DEFAULT_PORT; +static unsigned long tx_buf_size; +static unsigned long rx_buf_size; +static unsigned long mb_to_send = 40; + +static time_t current_nsec(void) +{ + struct timespec ts; + + if (clock_gettime(CLOCK_REALTIME, &ts)) { + perror("clock_gettime"); + exit(EXIT_FAILURE); + } + + return (ts.tv_sec * 1000000000ULL) + ts.tv_nsec; +} + +/* Server accepts connection and */ +static void run_server(void) +{ + int fd; + char *data; + int client_fd; + union { + struct sockaddr sa; + struct sockaddr_vm svm; + } addr = { + .svm = { + .svm_family = AF_VSOCK, + .svm_port = port, + .svm_cid = VMADDR_CID_ANY, + }, + }; + union { + struct sockaddr sa; + struct sockaddr_vm svm; + } clientaddr; + + socklen_t clientaddr_len = sizeof(clientaddr.svm); + time_t tx_begin_ns; + ssize_t total_send = 0; + unsigned long sum; + + fprintf(stderr, "Running server, listen %i, mb %lu tx buf %lu\n", + port, mb_to_send, tx_buf_size); + + fd = socket(AF_VSOCK, SOCK_STREAM, 0); + + if (fd < 0) { + perror("socket"); + exit(EXIT_FAILURE); + } + + if (bind(fd, &addr.sa, sizeof(addr.svm)) < 0) { + perror("bind"); + exit(EXIT_FAILURE); + } + + if (listen(fd, 1) < 0) { + perror("listen"); + exit(EXIT_FAILURE); + } + + client_fd = accept(fd, &clientaddr.sa, &clientaddr_len); + + if (client_fd < 0) { + perror("accept"); + exit(EXIT_FAILURE); + } + + data = malloc(tx_buf_size); + + if (data == NULL) { + fprintf(stderr, "malloc failed\n"); + exit(EXIT_FAILURE); + } + + sum = 0; + tx_begin_ns = current_nsec(); + + while (1) { + int i; + ssize_t sent; + + if (total_send > mb_to_send * 1024 * 1024ULL) + break; + + for (i = 0; i < tx_buf_size; i++) { + data[i] = rand() % 0xff; + sum += data[i]; + } + + sent = write(client_fd, data, tx_buf_size); + + if (sent <= 0) { + perror("write"); + exit(EXIT_FAILURE); + } + + total_send += sent; + } + + free(data); + + fprintf(stderr, "Total %zi MB, time %f\n", mb_to_send, + (float)(current_nsec() - tx_begin_ns)/1000.0/1000.0/1000.0); + + close(fd); + close(client_fd); +} + +static void run_client(int zerocopy) +{ + int fd; + union { + struct sockaddr sa; + struct sockaddr_vm svm; + } addr = { + .svm = { + .svm_family = AF_VSOCK, + .svm_port = port, + .svm_cid = peer_cid, + }, + }; + unsigned long sum = 0; + void *rx_va = NULL; + + printf("Running client, %s mode, peer %i:%i, rx buf %lu\n", + zerocopy ? "zerocopy" : "copy", peer_cid, port, + rx_buf_size); + + fd = socket(AF_VSOCK, SOCK_STREAM, 0); + + if (fd < 0) { + perror("socket"); + exit(EXIT_FAILURE); + } + + if (connect(fd, &addr.sa, sizeof(addr.svm))) { + perror("connect"); + exit(EXIT_FAILURE); + } + + if (zerocopy) { + rx_va = mmap(NULL, rx_buf_size, + PROT_READ, MAP_SHARED, fd, 0); + + if (rx_va == MAP_FAILED) { + perror("mmap"); + exit(EXIT_FAILURE); + } + } + + while (1) { + struct pollfd fds = { 0 }; + int done = 0; + + fds.fd = fd; + fds.events = POLLIN | POLLERR | POLLHUP | + POLLRDHUP | POLLNVAL; + + if (poll(&fds, 1, -1) < 0) { + perror("poll"); + exit(EXIT_FAILURE); + } + + if (fds.revents & (POLLHUP | POLLRDHUP)) + done = 1; + + if (fds.revents & POLLERR) { + fprintf(stderr, "Done error\n"); + break; + } + + if (fds.revents & POLLIN) { + if (zerocopy) { + struct virtio_vsock_usr_hdr *hdr; + uintptr_t tmp_rx_va = (uintptr_t)rx_va; + socklen_t len = sizeof(tmp_rx_va); + + if (getsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_ZEROCOPY, + &tmp_rx_va, &len) < 0) { + perror("getsockopt"); + exit(EXIT_FAILURE); + } + + hdr = (struct virtio_vsock_usr_hdr *)tmp_rx_va; + + if (!hdr->len) { + if (done) { + fprintf(stderr, "Done, sum %lu\n", sum); + break; + } + } + + tmp_rx_va += PAGE_SIZE; + + if (madvise((void *)rx_va, rx_buf_size, + MADV_DONTNEED)) { + perror("madvise"); + exit(EXIT_FAILURE); + } + } else { + char data[rx_buf_size - PAGE_SIZE]; + ssize_t bytes_read; + + bytes_read = read(fd, data, sizeof(data)); + + if (bytes_read <= 0) + break; + } + } + } +} + +static const char optstring[] = ""; +static const struct option longopts[] = { + { + .name = "mode", + .has_arg = required_argument, + .val = 'm', + }, + { + .name = "zerocopy", + .has_arg = no_argument, + .val = 'z', + }, + { + .name = "cid", + .has_arg = required_argument, + .val = 'c', + }, + { + .name = "port", + .has_arg = required_argument, + .val = 'p', + }, + { + .name = "mb", + .has_arg = required_argument, + .val = 's', + }, + { + .name = "tx", + .has_arg = required_argument, + .val = 't', + }, + { + .name = "rx", + .has_arg = required_argument, + .val = 'r', + }, + { + .name = "help", + .has_arg = no_argument, + .val = '?', + }, + {}, +}; + +int main(int argc, char **argv) +{ + int zerocopy = 0; + + for (;;) { + int opt = getopt_long(argc, argv, optstring, longopts, NULL); + + if (opt == -1) + break; + + switch (opt) { + case 's': + mb_to_send = atoi(optarg); + break; + case 'c': + peer_cid = atoi(optarg); + break; + case 'p': + port = atoi(optarg); + break; + case 'r': + rx_buf_size = atoi(optarg); + break; + case 't': + tx_buf_size = atoi(optarg); + break; + case 'm': + if (strcmp(optarg, "client") == 0) + client_mode = 1; + else if (strcmp(optarg, "server") == 0) + client_mode = 0; + else { + fprintf(stderr, "--mode must be \"client\" or \"server\"\n"); + return EXIT_FAILURE; + } + break; + case 'z': + zerocopy = 1; + break; + default: + break; + } + + } + + if (!tx_buf_size) + tx_buf_size = DEFAULT_TX_SIZE; + + if (!rx_buf_size) + rx_buf_size = DEFAULT_RX_SIZE; + + tx_buf_size *= PAGE_SIZE; + rx_buf_size *= PAGE_SIZE; + + srand(time(NULL)); + + if (client_mode) + run_client(zerocopy); + else + run_server(); + + return 0; +}