From patchwork Thu May 12 05:12:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 12847041 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34061C433F5 for ; Thu, 12 May 2022 05:13:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349257AbiELFN3 (ORCPT ); Thu, 12 May 2022 01:13:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345766AbiELFN1 (ORCPT ); Thu, 12 May 2022 01:13:27 -0400 Received: from mail.sberdevices.ru (mail.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B1D654183; Wed, 11 May 2022 22:13:25 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mail.sberdevices.ru (Postfix) with ESMTP id 979825FD06; Thu, 12 May 2022 08:13:23 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1652332403; bh=WEpoyG/FkCWYqdlSPYBa69mJl1hJm+eMFpJP7yLV9Dk=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=RN/PdgDxlPxCj5VoYog3nurPwkzOg86UAXtuLyazDIBRD3tBB+2S7k9QiXWQ7oyAb EwAxwRYudYiUZqqbi0hoxVonm4I4nQ3Z3S/0DSliezS2IhQtA2MO978rzmiQZ6x/BX vYynpzyO0O9AKITJfK7AjXtySYCUW+wkbEC0nzD2R4PUPLNJ7/rx32MOlq1j/mkDC2 E/49hBCSjn/V5DSoiB4xJj9hbHncqJmQu8lE5m/TYNcely0KZTpqwij4Gr6Lnc2m5w bfZev9CukZ+Vf9hQMHQg30qWihuD+2yiZ51T5/oPtLIRUknK/fediWvIbZpbMP9Fu5 f0hMEoIiDbi5Q== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mail.sberdevices.ru (Postfix) with ESMTP; Thu, 12 May 2022 08:13:23 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , "Jakub Kicinski" , Paolo Abeni CC: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , kernel , Arseniy Krasnov , Krasnov Arseniy Subject: [RFC PATCH v1 3/8] af_vsock: add zerocopy receive logic Thread-Topic: [RFC PATCH v1 3/8] af_vsock: add zerocopy receive logic Thread-Index: AQHYZb7nBASrNXtuFUaX+/Xomq2sVQ== Date: Thu, 12 May 2022 05:12:40 +0000 Message-ID: <44d2404f-dc4f-f42c-1235-2ad7f537a030@sberdevices.ru> In-Reply-To: <7cdcb1e1-7c97-c054-19cf-5caeacae981d@sberdevices.ru> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: <6C71C701713B9140B989C44D085F9EF4@sberdevices.ru> MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2022/05/12 02:55:00 #19424207 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-State: RFC This: 1) Adds callback for 'mmap()' call on socket. It checks vm area flags and sets vm area ops. 2) Adds special 'getsockopt()' case which calls transport zerocopy callback. Input argument is vm area address. Signed-off-by: Arseniy Krasnov --- include/net/af_vsock.h | 4 +++ include/uapi/linux/vm_sockets.h | 2 ++ net/vmw_vsock/af_vsock.c | 61 +++++++++++++++++++++++++++++++++ 3 files changed, 67 insertions(+) diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h index ab207677e0a8..d0aefb9ee4cf 100644 --- a/include/net/af_vsock.h +++ b/include/net/af_vsock.h @@ -135,6 +135,10 @@ struct vsock_transport { bool (*stream_is_active)(struct vsock_sock *); bool (*stream_allow)(u32 cid, u32 port); + int (*zerocopy_dequeue)(struct vsock_sock *vsk, + struct vm_area_struct *vma, + unsigned long addr); + /* SEQ_PACKET. */ ssize_t (*seqpacket_dequeue)(struct vsock_sock *vsk, struct msghdr *msg, int flags); diff --git a/include/uapi/linux/vm_sockets.h b/include/uapi/linux/vm_sockets.h index c60ca33eac59..62aec51a2bc3 100644 --- a/include/uapi/linux/vm_sockets.h +++ b/include/uapi/linux/vm_sockets.h @@ -83,6 +83,8 @@ #define SO_VM_SOCKETS_CONNECT_TIMEOUT_NEW 8 +#define SO_VM_SOCKETS_ZEROCOPY 9 + #if !defined(__KERNEL__) #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__)) #define SO_VM_SOCKETS_CONNECT_TIMEOUT SO_VM_SOCKETS_CONNECT_TIMEOUT_OLD diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 38baeb189d4e..3f98477ea546 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1652,6 +1652,42 @@ static int vsock_connectible_setsockopt(struct socket *sock, return err; } +static const struct vm_operations_struct afvsock_vm_ops = { +}; + +static int vsock_recv_zerocopy(struct socket *sock, + unsigned long address) +{ + struct sock *sk = sock->sk; + struct vsock_sock *vsk = vsock_sk(sk); + struct vm_area_struct *vma; + const struct vsock_transport *transport; + int res; + + transport = vsk->transport; + + if (!transport->zerocopy_dequeue) + return -EOPNOTSUPP; + + lock_sock(sk); + mmap_write_lock(current->mm); + + vma = vma_lookup(current->mm, address); + + if (!vma || vma->vm_ops != &afvsock_vm_ops) { + mmap_write_unlock(current->mm); + release_sock(sk); + return -EINVAL; + } + + res = transport->zerocopy_dequeue(vsk, vma, address); + + mmap_write_unlock(current->mm); + release_sock(sk); + + return res; +} + static int vsock_connectible_getsockopt(struct socket *sock, int level, int optname, char __user *optval, @@ -1696,6 +1732,17 @@ static int vsock_connectible_getsockopt(struct socket *sock, lv = sock_get_timeout(vsk->connect_timeout, &v, optname == SO_VM_SOCKETS_CONNECT_TIMEOUT_OLD); break; + case SO_VM_SOCKETS_ZEROCOPY: { + unsigned long vma_addr; + + if (len < sizeof(vma_addr)) + return -EINVAL; + + if (copy_from_user(&vma_addr, optval, sizeof(vma_addr))) + return -EFAULT; + + return vsock_recv_zerocopy(sock, vma_addr); + } default: return -ENOPROTOOPT; @@ -2124,6 +2171,19 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, return err; } +static int afvsock_mmap(struct file *file, struct socket *sock, + struct vm_area_struct *vma) +{ + if (vma->vm_flags & (VM_WRITE | VM_EXEC)) + return -EPERM; + + vma->vm_flags &= ~(VM_MAYWRITE | VM_MAYEXEC); + vma->vm_flags |= (VM_MIXEDMAP); + vma->vm_ops = &afvsock_vm_ops; + + return 0; +} + static const struct proto_ops vsock_stream_ops = { .family = PF_VSOCK, .owner = THIS_MODULE, @@ -2143,6 +2203,7 @@ static const struct proto_ops vsock_stream_ops = { .recvmsg = vsock_connectible_recvmsg, .mmap = sock_no_mmap, .sendpage = sock_no_sendpage, + .mmap = afvsock_mmap, }; static const struct proto_ops vsock_seqpacket_ops = {