From patchwork Wed Jun 9 23:24:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang Wang ." X-Patchwork-Id: 12311435 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CD47C48BD1 for ; Wed, 9 Jun 2021 23:27:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2D8CD613EA for ; Wed, 9 Jun 2021 23:27:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229722AbhFIX3m (ORCPT ); Wed, 9 Jun 2021 19:29:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229808AbhFIX3j (ORCPT ); Wed, 9 Jun 2021 19:29:39 -0400 Received: from mail-pj1-x102e.google.com (mail-pj1-x102e.google.com [IPv6:2607:f8b0:4864:20::102e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0BA3DC0617A6 for ; Wed, 9 Jun 2021 16:27:30 -0700 (PDT) Received: by mail-pj1-x102e.google.com with SMTP id h12-20020a17090aa88cb029016400fd8ad8so2644539pjq.3 for ; Wed, 09 Jun 2021 16:27:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=r8XBgJHTxrXFtSMQVzajzu+b1SL5llDzrX12khypvWk=; b=YRp7IbdiWcuBxknxIpOMnDJBEqWlQkSzds59gCQC/neXQhUv7TodzzMSjgPebbno6v DjYtvf2iGHZWtfrc1ZY7XrNh9THCH3+dTY57Ufq0FRt9WuV2NGz7tO4FvkTSL6kP9dQb YFDp5qCH3kq64w1N/wuk1gyt1TomnS5CRnzbnnaSSAAWWrMepYoAqxZpePjXNOnr4CGM upjKyAfXyekQquGm3/9krmhcaC22r9RUTuAYOXln8dpgUtmWIT7kEgECjxwr/cqfkDg/ SDH0Nljo1ngY+lDXrpnv+w8fSbJOQfWdkolvfJPtsjrPoJny916dQwU8bPq2Rlob5eTP cXng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=r8XBgJHTxrXFtSMQVzajzu+b1SL5llDzrX12khypvWk=; b=c81Z0c9Fx6kv5KoETBaLoVtuwQ90TI2k+7XNe7TLRcAF7BHreI+pNh6YDNISuZyMUk HIo8Pn+kqaZabLAcHARE5ql6wBPSeFRbxWSZp6uNFvrVw8NfUowd9CR6x+DIBhJLBLtC dFexuTFEV9YYSPpk74rUG6XVTatAD/LlUSTrKn9gdDATTl0RXV7vGHMpORAnIQDAWa7v j3lhaK0XLzSq28yRcXO8MDWllVaTzaxl+KyuKcO22CmocLaCb3rsGZ/PtP30PjfhcWf8 ex/bmOLpHatpUlOFAVtvy8BUL73pXv8NoGaIiw3SCivukq4BksVminROu0kDQY3FT4st ivwg== X-Gm-Message-State: AOAM533dSHvyjtIT413W4QYDhpKYo41v5L+TMTW2GP5SHHtOLahETp5J zfxhtaJn7LzCpyOr95Mqtob2MQ== X-Google-Smtp-Source: ABdhPJyxFSEKnaZ2ltvDInR67Ojfj4gWGKonXTvsJqyjudBs7otrxqn/jyoKEjVRgmlZu3AziO1Q1g== X-Received: by 2002:a17:90a:f317:: with SMTP id ca23mr154709pjb.174.1623281249606; Wed, 09 Jun 2021 16:27:29 -0700 (PDT) Received: from n124-121-013.byted.org (ec2-54-241-92-238.us-west-1.compute.amazonaws.com. [54.241.92.238]) by smtp.gmail.com with ESMTPSA id k1sm526783pfa.30.2021.06.09.16.27.28 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Jun 2021 16:27:29 -0700 (PDT) From: Jiang Wang To: sgarzare@redhat.com Cc: virtualization@lists.linux-foundation.org, stefanha@redhat.com, mst@redhat.com, arseny.krasnov@kaspersky.com, jhansen@vmware.comments, cong.wang@bytedance.com, duanxiongchun@bytedance.com, xieyongji@bytedance.com, chaiwen.cc@bytedance.com, Jason Wang , "David S. Miller" , Jakub Kicinski , Steven Rostedt , Ingo Molnar , Andra Paraschiv , Norbert Slusarek , Colin Ian King , Alexander Popov , kvm@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC v1 1/6] virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit Date: Wed, 9 Jun 2021 23:24:53 +0000 Message-Id: <20210609232501.171257-2-jiang.wang@bytedance.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20210609232501.171257-1-jiang.wang@bytedance.com> References: <20210609232501.171257-1-jiang.wang@bytedance.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC When this feature is enabled, allocate 5 queues, otherwise, allocate 3 queues to be compatible with old QEMU versions. Signed-off-by: Jiang Wang --- drivers/vhost/vsock.c | 3 +- include/linux/virtio_vsock.h | 9 +++++ include/uapi/linux/virtio_vsock.h | 3 ++ net/vmw_vsock/virtio_transport.c | 73 +++++++++++++++++++++++++++++++++++---- 4 files changed, 80 insertions(+), 8 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 5e78fb719602..81d064601093 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -31,7 +31,8 @@ enum { VHOST_VSOCK_FEATURES = VHOST_FEATURES | - (1ULL << VIRTIO_F_ACCESS_PLATFORM) + (1ULL << VIRTIO_F_ACCESS_PLATFORM) | + (1ULL << VIRTIO_VSOCK_F_DGRAM) }; enum { diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h index dc636b727179..ba3189ed9345 100644 --- a/include/linux/virtio_vsock.h +++ b/include/linux/virtio_vsock.h @@ -18,6 +18,15 @@ enum { VSOCK_VQ_MAX = 3, }; +enum { + VSOCK_VQ_STREAM_RX = 0, /* for host to guest data */ + VSOCK_VQ_STREAM_TX = 1, /* for guest to host data */ + VSOCK_VQ_DGRAM_RX = 2, + VSOCK_VQ_DGRAM_TX = 3, + VSOCK_VQ_EX_EVENT = 4, + VSOCK_VQ_EX_MAX = 5, +}; + /* Per-socket state (accessed via vsk->trans) */ struct virtio_vsock_sock { struct vsock_sock *vsk; diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h index 1d57ed3d84d2..b56614dff1c9 100644 --- a/include/uapi/linux/virtio_vsock.h +++ b/include/uapi/linux/virtio_vsock.h @@ -38,6 +38,9 @@ #include #include +/* The feature bitmap for virtio net */ +#define VIRTIO_VSOCK_F_DGRAM 0 /* Host support dgram vsock */ + struct virtio_vsock_config { __le64 guest_cid; } __attribute__((packed)); diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 2700a63ab095..7dcb8db23305 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -27,7 +27,8 @@ static DEFINE_MUTEX(the_virtio_vsock_mutex); /* protects the_virtio_vsock */ struct virtio_vsock { struct virtio_device *vdev; - struct virtqueue *vqs[VSOCK_VQ_MAX]; + struct virtqueue **vqs; + bool has_dgram; /* Virtqueue processing is deferred to a workqueue */ struct work_struct tx_work; @@ -333,7 +334,10 @@ static int virtio_vsock_event_fill_one(struct virtio_vsock *vsock, struct scatterlist sg; struct virtqueue *vq; - vq = vsock->vqs[VSOCK_VQ_EVENT]; + if (vsock->has_dgram) + vq = vsock->vqs[VSOCK_VQ_EX_EVENT]; + else + vq = vsock->vqs[VSOCK_VQ_EVENT]; sg_init_one(&sg, event, sizeof(*event)); @@ -351,7 +355,10 @@ static void virtio_vsock_event_fill(struct virtio_vsock *vsock) virtio_vsock_event_fill_one(vsock, event); } - virtqueue_kick(vsock->vqs[VSOCK_VQ_EVENT]); + if (vsock->has_dgram) + virtqueue_kick(vsock->vqs[VSOCK_VQ_EX_EVENT]); + else + virtqueue_kick(vsock->vqs[VSOCK_VQ_EVENT]); } static void virtio_vsock_reset_sock(struct sock *sk) @@ -391,7 +398,10 @@ static void virtio_transport_event_work(struct work_struct *work) container_of(work, struct virtio_vsock, event_work); struct virtqueue *vq; - vq = vsock->vqs[VSOCK_VQ_EVENT]; + if (vsock->has_dgram) + vq = vsock->vqs[VSOCK_VQ_EX_EVENT]; + else + vq = vsock->vqs[VSOCK_VQ_EVENT]; mutex_lock(&vsock->event_lock); @@ -411,7 +421,10 @@ static void virtio_transport_event_work(struct work_struct *work) } } while (!virtqueue_enable_cb(vq)); - virtqueue_kick(vsock->vqs[VSOCK_VQ_EVENT]); + if (vsock->has_dgram) + virtqueue_kick(vsock->vqs[VSOCK_VQ_EX_EVENT]); + else + virtqueue_kick(vsock->vqs[VSOCK_VQ_EVENT]); out: mutex_unlock(&vsock->event_lock); } @@ -434,6 +447,10 @@ static void virtio_vsock_tx_done(struct virtqueue *vq) queue_work(virtio_vsock_workqueue, &vsock->tx_work); } +static void virtio_vsock_dgram_tx_done(struct virtqueue *vq) +{ +} + static void virtio_vsock_rx_done(struct virtqueue *vq) { struct virtio_vsock *vsock = vq->vdev->priv; @@ -443,6 +460,10 @@ static void virtio_vsock_rx_done(struct virtqueue *vq) queue_work(virtio_vsock_workqueue, &vsock->rx_work); } +static void virtio_vsock_dgram_rx_done(struct virtqueue *vq) +{ +} + static struct virtio_transport virtio_transport = { .transport = { .module = THIS_MODULE, @@ -545,13 +566,29 @@ static int virtio_vsock_probe(struct virtio_device *vdev) virtio_vsock_tx_done, virtio_vsock_event_done, }; + vq_callback_t *ex_callbacks[] = { + virtio_vsock_rx_done, + virtio_vsock_tx_done, + virtio_vsock_dgram_rx_done, + virtio_vsock_dgram_tx_done, + virtio_vsock_event_done, + }; + static const char * const names[] = { "rx", "tx", "event", }; + static const char * const ex_names[] = { + "rx", + "tx", + "dgram_rx", + "dgram_tx", + "event", + }; + struct virtio_vsock *vsock = NULL; - int ret; + int ret, max_vq; ret = mutex_lock_interruptible(&the_virtio_vsock_mutex); if (ret) @@ -572,9 +609,30 @@ static int virtio_vsock_probe(struct virtio_device *vdev) vsock->vdev = vdev; - ret = virtio_find_vqs(vsock->vdev, VSOCK_VQ_MAX, + if (virtio_has_feature(vdev, VIRTIO_VSOCK_F_DGRAM)) + vsock->has_dgram = true; + + if (vsock->has_dgram) + max_vq = VSOCK_VQ_EX_MAX; + else + max_vq = VSOCK_VQ_MAX; + + vsock->vqs = kmalloc_array(max_vq, sizeof(struct virtqueue *), GFP_KERNEL); + if (!vsock->vqs) { + ret = -ENOMEM; + goto out; + } + + if (vsock->has_dgram) { + ret = virtio_find_vqs(vsock->vdev, max_vq, + vsock->vqs, ex_callbacks, ex_names, + NULL); + } else { + ret = virtio_find_vqs(vsock->vdev, max_vq, vsock->vqs, callbacks, names, NULL); + } + if (ret < 0) goto out; @@ -695,6 +753,7 @@ static struct virtio_device_id id_table[] = { }; static unsigned int features[] = { + VIRTIO_VSOCK_F_DGRAM, }; static struct virtio_driver virtio_vsock_driver = { From patchwork Wed Jun 9 23:24:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang Wang ." X-Patchwork-Id: 12311437 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F2A0C48BCF for ; Wed, 9 Jun 2021 23:27:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 233D2613EF for ; Wed, 9 Jun 2021 23:27:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229961AbhFIX3v (ORCPT ); Wed, 9 Jun 2021 19:29:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229808AbhFIX3s (ORCPT ); Wed, 9 Jun 2021 19:29:48 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A41EC0617A8 for ; Wed, 9 Jun 2021 16:27:35 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id v11so5079821ply.6 for ; Wed, 09 Jun 2021 16:27:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=3itOSl2ZsDFc1erUzMxC9+FNDhI2keKiH9z+Nt5k20U=; b=kEg5QFS9y1O0P9aiMyOc/uQ+Qylz2rPikJ9DE2KIK/aqDsk7aW5UP8y4z3MTl4LxMP V8TLTO8+bRfaznz3VSLCG9aCzszbVn/Rp544ag+YeGsKE9onxCpBu0Nh3nn58S/iDfMo NWueHIqjn2Ch820sCtsrYE+prg1noFmS9EPxuOWv1Fr2fA0WfepqFmvNyg+vkj7lvsDN qULImjPWTk8V0Fw9i1IcZ1bSdJFIf86WfiKGxI/fTqoJ1IT0Q3pjywD0fLHIQA6KAiA3 HCQybrRW1PryTRvQIPI2QVX10n/cCZ81MimTfwR/y3YpTpmIkIErOQ9GXMs9HIcLeGeZ IHtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=3itOSl2ZsDFc1erUzMxC9+FNDhI2keKiH9z+Nt5k20U=; b=qEVM+6r/tHUoNV4QT3ySr1GaA8TTPO7ZDUocdEeIEst6bq1AMIPp/ji6O7L/oXNnKi ov3DBtvxW9MHyeYk2VdbJDOVwxjw4MKZWInDeXRZCbx2ujQWOtuydxjr3ausq8FVYgxE XzVUYUZ8HsNsudJZ3/RE56yzyzWOaD2pDXm84tvc/ymfmCRLtTvZirqKvlJsh8C63wHC h0CcUFuyURWAvg4qf17hBsrX/73I2AY27MW1wRettk6iuh+DcFdQEhpC5l9ReXZ16ie2 n0xjR79ZS+88u43DsZ55/ToRZ7mkM8qJY33FXcYeFZ7OAYBMsf4KzvQX3fFIGZsDHwqQ +GiQ== X-Gm-Message-State: AOAM530sVBEFJHVydFpGu6zkIP/p+1/e9cZGDzZ2BUVE/xhZKCt3BZts 2NXlL5VdKL/FgQJlv2iL/5s5Pw== X-Google-Smtp-Source: ABdhPJybp/nsag/9+gDNiQKv2HKOO199eN12BUSSYQQ5HLlftSpzjHgwyUC4w2p1CTE/A8hlmMj2YA== X-Received: by 2002:a17:903:188:b029:115:29f6:674c with SMTP id z8-20020a1709030188b029011529f6674cmr2041298plg.57.1623281254563; Wed, 09 Jun 2021 16:27:34 -0700 (PDT) Received: from n124-121-013.byted.org (ec2-54-241-92-238.us-west-1.compute.amazonaws.com. [54.241.92.238]) by smtp.gmail.com with ESMTPSA id k1sm526783pfa.30.2021.06.09.16.27.32 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Jun 2021 16:27:34 -0700 (PDT) From: Jiang Wang To: sgarzare@redhat.com Cc: virtualization@lists.linux-foundation.org, stefanha@redhat.com, mst@redhat.com, arseny.krasnov@kaspersky.com, jhansen@vmware.comments, cong.wang@bytedance.com, duanxiongchun@bytedance.com, xieyongji@bytedance.com, chaiwen.cc@bytedance.com, Jason Wang , "David S. Miller" , Jakub Kicinski , Steven Rostedt , Ingo Molnar , Colin Ian King , Andra Paraschiv , Norbert Slusarek , Alexander Popov , kvm@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC v1 2/6] virtio/vsock: add support for virtio datagram Date: Wed, 9 Jun 2021 23:24:54 +0000 Message-Id: <20210609232501.171257-3-jiang.wang@bytedance.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20210609232501.171257-1-jiang.wang@bytedance.com> References: <20210609232501.171257-1-jiang.wang@bytedance.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC This patch add support for virtio dgram for the driver. Implemented related functions for tx and rx, enqueue and dequeue. Send packets synchronously to give sender indication when the virtqueue is full. Refactored virtio_transport_send_pkt_work() a little bit but no functions changes for it. Support for the host/device side is in another patch. Signed-off-by: Jiang Wang --- include/net/af_vsock.h | 1 + .../trace/events/vsock_virtio_transport_common.h | 5 +- include/uapi/linux/virtio_vsock.h | 1 + net/vmw_vsock/af_vsock.c | 12 + net/vmw_vsock/virtio_transport.c | 325 ++++++++++++++++++--- net/vmw_vsock/virtio_transport_common.c | 184 ++++++++++-- 6 files changed, 466 insertions(+), 62 deletions(-) diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h index b1c717286993..fcae7bca9609 100644 --- a/include/net/af_vsock.h +++ b/include/net/af_vsock.h @@ -200,6 +200,7 @@ void vsock_remove_sock(struct vsock_sock *vsk); void vsock_for_each_connected_socket(void (*fn)(struct sock *sk)); int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk); bool vsock_find_cid(unsigned int cid); +int vsock_bind_stream(struct vsock_sock *vsk, struct sockaddr_vm *addr); /**** TAP ****/ diff --git a/include/trace/events/vsock_virtio_transport_common.h b/include/trace/events/vsock_virtio_transport_common.h index 6782213778be..b1be25b327a1 100644 --- a/include/trace/events/vsock_virtio_transport_common.h +++ b/include/trace/events/vsock_virtio_transport_common.h @@ -9,9 +9,12 @@ #include TRACE_DEFINE_ENUM(VIRTIO_VSOCK_TYPE_STREAM); +TRACE_DEFINE_ENUM(VIRTIO_VSOCK_TYPE_DGRAM); #define show_type(val) \ - __print_symbolic(val, { VIRTIO_VSOCK_TYPE_STREAM, "STREAM" }) + __print_symbolic(val, \ + { VIRTIO_VSOCK_TYPE_STREAM, "STREAM" }, \ + { VIRTIO_VSOCK_TYPE_DGRAM, "DGRAM" }) TRACE_DEFINE_ENUM(VIRTIO_VSOCK_OP_INVALID); TRACE_DEFINE_ENUM(VIRTIO_VSOCK_OP_REQUEST); diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h index b56614dff1c9..5503585b26e8 100644 --- a/include/uapi/linux/virtio_vsock.h +++ b/include/uapi/linux/virtio_vsock.h @@ -68,6 +68,7 @@ struct virtio_vsock_hdr { enum virtio_vsock_type { VIRTIO_VSOCK_TYPE_STREAM = 1, + VIRTIO_VSOCK_TYPE_DGRAM = 3, }; enum virtio_vsock_op { diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 92a72f0e0d94..c1f512291b94 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -659,6 +659,18 @@ static int __vsock_bind_stream(struct vsock_sock *vsk, return 0; } +int vsock_bind_stream(struct vsock_sock *vsk, + struct sockaddr_vm *addr) +{ + int retval; + + spin_lock_bh(&vsock_table_lock); + retval = __vsock_bind_stream(vsk, addr); + spin_unlock_bh(&vsock_table_lock); + return retval; +} +EXPORT_SYMBOL(vsock_bind_stream); + static int __vsock_bind_dgram(struct vsock_sock *vsk, struct sockaddr_vm *addr) { diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 7dcb8db23305..cf47aadb0c34 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -20,21 +20,29 @@ #include #include #include +#include +#include +#include static struct workqueue_struct *virtio_vsock_workqueue; static struct virtio_vsock __rcu *the_virtio_vsock; +static struct virtio_vsock *the_virtio_vsock_dgram; static DEFINE_MUTEX(the_virtio_vsock_mutex); /* protects the_virtio_vsock */ struct virtio_vsock { struct virtio_device *vdev; struct virtqueue **vqs; bool has_dgram; + refcount_t active; /* Virtqueue processing is deferred to a workqueue */ struct work_struct tx_work; struct work_struct rx_work; struct work_struct event_work; + struct work_struct dgram_tx_work; + struct work_struct dgram_rx_work; + /* The following fields are protected by tx_lock. vqs[VSOCK_VQ_TX] * must be accessed with tx_lock held. */ @@ -55,6 +63,22 @@ struct virtio_vsock { int rx_buf_nr; int rx_buf_max_nr; + /* The following fields are protected by dgram_tx_lock. vqs[VSOCK_VQ_DGRAM_TX] + * must be accessed with dgram_tx_lock held. + */ + struct mutex dgram_tx_lock; + bool dgram_tx_run; + + atomic_t dgram_queued_replies; + + /* The following fields are protected by dgram_rx_lock. vqs[VSOCK_VQ_DGRAM_RX] + * must be accessed with dgram_rx_lock held. + */ + struct mutex dgram_rx_lock; + bool dgram_rx_run; + int dgram_rx_buf_nr; + int dgram_rx_buf_max_nr; + /* The following fields are protected by event_lock. * vqs[VSOCK_VQ_EVENT] must be accessed with event_lock held. */ @@ -83,21 +107,11 @@ static u32 virtio_transport_get_local_cid(void) return ret; } -static void -virtio_transport_send_pkt_work(struct work_struct *work) +static void virtio_transport_do_send_pkt(struct virtio_vsock *vsock, + struct virtqueue *vq, spinlock_t *lock, struct list_head *send_pkt_list, + bool *restart_rx) { - struct virtio_vsock *vsock = - container_of(work, struct virtio_vsock, send_pkt_work); - struct virtqueue *vq; bool added = false; - bool restart_rx = false; - - mutex_lock(&vsock->tx_lock); - - if (!vsock->tx_run) - goto out; - - vq = vsock->vqs[VSOCK_VQ_TX]; for (;;) { struct virtio_vsock_pkt *pkt; @@ -105,16 +119,16 @@ virtio_transport_send_pkt_work(struct work_struct *work) int ret, in_sg = 0, out_sg = 0; bool reply; - spin_lock_bh(&vsock->send_pkt_list_lock); - if (list_empty(&vsock->send_pkt_list)) { - spin_unlock_bh(&vsock->send_pkt_list_lock); + spin_lock_bh(lock); + if (list_empty(send_pkt_list)) { + spin_unlock_bh(lock); break; } - pkt = list_first_entry(&vsock->send_pkt_list, + pkt = list_first_entry(send_pkt_list, struct virtio_vsock_pkt, list); list_del_init(&pkt->list); - spin_unlock_bh(&vsock->send_pkt_list_lock); + spin_unlock_bh(lock); virtio_transport_deliver_tap_pkt(pkt); @@ -132,9 +146,9 @@ virtio_transport_send_pkt_work(struct work_struct *work) * the vq */ if (ret < 0) { - spin_lock_bh(&vsock->send_pkt_list_lock); - list_add(&pkt->list, &vsock->send_pkt_list); - spin_unlock_bh(&vsock->send_pkt_list_lock); + spin_lock_bh(lock); + list_add(&pkt->list, send_pkt_list); + spin_unlock_bh(lock); break; } @@ -146,7 +160,7 @@ virtio_transport_send_pkt_work(struct work_struct *work) /* Do we now have resources to resume rx processing? */ if (val + 1 == virtqueue_get_vring_size(rx_vq)) - restart_rx = true; + *restart_rx = true; } added = true; @@ -154,7 +168,55 @@ virtio_transport_send_pkt_work(struct work_struct *work) if (added) virtqueue_kick(vq); +} +static int virtio_transport_do_send_dgram_pkt(struct virtio_vsock *vsock, + struct virtqueue *vq, struct virtio_vsock_pkt *pkt) +{ + struct scatterlist hdr, buf, *sgs[2]; + int ret, in_sg = 0, out_sg = 0; + + virtio_transport_deliver_tap_pkt(pkt); + + sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr)); + sgs[out_sg++] = &hdr; + if (pkt->buf) { + sg_init_one(&buf, pkt->buf, pkt->len); + sgs[out_sg++] = &buf; + } + + ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, pkt, GFP_KERNEL); + /* Usually this means that there is no more space available in + * the vq + */ + if (ret < 0) { + virtio_transport_free_pkt(pkt); + return -ENOMEM; + } + + virtqueue_kick(vq); + + return pkt->len; +} + + +static void +virtio_transport_send_pkt_work(struct work_struct *work) +{ + struct virtio_vsock *vsock = + container_of(work, struct virtio_vsock, send_pkt_work); + struct virtqueue *vq; + bool restart_rx = false; + + mutex_lock(&vsock->tx_lock); + + if (!vsock->tx_run) + goto out; + + vq = vsock->vqs[VSOCK_VQ_TX]; + + virtio_transport_do_send_pkt(vsock, vq, &vsock->send_pkt_list_lock, + &vsock->send_pkt_list, &restart_rx); out: mutex_unlock(&vsock->tx_lock); @@ -163,11 +225,64 @@ virtio_transport_send_pkt_work(struct work_struct *work) } static int +virtio_transport_send_dgram_pkt(struct virtio_vsock_pkt *pkt) +{ + struct virtio_vsock *vsock; + int len = pkt->len; + struct virtqueue *vq; + + vsock = the_virtio_vsock_dgram; + + if (!vsock) { + virtio_transport_free_pkt(pkt); + return -ENODEV; + } + + if (!vsock->dgram_tx_run) { + virtio_transport_free_pkt(pkt); + return -ENODEV; + } + + if (!refcount_inc_not_zero(&vsock->active)) { + virtio_transport_free_pkt(pkt); + return -ENODEV; + } + + if (le64_to_cpu(pkt->hdr.dst_cid) == vsock->guest_cid) { + virtio_transport_free_pkt(pkt); + len = -ENODEV; + goto out_ref; + } + + /* send the pkt */ + mutex_lock(&vsock->dgram_tx_lock); + + if (!vsock->dgram_tx_run) + goto out_mutex; + + vq = vsock->vqs[VSOCK_VQ_DGRAM_TX]; + + len = virtio_transport_do_send_dgram_pkt(vsock, vq, pkt); + +out_mutex: + mutex_unlock(&vsock->dgram_tx_lock); + +out_ref: + if (!refcount_dec_not_one(&vsock->active)) + return -EFAULT; + + return len; +} + +static int virtio_transport_send_pkt(struct virtio_vsock_pkt *pkt) { struct virtio_vsock *vsock; int len = pkt->len; + if (pkt->hdr.type == VIRTIO_VSOCK_TYPE_DGRAM) + return virtio_transport_send_dgram_pkt(pkt); + rcu_read_lock(); vsock = rcu_dereference(the_virtio_vsock); if (!vsock) { @@ -243,7 +358,7 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk) return ret; } -static void virtio_vsock_rx_fill(struct virtio_vsock *vsock) +static void virtio_vsock_rx_fill(struct virtio_vsock *vsock, bool is_dgram) { int buf_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE; struct virtio_vsock_pkt *pkt; @@ -251,7 +366,10 @@ static void virtio_vsock_rx_fill(struct virtio_vsock *vsock) struct virtqueue *vq; int ret; - vq = vsock->vqs[VSOCK_VQ_RX]; + if (is_dgram) + vq = vsock->vqs[VSOCK_VQ_DGRAM_RX]; + else + vq = vsock->vqs[VSOCK_VQ_RX]; do { pkt = kzalloc(sizeof(*pkt), GFP_KERNEL); @@ -277,10 +395,19 @@ static void virtio_vsock_rx_fill(struct virtio_vsock *vsock) virtio_transport_free_pkt(pkt); break; } - vsock->rx_buf_nr++; + if (is_dgram) + vsock->dgram_rx_buf_nr++; + else + vsock->rx_buf_nr++; } while (vq->num_free); - if (vsock->rx_buf_nr > vsock->rx_buf_max_nr) - vsock->rx_buf_max_nr = vsock->rx_buf_nr; + if (is_dgram) { + if (vsock->dgram_rx_buf_nr > vsock->dgram_rx_buf_max_nr) + vsock->dgram_rx_buf_max_nr = vsock->dgram_rx_buf_nr; + } else { + if (vsock->rx_buf_nr > vsock->rx_buf_max_nr) + vsock->rx_buf_max_nr = vsock->rx_buf_nr; + } + virtqueue_kick(vq); } @@ -315,6 +442,34 @@ static void virtio_transport_tx_work(struct work_struct *work) queue_work(virtio_vsock_workqueue, &vsock->send_pkt_work); } +static void virtio_transport_dgram_tx_work(struct work_struct *work) +{ + struct virtio_vsock *vsock = + container_of(work, struct virtio_vsock, dgram_tx_work); + struct virtqueue *vq; + bool added = false; + + vq = vsock->vqs[VSOCK_VQ_DGRAM_TX]; + mutex_lock(&vsock->dgram_tx_lock); + + if (!vsock->dgram_tx_run) + goto out; + + do { + struct virtio_vsock_pkt *pkt; + unsigned int len; + + virtqueue_disable_cb(vq); + while ((pkt = virtqueue_get_buf(vq, &len)) != NULL) { + virtio_transport_free_pkt(pkt); + added = true; + } + } while (!virtqueue_enable_cb(vq)); + +out: + mutex_unlock(&vsock->dgram_tx_lock); +} + /* Is there space left for replies to rx packets? */ static bool virtio_transport_more_replies(struct virtio_vsock *vsock) { @@ -449,6 +604,11 @@ static void virtio_vsock_tx_done(struct virtqueue *vq) static void virtio_vsock_dgram_tx_done(struct virtqueue *vq) { + struct virtio_vsock *vsock = vq->vdev->priv; + + if (!vsock) + return; + queue_work(virtio_vsock_workqueue, &vsock->dgram_tx_work); } static void virtio_vsock_rx_done(struct virtqueue *vq) @@ -462,8 +622,12 @@ static void virtio_vsock_rx_done(struct virtqueue *vq) static void virtio_vsock_dgram_rx_done(struct virtqueue *vq) { -} + struct virtio_vsock *vsock = vq->vdev->priv; + if (!vsock) + return; + queue_work(virtio_vsock_workqueue, &vsock->dgram_rx_work); +} static struct virtio_transport virtio_transport = { .transport = { .module = THIS_MODULE, @@ -506,19 +670,9 @@ static struct virtio_transport virtio_transport = { .send_pkt = virtio_transport_send_pkt, }; -static void virtio_transport_rx_work(struct work_struct *work) +static void virtio_transport_do_rx_work(struct virtio_vsock *vsock, + struct virtqueue *vq, bool is_dgram) { - struct virtio_vsock *vsock = - container_of(work, struct virtio_vsock, rx_work); - struct virtqueue *vq; - - vq = vsock->vqs[VSOCK_VQ_RX]; - - mutex_lock(&vsock->rx_lock); - - if (!vsock->rx_run) - goto out; - do { virtqueue_disable_cb(vq); for (;;) { @@ -538,7 +692,10 @@ static void virtio_transport_rx_work(struct work_struct *work) break; } - vsock->rx_buf_nr--; + if (is_dgram) + vsock->dgram_rx_buf_nr--; + else + vsock->rx_buf_nr--; /* Drop short/long packets */ if (unlikely(len < sizeof(pkt->hdr) || @@ -554,11 +711,45 @@ static void virtio_transport_rx_work(struct work_struct *work) } while (!virtqueue_enable_cb(vq)); out: + return; +} + +static void virtio_transport_rx_work(struct work_struct *work) +{ + struct virtio_vsock *vsock = + container_of(work, struct virtio_vsock, rx_work); + struct virtqueue *vq; + + vq = vsock->vqs[VSOCK_VQ_RX]; + + mutex_lock(&vsock->rx_lock); + + if (vsock->rx_run) + virtio_transport_do_rx_work(vsock, vq, false); + if (vsock->rx_buf_nr < vsock->rx_buf_max_nr / 2) - virtio_vsock_rx_fill(vsock); + virtio_vsock_rx_fill(vsock, false); mutex_unlock(&vsock->rx_lock); } +static void virtio_transport_dgram_rx_work(struct work_struct *work) +{ + struct virtio_vsock *vsock = + container_of(work, struct virtio_vsock, dgram_rx_work); + struct virtqueue *vq; + + vq = vsock->vqs[VSOCK_VQ_DGRAM_RX]; + + mutex_lock(&vsock->dgram_rx_lock); + + if (vsock->dgram_rx_run) + virtio_transport_do_rx_work(vsock, vq, true); + + if (vsock->dgram_rx_buf_nr < vsock->dgram_rx_buf_max_nr / 2) + virtio_vsock_rx_fill(vsock, true); + mutex_unlock(&vsock->dgram_rx_lock); +} + static int virtio_vsock_probe(struct virtio_device *vdev) { vq_callback_t *callbacks[] = { @@ -642,8 +833,14 @@ static int virtio_vsock_probe(struct virtio_device *vdev) vsock->rx_buf_max_nr = 0; atomic_set(&vsock->queued_replies, 0); + vsock->dgram_rx_buf_nr = 0; + vsock->dgram_rx_buf_max_nr = 0; + atomic_set(&vsock->dgram_queued_replies, 0); + mutex_init(&vsock->tx_lock); mutex_init(&vsock->rx_lock); + mutex_init(&vsock->dgram_tx_lock); + mutex_init(&vsock->dgram_rx_lock); mutex_init(&vsock->event_lock); spin_lock_init(&vsock->send_pkt_list_lock); INIT_LIST_HEAD(&vsock->send_pkt_list); @@ -651,16 +848,27 @@ static int virtio_vsock_probe(struct virtio_device *vdev) INIT_WORK(&vsock->tx_work, virtio_transport_tx_work); INIT_WORK(&vsock->event_work, virtio_transport_event_work); INIT_WORK(&vsock->send_pkt_work, virtio_transport_send_pkt_work); + INIT_WORK(&vsock->dgram_rx_work, virtio_transport_dgram_rx_work); + INIT_WORK(&vsock->dgram_tx_work, virtio_transport_dgram_tx_work); mutex_lock(&vsock->tx_lock); vsock->tx_run = true; mutex_unlock(&vsock->tx_lock); + mutex_lock(&vsock->dgram_tx_lock); + vsock->dgram_tx_run = true; + mutex_unlock(&vsock->dgram_tx_lock); + mutex_lock(&vsock->rx_lock); - virtio_vsock_rx_fill(vsock); + virtio_vsock_rx_fill(vsock, false); vsock->rx_run = true; mutex_unlock(&vsock->rx_lock); + mutex_lock(&vsock->dgram_rx_lock); + virtio_vsock_rx_fill(vsock, true); + vsock->dgram_rx_run = true; + mutex_unlock(&vsock->dgram_rx_lock); + mutex_lock(&vsock->event_lock); virtio_vsock_event_fill(vsock); vsock->event_run = true; @@ -669,6 +877,9 @@ static int virtio_vsock_probe(struct virtio_device *vdev) vdev->priv = vsock; rcu_assign_pointer(the_virtio_vsock, vsock); + the_virtio_vsock_dgram = vsock; + refcount_set(&the_virtio_vsock_dgram->active, 1); + mutex_unlock(&the_virtio_vsock_mutex); return 0; @@ -699,14 +910,28 @@ static void virtio_vsock_remove(struct virtio_device *vdev) vsock->rx_run = false; mutex_unlock(&vsock->rx_lock); + mutex_lock(&vsock->dgram_rx_lock); + vsock->dgram_rx_run = false; + mutex_unlock(&vsock->dgram_rx_lock); + mutex_lock(&vsock->tx_lock); vsock->tx_run = false; mutex_unlock(&vsock->tx_lock); + mutex_lock(&vsock->dgram_tx_lock); + vsock->dgram_tx_run = false; + mutex_unlock(&vsock->dgram_tx_lock); + mutex_lock(&vsock->event_lock); vsock->event_run = false; mutex_unlock(&vsock->event_lock); + while (!refcount_dec_if_one(&the_virtio_vsock_dgram->active)) { + if (signal_pending(current)) + break; + msleep(5); + } + /* Flush all device writes and interrupts, device will not use any * more buffers. */ @@ -717,11 +942,21 @@ static void virtio_vsock_remove(struct virtio_device *vdev) virtio_transport_free_pkt(pkt); mutex_unlock(&vsock->rx_lock); + mutex_lock(&vsock->dgram_rx_lock); + while ((pkt = virtqueue_detach_unused_buf(vsock->vqs[VSOCK_VQ_DGRAM_RX]))) + virtio_transport_free_pkt(pkt); + mutex_unlock(&vsock->dgram_rx_lock); + mutex_lock(&vsock->tx_lock); while ((pkt = virtqueue_detach_unused_buf(vsock->vqs[VSOCK_VQ_TX]))) virtio_transport_free_pkt(pkt); mutex_unlock(&vsock->tx_lock); + mutex_lock(&vsock->dgram_tx_lock); + while ((pkt = virtqueue_detach_unused_buf(vsock->vqs[VSOCK_VQ_DGRAM_TX]))) + virtio_transport_free_pkt(pkt); + mutex_unlock(&vsock->dgram_tx_lock); + spin_lock_bh(&vsock->send_pkt_list_lock); while (!list_empty(&vsock->send_pkt_list)) { pkt = list_first_entry(&vsock->send_pkt_list, @@ -739,6 +974,8 @@ static void virtio_vsock_remove(struct virtio_device *vdev) */ flush_work(&vsock->rx_work); flush_work(&vsock->tx_work); + flush_work(&vsock->dgram_rx_work); + flush_work(&vsock->dgram_tx_work); flush_work(&vsock->event_work); flush_work(&vsock->send_pkt_work); @@ -775,7 +1012,7 @@ static int __init virtio_vsock_init(void) return -ENOMEM; ret = vsock_core_register(&virtio_transport.transport, - VSOCK_TRANSPORT_F_G2H); + VSOCK_TRANSPORT_F_G2H | VSOCK_TRANSPORT_F_DGRAM); if (ret) goto out_wq; diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 902cb6dd710b..9f041515b7f1 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -26,6 +26,8 @@ /* Threshold for detecting small packets to copy */ #define GOOD_COPY_LEN 128 +static s64 virtio_transport_dgram_has_data(struct vsock_sock *vsk); + static const struct virtio_transport * virtio_transport_get_ops(struct vsock_sock *vsk) { @@ -196,21 +198,28 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk, vvs = vsk->trans; /* we can send less than pkt_len bytes */ - if (pkt_len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) - pkt_len = VIRTIO_VSOCK_MAX_PKT_BUF_SIZE; + if (pkt_len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) { + if (info->type == VIRTIO_VSOCK_TYPE_STREAM) + pkt_len = VIRTIO_VSOCK_MAX_PKT_BUF_SIZE; + else + return 0; + } - /* virtio_transport_get_credit might return less than pkt_len credit */ - pkt_len = virtio_transport_get_credit(vvs, pkt_len); + if (info->type == VIRTIO_VSOCK_TYPE_STREAM) { + /* virtio_transport_get_credit might return less than pkt_len credit */ + pkt_len = virtio_transport_get_credit(vvs, pkt_len); - /* Do not send zero length OP_RW pkt */ - if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW) - return pkt_len; + /* Do not send zero length OP_RW pkt */ + if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW) + return pkt_len; + } pkt = virtio_transport_alloc_pkt(info, pkt_len, src_cid, src_port, dst_cid, dst_port); if (!pkt) { - virtio_transport_put_credit(vvs, pkt_len); + if (info->type == VIRTIO_VSOCK_TYPE_STREAM) + virtio_transport_put_credit(vvs, pkt_len); return -ENOMEM; } @@ -397,6 +406,58 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk, return err; } +static ssize_t +virtio_transport_dgram_do_dequeue(struct vsock_sock *vsk, + struct msghdr *msg, size_t len) +{ + struct virtio_vsock_sock *vvs = vsk->trans; + struct virtio_vsock_pkt *pkt; + size_t total = 0; + u32 free_space; + int err = -EFAULT; + + spin_lock_bh(&vvs->rx_lock); + if (total < len && !list_empty(&vvs->rx_queue)) { + pkt = list_first_entry(&vvs->rx_queue, + struct virtio_vsock_pkt, list); + + total = len; + if (total > pkt->len - pkt->off) + total = pkt->len - pkt->off; + else if (total < pkt->len - pkt->off) + msg->msg_flags |= MSG_TRUNC; + + /* sk_lock is held by caller so no one else can dequeue. + * Unlock rx_lock since memcpy_to_msg() may sleep. + */ + spin_unlock_bh(&vvs->rx_lock); + + err = memcpy_to_msg(msg, pkt->buf + pkt->off, total); + if (err) + return err; + + spin_lock_bh(&vvs->rx_lock); + + virtio_transport_dec_rx_pkt(vvs, pkt); + list_del(&pkt->list); + virtio_transport_free_pkt(pkt); + } + + free_space = vvs->buf_alloc - (vvs->fwd_cnt - vvs->last_fwd_cnt); + + spin_unlock_bh(&vvs->rx_lock); + + if (total > 0 && msg->msg_name) { + /* Provide the address of the sender. */ + DECLARE_SOCKADDR(struct sockaddr_vm *, vm_addr, msg->msg_name); + + vsock_addr_init(vm_addr, le64_to_cpu(pkt->hdr.src_cid), + le32_to_cpu(pkt->hdr.src_port)); + msg->msg_namelen = sizeof(*vm_addr); + } + return total; +} + ssize_t virtio_transport_stream_dequeue(struct vsock_sock *vsk, struct msghdr *msg, @@ -414,7 +475,66 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk, struct msghdr *msg, size_t len, int flags) { - return -EOPNOTSUPP; + struct sock *sk; + size_t err = 0; + long timeout; + + DEFINE_WAIT(wait); + + sk = &vsk->sk; + err = 0; + + lock_sock(sk); + + if (flags & MSG_OOB || flags & MSG_ERRQUEUE || flags & MSG_PEEK) + return -EOPNOTSUPP; + + if (!len) + goto out; + + timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT); + + while (1) { + s64 ready; + + prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE); + ready = virtio_transport_dgram_has_data(vsk); + + if (ready == 0) { + if (timeout == 0) { + err = -EAGAIN; + finish_wait(sk_sleep(sk), &wait); + break; + } + + release_sock(sk); + timeout = schedule_timeout(timeout); + lock_sock(sk); + + if (signal_pending(current)) { + err = sock_intr_errno(timeout); + finish_wait(sk_sleep(sk), &wait); + break; + } else if (timeout == 0) { + err = -EAGAIN; + finish_wait(sk_sleep(sk), &wait); + break; + } + } else { + finish_wait(sk_sleep(sk), &wait); + + if (ready < 0) { + err = -ENOMEM; + goto out; + } + + err = virtio_transport_dgram_do_dequeue(vsk, msg, len); + break; + } + } +out: + release_sock(sk); + return err; } EXPORT_SYMBOL_GPL(virtio_transport_dgram_dequeue); @@ -431,6 +551,11 @@ s64 virtio_transport_stream_has_data(struct vsock_sock *vsk) } EXPORT_SYMBOL_GPL(virtio_transport_stream_has_data); +static s64 virtio_transport_dgram_has_data(struct vsock_sock *vsk) +{ + return virtio_transport_stream_has_data(vsk); +} + static s64 virtio_transport_has_space(struct vsock_sock *vsk) { struct virtio_vsock_sock *vvs = vsk->trans; @@ -610,13 +735,15 @@ EXPORT_SYMBOL_GPL(virtio_transport_stream_allow); int virtio_transport_dgram_bind(struct vsock_sock *vsk, struct sockaddr_vm *addr) { - return -EOPNOTSUPP; + //use same stream bind for dgram + int ret = vsock_bind_stream(vsk, addr); + return ret; } EXPORT_SYMBOL_GPL(virtio_transport_dgram_bind); bool virtio_transport_dgram_allow(u32 cid, u32 port) { - return false; + return true; } EXPORT_SYMBOL_GPL(virtio_transport_dgram_allow); @@ -654,7 +781,17 @@ virtio_transport_dgram_enqueue(struct vsock_sock *vsk, struct msghdr *msg, size_t dgram_len) { - return -EOPNOTSUPP; + struct virtio_vsock_pkt_info info = { + .op = VIRTIO_VSOCK_OP_RW, + .type = VIRTIO_VSOCK_TYPE_DGRAM, + .msg = msg, + .pkt_len = dgram_len, + .vsk = vsk, + .remote_cid = remote_addr->svm_cid, + .remote_port = remote_addr->svm_port, + }; + + return virtio_transport_send_pkt_info(vsk, &info); } EXPORT_SYMBOL_GPL(virtio_transport_dgram_enqueue); @@ -729,7 +866,6 @@ static int virtio_transport_reset_no_sock(const struct virtio_transport *t, virtio_transport_free_pkt(reply); return -ENOTCONN; } - return t->send_pkt(reply); } @@ -925,7 +1061,8 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk, /* If there is space in the last packet queued, we copy the * new packet in its buffer. */ - if (pkt->len <= last_pkt->buf_len - last_pkt->len) { + if (pkt->len <= last_pkt->buf_len - last_pkt->len && + pkt->hdr.type == VIRTIO_VSOCK_TYPE_STREAM) { memcpy(last_pkt->buf + last_pkt->len, pkt->buf, pkt->len); last_pkt->len += pkt->len; @@ -949,6 +1086,12 @@ virtio_transport_recv_connected(struct sock *sk, struct vsock_sock *vsk = vsock_sk(sk); int err = 0; + if (le16_to_cpu(pkt->hdr.type == VIRTIO_VSOCK_TYPE_DGRAM)) { + virtio_transport_recv_enqueue(vsk, pkt); + sk->sk_data_ready(sk); + return err; + } + switch (le16_to_cpu(pkt->hdr.op)) { case VIRTIO_VSOCK_OP_RW: virtio_transport_recv_enqueue(vsk, pkt); @@ -1121,7 +1264,8 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, le32_to_cpu(pkt->hdr.buf_alloc), le32_to_cpu(pkt->hdr.fwd_cnt)); - if (le16_to_cpu(pkt->hdr.type) != VIRTIO_VSOCK_TYPE_STREAM) { + if (le16_to_cpu(pkt->hdr.type) != VIRTIO_VSOCK_TYPE_STREAM && + le16_to_cpu(pkt->hdr.type) != VIRTIO_VSOCK_TYPE_DGRAM) { (void)virtio_transport_reset_no_sock(t, pkt); goto free_pkt; } @@ -1150,11 +1294,16 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, goto free_pkt; } - space_available = virtio_transport_space_update(sk, pkt); - /* Update CID in case it has changed after a transport reset event */ vsk->local_addr.svm_cid = dst.svm_cid; + if (sk->sk_type == SOCK_DGRAM) { + virtio_transport_recv_connected(sk, pkt); + goto out; + } + + space_available = virtio_transport_space_update(sk, pkt); + if (space_available) sk->sk_write_space(sk); @@ -1180,6 +1329,7 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, break; } +out: release_sock(sk); /* Release refcnt obtained when we fetched this socket out of the From patchwork Wed Jun 9 23:24:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang Wang ." X-Patchwork-Id: 12311441 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD3BAC48BD1 for ; Wed, 9 Jun 2021 23:28:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9F1C0613DF for ; Wed, 9 Jun 2021 23:28:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230190AbhFIXav (ORCPT ); Wed, 9 Jun 2021 19:30:51 -0400 Received: from mail-pf1-f181.google.com ([209.85.210.181]:37538 "EHLO mail-pf1-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230151AbhFIXaq (ORCPT ); Wed, 9 Jun 2021 19:30:46 -0400 Received: by mail-pf1-f181.google.com with SMTP id y15so43874pfl.4 for ; Wed, 09 Jun 2021 16:28:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zxtMDfpXRh3G7FF9P7xCFZvMAcJpeu4u9HHZiC5BmgQ=; b=hDsjaPoMrtSH9GRbio36iLR/pUW7L/Tl/S17W+NPDDeVtYBdfRaM8WBwLl90xqUg1S Oi8bnEdCuR3FFeNuOGTOHHuJfisQ2/e7uNmiXFBPfHdnse7zgfbTCoLftowTBaThNvN8 +BXgen13NQ8IXEUiI6rBui0E7/F3nKJdi+RI8CGUogWWmG/MArOhdtMq1Q3Z+mvnYri0 62AZCopfnUXtIGuRiPKodspVFkmWijqVEyLjP/GgqGM0gvTSPOmECcm7VXhv2gjWVJrL zo1qsXCPKAyv0haV1m6/1DAmr2IzInifbQuC+MikpQrqFARoMtb4Jso+1ZyZPqiRSPzH W8JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zxtMDfpXRh3G7FF9P7xCFZvMAcJpeu4u9HHZiC5BmgQ=; b=ma9F5cdOYp57RYki5m8K5LenkCdW1yxZfWauD351xmUQGFqmun3zyCODL6me1NPAYF r0ydE4GwRT8Aud8uPZLf56+l7kpw1SkctOEPZL3pTf8B6LBHr7PhQbgUNQs8j6udmkzR 8xK5lRtGkRTpvJcCJYEttRPYQDRZe2dVVuSL9XRI73cJ1ZtPt9x4Pya+WxPn2e3tyN0Z kEJ6H4wv2RjL/pM+h5HbBvTMsJUXbj3/jO+Z4NMa/80AcXvB7eveiuwTsjDy/uCdwS0x 7mEk8/lk4vWkEUF0fsaOIE5SeU4DH7JV/Owg7Uf3vp08kvEq9PSMe0iGuw1Em59Je2GY t72g== X-Gm-Message-State: AOAM5307QdpWz01iXWO93BhR8l7dQjPx9IS+urUhW/ZCAi76PSE/HEaB OXFbjCCP2D3kQnJdvmuaOcw/hQ== X-Google-Smtp-Source: ABdhPJzbcQV0SQVOZFJb0/94leYwmQVEg4ITR6sRDVUmzofrPThAI14li5FiFITWKO+rl0OnZK49Sg== X-Received: by 2002:a62:7f15:0:b029:2e9:c6b8:516d with SMTP id a21-20020a627f150000b02902e9c6b8516dmr132686pfd.52.1623281259420; Wed, 09 Jun 2021 16:27:39 -0700 (PDT) Received: from n124-121-013.byted.org (ec2-54-241-92-238.us-west-1.compute.amazonaws.com. [54.241.92.238]) by smtp.gmail.com with ESMTPSA id k1sm526783pfa.30.2021.06.09.16.27.37 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Jun 2021 16:27:39 -0700 (PDT) From: Jiang Wang To: sgarzare@redhat.com Cc: virtualization@lists.linux-foundation.org, stefanha@redhat.com, mst@redhat.com, arseny.krasnov@kaspersky.com, jhansen@vmware.comments, cong.wang@bytedance.com, duanxiongchun@bytedance.com, xieyongji@bytedance.com, chaiwen.cc@bytedance.com, Jason Wang , "David S. Miller" , Jakub Kicinski , Steven Rostedt , Ingo Molnar , Colin Ian King , Andra Paraschiv , Norbert Slusarek , Jeff Vander Stoep , Alexander Popov , kvm@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC v1 3/6] vhost/vsock: add support for vhost dgram. Date: Wed, 9 Jun 2021 23:24:55 +0000 Message-Id: <20210609232501.171257-4-jiang.wang@bytedance.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20210609232501.171257-1-jiang.wang@bytedance.com> References: <20210609232501.171257-1-jiang.wang@bytedance.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-State: RFC This patch supports dgram on vhost side, including tx and rx. The vhost send packets asynchronously. Signed-off-by: Jiang Wang --- drivers/vhost/vsock.c | 199 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 173 insertions(+), 26 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 81d064601093..d366463be6d4 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -28,7 +28,10 @@ * small pkts. */ #define VHOST_VSOCK_PKT_WEIGHT 256 +#define VHOST_VSOCK_DGRM_MAX_PENDING_PKT 128 +/* Max wait time in busy poll in microseconds */ +#define VHOST_VSOCK_BUSY_POLL_TIMEOUT 20 enum { VHOST_VSOCK_FEATURES = VHOST_FEATURES | (1ULL << VIRTIO_F_ACCESS_PLATFORM) | @@ -45,7 +48,7 @@ static DEFINE_READ_MOSTLY_HASHTABLE(vhost_vsock_hash, 8); struct vhost_vsock { struct vhost_dev dev; - struct vhost_virtqueue vqs[2]; + struct vhost_virtqueue vqs[4]; /* Link to global vhost_vsock_hash, writes use vhost_vsock_mutex */ struct hlist_node hash; @@ -54,6 +57,11 @@ struct vhost_vsock { spinlock_t send_pkt_list_lock; struct list_head send_pkt_list; /* host->guest pending packets */ + spinlock_t dgram_send_pkt_list_lock; + struct list_head dgram_send_pkt_list; /* host->guest pending packets */ + struct vhost_work dgram_send_pkt_work; + int dgram_used; /*pending packets to be send */ + atomic_t queued_replies; u32 guest_cid; @@ -90,10 +98,22 @@ static void vhost_transport_do_send_pkt(struct vhost_vsock *vsock, struct vhost_virtqueue *vq) { - struct vhost_virtqueue *tx_vq = &vsock->vqs[VSOCK_VQ_TX]; + struct vhost_virtqueue *tx_vq; int pkts = 0, total_len = 0; bool added = false; bool restart_tx = false; + spinlock_t *lock; + struct list_head *send_pkt_list; + + if (vq == &vsock->vqs[VSOCK_VQ_RX]) { + tx_vq = &vsock->vqs[VSOCK_VQ_TX]; + lock = &vsock->send_pkt_list_lock; + send_pkt_list = &vsock->send_pkt_list; + } else { + tx_vq = &vsock->vqs[VSOCK_VQ_DGRAM_TX]; + lock = &vsock->dgram_send_pkt_list_lock; + send_pkt_list = &vsock->dgram_send_pkt_list; + } mutex_lock(&vq->mutex); @@ -113,36 +133,48 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, size_t nbytes; size_t iov_len, payload_len; int head; + bool is_dgram = false; - spin_lock_bh(&vsock->send_pkt_list_lock); - if (list_empty(&vsock->send_pkt_list)) { - spin_unlock_bh(&vsock->send_pkt_list_lock); + spin_lock_bh(lock); + if (list_empty(send_pkt_list)) { + spin_unlock_bh(lock); vhost_enable_notify(&vsock->dev, vq); break; } - pkt = list_first_entry(&vsock->send_pkt_list, + pkt = list_first_entry(send_pkt_list, struct virtio_vsock_pkt, list); list_del_init(&pkt->list); - spin_unlock_bh(&vsock->send_pkt_list_lock); + spin_unlock_bh(lock); + + if (pkt->hdr.type == VIRTIO_VSOCK_TYPE_DGRAM) + is_dgram = true; head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov), &out, &in, NULL, NULL); if (head < 0) { - spin_lock_bh(&vsock->send_pkt_list_lock); - list_add(&pkt->list, &vsock->send_pkt_list); - spin_unlock_bh(&vsock->send_pkt_list_lock); + spin_lock_bh(lock); + list_add(&pkt->list, send_pkt_list); + spin_unlock_bh(lock); break; } if (head == vq->num) { - spin_lock_bh(&vsock->send_pkt_list_lock); - list_add(&pkt->list, &vsock->send_pkt_list); - spin_unlock_bh(&vsock->send_pkt_list_lock); + if (is_dgram) { + virtio_transport_free_pkt(pkt); + vq_err(vq, "Dgram virtqueue is full!"); + spin_lock_bh(lock); + vsock->dgram_used--; + spin_unlock_bh(lock); + break; + } + spin_lock_bh(lock); + list_add(&pkt->list, send_pkt_list); + spin_unlock_bh(lock); /* We cannot finish yet if more buffers snuck in while - * re-enabling notify. - */ + * re-enabling notify. + */ if (unlikely(vhost_enable_notify(&vsock->dev, vq))) { vhost_disable_notify(&vsock->dev, vq); continue; @@ -153,6 +185,12 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, if (out) { virtio_transport_free_pkt(pkt); vq_err(vq, "Expected 0 output buffers, got %u\n", out); + if (is_dgram) { + spin_lock_bh(lock); + vsock->dgram_used--; + spin_unlock_bh(lock); + } + break; } @@ -160,6 +198,18 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, if (iov_len < sizeof(pkt->hdr)) { virtio_transport_free_pkt(pkt); vq_err(vq, "Buffer len [%zu] too small\n", iov_len); + if (is_dgram) { + spin_lock_bh(lock); + vsock->dgram_used--; + spin_unlock_bh(lock); + } + break; + } + + if (iov_len < pkt->len - pkt->off && + vq == &vsock->vqs[VSOCK_VQ_DGRAM_RX]) { + virtio_transport_free_pkt(pkt); + vq_err(vq, "Buffer len [%zu] too small for dgram\n", iov_len); break; } @@ -179,6 +229,11 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, if (nbytes != sizeof(pkt->hdr)) { virtio_transport_free_pkt(pkt); vq_err(vq, "Faulted on copying pkt hdr\n"); + if (is_dgram) { + spin_lock_bh(lock); + vsock->dgram_used--; + spin_unlock_bh(lock); + } break; } @@ -204,16 +259,17 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, /* If we didn't send all the payload we can requeue the packet * to send it with the next available buffer. */ - if (pkt->off < pkt->len) { + if ((pkt->off < pkt->len) + && (vq == &vsock->vqs[VSOCK_VQ_RX])) { /* We are queueing the same virtio_vsock_pkt to handle * the remaining bytes, and we want to deliver it * to monitoring devices in the next iteration. */ pkt->tap_delivered = false; - spin_lock_bh(&vsock->send_pkt_list_lock); - list_add(&pkt->list, &vsock->send_pkt_list); - spin_unlock_bh(&vsock->send_pkt_list_lock); + spin_lock_bh(lock); + list_add(&pkt->list, send_pkt_list); + spin_unlock_bh(lock); } else { if (pkt->reply) { int val; @@ -228,6 +284,11 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, } virtio_transport_free_pkt(pkt); + if (is_dgram) { + spin_lock_bh(lock); + vsock->dgram_used--; + spin_unlock_bh(lock); + } } } while(likely(!vhost_exceeds_weight(vq, ++pkts, total_len))); if (added) @@ -251,11 +312,25 @@ static void vhost_transport_send_pkt_work(struct vhost_work *work) vhost_transport_do_send_pkt(vsock, vq); } +static void vhost_transport_dgram_send_pkt_work(struct vhost_work *work) +{ + struct vhost_virtqueue *vq; + struct vhost_vsock *vsock; + + vsock = container_of(work, struct vhost_vsock, dgram_send_pkt_work); + vq = &vsock->vqs[VSOCK_VQ_DGRAM_RX]; + + vhost_transport_do_send_pkt(vsock, vq); +} + static int vhost_transport_send_pkt(struct virtio_vsock_pkt *pkt) { struct vhost_vsock *vsock; int len = pkt->len; + spinlock_t *lock; + struct list_head *send_pkt_list; + struct vhost_work *work; rcu_read_lock(); @@ -267,14 +342,38 @@ vhost_transport_send_pkt(struct virtio_vsock_pkt *pkt) return -ENODEV; } + if (pkt->hdr.type == VIRTIO_VSOCK_TYPE_STREAM) { + lock = &vsock->send_pkt_list_lock; + send_pkt_list = &vsock->send_pkt_list; + work = &vsock->send_pkt_work; + } else if (pkt->hdr.type == VIRTIO_VSOCK_TYPE_DGRAM) { + lock = &vsock->dgram_send_pkt_list_lock; + send_pkt_list = &vsock->dgram_send_pkt_list; + work = &vsock->dgram_send_pkt_work; + } else { + rcu_read_unlock(); + virtio_transport_free_pkt(pkt); + return -EINVAL; + } + + if (pkt->reply) atomic_inc(&vsock->queued_replies); - spin_lock_bh(&vsock->send_pkt_list_lock); - list_add_tail(&pkt->list, &vsock->send_pkt_list); - spin_unlock_bh(&vsock->send_pkt_list_lock); + spin_lock_bh(lock); + if (pkt->hdr.type == VIRTIO_VSOCK_TYPE_DGRAM) { + if (vsock->dgram_used == VHOST_VSOCK_DGRM_MAX_PENDING_PKT) + len = -ENOMEM; + else { + vsock->dgram_used++; + list_add_tail(&pkt->list, send_pkt_list); + } + } else + list_add_tail(&pkt->list, send_pkt_list); - vhost_work_queue(&vsock->dev, &vsock->send_pkt_work); + spin_unlock_bh(lock); + + vhost_work_queue(&vsock->dev, work); rcu_read_unlock(); return len; @@ -355,7 +454,8 @@ vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq, return NULL; } - if (le16_to_cpu(pkt->hdr.type) == VIRTIO_VSOCK_TYPE_STREAM) + if (le16_to_cpu(pkt->hdr.type) == VIRTIO_VSOCK_TYPE_STREAM + || le16_to_cpu(pkt->hdr.type) == VIRTIO_VSOCK_TYPE_DGRAM) pkt->len = le32_to_cpu(pkt->hdr.len); /* No payload */ @@ -442,6 +542,18 @@ static struct virtio_transport vhost_transport = { .send_pkt = vhost_transport_send_pkt, }; +static inline unsigned long busy_clock(void) +{ + return local_clock() >> 10; +} + +static bool vhost_can_busy_poll(unsigned long endtime) +{ + return likely(!need_resched() && !time_after(busy_clock(), endtime) && + !signal_pending(current)); +} + + static void vhost_vsock_handle_tx_kick(struct vhost_work *work) { struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue, @@ -452,6 +564,8 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) int head, pkts = 0, total_len = 0; unsigned int out, in; bool added = false; + unsigned long busyloop_timeout = VHOST_VSOCK_BUSY_POLL_TIMEOUT; + unsigned long endtime; mutex_lock(&vq->mutex); @@ -461,11 +575,14 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) if (!vq_meta_prefetch(vq)) goto out; + endtime = busy_clock() + busyloop_timeout; vhost_disable_notify(&vsock->dev, vq); + preempt_disable(); do { u32 len; - if (!vhost_vsock_more_replies(vsock)) { + if (vq == &vsock->vqs[VSOCK_VQ_TX] + && !vhost_vsock_more_replies(vsock)) { /* Stop tx until the device processes already * pending replies. Leave tx virtqueue * callbacks disabled. @@ -479,6 +596,11 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) break; if (head == vq->num) { + if (vhost_can_busy_poll(endtime)) { + cpu_relax(); + continue; + } + if (unlikely(vhost_enable_notify(&vsock->dev, vq))) { vhost_disable_notify(&vsock->dev, vq); continue; @@ -510,6 +632,7 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) total_len += len; added = true; } while(likely(!vhost_exceeds_weight(vq, ++pkts, total_len))); + preempt_enable(); no_more_replies: if (added) @@ -565,6 +688,7 @@ static int vhost_vsock_start(struct vhost_vsock *vsock) * let's kick the send worker to send them. */ vhost_work_queue(&vsock->dev, &vsock->send_pkt_work); + vhost_work_queue(&vsock->dev, &vsock->dgram_send_pkt_work); mutex_unlock(&vsock->dev.mutex); return 0; @@ -639,8 +763,14 @@ static int vhost_vsock_dev_open(struct inode *inode, struct file *file) vqs[VSOCK_VQ_TX] = &vsock->vqs[VSOCK_VQ_TX]; vqs[VSOCK_VQ_RX] = &vsock->vqs[VSOCK_VQ_RX]; + vqs[VSOCK_VQ_DGRAM_TX] = &vsock->vqs[VSOCK_VQ_DGRAM_TX]; + vqs[VSOCK_VQ_DGRAM_RX] = &vsock->vqs[VSOCK_VQ_DGRAM_RX]; vsock->vqs[VSOCK_VQ_TX].handle_kick = vhost_vsock_handle_tx_kick; vsock->vqs[VSOCK_VQ_RX].handle_kick = vhost_vsock_handle_rx_kick; + vsock->vqs[VSOCK_VQ_DGRAM_TX].handle_kick = + vhost_vsock_handle_tx_kick; + vsock->vqs[VSOCK_VQ_DGRAM_RX].handle_kick = + vhost_vsock_handle_rx_kick; vhost_dev_init(&vsock->dev, vqs, ARRAY_SIZE(vsock->vqs), UIO_MAXIOV, VHOST_VSOCK_PKT_WEIGHT, @@ -650,6 +780,11 @@ static int vhost_vsock_dev_open(struct inode *inode, struct file *file) spin_lock_init(&vsock->send_pkt_list_lock); INIT_LIST_HEAD(&vsock->send_pkt_list); vhost_work_init(&vsock->send_pkt_work, vhost_transport_send_pkt_work); + spin_lock_init(&vsock->dgram_send_pkt_list_lock); + INIT_LIST_HEAD(&vsock->dgram_send_pkt_list); + vhost_work_init(&vsock->dgram_send_pkt_work, + vhost_transport_dgram_send_pkt_work); + return 0; out: @@ -665,6 +800,7 @@ static void vhost_vsock_flush(struct vhost_vsock *vsock) if (vsock->vqs[i].handle_kick) vhost_poll_flush(&vsock->vqs[i].poll); vhost_work_flush(&vsock->dev, &vsock->send_pkt_work); + vhost_work_flush(&vsock->dev, &vsock->dgram_send_pkt_work); } static void vhost_vsock_reset_orphans(struct sock *sk) @@ -724,6 +860,17 @@ static int vhost_vsock_dev_release(struct inode *inode, struct file *file) } spin_unlock_bh(&vsock->send_pkt_list_lock); + spin_lock_bh(&vsock->dgram_send_pkt_list_lock); + while (!list_empty(&vsock->dgram_send_pkt_list)) { + struct virtio_vsock_pkt *pkt; + + pkt = list_first_entry(&vsock->dgram_send_pkt_list, + struct virtio_vsock_pkt, list); + list_del_init(&pkt->list); + virtio_transport_free_pkt(pkt); + } + spin_unlock_bh(&vsock->dgram_send_pkt_list_lock); + vhost_dev_cleanup(&vsock->dev); kfree(vsock->dev.vqs); vhost_vsock_free(vsock); @@ -906,7 +1053,7 @@ static int __init vhost_vsock_init(void) int ret; ret = vsock_core_register(&vhost_transport.transport, - VSOCK_TRANSPORT_F_H2G); + VSOCK_TRANSPORT_F_H2G | VSOCK_TRANSPORT_F_DGRAM); if (ret < 0) return ret; return misc_register(&vhost_vsock_misc); From patchwork Wed Jun 9 23:24:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang Wang ." X-Patchwork-Id: 12311443 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E32FC48BCD for ; Wed, 9 Jun 2021 23:29:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 63984613EA for ; Wed, 9 Jun 2021 23:29:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230230AbhFIXay (ORCPT ); Wed, 9 Jun 2021 19:30:54 -0400 Received: from mail-pj1-f53.google.com ([209.85.216.53]:44968 "EHLO mail-pj1-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230196AbhFIXav (ORCPT ); Wed, 9 Jun 2021 19:30:51 -0400 Received: by mail-pj1-f53.google.com with SMTP id h12-20020a17090aa88cb029016400fd8ad8so2644801pjq.3 for ; Wed, 09 Jun 2021 16:28:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=86REOzSGEIwqj5rGN0Fp/v5tt6ac6xVOwCMA3m/ijzs=; b=HY6JH+8XCF5yR1GR/h2NI77S+IONn0Rt72wljFboXJ0ZAzUZG+tyoUotGLfY10n2YO aa61kk5uyqP5Z4JTKFHa19TlzB0Xz+/hKLkp3y/Of9kLBZrs9UEoxkSu/6R57/rsHqCv DYhIZRsV/7qffJpSiv2v9lWHi9TybNez+PnlSVGGmhwYLTGs0mpFUQNNjJXTz3GPEO2r ptAPfxChgeyLBOkaUWzpNhwuavaaL1wDY24KAhyQYHSLJamwUFJXIc51vx0DT8fFzCiQ qfWUacEigu/YGg0ocpozrGBlpXaj0J8maNiDoE8niQcu6FrAW/DdzLT9T6oar+BO5U91 IpcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=86REOzSGEIwqj5rGN0Fp/v5tt6ac6xVOwCMA3m/ijzs=; b=SniXNZx+bJH/fWSTUdUY9edrul/ZhW0Co8QScFdBzNGmk1cdlLwjxr9+H4a6Nv89Cw aeg73HBtEi1fUB+9LAvkt2VxFFXrOyVjFoYHGp/qV+PoLht/IuPabgap/cdHZ5x46bUm iBW52zvgIESFHmYoBQ1rcu2uSjhGsKGJ4PPMd/b6knulu83xlTzeFdXDp9ewH68hHC33 5BzNDMz3i8CjwNoQbKU+F2MWgdRpFgx7IJTztEFioSKKWcAHUqkHYEBEi6kWBTU2P6PR 4JcJkddr5n7QPQkTgBn+NUt0Ih1kicLCk2/5b7C6tuXUZ0nDjTlDnpUtp/skcsiaOqQV C6ww== X-Gm-Message-State: AOAM533OkKAGLhOwB4R3KoDTBNtsYxKyWu4avWyUuvX1nGnixkt+hZ+H 1ysMMkP0roMHecMh4hurgZjbEw== X-Google-Smtp-Source: ABdhPJxdXSHc55OxeUlGlRb/YdeNLELbZGd8v+J1Gba9M7SknUktAbfuhaLzyUaEtS8isiG24mHYUg== X-Received: by 2002:a17:90a:a78d:: with SMTP id f13mr122404pjq.161.1623281264203; Wed, 09 Jun 2021 16:27:44 -0700 (PDT) Received: from n124-121-013.byted.org (ec2-54-241-92-238.us-west-1.compute.amazonaws.com. [54.241.92.238]) by smtp.gmail.com with ESMTPSA id k1sm526783pfa.30.2021.06.09.16.27.42 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Jun 2021 16:27:43 -0700 (PDT) From: Jiang Wang To: sgarzare@redhat.com Cc: virtualization@lists.linux-foundation.org, stefanha@redhat.com, mst@redhat.com, arseny.krasnov@kaspersky.com, jhansen@vmware.comments, cong.wang@bytedance.com, duanxiongchun@bytedance.com, xieyongji@bytedance.com, chaiwen.cc@bytedance.com, Jason Wang , "David S. Miller" , Jakub Kicinski , Steven Rostedt , Ingo Molnar , Colin Ian King , Norbert Slusarek , Andra Paraschiv , Alexander Popov , kvm@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC v1 4/6] vsock_test: add tests for vsock dgram Date: Wed, 9 Jun 2021 23:24:56 +0000 Message-Id: <20210609232501.171257-5-jiang.wang@bytedance.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20210609232501.171257-1-jiang.wang@bytedance.com> References: <20210609232501.171257-1-jiang.wang@bytedance.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-State: RFC Added test cases for vsock dgram types. Signed-off-by: Jiang Wang --- tools/testing/vsock/util.c | 105 +++++++++++++++++++++ tools/testing/vsock/util.h | 4 + tools/testing/vsock/vsock_test.c | 195 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 304 insertions(+) diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c index 93cbd6f603f9..59e5301b5380 100644 --- a/tools/testing/vsock/util.c +++ b/tools/testing/vsock/util.c @@ -238,6 +238,57 @@ void send_byte(int fd, int expected_ret, int flags) } } +/* Transmit one byte and check the return value. + * + * expected_ret: + * <0 Negative errno (for testing errors) + * 0 End-of-file + * 1 Success + */ +void sendto_byte(int fd, const struct sockaddr *dest_addr, int len, int expected_ret, + int flags) +{ + const uint8_t byte = 'A'; + ssize_t nwritten; + + timeout_begin(TIMEOUT); + do { + nwritten = sendto(fd, &byte, sizeof(byte), flags, dest_addr, + len); + timeout_check("write"); + } while (nwritten < 0 && errno == EINTR); + timeout_end(); + + if (expected_ret < 0) { + if (nwritten != -1) { + fprintf(stderr, "bogus sendto(2) return value %zd\n", + nwritten); + exit(EXIT_FAILURE); + } + if (errno != -expected_ret) { + perror("write"); + exit(EXIT_FAILURE); + } + return; + } + + if (nwritten < 0) { + perror("write"); + exit(EXIT_FAILURE); + } + if (nwritten == 0) { + if (expected_ret == 0) + return; + + fprintf(stderr, "unexpected EOF while sending byte\n"); + exit(EXIT_FAILURE); + } + if (nwritten != sizeof(byte)) { + fprintf(stderr, "bogus sendto(2) return value %zd\n", nwritten); + exit(EXIT_FAILURE); + } +} + /* Receive one byte and check the return value. * * expected_ret: @@ -291,6 +342,60 @@ void recv_byte(int fd, int expected_ret, int flags) } } +/* Receive one byte and check the return value. + * + * expected_ret: + * <0 Negative errno (for testing errors) + * 0 End-of-file + * 1 Success + */ +void recvfrom_byte(int fd, struct sockaddr *src_addr, socklen_t *addrlen, + int expected_ret, int flags) +{ + uint8_t byte; + ssize_t nread; + + timeout_begin(TIMEOUT); + do { + nread = recvfrom(fd, &byte, sizeof(byte), flags, src_addr, addrlen); + timeout_check("read"); + } while (nread < 0 && errno == EINTR); + timeout_end(); + + if (expected_ret < 0) { + if (nread != -1) { + fprintf(stderr, "bogus recvfrom(2) return value %zd\n", + nread); + exit(EXIT_FAILURE); + } + if (errno != -expected_ret) { + perror("read"); + exit(EXIT_FAILURE); + } + return; + } + + if (nread < 0) { + perror("read"); + exit(EXIT_FAILURE); + } + if (nread == 0) { + if (expected_ret == 0) + return; + + fprintf(stderr, "unexpected EOF while receiving byte\n"); + exit(EXIT_FAILURE); + } + if (nread != sizeof(byte)) { + fprintf(stderr, "bogus recvfrom(2) return value %zd\n", nread); + exit(EXIT_FAILURE); + } + if (byte != 'A') { + fprintf(stderr, "unexpected byte read %c\n", byte); + exit(EXIT_FAILURE); + } +} + /* Run test cases. The program terminates if a failure occurs. */ void run_tests(const struct test_case *test_cases, const struct test_opts *opts) diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h index e53dd09d26d9..cea1acd094c6 100644 --- a/tools/testing/vsock/util.h +++ b/tools/testing/vsock/util.h @@ -40,7 +40,11 @@ int vsock_stream_accept(unsigned int cid, unsigned int port, struct sockaddr_vm *clientaddrp); void vsock_wait_remote_close(int fd); void send_byte(int fd, int expected_ret, int flags); +void sendto_byte(int fd, const struct sockaddr *dest_addr, int len, int expected_ret, + int flags); void recv_byte(int fd, int expected_ret, int flags); +void recvfrom_byte(int fd, struct sockaddr *src_addr, socklen_t *addrlen, + int expected_ret, int flags); void run_tests(const struct test_case *test_cases, const struct test_opts *opts); void list_tests(const struct test_case *test_cases); diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c index 5a4fb80fa832..9dd9f004b7df 100644 --- a/tools/testing/vsock/vsock_test.c +++ b/tools/testing/vsock/vsock_test.c @@ -197,6 +197,115 @@ static void test_stream_server_close_server(const struct test_opts *opts) close(fd); } +static void test_dgram_sendto_client(const struct test_opts *opts) +{ + union { + struct sockaddr sa; + struct sockaddr_vm svm; + } addr = { + .svm = { + .svm_family = AF_VSOCK, + .svm_port = 1234, + .svm_cid = opts->peer_cid, + }, + }; + int fd; + + /* Wait for the server to be ready */ + control_expectln("BIND"); + + fd = socket(AF_VSOCK, SOCK_DGRAM, 0); + if (fd < 0) { + perror("socket"); + exit(EXIT_FAILURE); + } + + sendto_byte(fd, &addr.sa, sizeof(addr.svm), 1, 0); + + /* Notify the server that the client has finished */ + control_writeln("DONE"); + + close(fd); +} + +static void test_dgram_sendto_server(const struct test_opts *opts) +{ + union { + struct sockaddr sa; + struct sockaddr_vm svm; + } addr = { + .svm = { + .svm_family = AF_VSOCK, + .svm_port = 1234, + .svm_cid = VMADDR_CID_ANY, + }, + }; + int fd; + int len = sizeof(addr.sa); + + fd = socket(AF_VSOCK, SOCK_DGRAM, 0); + + if (bind(fd, &addr.sa, sizeof(addr.svm)) < 0) { + perror("bind"); + exit(EXIT_FAILURE); + } + + /* Notify the client that the server is ready */ + control_writeln("BIND"); + + recvfrom_byte(fd, &addr.sa, &len, 1, 0); + printf("got message from cid:%d, port %u ", addr.svm.svm_cid, + addr.svm.svm_port); + + /* Wait for the client to finish */ + control_expectln("DONE"); + + close(fd); +} + +static void test_dgram_connect_client(const struct test_opts *opts) +{ + union { + struct sockaddr sa; + struct sockaddr_vm svm; + } addr = { + .svm = { + .svm_family = AF_VSOCK, + .svm_port = 1234, + .svm_cid = opts->peer_cid, + }, + }; + int fd; + int ret; + + /* Wait for the server to be ready */ + control_expectln("BIND"); + + fd = socket(AF_VSOCK, SOCK_DGRAM, 0); + if (fd < 0) { + perror("bind"); + exit(EXIT_FAILURE); + } + + ret = connect(fd, &addr.sa, sizeof(addr.svm)); + if (ret < 0) { + perror("connect"); + exit(EXIT_FAILURE); + } + + send_byte(fd, 1, 0); + + /* Notify the server that the client has finished */ + control_writeln("DONE"); + + close(fd); +} + +static void test_dgram_connect_server(const struct test_opts *opts) +{ + test_dgram_sendto_server(opts); +} + /* With the standard socket sizes, VMCI is able to support about 100 * concurrent stream connections. */ @@ -250,6 +359,77 @@ static void test_stream_multiconn_server(const struct test_opts *opts) close(fds[i]); } +static void test_dgram_multiconn_client(const struct test_opts *opts) +{ + int fds[MULTICONN_NFDS]; + int i; + union { + struct sockaddr sa; + struct sockaddr_vm svm; + } addr = { + .svm = { + .svm_family = AF_VSOCK, + .svm_port = 1234, + .svm_cid = opts->peer_cid, + }, + }; + + /* Wait for the server to be ready */ + control_expectln("BIND"); + + for (i = 0; i < MULTICONN_NFDS; i++) { + fds[i] = socket(AF_VSOCK, SOCK_DGRAM, 0); + if (fds[i] < 0) { + perror("socket"); + exit(EXIT_FAILURE); + } + } + + for (i = 0; i < MULTICONN_NFDS; i++) + sendto_byte(fds[i], &addr.sa, sizeof(addr.svm), 1, 0); + + /* Notify the server that the client has finished */ + control_writeln("DONE"); + + for (i = 0; i < MULTICONN_NFDS; i++) + close(fds[i]); +} + +static void test_dgram_multiconn_server(const struct test_opts *opts) +{ + union { + struct sockaddr sa; + struct sockaddr_vm svm; + } addr = { + .svm = { + .svm_family = AF_VSOCK, + .svm_port = 1234, + .svm_cid = VMADDR_CID_ANY, + }, + }; + int fd; + int len = sizeof(addr.sa); + int i; + + fd = socket(AF_VSOCK, SOCK_DGRAM, 0); + + if (bind(fd, &addr.sa, sizeof(addr.svm)) < 0) { + perror("bind"); + exit(EXIT_FAILURE); + } + + /* Notify the client that the server is ready */ + control_writeln("BIND"); + + for (i = 0; i < MULTICONN_NFDS; i++) + recvfrom_byte(fd, &addr.sa, &len, 1, 0); + + /* Wait for the client to finish */ + control_expectln("DONE"); + + close(fd); +} + static void test_stream_msg_peek_client(const struct test_opts *opts) { int fd; @@ -309,6 +489,21 @@ static struct test_case test_cases[] = { .run_client = test_stream_msg_peek_client, .run_server = test_stream_msg_peek_server, }, + { + .name = "SOCK_DGRAM client close", + .run_client = test_dgram_sendto_client, + .run_server = test_dgram_sendto_server, + }, + { + .name = "SOCK_DGRAM client connect", + .run_client = test_dgram_connect_client, + .run_server = test_dgram_connect_server, + }, + { + .name = "SOCK_DGRAM multiple connections", + .run_client = test_dgram_multiconn_client, + .run_server = test_dgram_multiconn_server, + }, {}, }; From patchwork Wed Jun 9 23:24:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang Wang ." X-Patchwork-Id: 12311439 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CF69C48BDF for ; Wed, 9 Jun 2021 23:28:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 25EF5613EA for ; Wed, 9 Jun 2021 23:28:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230027AbhFIX37 (ORCPT ); Wed, 9 Jun 2021 19:29:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45028 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229989AbhFIX36 (ORCPT ); Wed, 9 Jun 2021 19:29:58 -0400 Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 56181C061574 for ; Wed, 9 Jun 2021 16:27:49 -0700 (PDT) Received: by mail-pj1-x1030.google.com with SMTP id 22-20020a17090a0c16b0290164a5354ad0so2603231pjs.2 for ; Wed, 09 Jun 2021 16:27:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ryjTh+mD4YySh21tlg0ARtI0y9ZaLutIYM7xzEboUMM=; b=PGAajtMkuQ3LaeTRZIG07olMjiCPi1sq3ITEedmcyp4kGig8kV2hTPoI3NKR1B1a+E 8MeDJ3E5rcRaw7ehm8sgbCOJZsSlU3wxL1I1kkWGJHw8GfGA2VVggWGRCQl85yUFLIaF b6VAkSQrOHIuq/euJ1bRx9Jl62SkXI8en5mJuL64oPOMb0VY6wCZYa4i/UeyG0Wvyvj5 mDL8xJiHUkGbyJxM0c/SCIaJOiUow9O3q6HgxLoYfJ8rdGG0AVCj30CRyBNcgfEPyuW2 w3KYyJg2uij3Y7S1OWR87xbEYWkKXCZrsl/v+7BehEMo1x3WTj1Yfp1LetwJ32a0eONp 18tA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ryjTh+mD4YySh21tlg0ARtI0y9ZaLutIYM7xzEboUMM=; b=XqQjaP1TQrFDxDGIJijWFnfcXWG+3QjzDwTOBwY5qwZXYAPMvcWOgTYoVYDX9AkYyb uVAYor+idCK1lAUAqh/R3kBDfbnYNojkIQPfowAjtiv4+mSax4aLHRuKovC4cfoxMxp3 gJJINLzFxyggWFs8ejDvDcQFmlS4YA7N13KeteQYnLHHJ2aeX3NQQOClxpF/blpbmww2 Ho6FkgzB5IqpkVNvf2/MDzd78ZCxG3AXivwYh8QvYKmaFR7Ov/XrTqFz/r6tPdY4hC4s 68vCtrLiWI8HJy0n6aNovV/iz6SzjtL4nujaN2kRLc57Hv2HTNpqgPdMSvDzSjWu3/m1 /RhA== X-Gm-Message-State: AOAM533064FztGt+5NlE/oeNTeuQK5tgPHLnY3zkL3l9CiWosEW4hgm8 LKKz8rXSvBAvDFscxBw0dukRZA== X-Google-Smtp-Source: ABdhPJxV7dXx9oYLLYKaZWYQUOlR4GVI03/GJ5W/+RMY3mRb75ZKPONwNUoGp6uEe7gqZSngjxUlYg== X-Received: by 2002:a17:90a:398f:: with SMTP id z15mr125059pjb.183.1623281268943; Wed, 09 Jun 2021 16:27:48 -0700 (PDT) Received: from n124-121-013.byted.org (ec2-54-241-92-238.us-west-1.compute.amazonaws.com. [54.241.92.238]) by smtp.gmail.com with ESMTPSA id k1sm526783pfa.30.2021.06.09.16.27.47 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Jun 2021 16:27:48 -0700 (PDT) From: Jiang Wang To: sgarzare@redhat.com Cc: virtualization@lists.linux-foundation.org, stefanha@redhat.com, mst@redhat.com, arseny.krasnov@kaspersky.com, jhansen@vmware.comments, cong.wang@bytedance.com, duanxiongchun@bytedance.com, xieyongji@bytedance.com, chaiwen.cc@bytedance.com, Jason Wang , "David S. Miller" , Jakub Kicinski , Steven Rostedt , Ingo Molnar , Andra Paraschiv , Norbert Slusarek , Colin Ian King , Lu Wei , Alexander Popov , kvm@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC v1 5/6] vhost/vsock: add kconfig for vhost dgram support Date: Wed, 9 Jun 2021 23:24:57 +0000 Message-Id: <20210609232501.171257-6-jiang.wang@bytedance.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20210609232501.171257-1-jiang.wang@bytedance.com> References: <20210609232501.171257-1-jiang.wang@bytedance.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-State: RFC Also change number of vqs according to the config Signed-off-by: Jiang Wang --- drivers/vhost/Kconfig | 8 ++++++++ drivers/vhost/vsock.c | 11 ++++++++--- 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig index 587fbae06182..d63fffee6007 100644 --- a/drivers/vhost/Kconfig +++ b/drivers/vhost/Kconfig @@ -61,6 +61,14 @@ config VHOST_VSOCK To compile this driver as a module, choose M here: the module will be called vhost_vsock. +config VHOST_VSOCK_DGRAM + bool "vhost vsock datagram sockets support" + depends on VHOST_VSOCK + default n + help + Enable vhost-vsock to support datagram types vsock. The QEMU + and the guest must support datagram types too to use it. + config VHOST_VDPA tristate "Vhost driver for vDPA-based backend" depends on EVENTFD diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index d366463be6d4..12ca1dc0268f 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -48,7 +48,11 @@ static DEFINE_READ_MOSTLY_HASHTABLE(vhost_vsock_hash, 8); struct vhost_vsock { struct vhost_dev dev; +#ifdef CONFIG_VHOST_VSOCK_DGRAM struct vhost_virtqueue vqs[4]; +#else + struct vhost_virtqueue vqs[2]; +#endif /* Link to global vhost_vsock_hash, writes use vhost_vsock_mutex */ struct hlist_node hash; @@ -763,15 +767,16 @@ static int vhost_vsock_dev_open(struct inode *inode, struct file *file) vqs[VSOCK_VQ_TX] = &vsock->vqs[VSOCK_VQ_TX]; vqs[VSOCK_VQ_RX] = &vsock->vqs[VSOCK_VQ_RX]; - vqs[VSOCK_VQ_DGRAM_TX] = &vsock->vqs[VSOCK_VQ_DGRAM_TX]; - vqs[VSOCK_VQ_DGRAM_RX] = &vsock->vqs[VSOCK_VQ_DGRAM_RX]; vsock->vqs[VSOCK_VQ_TX].handle_kick = vhost_vsock_handle_tx_kick; vsock->vqs[VSOCK_VQ_RX].handle_kick = vhost_vsock_handle_rx_kick; +#ifdef CONFIG_VHOST_VSOCK_DGRAM + vqs[VSOCK_VQ_DGRAM_TX] = &vsock->vqs[VSOCK_VQ_DGRAM_TX]; + vqs[VSOCK_VQ_DGRAM_RX] = &vsock->vqs[VSOCK_VQ_DGRAM_RX]; vsock->vqs[VSOCK_VQ_DGRAM_TX].handle_kick = vhost_vsock_handle_tx_kick; vsock->vqs[VSOCK_VQ_DGRAM_RX].handle_kick = vhost_vsock_handle_rx_kick; - +#endif vhost_dev_init(&vsock->dev, vqs, ARRAY_SIZE(vsock->vqs), UIO_MAXIOV, VHOST_VSOCK_PKT_WEIGHT, VHOST_VSOCK_WEIGHT, true, NULL); From patchwork Wed Jun 9 23:24:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang Wang ." X-Patchwork-Id: 12311445 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C5F2C48BCD for ; Wed, 9 Jun 2021 23:29:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 05EAC613F0 for ; Wed, 9 Jun 2021 23:29:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230248AbhFIXbT (ORCPT ); Wed, 9 Jun 2021 19:31:19 -0400 Received: from mail-pf1-f173.google.com ([209.85.210.173]:42747 "EHLO mail-pf1-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230247AbhFIXbF (ORCPT ); Wed, 9 Jun 2021 19:31:05 -0400 Received: by mail-pf1-f173.google.com with SMTP id s14so31056pfd.9 for ; Wed, 09 Jun 2021 16:28:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=XAzb3VkfUYxU1teKArHCgpj6NOb9ejMO6tAx0JjUIoQ=; b=O88kABLyJToxtXQQFFjKPeMLrJ6YesZtMujpSNeGlQr4FiygAaEUmju5bI0djjtyMO fmcAoWXx+PQjavZKTMLpBmCljJ1id5ISpmeoUMvTwVllZ38QED5U5oIuNgaW/8FJouyL OiLP0c9ZxGG9OOuD6Jj5yFH8xh1WJo4xhhiJnI4Ewx6elUoDWibQFNeKjkHo6Q41BgVZ 6OhFZ6U33EpgC9MKiUFgd9OVoT0548QMf4OzuoRrZcGp5QgdYh/+iS6Pzfu0KaDlqjjv FBKDBtgmajPmIt/ZcOzKSapooa1CDhL0tp2kafx6jHLIeKGMC1KlffQdYwljAhBcx5Rl UFjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=XAzb3VkfUYxU1teKArHCgpj6NOb9ejMO6tAx0JjUIoQ=; b=A6Xd2w7UDL8YirEezpONkTMgFEBt6ThGyvMgPcjGQ+cSvwrI4VX3cp+utGFk0JtCNW 0voX/bAfxNAZrL5gAuA9nLQ6JCeoHOmhTRi4ztuzipUxncg0Z9LWD17f5zCLRlkGiTWO 5PESBDN0ntqV7tGaD/6pfk/RBy1SDdbE1LxI4/HKxrvzEU7N2yH/W6j4BwnUpjV7VSEL uvQxsKtRDj0s2dvyF3l+Jh7JAWmkphiownCJge9TiyPu9gdwo+NbVgs8RtG9z8tXsIQ5 5M/uAmrIFfQy7zdqpaIApdjOYPUcOINgv4DtCbCFaueko3fEw5fawYsFc02/WGnLci8e 158A== X-Gm-Message-State: AOAM531vmwFxj9/ddLsNCCuwQ6MdRum2a3aFXT6n12CbcFCbZpPf9O0Y suvhfGFpvGvpTFi9eEFImXIjLQ== X-Google-Smtp-Source: ABdhPJxf83Qcv5sMhiGgL1PwI1FYFWMLIxV5hrXF/xSHNmKBquW4gmSwSNNe2aOfHo5UenCXytSasw== X-Received: by 2002:a63:de02:: with SMTP id f2mr2100758pgg.32.1623281273711; Wed, 09 Jun 2021 16:27:53 -0700 (PDT) Received: from n124-121-013.byted.org (ec2-54-241-92-238.us-west-1.compute.amazonaws.com. [54.241.92.238]) by smtp.gmail.com with ESMTPSA id k1sm526783pfa.30.2021.06.09.16.27.52 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Jun 2021 16:27:53 -0700 (PDT) From: Jiang Wang To: sgarzare@redhat.com Cc: virtualization@lists.linux-foundation.org, stefanha@redhat.com, mst@redhat.com, arseny.krasnov@kaspersky.com, jhansen@vmware.comments, cong.wang@bytedance.com, duanxiongchun@bytedance.com, xieyongji@bytedance.com, chaiwen.cc@bytedance.com, Jason Wang , "David S. Miller" , Jakub Kicinski , Steven Rostedt , Ingo Molnar , Colin Ian King , Norbert Slusarek , Andra Paraschiv , Lu Wei , Alexander Popov , kvm@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC v1 6/6] virtio/vsock: add sysfs for rx buf len for dgram Date: Wed, 9 Jun 2021 23:24:58 +0000 Message-Id: <20210609232501.171257-7-jiang.wang@bytedance.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20210609232501.171257-1-jiang.wang@bytedance.com> References: <20210609232501.171257-1-jiang.wang@bytedance.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC Make rx buf len configurable via sysfs Signed-off-by: Jiang Wang --- net/vmw_vsock/virtio_transport.c | 37 +++++++++++++++++++++++++++++++++++-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index cf47aadb0c34..2e4dd9c48472 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -29,6 +29,14 @@ static struct virtio_vsock __rcu *the_virtio_vsock; static struct virtio_vsock *the_virtio_vsock_dgram; static DEFINE_MUTEX(the_virtio_vsock_mutex); /* protects the_virtio_vsock */ +static int rx_buf_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE; +static struct kobject *kobj_ref; +static ssize_t sysfs_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf); +static ssize_t sysfs_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count); +static struct kobj_attribute rxbuf_attr = __ATTR(rx_buf_value, 0660, sysfs_show, sysfs_store); + struct virtio_vsock { struct virtio_device *vdev; struct virtqueue **vqs; @@ -360,7 +368,7 @@ virtio_transport_cancel_pkt(struct vsock_sock *vsk) static void virtio_vsock_rx_fill(struct virtio_vsock *vsock, bool is_dgram) { - int buf_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE; + int buf_len = rx_buf_len; struct virtio_vsock_pkt *pkt; struct scatterlist hdr, buf, *sgs[2]; struct virtqueue *vq; @@ -1003,6 +1011,22 @@ static struct virtio_driver virtio_vsock_driver = { .remove = virtio_vsock_remove, }; +static ssize_t sysfs_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%d", rx_buf_len); +} + +static ssize_t sysfs_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + if (kstrtou32(buf, 0, &rx_buf_len) < 0) + return -EINVAL; + if (rx_buf_len < 1024) + rx_buf_len = 1024; + return count; +} + static int __init virtio_vsock_init(void) { int ret; @@ -1020,8 +1044,17 @@ static int __init virtio_vsock_init(void) if (ret) goto out_vci; - return 0; + kobj_ref = kobject_create_and_add("vsock", kernel_kobj); + /*Creating sysfs file for etx_value*/ + ret = sysfs_create_file(kobj_ref, &rxbuf_attr.attr); + if (ret) + goto out_sysfs; + + return 0; +out_sysfs: + kobject_put(kobj_ref); + sysfs_remove_file(kernel_kobj, &rxbuf_attr.attr); out_vci: vsock_core_unregister(&virtio_transport.transport); out_wq: