From patchwork Thu Jan 16 17:24:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefano Garzarella X-Patchwork-Id: 11337563 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 677EE1398 for ; Thu, 16 Jan 2020 18:33:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 31E7F2073A for ; Thu, 16 Jan 2020 18:33:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="MPa5EpIF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2436770AbgAPSdp (ORCPT ); Thu, 16 Jan 2020 13:33:45 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:47514 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2387537AbgAPRYt (ORCPT ); Thu, 16 Jan 2020 12:24:49 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579195488; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+VGSKYYAvkNoF8+cprFtOeKETpEfAOiLdOWkQthdSJs=; b=MPa5EpIFiKBNMhzGSOiCBcHx2AWuH0CcnIXp0q9/uccj3wffJacmneGYv/hQ8B0cA2mPA5 pGAwUgZpR64Nnt82YvTUjDPcJJqWwa6gzxAr+rr3LO/CZVDm+jfc4pY2o426CYdnFVdl8S n9pp3GVzTZdWN8rRP+fZRj5hM0QVuC4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-97-UUN55USqMMi3d6FObijG9Q-1; Thu, 16 Jan 2020 12:24:47 -0500 X-MC-Unique: UUN55USqMMi3d6FObijG9Q-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0951A8010C7; Thu, 16 Jan 2020 17:24:45 +0000 (UTC) Received: from steredhat.redhat.com (ovpn-117-242.ams2.redhat.com [10.36.117.242]) by smtp.corp.redhat.com (Postfix) with ESMTP id 88F2F5C1D8; Thu, 16 Jan 2020 17:24:42 +0000 (UTC) From: Stefano Garzarella To: davem@davemloft.net, netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Jorgen Hansen , Jason Wang , kvm@vger.kernel.org, Stefan Hajnoczi , Stefano Garzarella , virtualization@lists.linux-foundation.org, linux-hyperv@vger.kernel.org, "Michael S. Tsirkin" , Dexuan Cui Subject: [PATCH net-next 1/3] vsock: add network namespace support Date: Thu, 16 Jan 2020 18:24:26 +0100 Message-Id: <20200116172428.311437-2-sgarzare@redhat.com> In-Reply-To: <20200116172428.311437-1-sgarzare@redhat.com> References: <20200116172428.311437-1-sgarzare@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This patch adds a check of the "net" assigned to a socket during the vsock_find_bound_socket() and vsock_find_connected_socket() to support network namespace, allowing to share the same address (cid, port) across different network namespaces. This patch adds 'netns' module param to enable this new feature (disabled by default), because it changes vsock's behavior with network namespaces and could break existing applications. G2H transports will use the default network namepsace (init_net). H2G transports can use different network namespace for different VMs. This patch uses default network namepsace (init_net) in all transports. Signed-off-by: Stefano Garzarella --- RFC -> v1 * added 'netns' module param * added 'vsock_net_eq()' to check the "net" assigned to a socket only when 'netns' support is enabled --- include/net/af_vsock.h | 7 +++-- net/vmw_vsock/af_vsock.c | 41 +++++++++++++++++++------ net/vmw_vsock/hyperv_transport.c | 5 +-- net/vmw_vsock/virtio_transport_common.c | 5 +-- net/vmw_vsock/vmci_transport.c | 5 +-- 5 files changed, 46 insertions(+), 17 deletions(-) diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h index b1c717286993..015913601fad 100644 --- a/include/net/af_vsock.h +++ b/include/net/af_vsock.h @@ -193,13 +193,16 @@ void vsock_enqueue_accept(struct sock *listener, struct sock *connected); void vsock_insert_connected(struct vsock_sock *vsk); void vsock_remove_bound(struct vsock_sock *vsk); void vsock_remove_connected(struct vsock_sock *vsk); -struct sock *vsock_find_bound_socket(struct sockaddr_vm *addr); +struct sock *vsock_find_bound_socket(struct sockaddr_vm *addr, struct net *net); struct sock *vsock_find_connected_socket(struct sockaddr_vm *src, - struct sockaddr_vm *dst); + struct sockaddr_vm *dst, + struct net *net); void vsock_remove_sock(struct vsock_sock *vsk); void vsock_for_each_connected_socket(void (*fn)(struct sock *sk)); int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk); bool vsock_find_cid(unsigned int cid); +bool vsock_net_eq(const struct net *net1, const struct net *net2); +struct net *vsock_default_net(void); /**** TAP ****/ diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 9c5b2a91baad..457ccd677756 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -140,6 +140,10 @@ static const struct vsock_transport *transport_dgram; static const struct vsock_transport *transport_local; static DEFINE_MUTEX(vsock_register_mutex); +static bool netns; +module_param(netns, bool, 0644); +MODULE_PARM_DESC(netns, "Enable network namespace support"); + /**** UTILS ****/ /* Each bound VSocket is stored in the bind hash table and each connected @@ -226,15 +230,18 @@ static void __vsock_remove_connected(struct vsock_sock *vsk) sock_put(&vsk->sk); } -static struct sock *__vsock_find_bound_socket(struct sockaddr_vm *addr) +static struct sock *__vsock_find_bound_socket(struct sockaddr_vm *addr, + struct net *net) { struct vsock_sock *vsk; list_for_each_entry(vsk, vsock_bound_sockets(addr), bound_table) { - if (vsock_addr_equals_addr(addr, &vsk->local_addr)) + if (vsock_addr_equals_addr(addr, &vsk->local_addr) && + vsock_net_eq(net, sock_net(sk_vsock(vsk)))) return sk_vsock(vsk); if (addr->svm_port == vsk->local_addr.svm_port && + vsock_net_eq(net, sock_net(sk_vsock(vsk))) && (vsk->local_addr.svm_cid == VMADDR_CID_ANY || addr->svm_cid == VMADDR_CID_ANY)) return sk_vsock(vsk); @@ -244,13 +251,15 @@ static struct sock *__vsock_find_bound_socket(struct sockaddr_vm *addr) } static struct sock *__vsock_find_connected_socket(struct sockaddr_vm *src, - struct sockaddr_vm *dst) + struct sockaddr_vm *dst, + struct net *net) { struct vsock_sock *vsk; list_for_each_entry(vsk, vsock_connected_sockets(src, dst), connected_table) { if (vsock_addr_equals_addr(src, &vsk->remote_addr) && + vsock_net_eq(net, sock_net(sk_vsock(vsk))) && dst->svm_port == vsk->local_addr.svm_port) { return sk_vsock(vsk); } @@ -295,12 +304,12 @@ void vsock_remove_connected(struct vsock_sock *vsk) } EXPORT_SYMBOL_GPL(vsock_remove_connected); -struct sock *vsock_find_bound_socket(struct sockaddr_vm *addr) +struct sock *vsock_find_bound_socket(struct sockaddr_vm *addr, struct net *net) { struct sock *sk; spin_lock_bh(&vsock_table_lock); - sk = __vsock_find_bound_socket(addr); + sk = __vsock_find_bound_socket(addr, net); if (sk) sock_hold(sk); @@ -311,12 +320,13 @@ struct sock *vsock_find_bound_socket(struct sockaddr_vm *addr) EXPORT_SYMBOL_GPL(vsock_find_bound_socket); struct sock *vsock_find_connected_socket(struct sockaddr_vm *src, - struct sockaddr_vm *dst) + struct sockaddr_vm *dst, + struct net *net) { struct sock *sk; spin_lock_bh(&vsock_table_lock); - sk = __vsock_find_connected_socket(src, dst); + sk = __vsock_find_connected_socket(src, dst, net); if (sk) sock_hold(sk); @@ -488,6 +498,18 @@ bool vsock_find_cid(unsigned int cid) } EXPORT_SYMBOL_GPL(vsock_find_cid); +bool vsock_net_eq(const struct net *net1, const struct net *net2) +{ + return !netns || net_eq(net1, net2); +} +EXPORT_SYMBOL_GPL(vsock_net_eq); + +struct net *vsock_default_net(void) +{ + return &init_net; +} +EXPORT_SYMBOL_GPL(vsock_default_net); + static struct sock *vsock_dequeue_accept(struct sock *listener) { struct vsock_sock *vlistener; @@ -586,6 +608,7 @@ static int __vsock_bind_stream(struct vsock_sock *vsk, { static u32 port; struct sockaddr_vm new_addr; + struct net *net = sock_net(sk_vsock(vsk)); if (!port) port = LAST_RESERVED_PORT + 1 + @@ -603,7 +626,7 @@ static int __vsock_bind_stream(struct vsock_sock *vsk, new_addr.svm_port = port++; - if (!__vsock_find_bound_socket(&new_addr)) { + if (!__vsock_find_bound_socket(&new_addr, net)) { found = true; break; } @@ -620,7 +643,7 @@ static int __vsock_bind_stream(struct vsock_sock *vsk, return -EACCES; } - if (__vsock_find_bound_socket(&new_addr)) + if (__vsock_find_bound_socket(&new_addr, net)) return -EADDRINUSE; } diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c index b3bdae74c243..237c53316d70 100644 --- a/net/vmw_vsock/hyperv_transport.c +++ b/net/vmw_vsock/hyperv_transport.c @@ -201,7 +201,8 @@ static void hvs_remote_addr_init(struct sockaddr_vm *remote, remote->svm_port = host_ephemeral_port++; - sk = vsock_find_connected_socket(remote, local); + sk = vsock_find_connected_socket(remote, local, + vsock_default_net()); if (!sk) { /* Found an available ephemeral port */ return; @@ -350,7 +351,7 @@ static void hvs_open_connection(struct vmbus_channel *chan) return; hvs_addr_init(&addr, conn_from_host ? if_type : if_instance); - sk = vsock_find_bound_socket(&addr); + sk = vsock_find_bound_socket(&addr, vsock_default_net()); if (!sk) return; diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index d9f0c9c5425a..cecdfd91ed00 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -1088,6 +1088,7 @@ virtio_transport_recv_listen(struct sock *sk, struct virtio_vsock_pkt *pkt, void virtio_transport_recv_pkt(struct virtio_transport *t, struct virtio_vsock_pkt *pkt) { + struct net *net = vsock_default_net(); struct sockaddr_vm src, dst; struct vsock_sock *vsk; struct sock *sk; @@ -1115,9 +1116,9 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, /* The socket must be in connected or bound table * otherwise send reset back */ - sk = vsock_find_connected_socket(&src, &dst); + sk = vsock_find_connected_socket(&src, &dst, net); if (!sk) { - sk = vsock_find_bound_socket(&dst); + sk = vsock_find_bound_socket(&dst, net); if (!sk) { (void)virtio_transport_reset_no_sock(t, pkt); goto free_pkt; diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c index 4b8b1150a738..3ad15d51b30b 100644 --- a/net/vmw_vsock/vmci_transport.c +++ b/net/vmw_vsock/vmci_transport.c @@ -669,6 +669,7 @@ static bool vmci_transport_stream_allow(u32 cid, u32 port) static int vmci_transport_recv_stream_cb(void *data, struct vmci_datagram *dg) { + struct net *net = vsock_default_net(); struct sock *sk; struct sockaddr_vm dst; struct sockaddr_vm src; @@ -702,9 +703,9 @@ static int vmci_transport_recv_stream_cb(void *data, struct vmci_datagram *dg) vsock_addr_init(&src, pkt->dg.src.context, pkt->src_port); vsock_addr_init(&dst, pkt->dg.dst.context, pkt->dst_port); - sk = vsock_find_connected_socket(&src, &dst); + sk = vsock_find_connected_socket(&src, &dst, net); if (!sk) { - sk = vsock_find_bound_socket(&dst); + sk = vsock_find_bound_socket(&dst, net); if (!sk) { /* We could not find a socket for this specified * address. If this packet is a RST, we just drop it. From patchwork Thu Jan 16 17:24:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefano Garzarella X-Patchwork-Id: 11337561 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D4D5E138D for ; Thu, 16 Jan 2020 18:33:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AAD772073A for ; Thu, 16 Jan 2020 18:33:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="LHtIFCAH" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392006AbgAPSdc (ORCPT ); Thu, 16 Jan 2020 13:33:32 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:31775 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2403821AbgAPRYw (ORCPT ); Thu, 16 Jan 2020 12:24:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579195491; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cbf0JDACQ1z0Tg+uGTmFGjFNBjOD4cliWg8yJbi8qUQ=; b=LHtIFCAHg3/7qcTa8Sz08Q3rUJX4E4DGcN6dMo/xmJbK8VF1BudDd6fvU7D2yxrPipquNI GIa2Oqp8X/ZbBcOLrMrgbEMIPzM+7iU30tTHtmVlcKO/85W5c6Kq/ge3kJwNWPpYL+WDDn yTrwZXK5Liq5pFtOQBqzWXUCLXfVXck= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-278-bBiplUCvO5uGdLfoXn4TUQ-1; Thu, 16 Jan 2020 12:24:49 -0500 X-MC-Unique: bBiplUCvO5uGdLfoXn4TUQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id CFB70800D4C; Thu, 16 Jan 2020 17:24:47 +0000 (UTC) Received: from steredhat.redhat.com (ovpn-117-242.ams2.redhat.com [10.36.117.242]) by smtp.corp.redhat.com (Postfix) with ESMTP id 593AB5C1D8; Thu, 16 Jan 2020 17:24:45 +0000 (UTC) From: Stefano Garzarella To: davem@davemloft.net, netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Jorgen Hansen , Jason Wang , kvm@vger.kernel.org, Stefan Hajnoczi , Stefano Garzarella , virtualization@lists.linux-foundation.org, linux-hyperv@vger.kernel.org, "Michael S. Tsirkin" , Dexuan Cui Subject: [PATCH net-next 2/3] vsock/virtio_transport_common: handle netns of received packets Date: Thu, 16 Jan 2020 18:24:27 +0100 Message-Id: <20200116172428.311437-3-sgarzare@redhat.com> In-Reply-To: <20200116172428.311437-1-sgarzare@redhat.com> References: <20200116172428.311437-1-sgarzare@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This patch allows transports that use virtio_transport_common to specify the network namespace where a received packet is to be delivered. virtio_transport and vhost_transport, for now, use the default network namespace. vsock_loopback uses the same network namespace of the transmitter. Signed-off-by: Stefano Garzarella --- drivers/vhost/vsock.c | 1 + include/linux/virtio_vsock.h | 2 ++ net/vmw_vsock/virtio_transport.c | 2 ++ net/vmw_vsock/virtio_transport_common.c | 13 ++++++++++--- 4 files changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index c2d7d57e98cf..f1d39939d5e4 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -474,6 +474,7 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) continue; } + pkt->net = vsock_default_net(); len = pkt->len; /* Deliver to monitoring devices all received packets */ diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h index 71c81e0dc8f2..d4fc93e6e03e 100644 --- a/include/linux/virtio_vsock.h +++ b/include/linux/virtio_vsock.h @@ -43,6 +43,7 @@ struct virtio_vsock_pkt { struct list_head list; /* socket refcnt not held, only use for cancellation */ struct vsock_sock *vsk; + struct net *net; void *buf; u32 buf_len; u32 len; @@ -54,6 +55,7 @@ struct virtio_vsock_pkt_info { u32 remote_cid, remote_port; struct vsock_sock *vsk; struct msghdr *msg; + struct net *net; u32 pkt_len; u16 type; u16 op; diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index dfbaf6bd8b1c..fb03a1535c21 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -527,6 +527,8 @@ static void virtio_transport_rx_work(struct work_struct *work) } pkt->len = len - sizeof(pkt->hdr); + pkt->net = vsock_default_net(); + virtio_transport_deliver_tap_pkt(pkt); virtio_transport_recv_pkt(&virtio_transport, pkt); } diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index cecdfd91ed00..6402dea62e45 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -63,6 +63,7 @@ virtio_transport_alloc_pkt(struct virtio_vsock_pkt_info *info, pkt->hdr.len = cpu_to_le32(len); pkt->reply = info->reply; pkt->vsk = info->vsk; + pkt->net = info->net; if (info->msg && len > 0) { pkt->buf = kmalloc(len, GFP_KERNEL); @@ -273,6 +274,7 @@ static int virtio_transport_send_credit_update(struct vsock_sock *vsk, .op = VIRTIO_VSOCK_OP_CREDIT_UPDATE, .type = type, .vsk = vsk, + .net = sock_net(sk_vsock(vsk)), }; return virtio_transport_send_pkt_info(vsk, &info); @@ -622,6 +624,7 @@ int virtio_transport_connect(struct vsock_sock *vsk) .op = VIRTIO_VSOCK_OP_REQUEST, .type = VIRTIO_VSOCK_TYPE_STREAM, .vsk = vsk, + .net = sock_net(sk_vsock(vsk)), }; return virtio_transport_send_pkt_info(vsk, &info); @@ -638,6 +641,7 @@ int virtio_transport_shutdown(struct vsock_sock *vsk, int mode) (mode & SEND_SHUTDOWN ? VIRTIO_VSOCK_SHUTDOWN_SEND : 0), .vsk = vsk, + .net = sock_net(sk_vsock(vsk)), }; return virtio_transport_send_pkt_info(vsk, &info); @@ -665,6 +669,7 @@ virtio_transport_stream_enqueue(struct vsock_sock *vsk, .msg = msg, .pkt_len = len, .vsk = vsk, + .net = sock_net(sk_vsock(vsk)), }; return virtio_transport_send_pkt_info(vsk, &info); @@ -687,6 +692,7 @@ static int virtio_transport_reset(struct vsock_sock *vsk, .type = VIRTIO_VSOCK_TYPE_STREAM, .reply = !!pkt, .vsk = vsk, + .net = sock_net(sk_vsock(vsk)), }; /* Send RST only if the original pkt is not a RST pkt */ @@ -707,6 +713,7 @@ static int virtio_transport_reset_no_sock(const struct virtio_transport *t, .op = VIRTIO_VSOCK_OP_RST, .type = le16_to_cpu(pkt->hdr.type), .reply = true, + .net = pkt->net, }; /* Send RST only if the original pkt is not a RST pkt */ @@ -991,6 +998,7 @@ virtio_transport_send_response(struct vsock_sock *vsk, .remote_port = le32_to_cpu(pkt->hdr.src_port), .reply = true, .vsk = vsk, + .net = sock_net(sk_vsock(vsk)), }; return virtio_transport_send_pkt_info(vsk, &info); @@ -1088,7 +1096,6 @@ virtio_transport_recv_listen(struct sock *sk, struct virtio_vsock_pkt *pkt, void virtio_transport_recv_pkt(struct virtio_transport *t, struct virtio_vsock_pkt *pkt) { - struct net *net = vsock_default_net(); struct sockaddr_vm src, dst; struct vsock_sock *vsk; struct sock *sk; @@ -1116,9 +1123,9 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, /* The socket must be in connected or bound table * otherwise send reset back */ - sk = vsock_find_connected_socket(&src, &dst, net); + sk = vsock_find_connected_socket(&src, &dst, pkt->net); if (!sk) { - sk = vsock_find_bound_socket(&dst, net); + sk = vsock_find_bound_socket(&dst, pkt->net); if (!sk) { (void)virtio_transport_reset_no_sock(t, pkt); goto free_pkt; From patchwork Thu Jan 16 17:24:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefano Garzarella X-Patchwork-Id: 11337555 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DD58F92A for ; Thu, 16 Jan 2020 18:32:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B26FA2073A for ; Thu, 16 Jan 2020 18:32:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UCN+paU/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391839AbgAPRY6 (ORCPT ); Thu, 16 Jan 2020 12:24:58 -0500 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:23267 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2391820AbgAPRY4 (ORCPT ); Thu, 16 Jan 2020 12:24:56 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579195495; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rJXCaurapNQ3l0l2gppkZEC8B+0gvsSxNTKA5/PN+5k=; b=UCN+paU/wQyQ6BTwPFumvC9JIx0Ubm1/ikweKV4QQMKD1tHx8Vk2Mqc52IGh6rXmeRjEYt v663eqs+wmRTbXzDuq/3M0xbFsdFs/wOfL00SesM61/4zBhU9YbFfLdKen3jwgLujlovSp h14kqg7JFrAAjhEIsmWYfbRNhCmzGIQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-144-D6q33l1yOH-_YtBkUiFYyg-1; Thu, 16 Jan 2020 12:24:53 -0500 X-MC-Unique: D6q33l1yOH-_YtBkUiFYyg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7BD63477; Thu, 16 Jan 2020 17:24:52 +0000 (UTC) Received: from steredhat.redhat.com (ovpn-117-242.ams2.redhat.com [10.36.117.242]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2A6405C1D8; Thu, 16 Jan 2020 17:24:48 +0000 (UTC) From: Stefano Garzarella To: davem@davemloft.net, netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Jorgen Hansen , Jason Wang , kvm@vger.kernel.org, Stefan Hajnoczi , Stefano Garzarella , virtualization@lists.linux-foundation.org, linux-hyperv@vger.kernel.org, "Michael S. Tsirkin" , Dexuan Cui Subject: [PATCH net-next 3/3] vhost/vsock: use netns of process that opens the vhost-vsock device Date: Thu, 16 Jan 2020 18:24:28 +0100 Message-Id: <20200116172428.311437-4-sgarzare@redhat.com> In-Reply-To: <20200116172428.311437-1-sgarzare@redhat.com> References: <20200116172428.311437-1-sgarzare@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This patch assigns the network namespace of the process that opened vhost-vsock device (e.g. VMM) to the packets coming from the guest, allowing only host sockets in the same network namespace to communicate with the guest. This patch also allows having different VMs, running in different network namespace, with the same CID. Signed-off-by: Stefano Garzarella --- RFC -> v1 * used 'vsock_net_eq()' insted of 'net_eq()' --- drivers/vhost/vsock.c | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index f1d39939d5e4..8b0169105559 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -40,6 +40,7 @@ static DEFINE_READ_MOSTLY_HASHTABLE(vhost_vsock_hash, 8); struct vhost_vsock { struct vhost_dev dev; struct vhost_virtqueue vqs[2]; + struct net *net; /* Link to global vhost_vsock_hash, writes use vhost_vsock_mutex */ struct hlist_node hash; @@ -61,7 +62,7 @@ static u32 vhost_transport_get_local_cid(void) /* Callers that dereference the return value must hold vhost_vsock_mutex or the * RCU read lock. */ -static struct vhost_vsock *vhost_vsock_get(u32 guest_cid) +static struct vhost_vsock *vhost_vsock_get(u32 guest_cid, struct net *net) { struct vhost_vsock *vsock; @@ -72,7 +73,7 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid) if (other_cid == 0) continue; - if (other_cid == guest_cid) + if (other_cid == guest_cid && vsock_net_eq(net, vsock->net)) return vsock; } @@ -245,7 +246,7 @@ vhost_transport_send_pkt(struct virtio_vsock_pkt *pkt) rcu_read_lock(); /* Find the vhost_vsock according to guest context id */ - vsock = vhost_vsock_get(le64_to_cpu(pkt->hdr.dst_cid)); + vsock = vhost_vsock_get(le64_to_cpu(pkt->hdr.dst_cid), pkt->net); if (!vsock) { rcu_read_unlock(); virtio_transport_free_pkt(pkt); @@ -277,7 +278,8 @@ vhost_transport_cancel_pkt(struct vsock_sock *vsk) rcu_read_lock(); /* Find the vhost_vsock according to guest context id */ - vsock = vhost_vsock_get(vsk->remote_addr.svm_cid); + vsock = vhost_vsock_get(vsk->remote_addr.svm_cid, + sock_net(sk_vsock(vsk))); if (!vsock) goto out; @@ -474,7 +476,7 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) continue; } - pkt->net = vsock_default_net(); + pkt->net = vsock->net; len = pkt->len; /* Deliver to monitoring devices all received packets */ @@ -608,7 +610,14 @@ static int vhost_vsock_dev_open(struct inode *inode, struct file *file) vqs = kmalloc_array(ARRAY_SIZE(vsock->vqs), sizeof(*vqs), GFP_KERNEL); if (!vqs) { ret = -ENOMEM; - goto out; + goto out_vsock; + } + + /* Derive the network namespace from the pid opening the device */ + vsock->net = get_net_ns_by_pid(current->pid); + if (IS_ERR(vsock->net)) { + ret = PTR_ERR(vsock->net); + goto out_vqs; } vsock->guest_cid = 0; /* no CID assigned yet */ @@ -630,7 +639,9 @@ static int vhost_vsock_dev_open(struct inode *inode, struct file *file) vhost_work_init(&vsock->send_pkt_work, vhost_transport_send_pkt_work); return 0; -out: +out_vqs: + kfree(vqs); +out_vsock: vhost_vsock_free(vsock); return ret; } @@ -655,7 +666,7 @@ static void vhost_vsock_reset_orphans(struct sock *sk) */ /* If the peer is still valid, no need to reset connection */ - if (vhost_vsock_get(vsk->remote_addr.svm_cid)) + if (vhost_vsock_get(vsk->remote_addr.svm_cid, sock_net(sk))) return; /* If the close timeout is pending, let it expire. This avoids races @@ -703,6 +714,7 @@ static int vhost_vsock_dev_release(struct inode *inode, struct file *file) spin_unlock_bh(&vsock->send_pkt_list_lock); vhost_dev_cleanup(&vsock->dev); + put_net(vsock->net); kfree(vsock->dev.vqs); vhost_vsock_free(vsock); return 0; @@ -729,7 +741,7 @@ static int vhost_vsock_set_cid(struct vhost_vsock *vsock, u64 guest_cid) /* Refuse if CID is already in use */ mutex_lock(&vhost_vsock_mutex); - other = vhost_vsock_get(guest_cid); + other = vhost_vsock_get(guest_cid, vsock->net); if (other && other != vsock) { mutex_unlock(&vhost_vsock_mutex); return -EADDRINUSE;