From patchwork Thu Dec 6 06:35:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 10715311 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7B53113BF for ; Thu, 6 Dec 2018 06:40:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 679BF2E1D4 for ; Thu, 6 Dec 2018 06:40:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5BB332E357; Thu, 6 Dec 2018 06:40:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B5F732E1D4 for ; Thu, 6 Dec 2018 06:40:52 +0000 (UTC) Received: from localhost ([::1]:39110 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gUnL9-0004wx-VU for patchwork-qemu-devel@patchwork.kernel.org; Thu, 06 Dec 2018 01:40:52 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52805) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gUnHJ-0001Sr-82 for qemu-devel@nongnu.org; Thu, 06 Dec 2018 01:36:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gUnHB-0003Os-LM for qemu-devel@nongnu.org; Thu, 06 Dec 2018 01:36:51 -0500 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:46198) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gUnH9-0003MD-Hw for qemu-devel@nongnu.org; Thu, 06 Dec 2018 01:36:45 -0500 Received: by mail-pg1-x544.google.com with SMTP id w7so10179284pgp.13 for ; Wed, 05 Dec 2018 22:36:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=R8PT06pUE8dn1kFgVY/VcsbF2I2WIcg1CFzPd6hORKs=; b=HYDg3s0ott3FOOANmNqmUBMkE4p7+F9fKma7LNDwrgsz5EncOYFMnvlxNTKE8xxrEu 4m0pvH+NNDCz0iTdmmfCMb3XUZVmVGZgAH/NdMFK8DklQVFzXJNIF/KEYBlYDBt1+i3N qHZhzSMdMCsT1flNfEsIT8/afoqiSbNbGKQ1oosMCRH2scTRyQcxTrcj9heoxAjQYvLE Xul9wQ/KeIHKiDdAVkj5f1MLGHRT57h1CAIVO+V1RlM4Kk+Ls6t5HgEWIEWQMWWgDshW 8Mdfu2KOaug2EAd0mxlBbM0RJ+WbtsNU+wKa6pGPHQd5nrlnHY2FViktc7wZRcERVkf8 oTtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=R8PT06pUE8dn1kFgVY/VcsbF2I2WIcg1CFzPd6hORKs=; b=Sunuft/FOFda1PVXFr1bZwtVWyxTRHuaY8fw+GRsKfxYv37F5xURTAfoIRPs2Dzfh9 8g7dIc0vRI3FUN8AN2MIrnM+7JwICP8GvP9NG6GKVzq5baCHo3wZ0rWfB0QIj+I0TTTU MIfWE42kyJ/YyDkuUUGQ3o0kJCOyxZfVHXDzXrRwpFJbNQPEdLIKz3QIIgRnbjCW6RIi 9+eP26O+mmDE//PMKsSHOuzfTmEBMejDRLyl25UBStMTZ9DWa4ofN4ZKt5lIuBgX+WpQ /D0DixkGCpng/mt5+B4NY6Ton2S5GSyKSuVDRiVkwKtcOAsNLHl9S1Xb51MjOTO/Xbnc WZDA== X-Gm-Message-State: AA+aEWZNVpwbJteOuO5n4wHdhVZvvfEQmDj0Q5bi/YSsqsCdEsGoXdUk +4Vf7WWNKEM6469zxsYZMTM= X-Google-Smtp-Source: AFSGD/VqmpF44YSesIdOVq/5P84B04IiibEwdBKJzlhNlDw/G3Wysms8Vo/wBg8lfphhlpPor6i/2A== X-Received: by 2002:a63:6ecf:: with SMTP id j198mr23484012pgc.3.1544078201664; Wed, 05 Dec 2018 22:36:41 -0800 (PST) Received: from yongji-Ubuntu.internal.baidu.com ([116.247.112.152]) by smtp.gmail.com with ESMTPSA id x186sm26175249pfb.59.2018.12.05.22.36.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 05 Dec 2018 22:36:41 -0800 (PST) From: elohimes@gmail.com X-Google-Original-From: xieyongji@baidu.com To: mst@redhat.com, marcandre.lureau@redhat.com Date: Thu, 6 Dec 2018 14:35:48 +0800 Message-Id: <20181206063552.6701-3-xieyongji@baidu.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181206063552.6701-1-xieyongji@baidu.com> References: <20181206063552.6701-1-xieyongji@baidu.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::544 Subject: [Qemu-devel] [PATCH for-4.0 2/6] vhost-user: Add shared memory to record inflight I/O X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: nixun@baidu.com, qemu-devel@nongnu.org, lilin24@baidu.com, zhangyu31@baidu.com, chaiwen@baidu.com, Xie Yongji Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Xie Yongji This introduces a new message VHOST_USER_SET_VRING_INFLIGHT to support offering shared memory to backend to record its inflight I/O. With this new message, the backend is able to restart without missing I/O which would cause I/O hung for block device. Signed-off-by: Xie Yongji Signed-off-by: Chai Wen Signed-off-by: Zhang Yu --- hw/virtio/vhost-user.c | 69 +++++++++++++++++++++++++++++++ hw/virtio/vhost.c | 8 ++++ include/hw/virtio/vhost-backend.h | 4 ++ include/hw/virtio/vhost-user.h | 8 ++++ 4 files changed, 89 insertions(+) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index e09bed0e4a..4c0e64891d 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -19,6 +19,7 @@ #include "sysemu/kvm.h" #include "qemu/error-report.h" #include "qemu/sockets.h" +#include "qemu/memfd.h" #include "sysemu/cryptodev.h" #include "migration/migration.h" #include "migration/postcopy-ram.h" @@ -52,6 +53,7 @@ enum VhostUserProtocolFeature { VHOST_USER_PROTOCOL_F_CONFIG = 9, VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD = 10, VHOST_USER_PROTOCOL_F_HOST_NOTIFIER = 11, + VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD = 12, VHOST_USER_PROTOCOL_F_MAX }; @@ -89,6 +91,7 @@ typedef enum VhostUserRequest { VHOST_USER_POSTCOPY_ADVISE = 28, VHOST_USER_POSTCOPY_LISTEN = 29, VHOST_USER_POSTCOPY_END = 30, + VHOST_USER_SET_VRING_INFLIGHT = 31, VHOST_USER_MAX } VhostUserRequest; @@ -147,6 +150,11 @@ typedef struct VhostUserVringArea { uint64_t offset; } VhostUserVringArea; +typedef struct VhostUserVringInflight { + uint32_t size; + uint32_t idx; +} VhostUserVringInflight; + typedef struct { VhostUserRequest request; @@ -169,6 +177,7 @@ typedef union { VhostUserConfig config; VhostUserCryptoSession session; VhostUserVringArea area; + VhostUserVringInflight inflight; } VhostUserPayload; typedef struct VhostUserMsg { @@ -1739,6 +1748,58 @@ static bool vhost_user_mem_section_filter(struct vhost_dev *dev, return result; } +static int vhost_user_set_vring_inflight(struct vhost_dev *dev, int idx) +{ + struct vhost_user *u = dev->opaque; + + if (!virtio_has_feature(dev->protocol_features, + VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD)) { + return 0; + } + + if (!u->user->inflight[idx].addr) { + Error *err = NULL; + + u->user->inflight[idx].size = qemu_real_host_page_size; + u->user->inflight[idx].addr = qemu_memfd_alloc("vhost-inflight", + u->user->inflight[idx].size, + F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL, + &u->user->inflight[idx].fd, &err); + if (err) { + error_report_err(err); + u->user->inflight[idx].addr = NULL; + return -1; + } + } + + VhostUserMsg msg = { + .hdr.request = VHOST_USER_SET_VRING_INFLIGHT, + .hdr.flags = VHOST_USER_VERSION, + .payload.inflight.size = u->user->inflight[idx].size, + .payload.inflight.idx = idx, + .hdr.size = sizeof(msg.payload.inflight), + }; + + if (vhost_user_write(dev, &msg, &u->user->inflight[idx].fd, 1) < 0) { + return -1; + } + + return 0; +} + +void vhost_user_inflight_reset(VhostUserState *user) +{ + int i; + + for (i = 0; i < VIRTIO_QUEUE_MAX; i++) { + if (!user->inflight[i].addr) { + continue; + } + + memset(user->inflight[i].addr, 0, user->inflight[i].size); + } +} + VhostUserState *vhost_user_init(void) { VhostUserState *user = g_new0(struct VhostUserState, 1); @@ -1756,6 +1817,13 @@ void vhost_user_cleanup(VhostUserState *user) munmap(user->notifier[i].addr, qemu_real_host_page_size); user->notifier[i].addr = NULL; } + + if (user->inflight[i].addr) { + munmap(user->inflight[i].addr, user->inflight[i].size); + user->inflight[i].addr = NULL; + close(user->inflight[i].fd); + user->inflight[i].fd = -1; + } } } @@ -1790,4 +1858,5 @@ const VhostOps user_ops = { .vhost_crypto_create_session = vhost_user_crypto_create_session, .vhost_crypto_close_session = vhost_user_crypto_close_session, .vhost_backend_mem_section_filter = vhost_user_mem_section_filter, + .vhost_set_vring_inflight = vhost_user_set_vring_inflight, }; diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 569c4053ea..2ca7b4e841 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -973,6 +973,14 @@ static int vhost_virtqueue_start(struct vhost_dev *dev, return -errno; } + if (dev->vhost_ops->vhost_set_vring_inflight) { + r = dev->vhost_ops->vhost_set_vring_inflight(dev, vhost_vq_index); + if (r) { + VHOST_OPS_DEBUG("vhost_set_vring_inflight failed"); + return -errno; + } + } + state.num = virtio_queue_get_last_avail_idx(vdev, idx); r = dev->vhost_ops->vhost_set_vring_base(dev, &state); if (r) { diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h index 81283ec50f..8110e09089 100644 --- a/include/hw/virtio/vhost-backend.h +++ b/include/hw/virtio/vhost-backend.h @@ -104,6 +104,9 @@ typedef int (*vhost_crypto_close_session_op)(struct vhost_dev *dev, typedef bool (*vhost_backend_mem_section_filter_op)(struct vhost_dev *dev, MemoryRegionSection *section); +typedef int (*vhost_set_vring_inflight_op)(struct vhost_dev *dev, + int idx); + typedef struct VhostOps { VhostBackendType backend_type; vhost_backend_init vhost_backend_init; @@ -142,6 +145,7 @@ typedef struct VhostOps { vhost_crypto_create_session_op vhost_crypto_create_session; vhost_crypto_close_session_op vhost_crypto_close_session; vhost_backend_mem_section_filter_op vhost_backend_mem_section_filter; + vhost_set_vring_inflight_op vhost_set_vring_inflight; } VhostOps; extern const VhostOps user_ops; diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h index fd660393a0..ff13433153 100644 --- a/include/hw/virtio/vhost-user.h +++ b/include/hw/virtio/vhost-user.h @@ -17,11 +17,19 @@ typedef struct VhostUserHostNotifier { bool set; } VhostUserHostNotifier; +typedef struct VhostUserInflight { + void *addr; + uint32_t size; + int fd; +} VhostUserInflight; + typedef struct VhostUserState { CharBackend *chr; VhostUserHostNotifier notifier[VIRTIO_QUEUE_MAX]; + VhostUserInflight inflight[VIRTIO_QUEUE_MAX]; } VhostUserState; +void vhost_user_inflight_reset(VhostUserState *user); VhostUserState *vhost_user_init(void); void vhost_user_cleanup(VhostUserState *user);