From patchwork Fri Nov 12 05:10:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12616153 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F08B4C433F5 for ; Fri, 12 Nov 2021 05:42:26 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 69AB460D07 for ; Fri, 12 Nov 2021 05:42:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 69AB460D07 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:48286 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mlPKT-0000Uj-4E for qemu-devel@archiver.kernel.org; Fri, 12 Nov 2021 00:42:25 -0500 Received: from eggs.gnu.org ([209.51.188.92]:57966) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlPHa-00067K-Gt for qemu-devel@nongnu.org; Fri, 12 Nov 2021 00:39:26 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:30883) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlPHW-0005O4-5O for qemu-devel@nongnu.org; Fri, 12 Nov 2021 00:39:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636695560; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=L3GWBoGQqRAbvohkFfy5xKBuX91uf/+PZBXIeeOsce4=; b=EA3+5c5vAIX5M0rz+fekEGMLBlFjkjJ1QrmZ7SbF6Bt1HmICtTkhMvv2EovDCLduPJO0pe UqIOL/kHIwt5GtH6gpWki/2JzP47tmAqYfODNQF5ZA/AVJuaVjfjcEZEsOwEEvEb/MESc9 B4Ra6+wbnoYDkb3a5AkWEXlQ5N93kQI= Received: from mail-ua1-f71.google.com (mail-ua1-f71.google.com [209.85.222.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-401-igNFt-O3OW-rNHpTLUdg8w-1; Fri, 12 Nov 2021 00:39:19 -0500 X-MC-Unique: igNFt-O3OW-rNHpTLUdg8w-1 Received: by mail-ua1-f71.google.com with SMTP id n10-20020ab013ca000000b002cfd6ab0ba5so4201240uae.1 for ; Thu, 11 Nov 2021 21:39:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=L3GWBoGQqRAbvohkFfy5xKBuX91uf/+PZBXIeeOsce4=; b=bJ3waCnb//866AetI8L0ZeaekqVapY8T7NTse6gDZPtiRQDeEWLew2paBWPu4M1VlS yJ5dJFUuxG2C74bhbrcx2A+Xzs7bZgkzNScZRA/kKcBVLlXysz4Fya8okXs9sUMyTZ5L MX8uqR+7BjHHklSl+RegnIbo0gEu1+OOApn2WNFw5mfw5GBprjP2am4ubY+noShm1aDM TCQesqZ2vtIUTN35U9rNoK7HyvLjmlB2msLwnjBuKSgIS3HmpazCYT2UFXUBpyB0vPSD PTzQMadlio131JvwxlIZyxVdUkzDvDS6goc8F94cTtr/w9mOOVQgRtv0uk/eAKESwnfs CG0g== X-Gm-Message-State: AOAM5306dm3Zfp0SBdJQ80BZqB1EXH4FVDOm4hu4k/hDj7m0KrTf+r+8 dPbjPwXIFiKqPUqqtIxOY7mD/C/7KcyZ+2Z9QXSMMYm8tF3ecX5K4ZYIBmC0JOyqwgBCE+HM0g+ rgEEXAOlIQDBU24U= X-Received: by 2002:a67:fd90:: with SMTP id k16mr6246540vsq.39.1636695558462; Thu, 11 Nov 2021 21:39:18 -0800 (PST) X-Google-Smtp-Source: ABdhPJxpiCzWAYCrV/Lmq+II17XEeHwEBYYEMcfrB1eIWubkjR2mtP656Xg3UBFAUVXmkHKifbNxcA== X-Received: by 2002:a67:fd90:: with SMTP id k16mr6246517vsq.39.1636695558250; Thu, 11 Nov 2021 21:39:18 -0800 (PST) Received: from LeoBras.redhat.com ([2804:431:c7f0:7e14:3b94:fb27:f0ad:a824]) by smtp.gmail.com with ESMTPSA id r2sm1465280vsk.28.2021.11.11.21.39.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Nov 2021 21:39:17 -0800 (PST) From: Leonardo Bras To: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster Subject: [PATCH v5 1/6] QIOChannel: Add io_writev_zerocopy & io_flush_zerocopy callbacks Date: Fri, 12 Nov 2021 02:10:36 -0300 Message-Id: <20211112051040.923746-2-leobras@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211112051040.923746-1-leobras@redhat.com> References: <20211112051040.923746-1-leobras@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=leobras@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.7, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Adds io_writev_zerocopy and io_flush_zerocopy as optional callback to QIOChannelClass, allowing the implementation of zerocopy writes by subclasses. How to use them: - Write data using qio_channel_writev_zerocopy(), - Wait write completion with qio_channel_flush_zerocopy(). Notes: As some zerocopy implementations work asynchronously, it's recommended to keep the write buffer untouched until the return of qio_channel_flush_zerocopy(), to avoid the risk of sending an updated buffer instead of the one at the write. As the new callbacks are optional, if a subclass does not implement them, then: - io_writev_zerocopy will return -1, - io_flush_zerocopy will return 0 without changing anything. Also, some functions like qio_channel_writev_full_all() were adapted to receive a flag parameter. That allows shared code between zerocopy and non-zerocopy writev. Signed-off-by: Leonardo Bras --- include/io/channel.h | 93 ++++++++++++++++++++++++++++++++++++++------ io/channel.c | 65 +++++++++++++++++++++++++------ 2 files changed, 135 insertions(+), 23 deletions(-) diff --git a/include/io/channel.h b/include/io/channel.h index 88988979f8..a19c09bb84 100644 --- a/include/io/channel.h +++ b/include/io/channel.h @@ -32,12 +32,15 @@ OBJECT_DECLARE_TYPE(QIOChannel, QIOChannelClass, #define QIO_CHANNEL_ERR_BLOCK -2 +#define QIO_CHANNEL_WRITE_FLAG_ZEROCOPY 0x1 + typedef enum QIOChannelFeature QIOChannelFeature; enum QIOChannelFeature { QIO_CHANNEL_FEATURE_FD_PASS, QIO_CHANNEL_FEATURE_SHUTDOWN, QIO_CHANNEL_FEATURE_LISTEN, + QIO_CHANNEL_FEATURE_WRITE_ZEROCOPY, }; @@ -136,6 +139,12 @@ struct QIOChannelClass { IOHandler *io_read, IOHandler *io_write, void *opaque); + ssize_t (*io_writev_zerocopy)(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + Error **errp); + int (*io_flush_zerocopy)(QIOChannel *ioc, + Error **errp); }; /* General I/O handling functions */ @@ -321,10 +330,11 @@ int qio_channel_readv_all(QIOChannel *ioc, /** - * qio_channel_writev_all: + * qio_channel_writev_all_flags: * @ioc: the channel object * @iov: the array of memory regions to write data from * @niov: the length of the @iov array + * @flags: write flags (QIO_CHANNEL_WRITE_FLAG_*) * @errp: pointer to a NULL-initialized error object * * Write data to the IO channel, reading it from the @@ -337,12 +347,23 @@ int qio_channel_readv_all(QIOChannel *ioc, * to be written, yielding from the current coroutine * if required. * + * If QIO_CHANNEL_WRITE_FLAG_ZEROCOPY is passed in flags, + * instead of waiting for all requested data to be written, + * this function will wait until it's all queued for writing. + * In this case, if the buffer gets changed between queueing and + * sending, the updated buffer will be sent. If this is not a + * desired behavior, it's suggested to call qio_channel_flush_zerocopy() + * before reusing the buffer. + * * Returns: 0 if all bytes were written, or -1 on error */ -int qio_channel_writev_all(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - Error **erp); +int qio_channel_writev_all_flags(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int flags, + Error **errp); +#define qio_channel_writev_all(ioc, iov, niov, errp) \ + qio_channel_writev_all_flags(ioc, iov, niov, 0, errp) /** * qio_channel_readv: @@ -831,12 +852,13 @@ int qio_channel_readv_full_all(QIOChannel *ioc, Error **errp); /** - * qio_channel_writev_full_all: + * qio_channel_writev_full_all_flags: * @ioc: the channel object * @iov: the array of memory regions to write data from * @niov: the length of the @iov array * @fds: an array of file handles to send * @nfds: number of file handles in @fds + * @flags: write flags (QIO_CHANNEL_WRITE_FLAG_*) * @errp: pointer to a NULL-initialized error object * * @@ -846,13 +868,62 @@ int qio_channel_readv_full_all(QIOChannel *ioc, * to be written, yielding from the current coroutine * if required. * + * If QIO_CHANNEL_WRITE_FLAG_ZEROCOPY is passed in flags, + * instead of waiting for all requested data to be written, + * this function will wait until it's all queued for writing. + * In this case, if the buffer gets changed between queueing and + * sending, the updated buffer will be sent. If this is not a + * desired behavior, it's suggested to call qio_channel_flush_zerocopy() + * before reusing the buffer. + * * Returns: 0 if all bytes were written, or -1 on error */ -int qio_channel_writev_full_all(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - int *fds, size_t nfds, - Error **errp); +int qio_channel_writev_full_all_flags(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, size_t nfds, + int flags, Error **errp); +#define qio_channel_writev_full_all(ioc, iov, niov, fds, nfds, errp) \ + qio_channel_writev_full_all_flags(ioc, iov, niov, fds, nfds, 0, errp) + +/** + * qio_channel_writev_zerocopy: + * @ioc: the channel object + * @iov: the array of memory regions to write data from + * @niov: the length of the @iov array + * @errp: pointer to a NULL-initialized error object + * + * Behaves like qio_channel_writev_full_all_flags, but may write + * data asynchronously while avoiding unnecessary data copy. + * This function may return before any data is actually written, + * but should queue every buffer for writing. + * + * If at some point it's necessary to wait for all data to be + * written, use qio_channel_flush_zerocopy(). + * + * If zerocopy is not available, returns -1 and set errp. + */ + +ssize_t qio_channel_writev_zerocopy(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + Error **errp); + +/** + * qio_channel_flush_zerocopy: + * @ioc: the channel object + * @errp: pointer to a NULL-initialized error object + * + * Will block until every packet queued with + * qio_channel_writev_zerocopy() is sent, or return + * in case of any error. + * + * Returns -1 if any error is found, 0 otherwise. + * If not implemented, acts as a no-op, and returns 0. + */ + +int qio_channel_flush_zerocopy(QIOChannel *ioc, + Error **errp); #endif /* QIO_CHANNEL_H */ diff --git a/io/channel.c b/io/channel.c index e8b019dc36..009da9b772 100644 --- a/io/channel.c +++ b/io/channel.c @@ -212,19 +212,21 @@ int qio_channel_readv_full_all(QIOChannel *ioc, return ret; } -int qio_channel_writev_all(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - Error **errp) +int qio_channel_writev_all_flags(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int flags, + Error **errp) { - return qio_channel_writev_full_all(ioc, iov, niov, NULL, 0, errp); + return qio_channel_writev_full_all_flags(ioc, iov, niov, NULL, 0, flags, + errp); } -int qio_channel_writev_full_all(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - int *fds, size_t nfds, - Error **errp) +int qio_channel_writev_full_all_flags(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, size_t nfds, + int flags, Error **errp) { int ret = -1; struct iovec *local_iov = g_new(struct iovec, niov); @@ -237,8 +239,15 @@ int qio_channel_writev_full_all(QIOChannel *ioc, while (nlocal_iov > 0) { ssize_t len; - len = qio_channel_writev_full(ioc, local_iov, nlocal_iov, fds, nfds, - errp); + + if (flags & QIO_CHANNEL_WRITE_FLAG_ZEROCOPY) { + assert(fds == NULL && nfds == 0); + len = qio_channel_writev_zerocopy(ioc, local_iov, nlocal_iov, errp); + } else { + len = qio_channel_writev_full(ioc, local_iov, nlocal_iov, fds, nfds, + errp); + } + if (len == QIO_CHANNEL_ERR_BLOCK) { if (qemu_in_coroutine()) { qio_channel_yield(ioc, G_IO_OUT); @@ -474,6 +483,38 @@ off_t qio_channel_io_seek(QIOChannel *ioc, } +ssize_t qio_channel_writev_zerocopy(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + Error **errp) +{ + QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc); + + if (!klass->io_writev_zerocopy || + !qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_WRITE_ZEROCOPY)) { + error_setg_errno(errp, EINVAL, + "Channel does not support zerocopy writev"); + return -1; + } + + return klass->io_writev_zerocopy(ioc, iov, niov, errp); +} + + +int qio_channel_flush_zerocopy(QIOChannel *ioc, + Error **errp) +{ + QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc); + + if (!klass->io_flush_zerocopy || + !qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_WRITE_ZEROCOPY)) { + return 0; + } + + return klass->io_flush_zerocopy(ioc, errp); +} + + static void qio_channel_restart_read(void *opaque) { QIOChannel *ioc = opaque; From patchwork Fri Nov 12 05:10:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12616157 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB5EDC433FE for ; Fri, 12 Nov 2021 05:42:30 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8534E60D07 for ; Fri, 12 Nov 2021 05:42:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8534E60D07 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:48546 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mlPKX-0000hO-MM for qemu-devel@archiver.kernel.org; Fri, 12 Nov 2021 00:42:29 -0500 Received: from eggs.gnu.org ([209.51.188.92]:57986) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlPHc-00069T-NK for qemu-devel@nongnu.org; Fri, 12 Nov 2021 00:39:28 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:52700) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlPHX-0005OL-Cx for qemu-devel@nongnu.org; Fri, 12 Nov 2021 00:39:27 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636695562; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Bjrh0f64PpUFOitgFXl5DU47FbzO9rtRnmHn46eI6S8=; b=NCD4wOq4FFTdSYgoCIFWEkFEsV0Dv1LIa9BF7CY8ikH0isD+xrIy4FNw3oIdc+BQ62rh2q sN98pVqZ36O2RsT1kSQL31nTFbfAslRxZ2Ywkeo4I9Aoar91ypf0tJEdr/zpl0kCymeKqH k8nu2nKor43Zj9Pe016w9hy9RUTyLfA= Received: from mail-vk1-f199.google.com (mail-vk1-f199.google.com [209.85.221.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-488-bz-Hu6XrOG6MnI7cgiKLIQ-1; Fri, 12 Nov 2021 00:39:21 -0500 X-MC-Unique: bz-Hu6XrOG6MnI7cgiKLIQ-1 Received: by mail-vk1-f199.google.com with SMTP id y15-20020a1f7d0f000000b002f244d4c479so3843761vkc.9 for ; Thu, 11 Nov 2021 21:39:21 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Bjrh0f64PpUFOitgFXl5DU47FbzO9rtRnmHn46eI6S8=; b=NOKjTPYSmD90FaVovjcyNGVLFRby1OVRe71o3fAVydtiS5gzywyMLqSHqCjxqDmR74 mprKqvAW5NpSakwdCC8jHpMZFrNq+/WzK6UvXN1QkW8kO8/o2dpC3E88FGigXXv/HJvq fnLkhZRvoLwHQtYnvTgGgWZ6OCmbku6bUvX86Qqsdpc6lx0YElouy81mbxq+yHlsSqDD qwcdJLqETIfcLdg50cbC/HF11CVboaJpnCha8VCENJdCam476pIdbJ3DqlCGXLPR2/dD UYFV6ViNl0A9jvMhGHH0PWYP1GpiLHAa+mcBxQiKJUJHawfOF0Lb/LCC7U1Zp+x5RSaY p3sA== X-Gm-Message-State: AOAM533ZGUKW9sOS5UGNpe1fi2/pnA2JZQDCfKzBNS+A5hCnMRXYAJ66 frY5g6ywFJ1fqt3r7NK3rKbdzIYc4t3uGBGCggm8NMLArTWJ2FlvWrwLw2mYL3eglBy1pmVmw+0 kdw/iv3NatYaJa3w= X-Received: by 2002:ab0:35cd:: with SMTP id x13mr12718305uat.46.1636695560879; Thu, 11 Nov 2021 21:39:20 -0800 (PST) X-Google-Smtp-Source: ABdhPJx87JJEKafitmpodLYDyDNm1IO66kUATrg8sRtsSDKch3tKyOdtwYznZCQ846VyV493eP+s+A== X-Received: by 2002:ab0:35cd:: with SMTP id x13mr12718272uat.46.1636695560699; Thu, 11 Nov 2021 21:39:20 -0800 (PST) Received: from LeoBras.redhat.com ([2804:431:c7f0:7e14:3b94:fb27:f0ad:a824]) by smtp.gmail.com with ESMTPSA id r2sm1465280vsk.28.2021.11.11.21.39.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Nov 2021 21:39:20 -0800 (PST) From: Leonardo Bras To: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster Subject: [PATCH v5 2/6] QIOChannelSocket: Add flags parameter for writing Date: Fri, 12 Nov 2021 02:10:37 -0300 Message-Id: <20211112051040.923746-3-leobras@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211112051040.923746-1-leobras@redhat.com> References: <20211112051040.923746-1-leobras@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=leobras@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.7, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Change qio_channel_socket_writev() in order to accept flags, so its possible to selectively make use of sendmsg() flags. qio_channel_socket_writev() contents were moved to a helper function qio_channel_socket_writev_flags() which accepts an extra argument for flags. (This argument is passed directly to sendmsg(). Signed-off-by: Leonardo Bras Reviewed-by: Daniel P. Berrangé --- io/channel-socket.c | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/io/channel-socket.c b/io/channel-socket.c index 606ec97cf7..b57a27bf91 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -520,12 +520,13 @@ static ssize_t qio_channel_socket_readv(QIOChannel *ioc, return ret; } -static ssize_t qio_channel_socket_writev(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - int *fds, - size_t nfds, - Error **errp) +static ssize_t qio_channel_socket_writev_flags(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, + size_t nfds, + int flags, + Error **errp) { QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); ssize_t ret; @@ -558,7 +559,7 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, } retry: - ret = sendmsg(sioc->fd, &msg, 0); + ret = sendmsg(sioc->fd, &msg, flags); if (ret <= 0) { if (errno == EAGAIN) { return QIO_CHANNEL_ERR_BLOCK; @@ -572,6 +573,17 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, } return ret; } + +static ssize_t qio_channel_socket_writev(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int *fds, + size_t nfds, + Error **errp) +{ + return qio_channel_socket_writev_flags(ioc, iov, niov, fds, nfds, 0, errp); +} + #else /* WIN32 */ static ssize_t qio_channel_socket_readv(QIOChannel *ioc, const struct iovec *iov, From patchwork Fri Nov 12 05:10:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12616161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8D81C433EF for ; Fri, 12 Nov 2021 05:44:57 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7DAB060F70 for ; Fri, 12 Nov 2021 05:44:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 7DAB060F70 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:55258 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mlPMu-0005GF-J7 for qemu-devel@archiver.kernel.org; Fri, 12 Nov 2021 00:44:56 -0500 Received: from eggs.gnu.org ([209.51.188.92]:58020) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlPHf-0006F5-2p for qemu-devel@nongnu.org; Fri, 12 Nov 2021 00:39:31 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:26850) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlPHc-0005Ov-P3 for qemu-devel@nongnu.org; Fri, 12 Nov 2021 00:39:30 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636695568; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QEZ7KWVhVlN9ynTuHmuv37z/vXH4vHDkNj5XVpkctIQ=; b=PxJKFPdEBxW5eauukQfUDpaCtf+QP+0d3f09QRYTsXrkfFPU8G7zNwb0SbVs8NzBNU63VM q7FFz8gwsRVSoT8rX1wvc02I9S0SEuQL3+GIvtqLBY+QrjtaHV1AJeal0bmbYebHEg2NhO oy0F49IjgKP20RgXw0gb4LhQtQupEfw= Received: from mail-vk1-f200.google.com (mail-vk1-f200.google.com [209.85.221.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-507-zzlE4e1zPH-NL_I05Kg1og-1; Fri, 12 Nov 2021 00:39:24 -0500 X-MC-Unique: zzlE4e1zPH-NL_I05Kg1og-1 Received: by mail-vk1-f200.google.com with SMTP id q3-20020a056122116300b002faa0b9026fso3820260vko.18 for ; Thu, 11 Nov 2021 21:39:23 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QEZ7KWVhVlN9ynTuHmuv37z/vXH4vHDkNj5XVpkctIQ=; b=Xuh5giifqNT6AL2Ugysh0jj3W5nfymnljqtKhsjoF0WGWBvv8cx4gpvFi/tmqz/iWM ru0yKKeLZhmI6HRGo3XPPLVzTa3HWKWQgxrCd/p2MlbKbXCQVgmRn30uwNbFok7D5JZc nxj/yS3FVTx+b2YH+d+7lbzy02nHo28pRDNZWT5OLN1XHvbsGkIxLiyTom596Hp+8PNJ kuFOIAQXSJWld3Um624bfMpzgKu32BTx11P46BjpBzU4AFINpaIuyMnz+UD4xwT4bvse O5CIBDtq05Q7icrwYFSPK7rUGf49fxdYDftBegxUf6ybS0ZmkQQPGL+v8wBacl7yVGUX 4KaQ== X-Gm-Message-State: AOAM533XT+M3v92PkQ3XPZCeT9VTihdJxE3U/m0/NtexBVQ95Bjvxfdt /Ss2b6jQKg9X01DWYiTm3k/rmmKjZ+7Gxu02+ra7rGLp3P493MUh/9ZG0COD+2EQ/LqEHVxC9gK lNK7WuAn3oZPQzFM= X-Received: by 2002:a05:6102:953:: with SMTP id a19mr6750973vsi.28.1636695563380; Thu, 11 Nov 2021 21:39:23 -0800 (PST) X-Google-Smtp-Source: ABdhPJwl1OkVaWtxGwMAD27GoGXlg2bZfIOG7C3pufeQtJKb2GBPlTIjrzus4EQ2zoQ1oA9WCRVygg== X-Received: by 2002:a05:6102:953:: with SMTP id a19mr6750946vsi.28.1636695563153; Thu, 11 Nov 2021 21:39:23 -0800 (PST) Received: from LeoBras.redhat.com ([2804:431:c7f0:7e14:3b94:fb27:f0ad:a824]) by smtp.gmail.com with ESMTPSA id r2sm1465280vsk.28.2021.11.11.21.39.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Nov 2021 21:39:22 -0800 (PST) From: Leonardo Bras To: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster Subject: [PATCH v5 3/6] QIOChannelSocket: Implement io_writev_zerocopy & io_flush_zerocopy for CONFIG_LINUX Date: Fri, 12 Nov 2021 02:10:38 -0300 Message-Id: <20211112051040.923746-4-leobras@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211112051040.923746-1-leobras@redhat.com> References: <20211112051040.923746-1-leobras@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=leobras@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.7, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" For CONFIG_LINUX, implement the new optional callbacks io_write_zerocopy and io_flush_zerocopy on QIOChannelSocket, but enables it only when MSG_ZEROCOPY feature is available in the host kernel, which is checked on qio_channel_socket_connect_sync() qio_channel_socket_flush_zerocopy() was implemented by counting how many times sendmsg(...,MSG_ZEROCOPY) was successfully called, and then reading the socket's error queue, in order to find how many of them finished sending. Flush will loop until those counters are the same, or until some error occurs. A new function qio_channel_socket_poll() was also created in order to avoid busy-looping recvmsg() in qio_channel_socket_flush_zerocopy() while waiting for updates in socket's error queue. Notes on using writev_zerocopy(): 1: Buffer - As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid copying, some caution is necessary to avoid overwriting any buffer before it's sent. If something like this happen, a newer version of the buffer may be sent instead. - If this is a problem, it's recommended to call flush_zerocopy() before freeing or re-using the buffer. 2: Locked memory - When using MSG_ZERCOCOPY, the buffer memory will be locked after queued, and unlocked after it's sent. - Depending on the size of each buffer, and how often it's sent, it may require a larger amount of locked memory than usually available to non-root user. - If the required amount of locked memory is not available, writev_zerocopy will return an error, which can abort an operation like migration, - Because of this, when an user code wants to add zerocopy as a feature, it requires a mechanism to disable it, so it can still be accessible to less privileged users. Signed-off-by: Leonardo Bras --- include/io/channel-socket.h | 2 + include/io/channel.h | 1 + io/channel-socket.c | 150 +++++++++++++++++++++++++++++++++++- 3 files changed, 150 insertions(+), 3 deletions(-) diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h index e747e63514..81d04baa4c 100644 --- a/include/io/channel-socket.h +++ b/include/io/channel-socket.h @@ -47,6 +47,8 @@ struct QIOChannelSocket { socklen_t localAddrLen; struct sockaddr_storage remoteAddr; socklen_t remoteAddrLen; + ssize_t zerocopy_queued; + ssize_t zerocopy_sent; }; diff --git a/include/io/channel.h b/include/io/channel.h index a19c09bb84..051fff4197 100644 --- a/include/io/channel.h +++ b/include/io/channel.h @@ -31,6 +31,7 @@ OBJECT_DECLARE_TYPE(QIOChannel, QIOChannelClass, #define QIO_CHANNEL_ERR_BLOCK -2 +#define QIO_CHANNEL_ERR_NOBUFS -3 #define QIO_CHANNEL_WRITE_FLAG_ZEROCOPY 0x1 diff --git a/io/channel-socket.c b/io/channel-socket.c index b57a27bf91..c724b849ad 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -26,6 +26,10 @@ #include "io/channel-watch.h" #include "trace.h" #include "qapi/clone-visitor.h" +#ifdef CONFIG_LINUX +#include +#include +#endif #define SOCKET_MAX_FDS 16 @@ -55,6 +59,8 @@ qio_channel_socket_new(void) sioc = QIO_CHANNEL_SOCKET(object_new(TYPE_QIO_CHANNEL_SOCKET)); sioc->fd = -1; + sioc->zerocopy_queued = 0; + sioc->zerocopy_sent = 0; ioc = QIO_CHANNEL(sioc); qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); @@ -140,6 +146,7 @@ int qio_channel_socket_connect_sync(QIOChannelSocket *ioc, Error **errp) { int fd; + int ret, v = 1; trace_qio_channel_socket_connect_sync(ioc, addr); fd = socket_connect(addr, errp); @@ -154,6 +161,15 @@ int qio_channel_socket_connect_sync(QIOChannelSocket *ioc, return -1; } +#ifdef CONFIG_LINUX + ret = qemu_setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &v, sizeof(v)); + if (ret == 0) { + /* Zerocopy available on host */ + qio_channel_set_feature(QIO_CHANNEL(ioc), + QIO_CHANNEL_FEATURE_WRITE_ZEROCOPY); + } +#endif + return 0; } @@ -561,12 +577,15 @@ static ssize_t qio_channel_socket_writev_flags(QIOChannel *ioc, retry: ret = sendmsg(sioc->fd, &msg, flags); if (ret <= 0) { - if (errno == EAGAIN) { + switch (errno) { + case EAGAIN: return QIO_CHANNEL_ERR_BLOCK; - } - if (errno == EINTR) { + case EINTR: goto retry; + case ENOBUFS: + return QIO_CHANNEL_ERR_NOBUFS; } + error_setg_errno(errp, errno, "Unable to write to socket"); return -1; @@ -670,6 +689,127 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, } #endif /* WIN32 */ + +#ifdef CONFIG_LINUX + +static int qio_channel_socket_poll(QIOChannelSocket *sioc, bool zerocopy, + Error **errp) +{ + struct pollfd pfd; + int ret; + + pfd.fd = sioc->fd; + pfd.events = 0; + + retry: + ret = poll(&pfd, 1, -1); + if (ret < 0) { + switch (errno) { + case EAGAIN: + case EINTR: + goto retry; + default: + error_setg_errno(errp, errno, + "Poll error"); + return ret; + } + } + + if (pfd.revents & (POLLHUP | POLLNVAL)) { + error_setg(errp, "Poll error: Invalid or disconnected fd"); + return -1; + } + + if (!zerocopy && (pfd.revents & POLLERR)) { + error_setg(errp, "Poll error: Errors present in errqueue"); + return -1; + } + + return ret; +} + +static ssize_t qio_channel_socket_writev_zerocopy(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + Error **errp) +{ + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); + ssize_t ret; + + ret = qio_channel_socket_writev_flags(ioc, iov, niov, NULL, 0, + MSG_ZEROCOPY, errp); + if (ret == QIO_CHANNEL_ERR_NOBUFS) { + error_setg_errno(errp, errno, + "Process can't lock enough memory for using MSG_ZEROCOPY"); + return -1; + } + + sioc->zerocopy_queued++; + return ret; +} + +static int qio_channel_socket_flush_zerocopy(QIOChannel *ioc, + Error **errp) +{ + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); + struct msghdr msg = {}; + struct sock_extended_err *serr; + struct cmsghdr *cm; + char control[CMSG_SPACE(sizeof(*serr))]; + int ret; + + msg.msg_control = control; + msg.msg_controllen = sizeof(control); + memset(control, 0, sizeof(control)); + + while (sioc->zerocopy_sent < sioc->zerocopy_queued) { + ret = recvmsg(sioc->fd, &msg, MSG_ERRQUEUE); + if (ret < 0) { + switch (errno) { + case EAGAIN: + /* Nothing on errqueue, wait until something is available */ + ret = qio_channel_socket_poll(sioc, true, errp); + if (ret < 0) { + return -1; + } + continue; + case EINTR: + continue; + default: + error_setg_errno(errp, errno, + "Unable to read errqueue"); + return -1; + } + } + + cm = CMSG_FIRSTHDR(&msg); + if (cm->cmsg_level != SOL_IP && + cm->cmsg_type != IP_RECVERR) { + error_setg_errno(errp, EPROTOTYPE, + "Wrong cmsg in errqueue"); + return -1; + } + + serr = (void *) CMSG_DATA(cm); + if (serr->ee_errno != SO_EE_ORIGIN_NONE) { + error_setg_errno(errp, serr->ee_errno, + "Error on socket"); + return -1; + } + if (serr->ee_origin != SO_EE_ORIGIN_ZEROCOPY) { + error_setg_errno(errp, serr->ee_origin, + "Error not from zerocopy"); + return -1; + } + + /* No errors, count successfully finished sendmsg()*/ + sioc->zerocopy_sent += serr->ee_data - serr->ee_info + 1; + } + return 0; +} + +#endif /* CONFIG_LINUX */ + static int qio_channel_socket_set_blocking(QIOChannel *ioc, bool enabled, @@ -799,6 +939,10 @@ static void qio_channel_socket_class_init(ObjectClass *klass, ioc_klass->io_set_delay = qio_channel_socket_set_delay; ioc_klass->io_create_watch = qio_channel_socket_create_watch; ioc_klass->io_set_aio_fd_handler = qio_channel_socket_set_aio_fd_handler; +#ifdef CONFIG_LINUX + ioc_klass->io_writev_zerocopy = qio_channel_socket_writev_zerocopy; + ioc_klass->io_flush_zerocopy = qio_channel_socket_flush_zerocopy; +#endif } static const TypeInfo qio_channel_socket_info = { From patchwork Fri Nov 12 05:10:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12616163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F049BC433F5 for ; Fri, 12 Nov 2021 05:46:22 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6FAD360D43 for ; Fri, 12 Nov 2021 05:46:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6FAD360D43 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:57316 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mlPOH-0006da-Ca for qemu-devel@archiver.kernel.org; Fri, 12 Nov 2021 00:46:21 -0500 Received: from eggs.gnu.org ([209.51.188.92]:58014) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlPHe-0006E9-Fk for qemu-devel@nongnu.org; Fri, 12 Nov 2021 00:39:30 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:59632) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlPHc-0005Op-9L for qemu-devel@nongnu.org; Fri, 12 Nov 2021 00:39:30 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636695567; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LG3JdECCuVaZ09IG/1lMKWaqDfT+2PUZn7zNymST8yQ=; b=Rgl7hrJ5vfNaIkepTvw1/YpXGyVhUEzAfEqPtDGHjj8HIuQR72jI8GFM9pBaScZIuc/s3o ZMIqy7KWTYPldxFYuXyxyABLprTDAwT4r1iEsJwnfbRR6CjMmC56+Ttf57qnbH/Nrkg0gT nfqR3QE9KPvQxqOGBoBNHJzod5kyS0Q= Received: from mail-ua1-f70.google.com (mail-ua1-f70.google.com [209.85.222.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-248-5VkFuBclOiWAJOzAicuqvw-1; Fri, 12 Nov 2021 00:39:26 -0500 X-MC-Unique: 5VkFuBclOiWAJOzAicuqvw-1 Received: by mail-ua1-f70.google.com with SMTP id j29-20020ab0185d000000b002cbb3c4660bso4138633uag.23 for ; Thu, 11 Nov 2021 21:39:26 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LG3JdECCuVaZ09IG/1lMKWaqDfT+2PUZn7zNymST8yQ=; b=OmT+wsp2DtV93VFRO5+WJMDAmFvBvX6YzizuczxMj5x7Da1hExAwMDqscyp7HmtrPj /Q3JUOsMqelhDr6AStu6bSsLqCuhigGvujtCajRndadfonEAdyXkfCuMH3Di+pvkHrIz KWZP/Ejp1i0w23wzU4Bw2I8zG5uN4qtMr1LUAUZcVD5cqcVyAUCrkGYP3bHnbsIGwoOh zWkqXLMgtbWwHM99CaDJGMuqm69Y4XdhfX+W81sPMg//qmUcoJiZyC4ckuqAjyUG5l1d XuJWc9bo0QQ/pdNWXsLYHGSatjOa2VKWjUws/k6QoB0gE+bQstxthkLfyHdrk8hIgLy/ k1DQ== X-Gm-Message-State: AOAM532IBW2eLwE923HdpkCxZMBszRJCEJWR+3kf5Em7gYdpgjiOuPaE fTwAUEqepcD5cBFbzAoFgP6bRoEXsKxDUdCL2Z+4C5yBl9Y7q9IXqR4wiGzn+Gn80H+RretRCU6 ITx1rgFdYAZCF+0g= X-Received: by 2002:a05:6102:f12:: with SMTP id v18mr6425522vss.0.1636695565880; Thu, 11 Nov 2021 21:39:25 -0800 (PST) X-Google-Smtp-Source: ABdhPJzaAq6e5l/xjF6oQ1B820l0EvQtVfDnqAneMrRAV4xJnQuJPMMxmoVYqmx+oZd0uEhGAU0rGQ== X-Received: by 2002:a05:6102:f12:: with SMTP id v18mr6425485vss.0.1636695565631; Thu, 11 Nov 2021 21:39:25 -0800 (PST) Received: from LeoBras.redhat.com ([2804:431:c7f0:7e14:3b94:fb27:f0ad:a824]) by smtp.gmail.com with ESMTPSA id r2sm1465280vsk.28.2021.11.11.21.39.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Nov 2021 21:39:25 -0800 (PST) From: Leonardo Bras To: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster Subject: [PATCH v5 4/6] migration: Add zerocopy parameter for QMP/HMP for Linux Date: Fri, 12 Nov 2021 02:10:39 -0300 Message-Id: <20211112051040.923746-5-leobras@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211112051040.923746-1-leobras@redhat.com> References: <20211112051040.923746-1-leobras@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=leobras@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.7, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add property that allows zerocopy migration of memory pages, and also includes a helper function migrate_use_zerocopy() to check if it's enabled. No code is introduced to actually do the migration, but it allow future implementations to enable/disable this feature. On non-Linux builds this parameter is compiled-out. Signed-off-by: Leonardo Bras --- qapi/migration.json | 18 ++++++++++++++++++ migration/migration.h | 5 +++++ migration/migration.c | 32 ++++++++++++++++++++++++++++++++ migration/multifd.c | 17 +++++++++-------- migration/socket.c | 5 +++++ monitor/hmp-cmds.c | 6 ++++++ 6 files changed, 75 insertions(+), 8 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index bbfd48cf0b..9534c299d7 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -730,6 +730,11 @@ # will consume more CPU. # Defaults to 1. (Since 5.0) # +# @zerocopy: Controls behavior on sending memory pages on migration. +# When true, enables a zerocopy mechanism for sending memory +# pages, if host supports it. +# Defaults to false. (Since 6.2) +# # @block-bitmap-mapping: Maps block nodes and bitmaps on them to # aliases for the purpose of dirty bitmap migration. Such # aliases may for example be the corresponding names on the @@ -769,6 +774,7 @@ 'xbzrle-cache-size', 'max-postcopy-bandwidth', 'max-cpu-throttle', 'multifd-compression', 'multifd-zlib-level' ,'multifd-zstd-level', + { 'name': 'zerocopy', 'if' : 'CONFIG_LINUX'}, 'block-bitmap-mapping' ] } ## @@ -895,6 +901,11 @@ # will consume more CPU. # Defaults to 1. (Since 5.0) # +# @zerocopy: Controls behavior on sending memory pages on migration. +# When true, enables a zerocopy mechanism for sending memory +# pages, if host supports it. +# Defaults to false. (Since 6.2) +# # @block-bitmap-mapping: Maps block nodes and bitmaps on them to # aliases for the purpose of dirty bitmap migration. Such # aliases may for example be the corresponding names on the @@ -949,6 +960,7 @@ '*multifd-compression': 'MultiFDCompression', '*multifd-zlib-level': 'uint8', '*multifd-zstd-level': 'uint8', + '*zerocopy': { 'type': 'bool', 'if': 'CONFIG_LINUX' }, '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } } ## @@ -1095,6 +1107,11 @@ # will consume more CPU. # Defaults to 1. (Since 5.0) # +# @zerocopy: Controls behavior on sending memory pages on migration. +# When true, enables a zerocopy mechanism for sending memory +# pages, if host supports it. +# Defaults to false. (Since 6.2) +# # @block-bitmap-mapping: Maps block nodes and bitmaps on them to # aliases for the purpose of dirty bitmap migration. Such # aliases may for example be the corresponding names on the @@ -1147,6 +1164,7 @@ '*multifd-compression': 'MultiFDCompression', '*multifd-zlib-level': 'uint8', '*multifd-zstd-level': 'uint8', + '*zerocopy': { 'type': 'bool', 'if': 'CONFIG_LINUX' }, '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } } ## diff --git a/migration/migration.h b/migration/migration.h index 8130b703eb..e61ef81f26 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -339,6 +339,11 @@ MultiFDCompression migrate_multifd_compression(void); int migrate_multifd_zlib_level(void); int migrate_multifd_zstd_level(void); +#ifdef CONFIG_LINUX +int migrate_use_zerocopy(void); +#else +#define migrate_use_zerocopy() (0) +#endif int migrate_use_xbzrle(void); uint64_t migrate_xbzrle_cache_size(void); bool migrate_colo_enabled(void); diff --git a/migration/migration.c b/migration/migration.c index abaf6f9e3d..add3dabc56 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -886,6 +886,10 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->multifd_zlib_level = s->parameters.multifd_zlib_level; params->has_multifd_zstd_level = true; params->multifd_zstd_level = s->parameters.multifd_zstd_level; +#ifdef CONFIG_LINUX + params->has_zerocopy = true; + params->zerocopy = s->parameters.zerocopy; +#endif params->has_xbzrle_cache_size = true; params->xbzrle_cache_size = s->parameters.xbzrle_cache_size; params->has_max_postcopy_bandwidth = true; @@ -1538,6 +1542,11 @@ static void migrate_params_test_apply(MigrateSetParameters *params, if (params->has_multifd_compression) { dest->multifd_compression = params->multifd_compression; } +#ifdef CONFIG_LINUX + if (params->has_zerocopy) { + dest->zerocopy = params->zerocopy; + } +#endif if (params->has_xbzrle_cache_size) { dest->xbzrle_cache_size = params->xbzrle_cache_size; } @@ -1650,6 +1659,11 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) if (params->has_multifd_compression) { s->parameters.multifd_compression = params->multifd_compression; } +#ifdef CONFIG_LINUX + if (params->has_zerocopy) { + s->parameters.zerocopy = params->zerocopy; + } +#endif if (params->has_xbzrle_cache_size) { s->parameters.xbzrle_cache_size = params->xbzrle_cache_size; xbzrle_cache_resize(params->xbzrle_cache_size, errp); @@ -2540,6 +2554,17 @@ int migrate_multifd_zstd_level(void) return s->parameters.multifd_zstd_level; } +#ifdef CONFIG_LINUX +int migrate_use_zerocopy(void) +{ + MigrationState *s; + + s = migrate_get_current(); + + return s->parameters.zerocopy; +} +#endif + int migrate_use_xbzrle(void) { MigrationState *s; @@ -4190,6 +4215,10 @@ static Property migration_properties[] = { DEFINE_PROP_UINT8("multifd-zstd-level", MigrationState, parameters.multifd_zstd_level, DEFAULT_MIGRATE_MULTIFD_ZSTD_LEVEL), +#ifdef CONFIG_LINUX + DEFINE_PROP_BOOL("zerocopy", MigrationState, + parameters.zerocopy, false), +#endif DEFINE_PROP_SIZE("xbzrle-cache-size", MigrationState, parameters.xbzrle_cache_size, DEFAULT_MIGRATE_XBZRLE_CACHE_SIZE), @@ -4287,6 +4316,9 @@ static void migration_instance_init(Object *obj) params->has_multifd_compression = true; params->has_multifd_zlib_level = true; params->has_multifd_zstd_level = true; +#ifdef CONFIG_LINUX + params->has_zerocopy = true; +#endif params->has_xbzrle_cache_size = true; params->has_max_postcopy_bandwidth = true; params->has_max_cpu_throttle = true; diff --git a/migration/multifd.c b/migration/multifd.c index 7c9deb1921..ab8f0f97be 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -854,16 +854,17 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque) trace_multifd_new_send_channel_async(p->id); if (qio_task_propagate_error(task, &local_err)) { goto cleanup; - } else { - p->c = QIO_CHANNEL(sioc); - qio_channel_set_delay(p->c, false); - p->running = true; - if (!multifd_channel_connect(p, sioc, local_err)) { - goto cleanup; - } - return; } + p->c = QIO_CHANNEL(sioc); + qio_channel_set_delay(p->c, false); + p->running = true; + if (!multifd_channel_connect(p, sioc, local_err)) { + goto cleanup; + } + + return; + cleanup: multifd_new_send_channel_cleanup(p, sioc, local_err); } diff --git a/migration/socket.c b/migration/socket.c index 05705a32d8..e26e94aa0c 100644 --- a/migration/socket.c +++ b/migration/socket.c @@ -77,6 +77,11 @@ static void socket_outgoing_migration(QIOTask *task, } else { trace_migration_socket_outgoing_connected(data->hostname); } + + if (migrate_use_zerocopy()) { + error_setg(&err, "Zerocopy not available in migration"); + } + migration_channel_connect(data->s, sioc, data->hostname, err); object_unref(OBJECT(sioc)); } diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c index 9c91bf93e9..442679dcfa 100644 --- a/monitor/hmp-cmds.c +++ b/monitor/hmp-cmds.c @@ -1297,6 +1297,12 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->has_multifd_zstd_level = true; visit_type_uint8(v, param, &p->multifd_zstd_level, &err); break; +#ifdef CONFIG_LINUX + case MIGRATION_PARAMETER_ZEROCOPY: + p->has_zerocopy = true; + visit_type_bool(v, param, &p->zerocopy, &err); + break; +#endif case MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE: p->has_xbzrle_cache_size = true; if (!visit_type_size(v, param, &cache_size, &err)) { From patchwork Fri Nov 12 05:10:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12616165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 262EAC433F5 for ; Fri, 12 Nov 2021 05:47:48 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9C8A260F70 for ; Fri, 12 Nov 2021 05:47:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9C8A260F70 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:59600 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mlPPe-0008GI-QV for qemu-devel@archiver.kernel.org; Fri, 12 Nov 2021 00:47:46 -0500 Received: from eggs.gnu.org ([209.51.188.92]:58030) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlPHg-0006JT-Mx for qemu-devel@nongnu.org; Fri, 12 Nov 2021 00:39:34 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:31203) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlPHe-0005PB-R9 for qemu-devel@nongnu.org; Fri, 12 Nov 2021 00:39:32 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636695570; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4OkdTiv3J2pePe4KSIQPKnUIrz83JJ+uC2UUHOjg05k=; b=ZmlBTHdkL/v2Vxb6iaGwwgi4jPyXYJMv/zqacJlvJpgge0NyvuBK07NL5sRr4gEShyGJ+8 i6FzCHLKu2+VHzasay4ssX9fypzgcOa/JW4mH+HcK10ty6n+lY4IQswiaCwrCQY/wO+LvR fkxbsqWAsEQve6YZkShA/GOlYbffiJo= Received: from mail-ua1-f71.google.com (mail-ua1-f71.google.com [209.85.222.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-8-MRNyESCHM2a421lrb8JJZA-1; Fri, 12 Nov 2021 00:39:28 -0500 X-MC-Unique: MRNyESCHM2a421lrb8JJZA-1 Received: by mail-ua1-f71.google.com with SMTP id 43-20020a9f25ae000000b002cf28d7afd2so4183118uaf.3 for ; Thu, 11 Nov 2021 21:39:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4OkdTiv3J2pePe4KSIQPKnUIrz83JJ+uC2UUHOjg05k=; b=nb158R0dI1Gb1g3zy1FRaI7U7NU6A9AYIw+Aa+6eO8Zob4DcloqtcprqYBHXzd6e6S liZUKLC4sxzKla1eH/nIAXJ8Af77BH1/jtgq7XfQLqcFjYYif0aX+6LPw5bHX/cCdecq q4suLaveUoDShi0Zgynzsb9v8PwGOCaDkumFXIixc5UXsBJx4ytwIMlP/RefgNxsrKgR 3/72Y8vfViK1pmpAJwPRkL75WNxstRVmk8X5SrFrIB3/EEjtIbeIi9RZ2J/AQLwA0m3O 834m8mN8owZYXxW2ot2qtuicP/xy2N8NGpq7GUMI45nP6J5MuQ4oQABeGtAdoWl/BWPE 2rhg== X-Gm-Message-State: AOAM533gVqMcRPP4b4pRaR5F17atAh93fJ8B+RqqZVZvZU13FT4aqOE7 ky2jpLzmlVnDVfokA/wOXkNGCnQrq8ojTjCdZuFOz2z+x6KWnDDMgJPBucnziZTztEhjDlqhLBC 1HsmF9V+jliKGeGs= X-Received: by 2002:a05:6122:786:: with SMTP id k6mr19107150vkr.26.1636695568229; Thu, 11 Nov 2021 21:39:28 -0800 (PST) X-Google-Smtp-Source: ABdhPJy8Jq0LncCxfT3WHoaFZKob17ghs9HYBjcql7gnurGZwyif0CGjOpFr05eK+kTkuk3KHRzg3w== X-Received: by 2002:a05:6122:786:: with SMTP id k6mr19107121vkr.26.1636695568074; Thu, 11 Nov 2021 21:39:28 -0800 (PST) Received: from LeoBras.redhat.com ([2804:431:c7f0:7e14:3b94:fb27:f0ad:a824]) by smtp.gmail.com with ESMTPSA id r2sm1465280vsk.28.2021.11.11.21.39.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Nov 2021 21:39:27 -0800 (PST) From: Leonardo Bras To: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster Subject: [PATCH v5 5/6] migration: Add migrate_use_tls() helper Date: Fri, 12 Nov 2021 02:10:40 -0300 Message-Id: <20211112051040.923746-6-leobras@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211112051040.923746-1-leobras@redhat.com> References: <20211112051040.923746-1-leobras@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=leobras@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.7, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" A lot of places check parameters.tls_creds in order to evaluate if TLS is in use, and sometimes call migrate_get_current() just for that test. Add new helper function migrate_use_tls() in order to simplify testing for TLS usage. Signed-off-by: Leonardo Bras Reviewed-by: Juan Quintela --- migration/migration.h | 1 + migration/channel.c | 6 +++--- migration/migration.c | 9 +++++++++ migration/multifd.c | 5 +---- 4 files changed, 14 insertions(+), 7 deletions(-) diff --git a/migration/migration.h b/migration/migration.h index e61ef81f26..9f38419312 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -344,6 +344,7 @@ int migrate_use_zerocopy(void); #else #define migrate_use_zerocopy() (0) #endif +int migrate_use_tls(void); int migrate_use_xbzrle(void); uint64_t migrate_xbzrle_cache_size(void); bool migrate_colo_enabled(void); diff --git a/migration/channel.c b/migration/channel.c index c4fc000a1a..1a45b75d29 100644 --- a/migration/channel.c +++ b/migration/channel.c @@ -32,16 +32,16 @@ */ void migration_channel_process_incoming(QIOChannel *ioc) { - MigrationState *s = migrate_get_current(); Error *local_err = NULL; trace_migration_set_incoming_channel( ioc, object_get_typename(OBJECT(ioc))); - if (s->parameters.tls_creds && - *s->parameters.tls_creds && + if (migrate_use_tls() && !object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_TLS)) { + MigrationState *s = migrate_get_current(); + migration_tls_channel_process_incoming(s, ioc, &local_err); } else { migration_ioc_register_yank(ioc); diff --git a/migration/migration.c b/migration/migration.c index add3dabc56..20ca99d726 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2565,6 +2565,15 @@ int migrate_use_zerocopy(void) } #endif +int migrate_use_tls(void) +{ + MigrationState *s; + + s = migrate_get_current(); + + return s->parameters.tls_creds && *s->parameters.tls_creds; +} + int migrate_use_xbzrle(void) { MigrationState *s; diff --git a/migration/multifd.c b/migration/multifd.c index ab8f0f97be..3d9dc8cb58 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -794,14 +794,11 @@ static bool multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc, Error *error) { - MigrationState *s = migrate_get_current(); - trace_multifd_set_outgoing_channel( ioc, object_get_typename(OBJECT(ioc)), p->tls_hostname, error); if (!error) { - if (s->parameters.tls_creds && - *s->parameters.tls_creds && + if (migrate_use_tls() && !object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_TLS)) { multifd_tls_channel_connect(p, ioc, &error); From patchwork Fri Nov 12 05:10:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12616159 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7D58C433EF for ; Fri, 12 Nov 2021 05:42:39 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 50D8A60D43 for ; Fri, 12 Nov 2021 05:42:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 50D8A60D43 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:48840 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mlPKg-0000uP-Eu for qemu-devel@archiver.kernel.org; Fri, 12 Nov 2021 00:42:38 -0500 Received: from eggs.gnu.org ([209.51.188.92]:58048) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlPHk-0006ML-VF for qemu-devel@nongnu.org; Fri, 12 Nov 2021 00:39:38 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:39521) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mlPHi-0005Pd-QQ for qemu-devel@nongnu.org; Fri, 12 Nov 2021 00:39:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636695574; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Di6e+XbnhT+3f4Rk7palBJBHX+/WcDESeDqZeBd5oiU=; b=W1R5Z/MDr7Wa+u4O1ONHBMRbbeaGc2Bp5zG95M5dzVN0il44pu+EMNExA52bZrmh7Kkz9O 6QLfp7KCI6MpZfv3ux0PFR3VzVGd/8abNkE0/crz8JPW3920MRcdd8QvOY1WpJ0oxFov3j TONq7wmoJAZMurGjnDm3GPxXaqzoSMo= Received: from mail-ua1-f70.google.com (mail-ua1-f70.google.com [209.85.222.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-124-tg-CaVQiOGyZlPHTj8wneg-1; Fri, 12 Nov 2021 00:39:31 -0500 X-MC-Unique: tg-CaVQiOGyZlPHTj8wneg-1 Received: by mail-ua1-f70.google.com with SMTP id h8-20020ab07c68000000b002d147c67fabso2990696uax.20 for ; Thu, 11 Nov 2021 21:39:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Di6e+XbnhT+3f4Rk7palBJBHX+/WcDESeDqZeBd5oiU=; b=PCbcL7z7tJNPM5BGhu4OV1zYDJMEpAirEXz2ddnOCJkIzxTfd0U+XjnwfXQmFTBj5B QmROPzqzFUnYwy/eIckkPlxDNCcdaSGMh9HXdNaBhdpak95E9S3F4bLnKtuJk6jUld4J AiBmCu9/CPbqBGeDaGASn5qkTpc0IMfdfmDn2jl3Ppz6oj/DWXR0+gIbF6l2REDgtYWS Cys58Dtj3XJJ1JqoE/6MZ2MupytItPygtxakGk8UufHIbRFRwjh/Na0XK7mT5njx34Jc /K/+HHG1pFZvfJi++moUlmaEbSFncYrmHavmVSnao5AGSdAWm9MGZeZeqZiHQo9fC7nY e51Q== X-Gm-Message-State: AOAM533KsutJNg/uYDFpW5YGRP+gM9kRYAwZkf7xymp+SGWdsVW1/jXK r1BiWojrzBThwtqTLT+cekzNuqzUE1y6NrnWJNceTvZPPkwP5lNqgYp2qtILyGTbvN8Am8QMSCw ClF6LfalWrEqyYto= X-Received: by 2002:a05:6122:894:: with SMTP id 20mr19540947vkf.9.1636695570660; Thu, 11 Nov 2021 21:39:30 -0800 (PST) X-Google-Smtp-Source: ABdhPJxv6NBp+yZU3Faugmuw4ZMb5ZHLrIMe+sQE75hPrs4dRcJdbxktkuuQixnWDtTvZV5D/OP6rA== X-Received: by 2002:a05:6122:894:: with SMTP id 20mr19540918vkf.9.1636695570493; Thu, 11 Nov 2021 21:39:30 -0800 (PST) Received: from LeoBras.redhat.com ([2804:431:c7f0:7e14:3b94:fb27:f0ad:a824]) by smtp.gmail.com with ESMTPSA id r2sm1465280vsk.28.2021.11.11.21.39.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Nov 2021 21:39:30 -0800 (PST) From: Leonardo Bras To: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Eric Blake , Markus Armbruster Subject: [PATCH v5 6/6] multifd: Implement zerocopy write in multifd migration (multifd-zerocopy) Date: Fri, 12 Nov 2021 02:10:41 -0300 Message-Id: <20211112051040.923746-7-leobras@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211112051040.923746-1-leobras@redhat.com> References: <20211112051040.923746-1-leobras@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=leobras@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.7, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Implement zerocopy on nocomp_send_write(), by making use of QIOChannel zerocopy interface. Change multifd_send_sync_main() so it can distinguish each iteration sync from the setup and the completion, so a flush_zerocopy() can be called at the after each iteration in order to make sure all dirty pages are sent before a new iteration is started. Also make it return -1 if flush_zerocopy() fails, in order to cancel the migration process, and avoid resuming the guest in the target host without receiving all current RAM. This will work fine on RAM migration because the RAM pages are not usually freed, and there is no problem on changing the pages content between async_send() and the actual sending of the buffer, because this change will dirty the page and cause it to be re-sent on a next iteration anyway. Given a lot of locked memory may be needed in order to use multid migration with zerocopy enabled, make it optional by creating a new migration parameter "zerocopy" on qapi, so low-privileged users can still perform multifd migrations. Signed-off-by: Leonardo Bras --- migration/multifd.h | 4 +++- migration/multifd.c | 37 ++++++++++++++++++++++++++++++++----- migration/ram.c | 29 ++++++++++++++++++++++------- migration/socket.c | 9 +++++++-- 4 files changed, 64 insertions(+), 15 deletions(-) diff --git a/migration/multifd.h b/migration/multifd.h index 15c50ca0b2..37941c1872 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -22,7 +22,7 @@ int multifd_load_cleanup(Error **errp); bool multifd_recv_all_channels_created(void); bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp); void multifd_recv_sync_main(void); -void multifd_send_sync_main(QEMUFile *f); +int multifd_send_sync_main(QEMUFile *f, bool sync); int multifd_queue_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset); /* Multifd Compression flags */ @@ -97,6 +97,8 @@ typedef struct { uint32_t packet_len; /* pointer to the packet */ MultiFDPacket_t *packet; + /* multifd flags for sending ram */ + int write_flags; /* multifd flags for each packet */ uint32_t flags; /* size of the next packet that contains pages */ diff --git a/migration/multifd.c b/migration/multifd.c index 3d9dc8cb58..816078df60 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -105,7 +105,8 @@ static int nocomp_send_prepare(MultiFDSendParams *p, uint32_t used, */ static int nocomp_send_write(MultiFDSendParams *p, uint32_t used, Error **errp) { - return qio_channel_writev_all(p->c, p->pages->iov, used, errp); + return qio_channel_writev_all_flags(p->c, p->pages->iov, used, + p->write_flags, errp); } /** @@ -578,19 +579,27 @@ void multifd_save_cleanup(void) multifd_send_state = NULL; } -void multifd_send_sync_main(QEMUFile *f) +int multifd_send_sync_main(QEMUFile *f, bool sync) { int i; + bool flush_zerocopy; if (!migrate_use_multifd()) { - return; + return 0; } if (multifd_send_state->pages->used) { if (multifd_send_pages(f) < 0) { error_report("%s: multifd_send_pages fail", __func__); - return; + return 0; } } + + /* + * When using zerocopy, it's necessary to flush after each iteration to make + * sure pages from earlier iterations don't end up replacing newer pages. + */ + flush_zerocopy = sync && migrate_use_zerocopy(); + for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDSendParams *p = &multifd_send_state->params[i]; @@ -601,7 +610,7 @@ void multifd_send_sync_main(QEMUFile *f) if (p->quit) { error_report("%s: channel %d has already quit", __func__, i); qemu_mutex_unlock(&p->mutex); - return; + return 0; } p->packet_num = multifd_send_state->packet_num++; @@ -612,6 +621,17 @@ void multifd_send_sync_main(QEMUFile *f) ram_counters.transferred += p->packet_len; qemu_mutex_unlock(&p->mutex); qemu_sem_post(&p->sem); + + if (flush_zerocopy) { + int ret; + Error *err = NULL; + + ret = qio_channel_flush_zerocopy(p->c, &err); + if (ret < 0) { + error_report_err(err); + return -1; + } + } } for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDSendParams *p = &multifd_send_state->params[i]; @@ -620,6 +640,8 @@ void multifd_send_sync_main(QEMUFile *f) qemu_sem_wait(&p->sem_sync); } trace_multifd_send_sync_main(multifd_send_state->packet_num); + + return 0; } static void *multifd_send_thread(void *opaque) @@ -853,6 +875,10 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque) goto cleanup; } + if (migrate_use_zerocopy()) { + p->write_flags = QIO_CHANNEL_WRITE_FLAG_ZEROCOPY; + } + p->c = QIO_CHANNEL(sioc); qio_channel_set_delay(p->c, false); p->running = true; @@ -918,6 +944,7 @@ int multifd_save_setup(Error **errp) p->packet->version = cpu_to_be32(MULTIFD_VERSION); p->name = g_strdup_printf("multifdsend_%d", i); p->tls_hostname = g_strdup(s->hostname); + p->write_flags = 0; socket_send_channel_create(multifd_new_send_channel_async, p); } diff --git a/migration/ram.c b/migration/ram.c index 863035d235..0b3ddbffc1 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2992,6 +2992,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque) { RAMState **rsp = opaque; RAMBlock *block; + int ret; if (compress_threads_save_setup()) { return -1; @@ -3026,7 +3027,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque) ram_control_before_iterate(f, RAM_CONTROL_SETUP); ram_control_after_iterate(f, RAM_CONTROL_SETUP); - multifd_send_sync_main(f); + ret = multifd_send_sync_main(f, false); + if (ret < 0) { + return ret; + } + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); qemu_fflush(f); @@ -3135,7 +3140,11 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) out: if (ret >= 0 && migration_is_setup_or_active(migrate_get_current()->state)) { - multifd_send_sync_main(rs->f); + ret = multifd_send_sync_main(rs->f, true); + if (ret < 0) { + return ret; + } + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); qemu_fflush(f); ram_counters.transferred += 8; @@ -3193,13 +3202,19 @@ static int ram_save_complete(QEMUFile *f, void *opaque) ram_control_after_iterate(f, RAM_CONTROL_FINISH); } - if (ret >= 0) { - multifd_send_sync_main(rs->f); - qemu_put_be64(f, RAM_SAVE_FLAG_EOS); - qemu_fflush(f); + if (ret < 0) { + return ret; } - return ret; + ret = multifd_send_sync_main(rs->f, false); + if (ret < 0) { + return ret; + } + + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); + qemu_fflush(f); + + return 0; } static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size, diff --git a/migration/socket.c b/migration/socket.c index e26e94aa0c..8e40e0a3fd 100644 --- a/migration/socket.c +++ b/migration/socket.c @@ -78,8 +78,13 @@ static void socket_outgoing_migration(QIOTask *task, trace_migration_socket_outgoing_connected(data->hostname); } - if (migrate_use_zerocopy()) { - error_setg(&err, "Zerocopy not available in migration"); + if (migrate_use_zerocopy() && + (!migrate_use_multifd() || + !qio_channel_has_feature(sioc, QIO_CHANNEL_FEATURE_WRITE_ZEROCOPY) || + migrate_multifd_compression() != MULTIFD_COMPRESSION_NONE || + migrate_use_tls())) { + error_setg(&err, + "Zerocopy only available for non-compressed non-TLS multifd migration"); } migration_channel_connect(data->s, sioc, data->hostname, err);