From patchwork Tue Aug 31 11:02:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12467003 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE88CC4320E for ; Tue, 31 Aug 2021 11:04:55 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2331760F6B for ; Tue, 31 Aug 2021 11:04:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2331760F6B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:35506 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mL1ZV-0001Fz-2v for qemu-devel@archiver.kernel.org; Tue, 31 Aug 2021 07:04:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42766) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mL1Xh-0007ZX-8P for qemu-devel@nongnu.org; Tue, 31 Aug 2021 07:03:01 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:37488) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mL1Xc-00061H-9D for qemu-devel@nongnu.org; Tue, 31 Aug 2021 07:02:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630407775; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aWZ7tSHN3tM3JJExv7MdF5jLAE1QQpN8X6S9jGR7WDY=; b=AKATN7eLRaELxc244cWv3dxDIrnHWPEhLA+86vHAt8T5bu/TUN9JqEAG1HI+4aJORcuMpC AlJRIQ9TgsQWaZrzwiA4g10QVeOkqb+55Bm0hEKN2dmXH1SHi8HKhiqf8W7+Jl5IeT/7P2 T9Q+KqxnZTdjRDIqqd9ddD0SFLgbcvU= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-311-3FIuwDsGOyeSNMVaCF_hwA-1; Tue, 31 Aug 2021 07:02:54 -0400 X-MC-Unique: 3FIuwDsGOyeSNMVaCF_hwA-1 Received: by mail-qv1-f70.google.com with SMTP id l12-20020a0cc20c000000b0037766e5daaeso2054242qvh.10 for ; Tue, 31 Aug 2021 04:02:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=aWZ7tSHN3tM3JJExv7MdF5jLAE1QQpN8X6S9jGR7WDY=; b=nUhMP1tX+QO/a9II56CNT/HeaPGy1YjrVPoeNahKPJ9f3YVlu8c497mJyyywqMlKHz uaeuJ+asQfX1G8g1+eT40cNOl7y9DmimBNNiMoSFkJTozzgFofS15kxfikbBvILKjxif E0mr2qYf7HAnjq57bPsh8Afi30ybNKvRbI51FiMb0s5OaB9XV3PM6WCeR/kv/3behBIm truRkfcKRTFgQ8cb2n6HZsdHwglMx4CzRvjeR/qkKiBZYE4QKpuFLBRx0NoY/I1UW+6W Xt1u+g3M4CyBwmMUmye57+11L+KSR6BtlkLc13sAxDoxfI823GYz+IjWq3cXQ7Eqblga 0HIA== X-Gm-Message-State: AOAM532py3/uO8HA5HcjuqO1QgYMU7ufPpek+ezvz6C6pLP4dX86er6L Bs+dIe9bpVR0Vb2vbeQy9IetWc2WcqVf9zNI4/s+Wf8O7uZ0ksPZKi2z+ymrYm9zz58RZ5QT0Nj YTe/owT7t/ooZz6s= X-Received: by 2002:a37:6103:: with SMTP id v3mr2385065qkb.12.1630407773621; Tue, 31 Aug 2021 04:02:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwI2Jo4cnB3hwj9643d9uy3epr71MASUxkqg8O3Yf047GLV0yGnOypW8QqH6vGkhF6Z6KR2xA== X-Received: by 2002:a37:6103:: with SMTP id v3mr2385039qkb.12.1630407773362; Tue, 31 Aug 2021 04:02:53 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f1:e948:8e69:9cd6:5512:12f4]) by smtp.gmail.com with ESMTPSA id b25sm13315536qka.23.2021.08.31.04.02.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 04:02:53 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Fam Zheng , Peter Xu Subject: [PATCH v1 1/3] io: Enable write flags for QIOChannel Date: Tue, 31 Aug 2021 08:02:37 -0300 Message-Id: <20210831110238.299458-2-leobras@redhat.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20210831110238.299458-1-leobras@redhat.com> References: <20210831110238.299458-1-leobras@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=leobras@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.391, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Some syscalls used for writting, such as sendmsg(), accept flags that can modify their behavior, even allowing the usage of features such as MSG_ZEROCOPY. Change qio_channel_write*() interface to allow passing down flags, allowing a more flexible use of IOChannel. At first, it's use is enabled for QIOChannelSocket, but can be easily extended to any other QIOChannel implementation. Signed-off-by: Leonardo Bras --- chardev/char-io.c | 2 +- hw/remote/mpqemu-link.c | 2 +- include/io/channel.h | 56 ++++++++++++++++++++--------- io/channel-buffer.c | 1 + io/channel-command.c | 1 + io/channel-file.c | 1 + io/channel-socket.c | 4 ++- io/channel-tls.c | 1 + io/channel-websock.c | 1 + io/channel.c | 53 ++++++++++++++------------- migration/rdma.c | 1 + scsi/pr-manager-helper.c | 2 +- tests/unit/test-io-channel-socket.c | 1 + 13 files changed, 81 insertions(+), 45 deletions(-) diff --git a/chardev/char-io.c b/chardev/char-io.c index 8ced184160..4ea7b1ee2a 100644 --- a/chardev/char-io.c +++ b/chardev/char-io.c @@ -122,7 +122,7 @@ int io_channel_send_full(QIOChannel *ioc, ret = qio_channel_writev_full( ioc, &iov, 1, - fds, nfds, NULL); + fds, 0, nfds, NULL); if (ret == QIO_CHANNEL_ERR_BLOCK) { if (offset) { return offset; diff --git a/hw/remote/mpqemu-link.c b/hw/remote/mpqemu-link.c index 7e841820e5..0d13321ef0 100644 --- a/hw/remote/mpqemu-link.c +++ b/hw/remote/mpqemu-link.c @@ -69,7 +69,7 @@ bool mpqemu_msg_send(MPQemuMsg *msg, QIOChannel *ioc, Error **errp) } if (!qio_channel_writev_full_all(ioc, send, G_N_ELEMENTS(send), - fds, nfds, errp)) { + fds, nfds, 0, errp)) { ret = true; } else { trace_mpqemu_send_io_error(msg->cmd, msg->size, nfds); diff --git a/include/io/channel.h b/include/io/channel.h index 88988979f8..dada9ebaaf 100644 --- a/include/io/channel.h +++ b/include/io/channel.h @@ -104,6 +104,7 @@ struct QIOChannelClass { size_t niov, int *fds, size_t nfds, + int flags, Error **errp); ssize_t (*io_readv)(QIOChannel *ioc, const struct iovec *iov, @@ -260,6 +261,7 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp); /** @@ -325,6 +327,7 @@ int qio_channel_readv_all(QIOChannel *ioc, * @ioc: the channel object * @iov: the array of memory regions to write data from * @niov: the length of the @iov array + * @flags: optional sending flags * @errp: pointer to a NULL-initialized error object * * Write data to the IO channel, reading it from the @@ -339,10 +342,14 @@ int qio_channel_readv_all(QIOChannel *ioc, * * Returns: 0 if all bytes were written, or -1 on error */ -int qio_channel_writev_all(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - Error **erp); +int qio_channel_writev_all_flags(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int flags, + Error **errp); + +#define qio_channel_writev_all(ioc, iov, niov, errp) \ + qio_channel_writev_all_flags(ioc, iov, niov, 0, errp) /** * qio_channel_readv: @@ -364,15 +371,21 @@ ssize_t qio_channel_readv(QIOChannel *ioc, * @ioc: the channel object * @iov: the array of memory regions to write data from * @niov: the length of the @iov array + * @flags: optional sending flags * @errp: pointer to a NULL-initialized error object * * Behaves as qio_channel_writev_full() but does not support * sending of file handles. */ -ssize_t qio_channel_writev(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - Error **errp); +ssize_t qio_channel_writev_flags(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int flags, + Error **errp); + +#define qio_channel_writev(ioc, iov, niov, errp) \ + qio_channel_writev_flags(ioc, iov, niov, 0, errp) + /** * qio_channel_read: @@ -395,16 +408,21 @@ ssize_t qio_channel_read(QIOChannel *ioc, * @ioc: the channel object * @buf: the memory regions to send data from * @buflen: the length of @buf + * @flags: optional sending flags * @errp: pointer to a NULL-initialized error object * * Behaves as qio_channel_writev_full() but does not support * sending of file handles, and only supports writing from a * single memory region. */ -ssize_t qio_channel_write(QIOChannel *ioc, - const char *buf, - size_t buflen, - Error **errp); +ssize_t qio_channel_write_flags(QIOChannel *ioc, + const char *buf, + size_t buflen, + int flags, + Error **errp); + +#define qio_channel_write(ioc, buf, buflen, errp) \ + qio_channel_write_flags(ioc, buf, buflen, 0, errp) /** * qio_channel_read_all_eof: @@ -453,6 +471,7 @@ int qio_channel_read_all(QIOChannel *ioc, * @ioc: the channel object * @buf: the memory region to write data into * @buflen: the number of bytes to @buf + * @flags: optional sending flags * @errp: pointer to a NULL-initialized error object * * Writes @buflen bytes from @buf, possibly blocking or (if the @@ -462,10 +481,14 @@ int qio_channel_read_all(QIOChannel *ioc, * * Returns: 0 if all bytes were written, or -1 on error */ -int qio_channel_write_all(QIOChannel *ioc, - const char *buf, - size_t buflen, - Error **errp); +int qio_channel_write_all_flags(QIOChannel *ioc, + const char *buf, + size_t buflen, + int flags, + Error **errp); + +#define qio_channel_write_all(ioc, buf, buflen, errp) \ + qio_channel_write_all_flags(ioc, buf, buflen, 0, errp) /** * qio_channel_set_blocking: @@ -853,6 +876,7 @@ int qio_channel_writev_full_all(QIOChannel *ioc, const struct iovec *iov, size_t niov, int *fds, size_t nfds, + int flags, Error **errp); #endif /* QIO_CHANNEL_H */ diff --git a/io/channel-buffer.c b/io/channel-buffer.c index baa4e2b089..bf52011be2 100644 --- a/io/channel-buffer.c +++ b/io/channel-buffer.c @@ -81,6 +81,7 @@ static ssize_t qio_channel_buffer_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelBuffer *bioc = QIO_CHANNEL_BUFFER(ioc); diff --git a/io/channel-command.c b/io/channel-command.c index b2a9e27138..5ff1691bad 100644 --- a/io/channel-command.c +++ b/io/channel-command.c @@ -258,6 +258,7 @@ static ssize_t qio_channel_command_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelCommand *cioc = QIO_CHANNEL_COMMAND(ioc); diff --git a/io/channel-file.c b/io/channel-file.c index c4bf799a80..348a48545e 100644 --- a/io/channel-file.c +++ b/io/channel-file.c @@ -114,6 +114,7 @@ static ssize_t qio_channel_file_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelFile *fioc = QIO_CHANNEL_FILE(ioc); diff --git a/io/channel-socket.c b/io/channel-socket.c index 606ec97cf7..e377e7303d 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -525,6 +525,7 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); @@ -558,7 +559,7 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, } retry: - ret = sendmsg(sioc->fd, &msg, 0); + ret = sendmsg(sioc->fd, &msg, flags); if (ret <= 0) { if (errno == EAGAIN) { return QIO_CHANNEL_ERR_BLOCK; @@ -620,6 +621,7 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); diff --git a/io/channel-tls.c b/io/channel-tls.c index 2ae1b92fc0..4ce890a538 100644 --- a/io/channel-tls.c +++ b/io/channel-tls.c @@ -301,6 +301,7 @@ static ssize_t qio_channel_tls_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc); diff --git a/io/channel-websock.c b/io/channel-websock.c index 70889bb54d..035dd6075b 100644 --- a/io/channel-websock.c +++ b/io/channel-websock.c @@ -1127,6 +1127,7 @@ static ssize_t qio_channel_websock_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelWebsock *wioc = QIO_CHANNEL_WEBSOCK(ioc); diff --git a/io/channel.c b/io/channel.c index e8b019dc36..ee3cb83d4d 100644 --- a/io/channel.c +++ b/io/channel.c @@ -72,6 +72,7 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc); @@ -83,7 +84,7 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc, return -1; } - return klass->io_writev(ioc, iov, niov, fds, nfds, errp); + return klass->io_writev(ioc, iov, niov, fds, nfds, flags, errp); } @@ -212,18 +213,20 @@ int qio_channel_readv_full_all(QIOChannel *ioc, return ret; } -int qio_channel_writev_all(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - Error **errp) +int qio_channel_writev_all_flags(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int flags, + Error **errp) { - return qio_channel_writev_full_all(ioc, iov, niov, NULL, 0, errp); + return qio_channel_writev_full_all(ioc, iov, niov, NULL, 0, flags, errp); } int qio_channel_writev_full_all(QIOChannel *ioc, const struct iovec *iov, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { int ret = -1; @@ -238,7 +241,7 @@ int qio_channel_writev_full_all(QIOChannel *ioc, while (nlocal_iov > 0) { ssize_t len; len = qio_channel_writev_full(ioc, local_iov, nlocal_iov, fds, nfds, - errp); + flags, errp); if (len == QIO_CHANNEL_ERR_BLOCK) { if (qemu_in_coroutine()) { qio_channel_yield(ioc, G_IO_OUT); @@ -272,15 +275,15 @@ ssize_t qio_channel_readv(QIOChannel *ioc, } -ssize_t qio_channel_writev(QIOChannel *ioc, - const struct iovec *iov, - size_t niov, - Error **errp) +ssize_t qio_channel_writev_flags(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int flags, + Error **errp) { - return qio_channel_writev_full(ioc, iov, niov, NULL, 0, errp); + return qio_channel_writev_full(ioc, iov, niov, NULL, 0, flags, errp); } - ssize_t qio_channel_read(QIOChannel *ioc, char *buf, size_t buflen, @@ -291,16 +294,16 @@ ssize_t qio_channel_read(QIOChannel *ioc, } -ssize_t qio_channel_write(QIOChannel *ioc, - const char *buf, - size_t buflen, - Error **errp) +ssize_t qio_channel_write_flags(QIOChannel *ioc, + const char *buf, + size_t buflen, + int flags, + Error **errp) { struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen }; - return qio_channel_writev_full(ioc, &iov, 1, NULL, 0, errp); + return qio_channel_writev_full(ioc, &iov, 1, NULL, 0, flags, errp); } - int qio_channel_read_all_eof(QIOChannel *ioc, char *buf, size_t buflen, @@ -321,16 +324,16 @@ int qio_channel_read_all(QIOChannel *ioc, } -int qio_channel_write_all(QIOChannel *ioc, - const char *buf, - size_t buflen, - Error **errp) +int qio_channel_write_all_flags(QIOChannel *ioc, + const char *buf, + size_t buflen, + int flags, + Error **errp) { struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen }; - return qio_channel_writev_all(ioc, &iov, 1, errp); + return qio_channel_writev_all_flags(ioc, &iov, 1, flags, errp); } - int qio_channel_set_blocking(QIOChannel *ioc, bool enabled, Error **errp) diff --git a/migration/rdma.c b/migration/rdma.c index 5c2d113aa9..bc0558b70e 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2713,6 +2713,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, size_t niov, int *fds, size_t nfds, + int flags, Error **errp) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); diff --git a/scsi/pr-manager-helper.c b/scsi/pr-manager-helper.c index 451c7631b7..3be52a98d5 100644 --- a/scsi/pr-manager-helper.c +++ b/scsi/pr-manager-helper.c @@ -77,7 +77,7 @@ static int pr_manager_helper_write(PRManagerHelper *pr_mgr, iov.iov_base = (void *)buf; iov.iov_len = sz; n_written = qio_channel_writev_full(QIO_CHANNEL(pr_mgr->ioc), &iov, 1, - nfds ? &fd : NULL, nfds, errp); + nfds ? &fd : NULL, nfds, 0, errp); if (n_written <= 0) { assert(n_written != QIO_CHANNEL_ERR_BLOCK); diff --git a/tests/unit/test-io-channel-socket.c b/tests/unit/test-io-channel-socket.c index c49eec1f03..6713886d02 100644 --- a/tests/unit/test-io-channel-socket.c +++ b/tests/unit/test-io-channel-socket.c @@ -444,6 +444,7 @@ static void test_io_channel_unix_fd_pass(void) G_N_ELEMENTS(iosend), fdsend, G_N_ELEMENTS(fdsend), + 0, &error_abort); qio_channel_readv_full(dst, From patchwork Tue Aug 31 11:02:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12467009 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADAB7C432BE for ; Tue, 31 Aug 2021 11:08:21 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 397C360249 for ; Tue, 31 Aug 2021 11:08:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 397C360249 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:41936 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mL1cq-0005zW-CL for qemu-devel@archiver.kernel.org; Tue, 31 Aug 2021 07:08:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42824) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mL1Xl-0007ar-Qm for qemu-devel@nongnu.org; Tue, 31 Aug 2021 07:03:05 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:32707) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mL1Xi-00065y-Mm for qemu-devel@nongnu.org; Tue, 31 Aug 2021 07:03:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630407780; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qf+JzssxEE9yG13C3qfTX97W4H2Wo3wnCSNu5tgoBAY=; b=VD0eCK1WyLGRKACPeLAHQM7xLKNiGiPbkOEkBKCU7/95UI8lyMczrRKvGd+dH9ssnnkjvL yXGUVLJYwLbHP/aZfdPVzkYAwhMJD9N5Ee18FJU46lgZLDP3t7XFPKdTyCIeexeN0ak6Bf Ujy2rOxYHBYeOP/Q0lpgFLJ3mKpvZJQ= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-80-Z6ECsye3ONWp0B-B4mM4BQ-1; Tue, 31 Aug 2021 07:02:57 -0400 X-MC-Unique: Z6ECsye3ONWp0B-B4mM4BQ-1 Received: by mail-qt1-f200.google.com with SMTP id p21-20020ac846150000b02902982d999bfbso478274qtn.7 for ; Tue, 31 Aug 2021 04:02:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qf+JzssxEE9yG13C3qfTX97W4H2Wo3wnCSNu5tgoBAY=; b=gZb+4mtK0h0Zw4YkERRurx0YvSAj3M+QCecsz/CC60h5EVdlojCFzpcO13Sf5WX/I6 eNiDelqk80Yu6bZxKjG8Y2cgw6+lOXmU+ybrdhe7bjZP/mjWN0gWas6x8/0ciC/7bna4 l8ebv3sH8ZPMK5pu7DyA8gvdH+RW7AECZ2uHTw6qVMQPDTD38apDqOrNOgemp/x+zHRS 6CLKK6RY7pKMBTQTsMKCyRW/vRr6q0D16BcFVUw8hgUIFvprb1EVWzwomkL7HeDjdAUT xptxjeA1deFsfIPNv0tlRukAyiI7QF1LPvtC4sNm4e4W75zeTerhc3D8EOxB+r35V+hs 3ttw== X-Gm-Message-State: AOAM532wu2cui2SkFemlRx5xUU0tzBHfnPi2vGN5lottNhS7tNbpD2v7 WXbqpO1lpeRB5PXRfL8pKsdXyBeQMAMkvGXNNqT0Qlz+K9TDPZ0wMDCFgIhWx2vSke6zA6nUF8U 8Kyo1JZuJMEitLV8= X-Received: by 2002:a0c:f88f:: with SMTP id u15mr28127097qvn.38.1630407776853; Tue, 31 Aug 2021 04:02:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwg63XcujY0wqjCuJZJ2cXK/+40NM98x9E+L+FBOAIUd1V8+3dsXGpTmSmTfWPxHn8K+4rKvg== X-Received: by 2002:a0c:f88f:: with SMTP id u15mr28127073qvn.38.1630407776635; Tue, 31 Aug 2021 04:02:56 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f1:e948:8e69:9cd6:5512:12f4]) by smtp.gmail.com with ESMTPSA id b25sm13315536qka.23.2021.08.31.04.02.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 04:02:56 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Fam Zheng , Peter Xu Subject: [PATCH v1 2/3] io: Add zerocopy and errqueue Date: Tue, 31 Aug 2021 08:02:38 -0300 Message-Id: <20210831110238.299458-3-leobras@redhat.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20210831110238.299458-1-leobras@redhat.com> References: <20210831110238.299458-1-leobras@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=leobras@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -12 X-Spam_score: -1.3 X-Spam_bar: - X-Spam_report: (-1.3 / 5.0 requ) DKIMWL_WL_HIGH=-0.391, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" MSG_ZEROCOPY is a feature that enables copy avoidance in TCP/UDP socket send calls. It does so by avoiding copying user data into kernel buffers. To make it work, three steps are needed: 1 - A setsockopt() system call, enabling SO_ZEROCOPY 2 - Passing down the MSG_ZEROCOPY flag for each send*() syscall 3 - Process the socket's error queue, dealing with any error Zerocopy has it's costs, so it will only get improved performance if the sending buffer is big (10KB, according to Linux docs). The step 2 makes it possible to use the same socket to send data using both zerocopy and the default copying approach, so the application cat choose what is best for each packet. To implement step 1, an optional set_zerocopy() interface was created in QIOChannel, allowing each using code to enable or disable it. Step 2 will be enabled by the using code at each qio_channel_write*() that would benefit of zerocopy; Step 3 is done with qio_channel_socket_errq_proc(), that runs after SOCKET_ERRQ_THRESH (16k) iovs sent, dealing with any error found. Signed-off-by: Leonardo Bras --- include/io/channel-socket.h | 2 + include/io/channel.h | 29 ++++++++++++++ io/channel-socket.c | 76 +++++++++++++++++++++++++++++++++++++ io/channel-tls.c | 11 ++++++ io/channel-websock.c | 9 +++++ io/channel.c | 11 ++++++ 6 files changed, 138 insertions(+) diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h index e747e63514..09dffe059f 100644 --- a/include/io/channel-socket.h +++ b/include/io/channel-socket.h @@ -47,6 +47,8 @@ struct QIOChannelSocket { socklen_t localAddrLen; struct sockaddr_storage remoteAddr; socklen_t remoteAddrLen; + size_t errq_pending; + bool zerocopy_enabled; }; diff --git a/include/io/channel.h b/include/io/channel.h index dada9ebaaf..de10a78b10 100644 --- a/include/io/channel.h +++ b/include/io/channel.h @@ -137,6 +137,8 @@ struct QIOChannelClass { IOHandler *io_read, IOHandler *io_write, void *opaque); + void (*io_set_zerocopy)(QIOChannel *ioc, + bool enabled); }; /* General I/O handling functions */ @@ -570,6 +572,33 @@ int qio_channel_shutdown(QIOChannel *ioc, void qio_channel_set_delay(QIOChannel *ioc, bool enabled); +/** + * qio_channel_set_zerocopy: + * @ioc: the channel object + * @enabled: the new flag state + * + * Controls whether the underlying transport is + * permitted to use zerocopy to avoid copying the + * sending buffer in kernel. If @enabled is true, then the + * writes may avoid buffer copy in kernel. If @enabled + * is false, writes will cause the kernel to always + * copy the buffer contents before sending. + * + * In order to use make a write with zerocopy feature, + * it's also necessary to sent each packet with + * MSG_ZEROCOPY flag. With this, it's possible to + * to select only writes that would benefit from the + * use of zerocopy feature, i.e. the ones with larger + * buffers. + * + * This feature was added in Linux 4.14, so older + * versions will fail on enabling. This is not an + * issue, since it will fall-back to default copying + * approach. + */ +void qio_channel_set_zerocopy(QIOChannel *ioc, + bool enabled); + /** * qio_channel_set_cork: * @ioc: the channel object diff --git a/io/channel-socket.c b/io/channel-socket.c index e377e7303d..a69fec7315 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -26,8 +26,10 @@ #include "io/channel-watch.h" #include "trace.h" #include "qapi/clone-visitor.h" +#include #define SOCKET_MAX_FDS 16 +#define SOCKET_ERRQ_THRESH 16384 SocketAddress * qio_channel_socket_get_local_address(QIOChannelSocket *ioc, @@ -55,6 +57,8 @@ qio_channel_socket_new(void) sioc = QIO_CHANNEL_SOCKET(object_new(TYPE_QIO_CHANNEL_SOCKET)); sioc->fd = -1; + sioc->zerocopy_enabled = false; + sioc->errq_pending = 0; ioc = QIO_CHANNEL(sioc); qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); @@ -520,6 +524,54 @@ static ssize_t qio_channel_socket_readv(QIOChannel *ioc, return ret; } +static void qio_channel_socket_errq_proc(QIOChannelSocket *sioc, + Error **errp) +{ + int fd = sioc->fd; + int ret; + struct msghdr msg = {}; + struct sock_extended_err *serr; + struct cmsghdr *cm; + + do { + ret = recvmsg(fd, &msg, MSG_ERRQUEUE); + if (ret <= 0) { + if (ret == 0 || errno == EAGAIN) { + /* Nothing on errqueue */ + sioc->errq_pending = 0; + break; + } + if (errno == EINTR) { + continue; + } + + error_setg_errno(errp, errno, + "Unable to read errqueue"); + break; + } + + cm = CMSG_FIRSTHDR(&msg); + if (cm->cmsg_level != SOL_IP && + cm->cmsg_type != IP_RECVERR) { + error_setg_errno(errp, EPROTOTYPE, + "Wrong cmsg in errqueue"); + break; + } + + serr = (void *) CMSG_DATA(cm); + if (serr->ee_errno != 0) { + error_setg_errno(errp, serr->ee_errno, + "Error on socket"); + break; + } + if (serr->ee_origin != SO_EE_ORIGIN_ZEROCOPY) { + error_setg_errno(errp, serr->ee_origin, + "Error not from zerocopy"); + break; + } + } while (true); +} + static ssize_t qio_channel_socket_writev(QIOChannel *ioc, const struct iovec *iov, size_t niov, @@ -571,6 +623,14 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc, "Unable to write to socket"); return -1; } + + if ((flags & MSG_ZEROCOPY) && sioc->zerocopy_enabled) { + sioc->errq_pending += niov; + if (sioc->errq_pending > SOCKET_ERRQ_THRESH) { + qio_channel_socket_errq_proc(sioc, errp); + } + } + return ret; } #else /* WIN32 */ @@ -689,6 +749,21 @@ qio_channel_socket_set_delay(QIOChannel *ioc, } +static void +qio_channel_socket_set_zerocopy(QIOChannel *ioc, + bool enabled) +{ + QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc); + int v = enabled ? 1 : 0; + int ret; + + ret = qemu_setsockopt(sioc->fd, SOL_SOCKET, SO_ZEROCOPY, &v, sizeof(v)); + if (ret >= 0) { + sioc->zerocopy_enabled = true; + } +} + + static void qio_channel_socket_set_cork(QIOChannel *ioc, bool enabled) @@ -789,6 +864,7 @@ static void qio_channel_socket_class_init(ObjectClass *klass, ioc_klass->io_set_delay = qio_channel_socket_set_delay; ioc_klass->io_create_watch = qio_channel_socket_create_watch; ioc_klass->io_set_aio_fd_handler = qio_channel_socket_set_aio_fd_handler; + ioc_klass->io_set_zerocopy = qio_channel_socket_set_zerocopy; } static const TypeInfo qio_channel_socket_info = { diff --git a/io/channel-tls.c b/io/channel-tls.c index 4ce890a538..bf44b0f7b0 100644 --- a/io/channel-tls.c +++ b/io/channel-tls.c @@ -350,6 +350,16 @@ static void qio_channel_tls_set_delay(QIOChannel *ioc, qio_channel_set_delay(tioc->master, enabled); } + +static void qio_channel_tls_set_zerocopy(QIOChannel *ioc, + bool enabled) +{ + QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc); + + qio_channel_set_zerocopy(tioc->master, enabled); +} + + static void qio_channel_tls_set_cork(QIOChannel *ioc, bool enabled) { @@ -416,6 +426,7 @@ static void qio_channel_tls_class_init(ObjectClass *klass, ioc_klass->io_shutdown = qio_channel_tls_shutdown; ioc_klass->io_create_watch = qio_channel_tls_create_watch; ioc_klass->io_set_aio_fd_handler = qio_channel_tls_set_aio_fd_handler; + ioc_klass->io_set_zerocopy = qio_channel_tls_set_zerocopy; } static const TypeInfo qio_channel_tls_info = { diff --git a/io/channel-websock.c b/io/channel-websock.c index 035dd6075b..4e9491966b 100644 --- a/io/channel-websock.c +++ b/io/channel-websock.c @@ -1194,6 +1194,14 @@ static void qio_channel_websock_set_delay(QIOChannel *ioc, qio_channel_set_delay(tioc->master, enabled); } +static void qio_channel_websock_set_zerocopy(QIOChannel *ioc, + bool enabled) +{ + QIOChannelWebsock *tioc = QIO_CHANNEL_WEBSOCK(ioc); + + qio_channel_set_zerocopy(tioc->master, enabled); +} + static void qio_channel_websock_set_cork(QIOChannel *ioc, bool enabled) { @@ -1318,6 +1326,7 @@ static void qio_channel_websock_class_init(ObjectClass *klass, ioc_klass->io_close = qio_channel_websock_close; ioc_klass->io_shutdown = qio_channel_websock_shutdown; ioc_klass->io_create_watch = qio_channel_websock_create_watch; + ioc_klass->io_set_zerocopy = qio_channel_websock_set_zerocopy; } static const TypeInfo qio_channel_websock_info = { diff --git a/io/channel.c b/io/channel.c index ee3cb83d4d..476440e8b2 100644 --- a/io/channel.c +++ b/io/channel.c @@ -450,6 +450,17 @@ void qio_channel_set_delay(QIOChannel *ioc, } +void qio_channel_set_zerocopy(QIOChannel *ioc, + bool enabled) +{ + QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc); + + if (klass->io_set_zerocopy) { + klass->io_set_zerocopy(ioc, enabled); + } +} + + void qio_channel_set_cork(QIOChannel *ioc, bool enabled) { From patchwork Tue Aug 31 11:02:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 12467001 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FCF7C432BE for ; Tue, 31 Aug 2021 11:04:54 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AC06960FD8 for ; Tue, 31 Aug 2021 11:04:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org AC06960FD8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:35578 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mL1ZU-0001Ix-Sr for qemu-devel@archiver.kernel.org; Tue, 31 Aug 2021 07:04:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42822) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mL1Xl-0007aq-Qw for qemu-devel@nongnu.org; Tue, 31 Aug 2021 07:03:05 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:49856) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mL1Xi-00066y-Ml for qemu-devel@nongnu.org; Tue, 31 Aug 2021 07:03:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630407781; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=p9rSUW+5mIJo1lA0bnwvxvKcuIMBAj1Pcz7IyE4l0jw=; b=ZO4lLgHT44wYAmbZMsts4n2LYdj6DA3OwOU2ONvO730a1MwzA70ZfG/91vgPxE92Idu3p5 RKe10pm3WYwZq820D0U1JmWRxzqVzfQNjjWWqUKDfaNheicjA2IE+88CyGGOKvVe54iLji vaPPAG0Filoqg7ebu8EyPo3C2ChTU6c= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-415-HO-MfMR1M3KbvcOq9IOaGQ-1; Tue, 31 Aug 2021 07:03:00 -0400 X-MC-Unique: HO-MfMR1M3KbvcOq9IOaGQ-1 Received: by mail-qk1-f199.google.com with SMTP id h135-20020a379e8d000000b003f64b0f4865so1927891qke.12 for ; Tue, 31 Aug 2021 04:03:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=p9rSUW+5mIJo1lA0bnwvxvKcuIMBAj1Pcz7IyE4l0jw=; b=C0T78B/Wx84uN0ZbNAnosl9LlbMazoJXS1nee5JF6ez/2h/odWajyubUYGg6LHLMZL e0tJiKTUI4Rkd8GyMCw0FXZXOwZRNJjV7KluoFA3DOQXoSrxaIAat5F/DCOBDyqux4V1 kAmXEXGQhreTHrlWAVAmz5F+PWjEVOQECQh68mjNWQKkEH74bcFBltNJgbOztj31X7SE WXr2hN8gM9+7sC0LePsKOzhlnkYCHGn5NbupRn0qdK2XtPRRngYnXifKSYO38n7uA7HU C+yTKvFzOveAxA79Testb6CB5C1gm0S6kqVyYbdkAU+IM4fLBJRf8xR3dHU52+N4/b3Z ZkVA== X-Gm-Message-State: AOAM5303ZFbjHI8XxT1GATDW03BiumvA/GCKySNbNh98RDUhP8FUArhu zSuyvpM6287lUhhs2xIfJqQb6WpbdrUG5CDm3Bb0ngMfAYWFuZ5yFkxdG9Uxz5Hu/VCa3+nPFVT LGaVaUtf7A0PbKjk= X-Received: by 2002:ae9:df07:: with SMTP id t7mr2373497qkf.95.1630407780006; Tue, 31 Aug 2021 04:03:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy55gEMfNxL9DZMfGwWVHLuMA91xOVcCUV2MRQG6EAN/0L9opZbMvfI+XtxQVgkYAVOngyTxA== X-Received: by 2002:ae9:df07:: with SMTP id t7mr2373483qkf.95.1630407779843; Tue, 31 Aug 2021 04:02:59 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:431:c7f1:e948:8e69:9cd6:5512:12f4]) by smtp.gmail.com with ESMTPSA id b25sm13315536qka.23.2021.08.31.04.02.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 04:02:59 -0700 (PDT) From: Leonardo Bras To: =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , Paolo Bonzini , Elena Ufimtseva , Jagannathan Raman , John G Johnson , =?utf-8?q?Daniel_P=2E_Berrang?= =?utf-8?q?=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" , Fam Zheng , Peter Xu Subject: [PATCH v1 3/3] migration: multifd: Enable zerocopy Date: Tue, 31 Aug 2021 08:02:39 -0300 Message-Id: <20210831110238.299458-4-leobras@redhat.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20210831110238.299458-1-leobras@redhat.com> References: <20210831110238.299458-1-leobras@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=leobras@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=leobras@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.391, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Call qio_channel_set_zerocopy(true) in the start of every multifd thread. Change the send_write() interface of multifd, allowing it to pass down flags for qio_channel_write*(). Pass down MSG_ZEROCOPY flag for sending memory pages, while keeping the other data being sent at the default copying approach. Signed-off-by: Leonardo Bras --- migration/multifd-zlib.c | 7 ++++--- migration/multifd-zstd.c | 7 ++++--- migration/multifd.c | 9 ++++++--- migration/multifd.h | 3 ++- 4 files changed, 16 insertions(+), 10 deletions(-) diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c index ab4ba75d75..d8cce1810a 100644 --- a/migration/multifd-zlib.c +++ b/migration/multifd-zlib.c @@ -160,12 +160,13 @@ static int zlib_send_prepare(MultiFDSendParams *p, uint32_t used, Error **errp) * @used: number of pages used * @errp: pointer to an error */ -static int zlib_send_write(MultiFDSendParams *p, uint32_t used, Error **errp) +static int zlib_send_write(MultiFDSendParams *p, uint32_t used, int flags, + Error **errp) { struct zlib_data *z = p->data; - return qio_channel_write_all(p->c, (void *)z->zbuff, p->next_packet_size, - errp); + return qio_channel_write_all_flags(p->c, (void *)z->zbuff, + p->next_packet_size, flags, errp); } /** diff --git a/migration/multifd-zstd.c b/migration/multifd-zstd.c index 693bddf8c9..fa063fd33e 100644 --- a/migration/multifd-zstd.c +++ b/migration/multifd-zstd.c @@ -171,12 +171,13 @@ static int zstd_send_prepare(MultiFDSendParams *p, uint32_t used, Error **errp) * @used: number of pages used * @errp: pointer to an error */ -static int zstd_send_write(MultiFDSendParams *p, uint32_t used, Error **errp) +static int zstd_send_write(MultiFDSendParams *p, uint32_t used, int flags, + Error **errp) { struct zstd_data *z = p->data; - return qio_channel_write_all(p->c, (void *)z->zbuff, p->next_packet_size, - errp); + return qio_channel_write_all_flags(p->c, (void *)z->zbuff, + p->next_packet_size, flags, errp); } /** diff --git a/migration/multifd.c b/migration/multifd.c index 377da78f5b..097621c12c 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -103,9 +103,10 @@ static int nocomp_send_prepare(MultiFDSendParams *p, uint32_t used, * @used: number of pages used * @errp: pointer to an error */ -static int nocomp_send_write(MultiFDSendParams *p, uint32_t used, Error **errp) +static int nocomp_send_write(MultiFDSendParams *p, uint32_t used, int flags, + Error **errp) { - return qio_channel_writev_all(p->c, p->pages->iov, used, errp); + return qio_channel_writev_all_flags(p->c, p->pages->iov, used, flags, errp); } /** @@ -675,7 +676,8 @@ static void *multifd_send_thread(void *opaque) } if (used) { - ret = multifd_send_state->ops->send_write(p, used, &local_err); + ret = multifd_send_state->ops->send_write(p, used, MSG_ZEROCOPY, + &local_err); if (ret != 0) { break; } @@ -815,6 +817,7 @@ static bool multifd_channel_connect(MultiFDSendParams *p, } else { /* update for tls qio channel */ p->c = ioc; + qio_channel_set_zerocopy(ioc, true); qemu_thread_create(&p->thread, p->name, multifd_send_thread, p, QEMU_THREAD_JOINABLE); } diff --git a/migration/multifd.h b/migration/multifd.h index 8d6751f5ed..7243ba4185 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -157,7 +157,8 @@ typedef struct { /* Prepare the send packet */ int (*send_prepare)(MultiFDSendParams *p, uint32_t used, Error **errp); /* Write the send packet */ - int (*send_write)(MultiFDSendParams *p, uint32_t used, Error **errp); + int (*send_write)(MultiFDSendParams *p, uint32_t used, int flags, + Error **errp); /* Setup for receiving side */ int (*recv_setup)(MultiFDRecvParams *p, Error **errp); /* Cleanup for receiving side */