From patchwork Fri Apr 19 11:08:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13636231 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0ACB3C4345F for ; Fri, 19 Apr 2024 11:09:01 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.709003.1108268 (Exim 4.92) (envelope-from ) id 1rxm6p-0001MY-He; Fri, 19 Apr 2024 11:08:47 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 709003.1108268; Fri, 19 Apr 2024 11:08:47 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rxm6p-0001MR-DL; Fri, 19 Apr 2024 11:08:47 +0000 Received: by outflank-mailman (input) for mailman id 709003; Fri, 19 Apr 2024 11:08:46 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rxm6n-0001FM-W4 for xen-devel@lists.xenproject.org; Fri, 19 Apr 2024 11:08:46 +0000 Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [2a00:1450:4864:20::630]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 2f45bbf5-fe3d-11ee-94a3-07e782e9044d; Fri, 19 Apr 2024 13:08:43 +0200 (CEST) Received: by mail-ej1-x630.google.com with SMTP id a640c23a62f3a-a556d22fa93so225792066b.3 for ; Fri, 19 Apr 2024 04:08:43 -0700 (PDT) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id z13-20020a17090655cd00b00a4739efd7cesm2082525ejp.60.2024.04.19.04.08.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Apr 2024 04:08:42 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2f45bbf5-fe3d-11ee-94a3-07e782e9044d DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1713524923; x=1714129723; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OOjfjLTpACX+ktm4SIjFbQPdNROuvyJVkhcWINyu6zc=; b=cMy9SvNz1hfLnoxKHKDlazg+EC1shJ4AVwhKC/YkIuHES7D4hFx1kz9fvmbsykqzSx NgnpLnFtAMDL8lXalUL3SlpE4Dn/qfF0Zwpyr4O9whSxF4U2raPU03PJtTKm512i4yb8 fWHVI40f59NvX6f6TsvCMeoit/1B8WFMFWRUEl/up63SKl/70aIkyhb8fbKEfTaCoGP3 OpGeJu+NWkNFqPLY0IVJLwaccXMEuYM4n5dPor5Ov7jXRDSokWHWrXhcTEJXZIQMj0TS lXZwXI3+guxDyn9GTEUtaSRCHNB1GRlK1cTkocZ719bnYDMjPQrrkgp8GEZRW6kcnBSH x4Ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713524923; x=1714129723; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OOjfjLTpACX+ktm4SIjFbQPdNROuvyJVkhcWINyu6zc=; b=NQVKsjO+aYXBP+HxAKRnfQdfXp2yIE/AJPhtdgqCSO7JgRoLDL0qu68Kh2+3kdcya9 mij3tXftB++bOy1A7gEtPrV3OnVrvS8fG84zsC8doKKsKR0NE+4IwjeOXnMXKrOhvxxJ ZywzAjwWfHNiqBCBWTtHxzXctfpeDpJlnGzoOwAghbgntQAI32LosBkwzwe0ngn7vL9Z KLclh/qkn8m511Mj5G/JaE0Qv5/n2bL0HglaFYnSnkDD4WOLoqC/Z05XdXngNraalMSf /6+z5z6PgoqnqaGtlBLc8+rqH50WPU8Ju8SvlOyqci7z921s2K7lvtxznPlfLhgPGDXz v/WQ== X-Forwarded-Encrypted: i=1; AJvYcCUyyMSJjXAv8fQ/J4Ieh7FNL01girGRXJL12nVpCzGiBJb/Hj05FTHMPWBb1k+VBaidc65g+eGmxyXDE5t37EBJ7nFaLpov0/H/C43rtWU= X-Gm-Message-State: AOJu0YwBDXQtb3WmDVQe2WzVFRivS4D2dPGXnhJk9O60oOczkZEPpDtI tTt8vGeisRLIQfrMqS8bwUHsu4qkHrQXWQNvXjrtupmItch+95V7 X-Google-Smtp-Source: AGHT+IEmBOo8ZD4NY+mkF+vWNE4N/26bHcNn7qxJJxwL9/Pi2Lxd8nenGK/ka+snHuSjnwh79g+UAw== X-Received: by 2002:a17:906:840c:b0:a55:6f32:63b2 with SMTP id n12-20020a170906840c00b00a556f3263b2mr1216223ejx.5.1713524922714; Fri, 19 Apr 2024 04:08:42 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org, netdev@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, "David S . Miller" , Jakub Kicinski , David Ahern , Eric Dumazet , Willem de Bruijn , Jason Wang , Wei Liu , Paul Durrant , xen-devel@lists.xenproject.org, "Michael S . Tsirkin" , virtualization@lists.linux.dev, kvm@vger.kernel.org Subject: [PATCH io_uring-next/net-next v2 1/4] net: extend ubuf_info callback to ops structure Date: Fri, 19 Apr 2024 12:08:39 +0100 Message-ID: X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 We'll need to associate additional callbacks with ubuf_info, introduce a structure holding ubuf_info callbacks. Apart from a more smarter io_uring notification management introduced in next patches, it can be used to generalise msg_zerocopy_put_abort() and also store ->sg_from_iter, which is currently passed in struct msghdr. Reviewed-by: Jens Axboe Reviewed-by: David Ahern Signed-off-by: Pavel Begunkov Reviewed-by: Willem de Bruijn --- drivers/net/tap.c | 2 +- drivers/net/tun.c | 2 +- drivers/net/xen-netback/common.h | 5 ++--- drivers/net/xen-netback/interface.c | 2 +- drivers/net/xen-netback/netback.c | 11 ++++++++--- drivers/vhost/net.c | 8 ++++++-- include/linux/skbuff.h | 19 +++++++++++-------- io_uring/notif.c | 8 ++++++-- net/core/skbuff.c | 16 ++++++++++------ 9 files changed, 46 insertions(+), 27 deletions(-) diff --git a/drivers/net/tap.c b/drivers/net/tap.c index 9f0495e8df4d..bfdd3875fe86 100644 --- a/drivers/net/tap.c +++ b/drivers/net/tap.c @@ -754,7 +754,7 @@ static ssize_t tap_get_user(struct tap_queue *q, void *msg_control, skb_zcopy_init(skb, msg_control); } else if (msg_control) { struct ubuf_info *uarg = msg_control; - uarg->callback(NULL, uarg, false); + uarg->ops->complete(NULL, uarg, false); } dev_queue_xmit(skb); diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 0b3f21cba552..b7401d990680 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1906,7 +1906,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile, skb_zcopy_init(skb, msg_control); } else if (msg_control) { struct ubuf_info *uarg = msg_control; - uarg->callback(NULL, uarg, false); + uarg->ops->complete(NULL, uarg, false); } skb_reset_network_header(skb); diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 1fcbd83f7ff2..17421da139f2 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -390,9 +390,8 @@ bool xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff *skb); void xenvif_carrier_on(struct xenvif *vif); -/* Callback from stack when TX packet can be released */ -void xenvif_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *ubuf, - bool zerocopy_success); +/* Callbacks from stack when TX packet can be released */ +extern const struct ubuf_info_ops xenvif_ubuf_ops; static inline pending_ring_idx_t nr_pending_reqs(struct xenvif_queue *queue) { diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 7cff90aa8d24..65db5f14465f 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -593,7 +593,7 @@ int xenvif_init_queue(struct xenvif_queue *queue) for (i = 0; i < MAX_PENDING_REQS; i++) { queue->pending_tx_info[i].callback_struct = (struct ubuf_info_msgzc) - { { .callback = xenvif_zerocopy_callback }, + { { .ops = &xenvif_ubuf_ops }, { { .ctx = NULL, .desc = i } } }; queue->grant_tx_handle[i] = NETBACK_INVALID_HANDLE; diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 48254fc07d64..5836995d6774 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1157,7 +1157,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s uarg = skb_shinfo(skb)->destructor_arg; /* increase inflight counter to offset decrement in callback */ atomic_inc(&queue->inflight_packets); - uarg->callback(NULL, uarg, true); + uarg->ops->complete(NULL, uarg, true); skb_shinfo(skb)->destructor_arg = NULL; /* Fill the skb with the new (local) frags. */ @@ -1279,8 +1279,9 @@ static int xenvif_tx_submit(struct xenvif_queue *queue) return work_done; } -void xenvif_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *ubuf_base, - bool zerocopy_success) +static void xenvif_zerocopy_callback(struct sk_buff *skb, + struct ubuf_info *ubuf_base, + bool zerocopy_success) { unsigned long flags; pending_ring_idx_t index; @@ -1313,6 +1314,10 @@ void xenvif_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *ubuf_base, xenvif_skb_zerocopy_complete(queue); } +const struct ubuf_info_ops xenvif_ubuf_ops = { + .complete = xenvif_zerocopy_callback, +}; + static inline void xenvif_tx_dealloc_action(struct xenvif_queue *queue) { struct gnttab_unmap_grant_ref *gop; diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index c64ded183f8d..f16279351db5 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -380,7 +380,7 @@ static void vhost_zerocopy_signal_used(struct vhost_net *net, } } -static void vhost_zerocopy_callback(struct sk_buff *skb, +static void vhost_zerocopy_complete(struct sk_buff *skb, struct ubuf_info *ubuf_base, bool success) { struct ubuf_info_msgzc *ubuf = uarg_to_msgzc(ubuf_base); @@ -408,6 +408,10 @@ static void vhost_zerocopy_callback(struct sk_buff *skb, rcu_read_unlock_bh(); } +static const struct ubuf_info_ops vhost_ubuf_ops = { + .complete = vhost_zerocopy_complete, +}; + static inline unsigned long busy_clock(void) { return local_clock() >> 10; @@ -879,7 +883,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock) vq->heads[nvq->upend_idx].len = VHOST_DMA_IN_PROGRESS; ubuf->ctx = nvq->ubufs; ubuf->desc = nvq->upend_idx; - ubuf->ubuf.callback = vhost_zerocopy_callback; + ubuf->ubuf.ops = &vhost_ubuf_ops; ubuf->ubuf.flags = SKBFL_ZEROCOPY_FRAG; refcount_set(&ubuf->ubuf.refcnt, 1); msg.msg_control = &ctl; diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 4072a7ee3859..a44954264746 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -527,6 +527,11 @@ enum { #define SKBFL_ALL_ZEROCOPY (SKBFL_ZEROCOPY_FRAG | SKBFL_PURE_ZEROCOPY | \ SKBFL_DONT_ORPHAN | SKBFL_MANAGED_FRAG_REFS) +struct ubuf_info_ops { + void (*complete)(struct sk_buff *, struct ubuf_info *, + bool zerocopy_success); +}; + /* * The callback notifies userspace to release buffers when skb DMA is done in * lower device, the skb last reference should be 0 when calling this. @@ -536,8 +541,7 @@ enum { * The desc field is used to track userspace buffer index. */ struct ubuf_info { - void (*callback)(struct sk_buff *, struct ubuf_info *, - bool zerocopy_success); + const struct ubuf_info_ops *ops; refcount_t refcnt; u8 flags; }; @@ -1671,14 +1675,13 @@ static inline void skb_set_end_offset(struct sk_buff *skb, unsigned int offset) } #endif +extern const struct ubuf_info_ops msg_zerocopy_ubuf_ops; + struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size, struct ubuf_info *uarg); void msg_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref); -void msg_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *uarg, - bool success); - int __zerocopy_sg_from_iter(struct msghdr *msg, struct sock *sk, struct sk_buff *skb, struct iov_iter *from, size_t length); @@ -1766,13 +1769,13 @@ static inline void *skb_zcopy_get_nouarg(struct sk_buff *skb) static inline void net_zcopy_put(struct ubuf_info *uarg) { if (uarg) - uarg->callback(NULL, uarg, true); + uarg->ops->complete(NULL, uarg, true); } static inline void net_zcopy_put_abort(struct ubuf_info *uarg, bool have_uref) { if (uarg) { - if (uarg->callback == msg_zerocopy_callback) + if (uarg->ops == &msg_zerocopy_ubuf_ops) msg_zerocopy_put_abort(uarg, have_uref); else if (have_uref) net_zcopy_put(uarg); @@ -1786,7 +1789,7 @@ static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success) if (uarg) { if (!skb_zcopy_is_nouarg(skb)) - uarg->callback(skb, uarg, zerocopy_success); + uarg->ops->complete(skb, uarg, zerocopy_success); skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY; } diff --git a/io_uring/notif.c b/io_uring/notif.c index 3485437b207d..53532d78a947 100644 --- a/io_uring/notif.c +++ b/io_uring/notif.c @@ -23,7 +23,7 @@ void io_notif_tw_complete(struct io_kiocb *notif, struct io_tw_state *ts) io_req_task_complete(notif, ts); } -static void io_tx_ubuf_callback(struct sk_buff *skb, struct ubuf_info *uarg, +static void io_tx_ubuf_complete(struct sk_buff *skb, struct ubuf_info *uarg, bool success) { struct io_notif_data *nd = container_of(uarg, struct io_notif_data, uarg); @@ -43,6 +43,10 @@ static void io_tx_ubuf_callback(struct sk_buff *skb, struct ubuf_info *uarg, __io_req_task_work_add(notif, IOU_F_TWQ_LAZY_WAKE); } +static const struct ubuf_info_ops io_ubuf_ops = { + .complete = io_tx_ubuf_complete, +}; + struct io_kiocb *io_alloc_notif(struct io_ring_ctx *ctx) __must_hold(&ctx->uring_lock) { @@ -62,7 +66,7 @@ struct io_kiocb *io_alloc_notif(struct io_ring_ctx *ctx) nd->zc_report = false; nd->account_pages = 0; nd->uarg.flags = IO_NOTIF_UBUF_FLAGS; - nd->uarg.callback = io_tx_ubuf_callback; + nd->uarg.ops = &io_ubuf_ops; refcount_set(&nd->uarg.refcnt, 1); return notif; } diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 37c858dc11a6..0f4cc759824b 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -1652,7 +1652,7 @@ static struct ubuf_info *msg_zerocopy_alloc(struct sock *sk, size_t size) return NULL; } - uarg->ubuf.callback = msg_zerocopy_callback; + uarg->ubuf.ops = &msg_zerocopy_ubuf_ops; uarg->id = ((u32)atomic_inc_return(&sk->sk_zckey)) - 1; uarg->len = 1; uarg->bytelen = size; @@ -1678,7 +1678,7 @@ struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size, u32 bytelen, next; /* there might be non MSG_ZEROCOPY users */ - if (uarg->callback != msg_zerocopy_callback) + if (uarg->ops != &msg_zerocopy_ubuf_ops) return NULL; /* realloc only when socket is locked (TCP, UDP cork), @@ -1789,8 +1789,8 @@ static void __msg_zerocopy_callback(struct ubuf_info_msgzc *uarg) sock_put(sk); } -void msg_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *uarg, - bool success) +static void msg_zerocopy_complete(struct sk_buff *skb, struct ubuf_info *uarg, + bool success) { struct ubuf_info_msgzc *uarg_zc = uarg_to_msgzc(uarg); @@ -1799,7 +1799,6 @@ void msg_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *uarg, if (refcount_dec_and_test(&uarg->refcnt)) __msg_zerocopy_callback(uarg_zc); } -EXPORT_SYMBOL_GPL(msg_zerocopy_callback); void msg_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref) { @@ -1809,10 +1808,15 @@ void msg_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref) uarg_to_msgzc(uarg)->len--; if (have_uref) - msg_zerocopy_callback(NULL, uarg, true); + msg_zerocopy_complete(NULL, uarg, true); } EXPORT_SYMBOL_GPL(msg_zerocopy_put_abort); +const struct ubuf_info_ops msg_zerocopy_ubuf_ops = { + .complete = msg_zerocopy_complete, +}; +EXPORT_SYMBOL_GPL(msg_zerocopy_ubuf_ops); + int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb, struct msghdr *msg, int len, struct ubuf_info *uarg) From patchwork Fri Apr 19 11:08:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13636229 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0E065C071FD for ; Fri, 19 Apr 2024 11:08:59 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.709004.1108278 (Exim 4.92) (envelope-from ) id 1rxm6q-0001ba-Ng; Fri, 19 Apr 2024 11:08:48 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 709004.1108278; Fri, 19 Apr 2024 11:08:48 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rxm6q-0001bR-Ks; Fri, 19 Apr 2024 11:08:48 +0000 Received: by outflank-mailman (input) for mailman id 709004; Fri, 19 Apr 2024 11:08:46 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rxm6o-0001FM-Mh for xen-devel@lists.xenproject.org; Fri, 19 Apr 2024 11:08:46 +0000 Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [2a00:1450:4864:20::52a]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 30255899-fe3d-11ee-94a3-07e782e9044d; Fri, 19 Apr 2024 13:08:45 +0200 (CEST) Received: by mail-ed1-x52a.google.com with SMTP id 4fb4d7f45d1cf-571d8606f40so307097a12.2 for ; Fri, 19 Apr 2024 04:08:45 -0700 (PDT) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id z13-20020a17090655cd00b00a4739efd7cesm2082525ejp.60.2024.04.19.04.08.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Apr 2024 04:08:43 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 30255899-fe3d-11ee-94a3-07e782e9044d DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1713524924; x=1714129724; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gztq7q+tVBfKfiFB41TR2uivJxMSmSRECNayk7q9P5o=; b=Vtba2uo3nvOZARvC3JslcGnayR6WbvbVno4fU6D2UGtfqQqeBdwXocXd1YZJVr5Xn1 77+pw9sf0VCBOnm9Q3Hc36eeqnWqrVLH/epTazfwQUADq3NzLEFHmWTQPEwPYFqZdfRP aUJYFLceZb5WndWInEvHjeKMMVFIhO7ZCjpDzCVLcSzmJ5Wn1koSUfdxzGe6ZBKBkLfM 179fGo76sWFG9RtiDUZENc6Co9Ih1Kp4VqsngZTgtn3vwKPXagOzrZOhc52e5SyayNzN 4WYqD1ma76FCTqSKACi2eNOGHnNFqTRk808tp8WbyOCDffrc9jX1KpQzZ5Tw1PuDTjmq agiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713524924; x=1714129724; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gztq7q+tVBfKfiFB41TR2uivJxMSmSRECNayk7q9P5o=; b=koGN8Q0529ohQu5KJ4v4PnArfjhpd5fG7rcJEn2bpF5EjcaVt7KCL5ky3zHzY2NO3J iWpQfxFWPA10L0wJPjaOQTm5C966QzqOJH7vsJKWdTiGyXYA54wdk+pABh/vdGCDm2bQ JA7xAIiZLyHvQe+VeT8DLnpmK68z4JqisifPNjRsbzM90xhVkDdwCyXt0fYezo7E8yy6 muLidmvkuZb5UfrHxkuRiqVc99kz+k+yP3Qi+q8BO4nY+mGEy6RLwCy6Pr8d0lmtFkDd xiflNAeSoLm+b18+R2HkUeVMzDMSG4UbIeCSNriD8+rEhd6tko78vc81ANXs/fZfyGv9 7jNQ== X-Forwarded-Encrypted: i=1; AJvYcCV1oiE1AlTBJVPL+ET5tPNW+PhoYPdPEabCOvUyekxObFXHt9pkeAQxJqF++KhMdHYEFfXjBQRmOTC9Y67N7W+tZMcN7WXu2urinl79FBo= X-Gm-Message-State: AOJu0YzSwb3DyWzsFSGlF1i1wQauLNaU4W0HKCam0Wt+bmY4xa0jccXp 7jics9PHmgMJ3TvRvh93o50V+YzQQWft4snjGy6pdZX4ZbNN5m9S X-Google-Smtp-Source: AGHT+IFLPp89/CqqYt/oDY9TsD6MpTpW+WHomUzd8c9Q6sFQ45RxeiYZXRR2f3kaQNaQYK9HhlusCw== X-Received: by 2002:a17:907:9624:b0:a52:2a36:38bf with SMTP id gb36-20020a170907962400b00a522a3638bfmr1634494ejc.55.1713524924521; Fri, 19 Apr 2024 04:08:44 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org, netdev@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, "David S . Miller" , Jakub Kicinski , David Ahern , Eric Dumazet , Willem de Bruijn , Jason Wang , Wei Liu , Paul Durrant , xen-devel@lists.xenproject.org, "Michael S . Tsirkin" , virtualization@lists.linux.dev, kvm@vger.kernel.org Subject: [PATCH io_uring-next/net-next v2 2/4] net: add callback for setting a ubuf_info to skb Date: Fri, 19 Apr 2024 12:08:40 +0100 Message-ID: X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 At the moment an skb can only have one ubuf_info associated with it, which might be a performance problem for zerocopy sends in cases like TCP via io_uring. Add a callback for assigning ubuf_info to skb, this way we will implement smarter assignment later like linking ubuf_info together. Note, it's an optional callback, which should be compatible with skb_zcopy_set(), that's because the net stack might potentially decide to clone an skb and take another reference to ubuf_info whenever it wishes. Also, a correct implementation should always be able to bind to an skb without prior ubuf_info, otherwise we could end up in a situation when the send would not be able to progress. Reviewed-by: Jens Axboe Reviewed-by: David Ahern Signed-off-by: Pavel Begunkov Reviewed-by: Willem de Bruijn --- include/linux/skbuff.h | 2 ++ net/core/skbuff.c | 20 ++++++++++++++------ 2 files changed, 16 insertions(+), 6 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index a44954264746..f76825e5b92a 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -530,6 +530,8 @@ enum { struct ubuf_info_ops { void (*complete)(struct sk_buff *, struct ubuf_info *, bool zerocopy_success); + /* has to be compatible with skb_zcopy_set() */ + int (*link_skb)(struct sk_buff *skb, struct ubuf_info *uarg); }; /* diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 0f4cc759824b..0c8b82750000 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -1824,11 +1824,18 @@ int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb, struct ubuf_info *orig_uarg = skb_zcopy(skb); int err, orig_len = skb->len; - /* An skb can only point to one uarg. This edge case happens when - * TCP appends to an skb, but zerocopy_realloc triggered a new alloc. - */ - if (orig_uarg && uarg != orig_uarg) - return -EEXIST; + if (uarg->ops->link_skb) { + err = uarg->ops->link_skb(skb, uarg); + if (err) + return err; + } else { + /* An skb can only point to one uarg. This edge case happens + * when TCP appends to an skb, but zerocopy_realloc triggered + * a new alloc. + */ + if (orig_uarg && uarg != orig_uarg) + return -EEXIST; + } err = __zerocopy_sg_from_iter(msg, sk, skb, &msg->msg_iter, len); if (err == -EFAULT || (err == -EMSGSIZE && skb->len == orig_len)) { @@ -1842,7 +1849,8 @@ int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb, return err; } - skb_zcopy_set(skb, uarg, NULL); + if (!uarg->ops->link_skb) + skb_zcopy_set(skb, uarg, NULL); return skb->len - orig_len; } EXPORT_SYMBOL_GPL(skb_zerocopy_iter_stream); From patchwork Fri Apr 19 11:08:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13636227 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 24976C4345F for ; Fri, 19 Apr 2024 11:08:58 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.709005.1108283 (Exim 4.92) (envelope-from ) id 1rxm6r-0001dp-0W; Fri, 19 Apr 2024 11:08:49 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 709005.1108283; Fri, 19 Apr 2024 11:08:48 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rxm6q-0001dA-Rh; Fri, 19 Apr 2024 11:08:48 +0000 Received: by outflank-mailman (input) for mailman id 709005; Fri, 19 Apr 2024 11:08:47 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rxm6p-00017h-6l for xen-devel@lists.xenproject.org; Fri, 19 Apr 2024 11:08:47 +0000 Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [2a00:1450:4864:20::636]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 311c49e2-fe3d-11ee-b909-491648fe20b8; Fri, 19 Apr 2024 13:08:46 +0200 (CEST) Received: by mail-ej1-x636.google.com with SMTP id a640c23a62f3a-a5557e3ebcaso316022266b.1 for ; Fri, 19 Apr 2024 04:08:46 -0700 (PDT) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id z13-20020a17090655cd00b00a4739efd7cesm2082525ejp.60.2024.04.19.04.08.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Apr 2024 04:08:45 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 311c49e2-fe3d-11ee-b909-491648fe20b8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1713524926; x=1714129726; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vg0pL2CVxV49Fgeqn+qahTZ1Z4PbsN7awrFUvCLM+ng=; b=V7xo0xUDhjNiMf/OcZ257ViI87JyvzkVSJflTaiWLug+Q/7/r1tQXO5lnLf6S6+L5A ELoiV9umEPUyww3kHkDegPrZCdTOrFhuq4MwFtf7inS0hXHfSgT1my5UOZanS1p2BCX0 gqWLUfPd/0gAgFt/d5YQt57u1Zg6zHaW1SIVeOKda6+7C6A/20G8adFIjq4Nn1yqCFAC B2FFARRPVaiC9xE9kHDolNCKvNj6ACI9Q5RzeVRGYQSrOC3Fw9JFjmeNp6K0f4p6YsIS cQ4+/2ElypZn9bCQ+SzhMlOYX4egHasUHiMzu7MeYcJA4BQcWlfhkAe6R7i3koHZjlkZ BDlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713524926; x=1714129726; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vg0pL2CVxV49Fgeqn+qahTZ1Z4PbsN7awrFUvCLM+ng=; b=xQ+Kb6+ziaGde9Xs0IfKLG7e8wEbks3+BYOe2W4qbMkOpHxpeYV4fsHsoILuX3plKo nnr0uUtlyqMqXYSEW4X2nqQi3IwAuxPjFA+aI4brve0ubfstOs7tIYWcvMI4H86bqUqc dzLrTm+5X2apverA1mSS9L1/sGolDnxJfdofyvcpDSmSKiF5ZFMzfiWnnwhbIF1DOoID UDBB8BvGOLArjq15MkHj6E4chisZLkVWcpM4z2JMGUOVwmrMiwkVyaYSy0czHisGU1G4 IzdSnmuM4ZbToYvZFPhtx3LU1K9nklD4kQGw7Q8RjJP+19XlpXex28Cvtqb159sW1gU1 tmpA== X-Forwarded-Encrypted: i=1; AJvYcCWKyaL9etwrLDPDtbZwzROWhaOKQIvuLOHPajMkBB3S1Hu01ooKQWQ4eB6TGamRIuJor69KIy+OStmRe/2aFHuEekGEhRHyLwwP6vMDHlo= X-Gm-Message-State: AOJu0YyKvPftJE8SwZip85x0QV5NCW2K/fIyewGSXzfDeJAqUci6bOsi SjjvR4oLXVaFMT+ZoQCG/MTFXCjDUVaFFdmxnZ0l5nFqloGyd/t7 X-Google-Smtp-Source: AGHT+IGlNNpKG3wGDzOQc1RYm8RL9LY87wF61Oxy7BMYiyIn9QMexWxwwOK9pfN0OrWYDJl4d93LDw== X-Received: by 2002:a17:907:7215:b0:a55:75f7:42fb with SMTP id dr21-20020a170907721500b00a5575f742fbmr4895587ejc.24.1713524926172; Fri, 19 Apr 2024 04:08:46 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org, netdev@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, "David S . Miller" , Jakub Kicinski , David Ahern , Eric Dumazet , Willem de Bruijn , Jason Wang , Wei Liu , Paul Durrant , xen-devel@lists.xenproject.org, "Michael S . Tsirkin" , virtualization@lists.linux.dev, kvm@vger.kernel.org Subject: [PATCH io_uring-next/net-next v2 3/4] io_uring/notif: simplify io_notif_flush() Date: Fri, 19 Apr 2024 12:08:41 +0100 Message-ID: <19e41652c16718b946a5c80d2ad409df7682e47e.1713369317.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 io_notif_flush() is partially duplicating io_tx_ubuf_complete(), so instead of duplicating it, make the flush call io_tx_ubuf_complete. Reviewed-by: Jens Axboe Signed-off-by: Pavel Begunkov --- io_uring/notif.c | 6 +++--- io_uring/notif.h | 9 +++------ 2 files changed, 6 insertions(+), 9 deletions(-) diff --git a/io_uring/notif.c b/io_uring/notif.c index 53532d78a947..26680176335f 100644 --- a/io_uring/notif.c +++ b/io_uring/notif.c @@ -9,7 +9,7 @@ #include "notif.h" #include "rsrc.h" -void io_notif_tw_complete(struct io_kiocb *notif, struct io_tw_state *ts) +static void io_notif_tw_complete(struct io_kiocb *notif, struct io_tw_state *ts) { struct io_notif_data *nd = io_notif_to_data(notif); @@ -23,8 +23,8 @@ void io_notif_tw_complete(struct io_kiocb *notif, struct io_tw_state *ts) io_req_task_complete(notif, ts); } -static void io_tx_ubuf_complete(struct sk_buff *skb, struct ubuf_info *uarg, - bool success) +void io_tx_ubuf_complete(struct sk_buff *skb, struct ubuf_info *uarg, + bool success) { struct io_notif_data *nd = container_of(uarg, struct io_notif_data, uarg); struct io_kiocb *notif = cmd_to_io_kiocb(nd); diff --git a/io_uring/notif.h b/io_uring/notif.h index 2e25a2fc77d1..2cf9ff6abd7a 100644 --- a/io_uring/notif.h +++ b/io_uring/notif.h @@ -21,7 +21,8 @@ struct io_notif_data { }; struct io_kiocb *io_alloc_notif(struct io_ring_ctx *ctx); -void io_notif_tw_complete(struct io_kiocb *notif, struct io_tw_state *ts); +void io_tx_ubuf_complete(struct sk_buff *skb, struct ubuf_info *uarg, + bool success); static inline struct io_notif_data *io_notif_to_data(struct io_kiocb *notif) { @@ -33,11 +34,7 @@ static inline void io_notif_flush(struct io_kiocb *notif) { struct io_notif_data *nd = io_notif_to_data(notif); - /* drop slot's master ref */ - if (refcount_dec_and_test(&nd->uarg.refcnt)) { - notif->io_task_work.func = io_notif_tw_complete; - __io_req_task_work_add(notif, IOU_F_TWQ_LAZY_WAKE); - } + io_tx_ubuf_complete(NULL, &nd->uarg, true); } static inline int io_notif_account_mem(struct io_kiocb *notif, unsigned len) From patchwork Fri Apr 19 11:08:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13636228 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2752BC04FF6 for ; Fri, 19 Apr 2024 11:08:58 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.709006.1108298 (Exim 4.92) (envelope-from ) id 1rxm6t-00028B-EJ; Fri, 19 Apr 2024 11:08:51 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 709006.1108298; Fri, 19 Apr 2024 11:08:51 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rxm6t-00027v-BA; Fri, 19 Apr 2024 11:08:51 +0000 Received: by outflank-mailman (input) for mailman id 709006; Fri, 19 Apr 2024 11:08:50 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rxm6r-0001FM-Vv for xen-devel@lists.xenproject.org; Fri, 19 Apr 2024 11:08:49 +0000 Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [2a00:1450:4864:20::634]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 320ebd81-fe3d-11ee-94a3-07e782e9044d; Fri, 19 Apr 2024 13:08:48 +0200 (CEST) Received: by mail-ej1-x634.google.com with SMTP id a640c23a62f3a-a557044f2ddso194696766b.2 for ; Fri, 19 Apr 2024 04:08:48 -0700 (PDT) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id z13-20020a17090655cd00b00a4739efd7cesm2082525ejp.60.2024.04.19.04.08.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Apr 2024 04:08:46 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 320ebd81-fe3d-11ee-94a3-07e782e9044d DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1713524928; x=1714129728; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2fP3T1HRg4APmOPCr1i20fy3OAFGb9fSrnxRLKsDwOU=; b=dl/zco0HRhXxCfHY4UGGo3p42hfzch0Upk/TLHosmE9B4Tozp4jo96slNEXjdZCs2x Mhzylds1tViEFQ6HTOWhAvxU+7ROVvJEhcgDC21vlgXXRL978up3dwwF2jMoRA1J6rzd S/iwW5kmpXjIpNrtWjoBPr7AsnQuTrKmVnchXttWOsHOYSQfcb+kiDzi8evO53rqKLu7 FiXHn5cRMKHIbMTP0puMdpbfOAtC+UUICi2G2roWuui/d047Z2ZIrfUkLJj8ApJRo24l H8X8l1aqwn3FaZEmRL6dkRhGcG+CGTISxJqc0GN2sq8r9mz8mnHIcW7KohgkaYkvLs4p bQjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713524928; x=1714129728; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2fP3T1HRg4APmOPCr1i20fy3OAFGb9fSrnxRLKsDwOU=; b=HWVdrbLKOvUIPaVKeZ1FsjXTL3vf1sZzuW2P9mrIOnA5xcfTLVFaIJcHzTpBcve68e lEBUP6Byk251yXw3uVFt222iXpTfx52U5kqTynXbzjLkeSD9qRjJrq6DWf1bb1KQusI5 KzNLmpxy8ugam/7BKXbkoid6sSCj2bHg5ynn+x7F44fv2Mct4Sg/DAhB5d/ST8hZrt3n Unjv9hEbQEd+oijI8+kOJDJZD3VotbFfUnCjBjhWEdbR0mrAfF+vBEJEOfUhj1L44Kme cARzwH5sl+Hz9UKmXjRfhdDX/WCTyqL7tDPhQbPfCmJzbX7fmJI5udYzuNdYCartRzHy 0Mvw== X-Forwarded-Encrypted: i=1; AJvYcCVPhXVeXRanv3igThDTKFPYzJo3sPLhwawlQfHXcbJxZ5Na9yMteM3zKjCCm9RyEqTp83EGc/gBc1q5Yrz2Y2KRv3dLBqXrrI7mM+BsqWA= X-Gm-Message-State: AOJu0Yw74d24wc0h8jkSrbTny/r4hAMJVjnQhzLFMZPpuhv06IrHnMHi 8vb00TeTP2XVzYPAEU6tKjfONvFmaGGirEJRz5OiC0c46xMxXbtBKe9uLA== X-Google-Smtp-Source: AGHT+IFy2QsdrmLfOtT9EYc1USivrPyf0Bwi5GEdqAq0Fpn53sZdiNwWQaN42/Mf/SYmqQzPfik1yw== X-Received: by 2002:a17:906:a206:b0:a52:2c00:9850 with SMTP id r6-20020a170906a20600b00a522c009850mr1380280ejy.59.1713524927726; Fri, 19 Apr 2024 04:08:47 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org, netdev@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, "David S . Miller" , Jakub Kicinski , David Ahern , Eric Dumazet , Willem de Bruijn , Jason Wang , Wei Liu , Paul Durrant , xen-devel@lists.xenproject.org, "Michael S . Tsirkin" , virtualization@lists.linux.dev, kvm@vger.kernel.org Subject: [PATCH io_uring-next/net-next v2 4/4] io_uring/notif: implement notification stacking Date: Fri, 19 Apr 2024 12:08:42 +0100 Message-ID: X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 The network stack allows only one ubuf_info per skb, and unlike MSG_ZEROCOPY, each io_uring zerocopy send will carry a separate ubuf_info. That means that send requests can't reuse a previosly allocated skb and need to get one more or more of new ones. That's fine for large sends, but otherwise it would spam the stack with lots of skbs carrying just a little data each. To help with that implement linking notification (i.e. an io_uring wrapper around ubuf_info) into a list. Each is refcounted by skbs and the stack as usual. additionally all non head entries keep a reference to the head, which they put down when their refcount hits 0. When the head have no more users, it'll efficiently put all notifications in a batch. As mentioned previously about ->io_link_skb, the callback implementation always allows to bind to an skb without a ubuf_info. Reviewed-by: Jens Axboe Signed-off-by: Pavel Begunkov --- io_uring/notif.c | 71 +++++++++++++++++++++++++++++++++++++++++++----- io_uring/notif.h | 3 ++ 2 files changed, 67 insertions(+), 7 deletions(-) diff --git a/io_uring/notif.c b/io_uring/notif.c index 26680176335f..d58cdc01e691 100644 --- a/io_uring/notif.c +++ b/io_uring/notif.c @@ -9,18 +9,28 @@ #include "notif.h" #include "rsrc.h" +static const struct ubuf_info_ops io_ubuf_ops; + static void io_notif_tw_complete(struct io_kiocb *notif, struct io_tw_state *ts) { struct io_notif_data *nd = io_notif_to_data(notif); - if (unlikely(nd->zc_report) && (nd->zc_copied || !nd->zc_used)) - notif->cqe.res |= IORING_NOTIF_USAGE_ZC_COPIED; + do { + notif = cmd_to_io_kiocb(nd); - if (nd->account_pages && notif->ctx->user) { - __io_unaccount_mem(notif->ctx->user, nd->account_pages); - nd->account_pages = 0; - } - io_req_task_complete(notif, ts); + lockdep_assert(refcount_read(&nd->uarg.refcnt) == 0); + + if (unlikely(nd->zc_report) && (nd->zc_copied || !nd->zc_used)) + notif->cqe.res |= IORING_NOTIF_USAGE_ZC_COPIED; + + if (nd->account_pages && notif->ctx->user) { + __io_unaccount_mem(notif->ctx->user, nd->account_pages); + nd->account_pages = 0; + } + + nd = nd->next; + io_req_task_complete(notif, ts); + } while (nd); } void io_tx_ubuf_complete(struct sk_buff *skb, struct ubuf_info *uarg, @@ -39,12 +49,56 @@ void io_tx_ubuf_complete(struct sk_buff *skb, struct ubuf_info *uarg, if (!refcount_dec_and_test(&uarg->refcnt)) return; + if (nd->head != nd) { + io_tx_ubuf_complete(skb, &nd->head->uarg, success); + return; + } notif->io_task_work.func = io_notif_tw_complete; __io_req_task_work_add(notif, IOU_F_TWQ_LAZY_WAKE); } +static int io_link_skb(struct sk_buff *skb, struct ubuf_info *uarg) +{ + struct io_notif_data *nd, *prev_nd; + struct io_kiocb *prev_notif, *notif; + struct ubuf_info *prev_uarg = skb_zcopy(skb); + + nd = container_of(uarg, struct io_notif_data, uarg); + notif = cmd_to_io_kiocb(nd); + + if (!prev_uarg) { + net_zcopy_get(&nd->uarg); + skb_zcopy_init(skb, &nd->uarg); + return 0; + } + /* handle it separately as we can't link a notif to itself */ + if (unlikely(prev_uarg == &nd->uarg)) + return 0; + /* we can't join two links together, just request a fresh skb */ + if (unlikely(nd->head != nd || nd->next)) + return -EEXIST; + /* don't mix zc providers */ + if (unlikely(prev_uarg->ops != &io_ubuf_ops)) + return -EEXIST; + + prev_nd = container_of(prev_uarg, struct io_notif_data, uarg); + prev_notif = cmd_to_io_kiocb(nd); + + /* make sure all noifications can be finished in the same task_work */ + if (unlikely(notif->ctx != prev_notif->ctx || + notif->task != prev_notif->task)) + return -EEXIST; + + nd->head = prev_nd->head; + nd->next = prev_nd->next; + prev_nd->next = nd; + net_zcopy_get(&nd->head->uarg); + return 0; +} + static const struct ubuf_info_ops io_ubuf_ops = { .complete = io_tx_ubuf_complete, + .link_skb = io_link_skb, }; struct io_kiocb *io_alloc_notif(struct io_ring_ctx *ctx) @@ -65,6 +119,9 @@ struct io_kiocb *io_alloc_notif(struct io_ring_ctx *ctx) nd = io_notif_to_data(notif); nd->zc_report = false; nd->account_pages = 0; + nd->next = NULL; + nd->head = nd; + nd->uarg.flags = IO_NOTIF_UBUF_FLAGS; nd->uarg.ops = &io_ubuf_ops; refcount_set(&nd->uarg.refcnt, 1); diff --git a/io_uring/notif.h b/io_uring/notif.h index 2cf9ff6abd7a..f3589cfef4a9 100644 --- a/io_uring/notif.h +++ b/io_uring/notif.h @@ -14,6 +14,9 @@ struct io_notif_data { struct file *file; struct ubuf_info uarg; + struct io_notif_data *next; + struct io_notif_data *head; + unsigned account_pages; bool zc_report; bool zc_used;