From patchwork Mon Feb 19 21:25:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13563112 Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5E2CE5025C for ; Mon, 19 Feb 2024 21:27:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708378079; cv=none; b=jYGQJonwcK7Sq7SJFAZ8XPirT04fjyMjD23Z5csuYa9eO6EZV7oaRsCEASJZP1BpWCU9CrGC+Km6kXHSItUnk6c8tAU7fbfSp1gP8lwROpkcUxxWmhrzbd5uzK6meB+S/m1mZSmeCgjNqonPX2CRJkNUaz3OIDlOf6ojus1ojaw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708378079; c=relaxed/simple; bh=fCyD9l12nlVFS0uM8gSt7z43duSz7ZNJ2TW72BvnVw8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EHCL60ax4ZrmVvF2OKTJkdyMVpNap8eIrx6oTs89HvGHv9H/SUKomhV8Z/7edCvJc8e3cJTbG8XGWf9cQ6kSZWZTljJMVK2KZpvWZ+1wiZCuIMyJmGRWMM6L13HrEFB0kHbVvtUc9SKKM7EeYqG7yjf5RHEyfaKzDG+66NF6N1U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=nAWoIoXt; arc=none smtp.client-ip=209.85.210.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="nAWoIoXt" Received: by mail-pf1-f177.google.com with SMTP id d2e1a72fcca58-6e28029f2b4so36581b3a.1 for ; Mon, 19 Feb 2024 13:27:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1708378075; x=1708982875; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8NMa6cdlS4mwHyd9OSd9n4AxIpngpcBlX+GZ3KJr7wY=; b=nAWoIoXtrgGon1i2HeINcdJx+HXsCrwLcRnEyuIriU6W9s3VFVGAbdlbRjva/V29a3 r5TQ0BS3XMk9t9IkEw7LNkMRP5qm+f36IEx4pf+57Y71o+jVP9FKDw3JWeoLACRLmvJk HwStEr5lBBS3Wu+EU8stVLPaV+p61YesO3edEHtWhY6s+mP7TK/2vW123Vwi6D+/syHR LqICfY4a3ZhmvAf0nGfD9gfFIqIT0uyvOSgGbPzGJ5H/OXht0ittt+Y0F+9aLZxVVxu8 EJxnjAzzV+FHgs5/2i61JvHoRJZi+Kd6+JKh1KqkBcEDS2HD/8+ftunpqS5mseqMHLxh s0zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708378075; x=1708982875; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8NMa6cdlS4mwHyd9OSd9n4AxIpngpcBlX+GZ3KJr7wY=; b=L1WmxwESP6zN7fBE0FTJAnx84OwayAOotsi6oxlk8u5lwy1o8xHz3JC8CMRY7lsqHq uUDxRtBNQ0Y9/0RMVjsEsxkpV1guow1rBk1oA1D+G2XLY/uJFedV3wWT9BYASqFapTeJ q3VFd4ojJYdx+YhFNWJEzXZrrUlA0mS6FEJr0iNeRmuCZp+1fVeBV3zVLP25QfOgfK5W zrtWPCy5NOqAc/50A0fALGITIHAaVSGaomJMyEVg02VH7yttk26bPGCU1OUJ5MG5kMrK N894BlzyEU2c96TDtSq1Tdy593LdzlQAoXG14jH3LV0LQbo8/QzMqUVcAavQmc+8B8aj UO9A== X-Gm-Message-State: AOJu0Yxm0KH7W7Gz5wT/4VcEnQP8USfL30flHAnJNBl9eXks3Vlhcwsd IWOHuqN5aAeNEhgDY/G/mSDa9cmPprBetATx8Wnj54/sUkFsC6fRQIOe5RcX0AZ+6xxb7UYX1Xx Z X-Google-Smtp-Source: AGHT+IGYEh7G4dv3ZO160wn6HVKjPzPyUW/WjeizQjIgjGMWwO/YkltLCJTr11Y+7v6IgMH4Wh45Ew== X-Received: by 2002:a05:6a20:7284:b0:1a0:9121:2227 with SMTP id o4-20020a056a20728400b001a091212227mr10576365pzk.3.1708378074621; Mon, 19 Feb 2024 13:27:54 -0800 (PST) Received: from localhost.localdomain ([198.8.77.194]) by smtp.gmail.com with ESMTPSA id q13-20020a056a00088d00b006e05c801748sm5279770pfj.199.2024.02.19.13.27.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Feb 2024 13:27:54 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 1/3] io_uring/net: unify how recvmsg and sendmsg copy in the msghdr Date: Mon, 19 Feb 2024 14:25:25 -0700 Message-ID: <20240219212748.3826830-2-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219212748.3826830-1-axboe@kernel.dk> References: <20240219212748.3826830-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 For recvmsg, we roll our own since we support buffer selections. This isn't the case for sendmsg right now, but in preparation for doing so, make the recvmsg copy helpers generic so we can call them from the sendmsg side as well. Signed-off-by: Jens Axboe --- io_uring/net.c | 275 +++++++++++++++++++++++++------------------------ 1 file changed, 140 insertions(+), 135 deletions(-) diff --git a/io_uring/net.c b/io_uring/net.c index 161622029147..fcbaeb7cc045 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -204,16 +204,148 @@ static int io_setup_async_msg(struct io_kiocb *req, return -EAGAIN; } +static bool io_recvmsg_multishot_overflow(struct io_async_msghdr *iomsg) +{ + int hdr; + + if (iomsg->namelen < 0) + return true; + if (check_add_overflow((int)sizeof(struct io_uring_recvmsg_out), + iomsg->namelen, &hdr)) + return true; + if (check_add_overflow(hdr, (int)iomsg->controllen, &hdr)) + return true; + + return false; +} + +#ifdef CONFIG_COMPAT +static int __io_compat_msg_copy_hdr(struct io_kiocb *req, + struct io_async_msghdr *iomsg, + struct sockaddr __user **addr, int ddir) +{ + struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); + struct compat_msghdr msg; + struct compat_iovec __user *uiov; + int ret; + + if (copy_from_user(&msg, sr->umsg_compat, sizeof(msg))) + return -EFAULT; + + ret = __get_compat_msghdr(&iomsg->msg, &msg, addr); + if (ret) + return ret; + + uiov = compat_ptr(msg.msg_iov); + if (req->flags & REQ_F_BUFFER_SELECT) { + compat_ssize_t clen; + + iomsg->free_iov = NULL; + if (msg.msg_iovlen == 0) { + sr->len = 0; + } else if (msg.msg_iovlen > 1) { + return -EINVAL; + } else { + if (!access_ok(uiov, sizeof(*uiov))) + return -EFAULT; + if (__get_user(clen, &uiov->iov_len)) + return -EFAULT; + if (clen < 0) + return -EINVAL; + sr->len = clen; + } + + if (ddir == ITER_DEST && req->flags & REQ_F_APOLL_MULTISHOT) { + iomsg->namelen = msg.msg_namelen; + iomsg->controllen = msg.msg_controllen; + if (io_recvmsg_multishot_overflow(iomsg)) + return -EOVERFLOW; + } + } else { + iomsg->free_iov = iomsg->fast_iov; + ret = __import_iovec(ddir, (struct iovec __user *)uiov, + msg.msg_iovlen, UIO_FASTIOV, + &iomsg->free_iov, &iomsg->msg.msg_iter, + true); + if (ret < 0) + return ret; + } + + return 0; +} +#endif + +static int __io_msg_copy_hdr(struct io_kiocb *req, struct io_async_msghdr *iomsg, + struct sockaddr __user **addr, int ddir) +{ + struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); + struct user_msghdr msg; + int ret; + + if (copy_from_user(&msg, sr->umsg, sizeof(*sr->umsg))) + return -EFAULT; + + ret = __copy_msghdr(&iomsg->msg, &msg, addr); + if (ret) + return ret; + + if (req->flags & REQ_F_BUFFER_SELECT) { + if (msg.msg_iovlen == 0) { + sr->len = iomsg->fast_iov[0].iov_len = 0; + iomsg->fast_iov[0].iov_base = NULL; + iomsg->free_iov = NULL; + } else if (msg.msg_iovlen > 1) { + return -EINVAL; + } else { + if (copy_from_user(iomsg->fast_iov, msg.msg_iov, + sizeof(*msg.msg_iov))) + return -EFAULT; + sr->len = iomsg->fast_iov[0].iov_len; + iomsg->free_iov = NULL; + } + + if (ddir == ITER_DEST && req->flags & REQ_F_APOLL_MULTISHOT) { + iomsg->namelen = msg.msg_namelen; + iomsg->controllen = msg.msg_controllen; + if (io_recvmsg_multishot_overflow(iomsg)) + return -EOVERFLOW; + } + } else { + iomsg->free_iov = iomsg->fast_iov; + ret = __import_iovec(ddir, msg.msg_iov, msg.msg_iovlen, + UIO_FASTIOV, &iomsg->free_iov, + &iomsg->msg.msg_iter, false); + if (ret > 0) + ret = 0; + } + + return ret; +} + +static int io_msg_copy_hdr(struct io_kiocb *req, struct io_async_msghdr *iomsg, + struct sockaddr __user **addr, int ddir) +{ + iomsg->msg.msg_name = &iomsg->addr; + iomsg->msg.msg_iter.nr_segs = 0; + +#ifdef CONFIG_COMPAT + if (req->ctx->compat) + return __io_compat_msg_copy_hdr(req, iomsg, addr, ddir); +#endif + + return __io_msg_copy_hdr(req, iomsg, addr, ddir); +} + static int io_sendmsg_copy_hdr(struct io_kiocb *req, struct io_async_msghdr *iomsg) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); int ret; - iomsg->msg.msg_name = &iomsg->addr; - iomsg->free_iov = iomsg->fast_iov; - ret = sendmsg_copy_msghdr(&iomsg->msg, sr->umsg, sr->msg_flags, - &iomsg->free_iov); + ret = io_msg_copy_hdr(req, iomsg, NULL, ITER_SOURCE); + if (ret) + return ret; + /* save msg_control as sys_sendmsg() overwrites it */ sr->msg_control = iomsg->msg.msg_control_user; return ret; @@ -435,142 +567,15 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags) return IOU_OK; } -static bool io_recvmsg_multishot_overflow(struct io_async_msghdr *iomsg) -{ - int hdr; - - if (iomsg->namelen < 0) - return true; - if (check_add_overflow((int)sizeof(struct io_uring_recvmsg_out), - iomsg->namelen, &hdr)) - return true; - if (check_add_overflow(hdr, (int)iomsg->controllen, &hdr)) - return true; - - return false; -} - -static int __io_recvmsg_copy_hdr(struct io_kiocb *req, - struct io_async_msghdr *iomsg) -{ - struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct user_msghdr msg; - int ret; - - if (copy_from_user(&msg, sr->umsg, sizeof(*sr->umsg))) - return -EFAULT; - - ret = __copy_msghdr(&iomsg->msg, &msg, &iomsg->uaddr); - if (ret) - return ret; - - if (req->flags & REQ_F_BUFFER_SELECT) { - if (msg.msg_iovlen == 0) { - sr->len = iomsg->fast_iov[0].iov_len = 0; - iomsg->fast_iov[0].iov_base = NULL; - iomsg->free_iov = NULL; - } else if (msg.msg_iovlen > 1) { - return -EINVAL; - } else { - if (copy_from_user(iomsg->fast_iov, msg.msg_iov, sizeof(*msg.msg_iov))) - return -EFAULT; - sr->len = iomsg->fast_iov[0].iov_len; - iomsg->free_iov = NULL; - } - - if (req->flags & REQ_F_APOLL_MULTISHOT) { - iomsg->namelen = msg.msg_namelen; - iomsg->controllen = msg.msg_controllen; - if (io_recvmsg_multishot_overflow(iomsg)) - return -EOVERFLOW; - } - } else { - iomsg->free_iov = iomsg->fast_iov; - ret = __import_iovec(ITER_DEST, msg.msg_iov, msg.msg_iovlen, UIO_FASTIOV, - &iomsg->free_iov, &iomsg->msg.msg_iter, - false); - if (ret > 0) - ret = 0; - } - - return ret; -} - -#ifdef CONFIG_COMPAT -static int __io_compat_recvmsg_copy_hdr(struct io_kiocb *req, - struct io_async_msghdr *iomsg) -{ - struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct compat_msghdr msg; - struct compat_iovec __user *uiov; - int ret; - - if (copy_from_user(&msg, sr->umsg_compat, sizeof(msg))) - return -EFAULT; - - ret = __get_compat_msghdr(&iomsg->msg, &msg, &iomsg->uaddr); - if (ret) - return ret; - - uiov = compat_ptr(msg.msg_iov); - if (req->flags & REQ_F_BUFFER_SELECT) { - compat_ssize_t clen; - - iomsg->free_iov = NULL; - if (msg.msg_iovlen == 0) { - sr->len = 0; - } else if (msg.msg_iovlen > 1) { - return -EINVAL; - } else { - if (!access_ok(uiov, sizeof(*uiov))) - return -EFAULT; - if (__get_user(clen, &uiov->iov_len)) - return -EFAULT; - if (clen < 0) - return -EINVAL; - sr->len = clen; - } - - if (req->flags & REQ_F_APOLL_MULTISHOT) { - iomsg->namelen = msg.msg_namelen; - iomsg->controllen = msg.msg_controllen; - if (io_recvmsg_multishot_overflow(iomsg)) - return -EOVERFLOW; - } - } else { - iomsg->free_iov = iomsg->fast_iov; - ret = __import_iovec(ITER_DEST, (struct iovec __user *)uiov, msg.msg_iovlen, - UIO_FASTIOV, &iomsg->free_iov, - &iomsg->msg.msg_iter, true); - if (ret < 0) - return ret; - } - - return 0; -} -#endif - -static int io_recvmsg_copy_hdr(struct io_kiocb *req, - struct io_async_msghdr *iomsg) -{ - iomsg->msg.msg_name = &iomsg->addr; - iomsg->msg.msg_iter.nr_segs = 0; - -#ifdef CONFIG_COMPAT - if (req->ctx->compat) - return __io_compat_recvmsg_copy_hdr(req, iomsg); -#endif - - return __io_recvmsg_copy_hdr(req, iomsg); -} - int io_recvmsg_prep_async(struct io_kiocb *req) { + struct io_async_msghdr *iomsg; int ret; if (!io_msg_alloc_async_prep(req)) return -ENOMEM; - ret = io_recvmsg_copy_hdr(req, req->async_data); + iomsg = req->async_data; + ret = io_msg_copy_hdr(req, iomsg, &iomsg->uaddr, ITER_DEST); if (!ret) req->flags |= REQ_F_NEED_CLEANUP; return ret; @@ -793,7 +798,7 @@ int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags) if (req_has_async_data(req)) { kmsg = req->async_data; } else { - ret = io_recvmsg_copy_hdr(req, &iomsg); + ret = io_msg_copy_hdr(req, &iomsg, &iomsg.uaddr, ITER_DEST); if (ret) return ret; kmsg = &iomsg; From patchwork Mon Feb 19 21:25:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13563113 Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 57ACA535DE for ; Mon, 19 Feb 2024 21:27:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708378080; cv=none; b=pFRo8N83MNBuL6wUQvg+JHRzObBodhRAQsA6xklPAphzuNHTv+9dk2PRMlLpFZeGkUffJbdP/CW/OG39+q1j2brTj0Cw1IczEZATOs8Fy2R81pIR7E1TF8oOcDqYjFAoL6dA8gzlPzg0LJS7GspOyKEP8tfWStm6xZAdHb3wNyw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708378080; c=relaxed/simple; bh=A3OKKfft3DNeQRrP5felkjZSJPYhl3MGmCwrW5AVh74=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ivqo6hmOrOIjw1mYPlKwR6UtYeg5V5Ks5rcDa8S8QRB/r/3UnphFR8k8cHhb7FWj4Pb9qFY6GIvIxOfcujZkMbRb0BX2caE++VKaHH6UnwX4ihK2CYusb0Vawd8eIxnxhWkELCCAPJwoiQ5rZneNWLMjIQ9vKyWzxVsI+TN1j7w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=ggxsBiuq; arc=none smtp.client-ip=209.85.210.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="ggxsBiuq" Received: by mail-pf1-f177.google.com with SMTP id d2e1a72fcca58-6e4784216cbso56038b3a.0 for ; Mon, 19 Feb 2024 13:27:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1708378077; x=1708982877; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YVNqlUfD9EN4SXvPUo1xiT2FLfu9522ZGHPisvqay5M=; b=ggxsBiuq7K6TFy9G6/hFt2idr7SyGCXJ8s4PJPTwWqqs78dsgdZzyTFYQw8EPIYC7e E+3v6U4WDZRxcyllbMHPQtN2K61ZiFr8SQKglZmidL178y9MSN+j0V/kHwPxSHTw5fPQ 9L2BSam/ZBAeIK2yrP2WBGD2y+IIP8jezQDnVLLEhjoIyw6zKTL5giKRdWhX9paCrYsO xAZb3qEsuF9zZ8pRhmK4MmcjYINb1TStUFZsZ71SsWeSXxCShlycr/mqgrCi1ZxfGeRI 6lpgpkEMfrQ0hvo5fJc6tiiHBLRN8/YKM+vXuCUdbIm6zWwSReE3Fy+HhBjYRuaLRxSl 1hFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708378077; x=1708982877; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YVNqlUfD9EN4SXvPUo1xiT2FLfu9522ZGHPisvqay5M=; b=dVqILUPUUgaM/JN3kSIC4NbvJpFYCslf2XzO4HgzZPv0fjro+Z1iFCbD9b1ZkBk+zg WuxagxDrjXoNaBrZ+E+tRm82O3Ij33XxQmlelhdPQxhLohwVAuFmI5eYOSIyefDZ/ykU MxY+E4R2RouNnpxAffLIm3NtjlXvg4fWknFfaM+GqEy6QyOyIoUHtVRr5ERHahP+umgq Pu2VAQvtz9I5pwAVrIhIXZBLv0sME6kK8uFx61GdioO5Wd892T9yDK4/t9/JsMiBTJNl o3uwigpdXmCr8vCwgtdTmUOzzq64nWletqOEGjiiONQScjNOZSoO4WqgQJ7A71V9Az/h 2JLQ== X-Gm-Message-State: AOJu0Yxifg4K4QSeVP2jp/klGpiqNypvp97Bb4T2LVABJE4dGXWtOfon SiiS+E0nGMRviY/y1Ip9bUheJbJIm6x5CmHjeBsDz4BY78fFXOu6Vt3kcYRY4Pxubfx6INf46Po S X-Google-Smtp-Source: AGHT+IGpvIx0oiUUIpL1T2KEXYwZ7n3pqscWBtcJ7yhm05nCAmEzhN6dgIOR5kmAGM69WDofnUaX/Q== X-Received: by 2002:a05:6a00:6c8f:b0:6e4:648b:fedb with SMTP id jc15-20020a056a006c8f00b006e4648bfedbmr4096336pfb.0.1708378077103; Mon, 19 Feb 2024 13:27:57 -0800 (PST) Received: from localhost.localdomain ([198.8.77.194]) by smtp.gmail.com with ESMTPSA id q13-20020a056a00088d00b006e05c801748sm5279770pfj.199.2024.02.19.13.27.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Feb 2024 13:27:55 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 2/3] io_uring/net: add provided buffer support for IORING_OP_SEND Date: Mon, 19 Feb 2024 14:25:26 -0700 Message-ID: <20240219212748.3826830-3-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219212748.3826830-1-axboe@kernel.dk> References: <20240219212748.3826830-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 It's pretty trivial to wire up provided buffer support for the send side, just like we do on the receive side. This enables setting up a buffer ring that an application can use to push pending sends to, and then have a send pick a buffer from that ring. One of the challenges with async IO and networking sends is that you can get into reordering conditions if you have more than one inflight at the same time. Consider the following scenario where everything is fine: 1) App queues sendA for socket1 2) App queues sendB for socket1 3) App does io_uring_submit() 4) sendA is issued, completes successfully, posts CQE 5) sendB is issued, completes successfully, posts CQE All is fine. Requests are always issued in-order, and both complete inline as most sends do. However, if we're flooding socket1 with sends, the following could also result from the same sequence: 1) App queues sendA for socket1 2) App queues sendB for socket1 3) App does io_uring_submit() 4) sendA is issued, socket1 is full, poll is armed for retry 5) Space frees up in socket1, this triggers sendA retry via task_work 6) sendB is issued, completes successfully, posts CQE 7) sendA is retried, completes successfully, posts CQE Now we've sent sendB before sendA, which can make things unhappy. If both sendA and sendB had been using provided buffers, then it would look as follows instead: 1) App queues dataA for sendA, queues sendA for socket1 2) App queues dataB for sendB queues sendB for socket1 3) App does io_uring_submit() 4) sendA is issued, socket1 is full, poll is armed for retry 5) Space frees up in socket1, this triggers sendA retry via task_work 6) sendB is issued, picks first buffer (dataA), completes successfully, posts CQE (which says "I sent dataA") 7) sendA is retried, picks first buffer (dataB), completes successfully, posts CQE (which says "I sent dataB") Now we've sent the data in order, and everybody is happy. It's worth noting that this also opens the door for supporting multishot sends, as provided buffers would be a prerequisite for that. Those can trigger either when new buffers are added to the outgoing ring, or (if stalled due to lack of space) when space frees up in the socket. Signed-off-by: Jens Axboe --- include/uapi/linux/io_uring.h | 1 + io_uring/io_uring.c | 3 ++- io_uring/net.c | 19 ++++++++++++++++--- io_uring/opdef.c | 1 + 4 files changed, 20 insertions(+), 4 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 7bd10201a02b..74c3afac9c63 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -522,6 +522,7 @@ struct io_uring_params { #define IORING_FEAT_CQE_SKIP (1U << 11) #define IORING_FEAT_LINKED_FILE (1U << 12) #define IORING_FEAT_REG_REG_RING (1U << 13) +#define IORING_FEAT_SEND_BUFS (1U << 14) /* * io_uring_register(2) opcodes and arguments diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index cf2f514b7cc0..f6332fc56bed 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3962,7 +3962,8 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, IORING_FEAT_POLL_32BITS | IORING_FEAT_SQPOLL_NONFIXED | IORING_FEAT_EXT_ARG | IORING_FEAT_NATIVE_WORKERS | IORING_FEAT_RSRC_TAGS | IORING_FEAT_CQE_SKIP | - IORING_FEAT_LINKED_FILE | IORING_FEAT_REG_REG_RING; + IORING_FEAT_LINKED_FILE | IORING_FEAT_REG_REG_RING | + IORING_FEAT_SEND_BUFS; if (copy_to_user(params, p, sizeof(*p))) { ret = -EFAULT; diff --git a/io_uring/net.c b/io_uring/net.c index fcbaeb7cc045..10b6d8caf4da 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -489,7 +489,8 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) ret += sr->done_io; else if (sr->done_io) ret = sr->done_io; - io_req_set_res(req, ret, 0); + cflags = io_put_kbuf(req, issue_flags); + io_req_set_res(req, ret, cflags); return IOU_OK; } @@ -497,8 +498,10 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags) { struct sockaddr_storage __address; struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct msghdr msg; + size_t len = sr->len; + unsigned int cflags; struct socket *sock; + struct msghdr msg; unsigned flags; int min_ret = 0; int ret; @@ -531,7 +534,17 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags) if (unlikely(!sock)) return -ENOTSOCK; - ret = import_ubuf(ITER_SOURCE, sr->buf, sr->len, &msg.msg_iter); + if (io_do_buffer_select(req)) { + void __user *buf; + + buf = io_buffer_select(req, &len, issue_flags); + if (!buf) + return -ENOBUFS; + sr->buf = buf; + sr->len = len; + } + + ret = import_ubuf(ITER_SOURCE, sr->buf, len, &msg.msg_iter); if (unlikely(ret)) return ret; diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 9c080aadc5a6..88fbe5cfd379 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -273,6 +273,7 @@ const struct io_issue_def io_issue_defs[] = { .audit_skip = 1, .ioprio = 1, .manual_alloc = 1, + .buffer_select = 1, #if defined(CONFIG_NET) .prep = io_sendmsg_prep, .issue = io_send, From patchwork Mon Feb 19 21:25:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13563114 Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 006BC55E47 for ; Mon, 19 Feb 2024 21:28:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708378082; cv=none; b=Yc/Jzj5KqLqBebw+2aCGqYPF2CT8hYxXJ0U5JhSeN9jh+DK/h2rvjW4Rm99KCxObSLC3ybeiyUwIZP0Vd/HLL3xDyrhfKsyFhv5V31I1CWHQnveiqKbAjiedQPrQgLrlVpfwscekkWPCgmtNmhrOEVtUedA6lwpkxN3hSLGxn+4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708378082; c=relaxed/simple; bh=taZLZYe9/lGhZlFpg6RgyuB6JMR1eg7+Ku/sLaHYol0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=p4kyStMTirhHCceleD5dSnXrHWpCv/ddH1n1zVZJrukfdydiT9brLTxKXJH9ria/SJawPqZZVkQU3OrE1Zw5plGPppKIg1k5ifDZC0ysFkNDfpcEO29Vras1iHLO3yY8+908o+rBn2RSRolriUjBGiQgHGUFPbvNhxliwPVMoFY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=GWNQ0WRP; arc=none smtp.client-ip=209.85.210.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="GWNQ0WRP" Received: by mail-pf1-f171.google.com with SMTP id d2e1a72fcca58-6e4784216cbso56047b3a.0 for ; Mon, 19 Feb 2024 13:28:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1708378080; x=1708982880; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iQJvsw/J6sK+/X03ucFK8E4fqOrLrmHIUymPpkqPOW8=; b=GWNQ0WRPnAiynwPE2axBzD/lBxW+ojw9hoK+U5lcPkueMME3ULKAvl8sq0gOBHHaIt 4gTQXweg8PQVtyWsLPDndTV8fakTayWSdgdvvZgUuraEH7BkxgpbGI5VP7tgP6lW6BE7 TKxXeKguegG6534Ktrmv/PlGRfFG+f9rY5wJyzE5l9UWHi/MN7UvI0kCGw4SxdLSkBA6 dINOYIOL18K7wAdA4O2rW2Rs9TEN2SMdNJ5RQJkgh5J57eKwbEcMKcwSp/+EwdDXFINn wljmJTdd60DiuFiatTNtCzuRlcCDBWT26vURLEfWiLJaN5jMJj65MCVYp47VuPqIHJP9 acLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708378080; x=1708982880; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iQJvsw/J6sK+/X03ucFK8E4fqOrLrmHIUymPpkqPOW8=; b=V/h7TSOSbstefAycggJvXRt8z+JQfPyik/v3Q5gMHTK7OzQers0IT0XM+o0uMOACVW jeJJc+3MA6vFWRqLpHEgOckvGAaB6Nc2+BxOUPFPGRMVzdSxUQTfU1vdqKvAgwcVfMQ2 9pP7xDh1URLQSZD0LyqcNSQ+L6lGv80qItDttoHWOcv2pI2JXwvGyXok7w6bpOMBExsC IXEI51crIpw8jiZma7psD8bPJnC1MTtF+BSAXHPAUpEPXsd7trS2PBFBm7nbVuGlE62j CwVS0dcHAfgL+sxk244wqMOUF62C9b3bHjQKh6ULaSSr95HP9dyclqF/vhoc4czfCZcn VNmQ== X-Gm-Message-State: AOJu0Yzg29/PqFhAT7/nKoYoXqL3a83VLxxFUj1G0erzZVZQfV4kPPzb /O3Qx9DtJEprQppS5nflBv1cQW/CQEScEV4X0jSUaE0uydgS2vDUl9qFSmytdNygvr5xrIHOv4x O X-Google-Smtp-Source: AGHT+IFrzr9AMXwoB1VIzIWPn53A7QvxUfQvzQbiHqp+n8D0WBtDF51S3eqKkSWnvFtL8nanP5j6HQ== X-Received: by 2002:a05:6a00:1c88:b0:6e4:5e66:6815 with SMTP id y8-20020a056a001c8800b006e45e666815mr5148474pfw.3.1708378078847; Mon, 19 Feb 2024 13:27:58 -0800 (PST) Received: from localhost.localdomain ([198.8.77.194]) by smtp.gmail.com with ESMTPSA id q13-20020a056a00088d00b006e05c801748sm5279770pfj.199.2024.02.19.13.27.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Feb 2024 13:27:57 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 3/3] io_uring/net: add provided buffer support for IORING_OP_SENDMSG Date: Mon, 19 Feb 2024 14:25:27 -0700 Message-ID: <20240219212748.3826830-4-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219212748.3826830-1-axboe@kernel.dk> References: <20240219212748.3826830-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Adds provided buffer support for sendmsg as well, see the previous commit that added it to IORING_OP_SEND for a longer explanation of why this makes sense. Signed-off-by: Jens Axboe --- io_uring/net.c | 15 ++++++++++++++- io_uring/opdef.c | 1 + 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/io_uring/net.c b/io_uring/net.c index 10b6d8caf4da..30afb394efd7 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -436,6 +436,7 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); struct io_async_msghdr iomsg, *kmsg; struct socket *sock; + unsigned int cflags; unsigned flags; int min_ret = 0; int ret; @@ -458,6 +459,17 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) (sr->flags & IORING_RECVSEND_POLL_FIRST)) return io_setup_async_msg(req, kmsg, issue_flags); + if (io_do_buffer_select(req)) { + void __user *buf; + size_t len = sr->len; + + buf = io_buffer_select(req, &len, issue_flags); + if (!buf) + return -ENOBUFS; + + iov_iter_ubuf(&kmsg->msg.msg_iter, ITER_SOURCE, buf, len); + } + flags = sr->msg_flags; if (issue_flags & IO_URING_F_NONBLOCK) flags |= MSG_DONTWAIT; @@ -576,7 +588,8 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags) ret += sr->done_io; else if (sr->done_io) ret = sr->done_io; - io_req_set_res(req, ret, 0); + cflags = io_put_kbuf(req, issue_flags); + io_req_set_res(req, ret, cflags); return IOU_OK; } diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 88fbe5cfd379..1f6b09e61ef8 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -139,6 +139,7 @@ const struct io_issue_def io_issue_defs[] = { .pollout = 1, .ioprio = 1, .manual_alloc = 1, + .buffer_select = 1, #if defined(CONFIG_NET) .prep = io_sendmsg_prep, .issue = io_sendmsg,