From patchwork Tue Aug 15 17:31:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354008 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01146C04FDF for ; Tue, 15 Aug 2023 17:34:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238886AbjHORdd (ORCPT ); Tue, 15 Aug 2023 13:33:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238920AbjHORdU (ORCPT ); Tue, 15 Aug 2023 13:33:20 -0400 Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7DB71BDD for ; Tue, 15 Aug 2023 10:33:17 -0700 (PDT) Received: by mail-lf1-x131.google.com with SMTP id 2adb3069b0e04-4fe389d6f19so8896556e87.3 for ; Tue, 15 Aug 2023 10:33:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120796; x=1692725596; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TNuX+XOKOpOPnPfeKjKStgxGv91vortKXWoWHq+NGN8=; b=EYQigw78nE/PUxM7VAdE5/X4ykw8KQdcvNt6RUIlMfyDqmtR5S31cwnoXkN2KJzzzT OP0PiFGlMgqLAV3cuI/S8H6Y7MYV9FP3zMvb5j5ZJmeYxF6/9uSY3GnbhzL6lzTxEtMs uQfHq+3oOf64EfXnnF4T2Vv/s+Za0Li3144DSFLEdnZ+TNlPMHpH3o12n++dsl3FaWN0 jO8L2KN3oEYrZ4GMEn33G+Lu0vdQRQ7KFm1algRuSk+MFyzEs5Y3OVq7j7hGN9nGf/bw qF/alXJ2fJIgqDQszt4fNlKlJBYuSu4B7vPSqnFOaoXVl7CzIRK+bhKXH0pa5MmDNEvK tdng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120796; x=1692725596; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TNuX+XOKOpOPnPfeKjKStgxGv91vortKXWoWHq+NGN8=; b=esIOtOqjHyXSFo5s5oOxm8Ns7ODK3iurHrVseYRk3ZgcsUbGLwHzPrqxQTDDVI7uAq 2wXEz718jnsrk6usec/DQj8ylbPgi/26rLI9Be7se8vKJ/ARQBicADoq654GyiEZFeoP G+V/ofWLXC48/CKQ9EKLLwwprEQp6M8Ya1ElyAkODOaGD7lZAzh9te950SqtS57JyrEn pd2afxhLeTu9G+FDKk+EmqQty931UFeJ6j9FBh+66PWJTZgBYQkXy2eQGvXHemdt8aRX e5fHQDBdhJfckm6oDsPQjYDUbiNsTEtzRNITc3QCljad66ayQm9qF5yv6344A7ozdqHO nRlA== X-Gm-Message-State: AOJu0Yy/bebzcQkHbCYLnHwFqiMEkaWXi0PzI67NY74yfNRdIgwIq72T C52QAe7gJziYSzVfo+fOtUcQXY3aLpU= X-Google-Smtp-Source: AGHT+IET6Ndt5Guu505JTXnZUHOObdpWfHHqnFM5GsJa9GAtW0vmQOkMHHYOMfhsRGE4A1pdTHCRww== X-Received: by 2002:a19:7b10:0:b0:4f4:c6ab:f119 with SMTP id w16-20020a197b10000000b004f4c6abf119mr7109247lfc.64.1692120795716; Tue, 15 Aug 2023 10:33:15 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:15 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 01/16] io_uring: improve cqe !tracing hot path Date: Tue, 15 Aug 2023 18:31:30 +0100 Message-ID: <130dd5980d00ad88912362a33bfddb09cf53bb3c.1692119257.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org While looking at io_fill_cqe_req()'s asm I stumbled on our trace points turning into the chunk below: trace_io_uring_complete(req->ctx, req, req->cqe.user_data, req->cqe.res, req->cqe.flags, req->extra1, req->extra2); io_uring/io_uring.c:898: trace_io_uring_complete(req->ctx, req, req->cqe.user_data, movq 232(%rbx), %rdi # req_44(D)->big_cqe.extra2, _5 movq 224(%rbx), %rdx # req_44(D)->big_cqe.extra1, _6 movl 84(%rbx), %r9d # req_44(D)->cqe.D.81184.flags, _7 movl 80(%rbx), %r8d # req_44(D)->cqe.res, _8 movq 72(%rbx), %rcx # req_44(D)->cqe.user_data, _9 movq 88(%rbx), %rsi # req_44(D)->ctx, _10 ./arch/x86/include/asm/jump_label.h:27: asm_volatile_goto("1:" 1:jmp .L1772 # objtool NOPs this # ... It does a jump_label for actual tracing, but those 6 moves will stay there in the hottest io_uring path. As an optimisation, add a trace_io_uring_complete_enabled() check, which is also uses jump_labels, it tricks the compiler into behaving. It removes the junk without changing anything else int the hot path. Note: apparently, it's not only me noticing it, and people are also working it around. We should remove the check when it's solved generically or rework tracing. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.h | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 3e6ff3cd9a24..465598223386 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -145,10 +145,11 @@ static inline bool io_fill_cqe_req(struct io_ring_ctx *ctx, struct io_kiocb *req if (unlikely(!cqe)) return false; - trace_io_uring_complete(req->ctx, req, req->cqe.user_data, - req->cqe.res, req->cqe.flags, - (req->flags & REQ_F_CQE32_INIT) ? req->extra1 : 0, - (req->flags & REQ_F_CQE32_INIT) ? req->extra2 : 0); + if (trace_io_uring_complete_enabled()) + trace_io_uring_complete(req->ctx, req, req->cqe.user_data, + req->cqe.res, req->cqe.flags, + (req->flags & REQ_F_CQE32_INIT) ? req->extra1 : 0, + (req->flags & REQ_F_CQE32_INIT) ? req->extra2 : 0); memcpy(cqe, &req->cqe, sizeof(*cqe)); From patchwork Tue Aug 15 17:31:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354017 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5B54C3DA40 for ; Tue, 15 Aug 2023 17:34:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238922AbjHORdi (ORCPT ); Tue, 15 Aug 2023 13:33:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49658 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238921AbjHORdU (ORCPT ); Tue, 15 Aug 2023 13:33:20 -0400 Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B18F41BDF for ; Tue, 15 Aug 2023 10:33:18 -0700 (PDT) Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-99bcf2de59cso740605866b.0 for ; Tue, 15 Aug 2023 10:33:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120797; x=1692725597; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kqrlD+pi9UzvW3ZlyidGj+8pb9+oKgMml6JbnDHNb4M=; b=o/363iN0dmQx4AoCSpF4WDtzf7bGX6WNw1FUZsM2UzgJZCOpAcfCNC6PlBKD8mNuAO jIxKCluGhN/gIsD0YISCPJ15dteEhKUyiWaO9aETAHW9H9g7t231q5vbUmB9Ejw7pe74 U21/A3XAm+s0KiK6cu8bgpECtizAcpDLK/d7D/1hW2CgOtsq8uy9yzb+apC3gX/SzKqg woSTjDJQbazzVX6iK1LI/g66YqI+F61/vyFaWXJZgslsq5UdjsLYRKPii7t9Xzsw0alM Rqv1WvzJu4ODVSfiozqS2HACXm1NCaI9Et+HKvLaRWVAWaCJw6vzSjUuHGrgDDrlZPqF U20g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120797; x=1692725597; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kqrlD+pi9UzvW3ZlyidGj+8pb9+oKgMml6JbnDHNb4M=; b=Kl6ZujPsfagpBNTwMn5wn4Rqhv+EPrzwSH8hjLMVArNs2uksGDKZaBeTaa4vYm+A2K 9rjb08q2tT4qrP6QzG44kCGi1Ij040bYaIO7SwOdlKPYFfd1o+OIIn/WDX5wo7N8+oGw cPcSt2Bg4JH9pp4OxSm3e1K/NUdSfDIh4c9gQe8/8WBxRw8abQjbL4r5Yas6yyi8Ues1 SGBl23MF5RYpLurU8DGl0SxcPYVez+eRcw0PEGzpRb7cCCcC14HGgd+/+nt3UlqtuPXO Cz4HQvb923uCIGZjzZtTzMp3b6SB2aVrxd2Airl1oppxwht+Q9+GzJmpOKOBkveQYr04 8hLg== X-Gm-Message-State: AOJu0Yw+AW6EkN6KCsGFhujJrJ+3TQI2VK6IgEaZmK9JTA6SYrr+iq5t 2FYXO798e1U1gH9FwpUj2dn1xVZefLU= X-Google-Smtp-Source: AGHT+IHrU3/yqt1LOrErz0gfYYdyHnHZrYcW0lij1h2t3/GgTm0tQVaLnjWNjzKxKEHs+wV9J/HuFw== X-Received: by 2002:a17:906:5358:b0:993:e809:b9ff with SMTP id j24-20020a170906535800b00993e809b9ffmr10736047ejo.21.1692120796727; Tue, 15 Aug 2023 10:33:16 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:15 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 02/16] io_uring: cqe init hardening Date: Tue, 15 Aug 2023 18:31:31 +0100 Message-ID: <731ecc625e6e67900ebe8c821b3d3647850e0bea.1692119257.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org io_kiocb::cqe stores the completion info which we'll memcpy to userspace, and we rely on callbacks and other later steps to populate it with right values. We have never had problems with that, but it would still be safer to zero it on allocation. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index e189158ebbdd..4d27655be3a6 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1056,7 +1056,7 @@ static void io_preinit_req(struct io_kiocb *req, struct io_ring_ctx *ctx) req->link = NULL; req->async_data = NULL; /* not necessary, but safer to zero */ - req->cqe.res = 0; + memset(&req->cqe, 0, sizeof(req->cqe)); } static void io_flush_cached_locked_reqs(struct io_ring_ctx *ctx, From patchwork Tue Aug 15 17:31:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354016 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6968C10F19 for ; Tue, 15 Aug 2023 17:34:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238921AbjHORdj (ORCPT ); Tue, 15 Aug 2023 13:33:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49678 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238923AbjHORdV (ORCPT ); Tue, 15 Aug 2023 13:33:21 -0400 Received: from mail-ed1-x530.google.com (mail-ed1-x530.google.com [IPv6:2a00:1450:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 097091BE1 for ; Tue, 15 Aug 2023 10:33:19 -0700 (PDT) Received: by mail-ed1-x530.google.com with SMTP id 4fb4d7f45d1cf-5257e2b5d12so460220a12.2 for ; Tue, 15 Aug 2023 10:33:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120797; x=1692725597; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=30N+fjibkNmjYDpZD6+/lnILGCimNgsmQWV3GqCI2Zs=; b=Gu5W5hwPt66keoVjPFR57l5MUNjh5AQjZ73Vk7SHr4YcOSnuLM1XuQBiySyZj/ePht fnYNY9N4Fnp8JAg6smDYfoH0Wuz47hQSTX2FClFjQfUjqdmOHDGUtRsriH/AuU9Iesbv NQvcc5fqiDKQq6CcIVxepuWThbsaJ4DW7SsSbnHTitFAB8Gk9TP64ctwzlEIdbNbJYqx 5jq30H2C+e9YeN0/9SGS1w5cdd8lN3acHz5aLK+MCD0OsvJWK3DM4Ge4cu5fiiSaOKvi f8K4VRHjWZ5PmRprCVbWEtbFNEFQdwBFOZWiNZM+2am+yNzniluMjFlRnfS77wir4q7t 4IKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120797; x=1692725597; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=30N+fjibkNmjYDpZD6+/lnILGCimNgsmQWV3GqCI2Zs=; b=e1EFAL54AbQ1Q6M9M8AWwWnnwnG/sohvltgAIw8ptxkoiwPRy1XNuQpdUsHNxhik5s 6GlcMkrOiDb06YGf6NIUrm4SXLHIeykpr4TromN/4MkLDUaOJxnV/yLWK0fE4fnfEVxx FICyGcauA99IZ/AHCLokf+MLmHyZgfjsmYix5eDmNGudbAcBljx0KXaoEPAwhCE42kg9 j+i/oECjbISR3Tsy5IUbYRdGxMqTAc5O9s/NhfMl40OaZH/owsACnM19ZyT1eUjyiOCf M3YuXlgjnm3eB8MzrEHE7DvwFWlFxH2/0CQnSY11/MDjPQI7+ySnjkdhX3GGqf2TrDrX 74XA== X-Gm-Message-State: AOJu0YzncxRlr1keTAPlbZTO/+X3uzcql04HhHl4MXTPAC4JZOlKpVOH frepEA6vg1bnwwmgXi89r2YxGfFFht4= X-Google-Smtp-Source: AGHT+IEmoDIVQrfz8R3izsZH+xfXmGMwsxFdfpypM/DOv2xh2eaRQs01T4KlzEoegjCbXH4+/Mka7g== X-Received: by 2002:a17:906:109c:b0:99c:c15f:31b2 with SMTP id u28-20020a170906109c00b0099cc15f31b2mr10729211eju.8.1692120797226; Tue, 15 Aug 2023 10:33:17 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:16 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 03/16] io_uring: simplify big_cqe handling Date: Tue, 15 Aug 2023 18:31:32 +0100 Message-ID: <5dcfd5797c3788d0228ac0d6bc3c154a4e382ee9.1692119257.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Don't keep big_cqe bits of req in a union with hash_node, find a separate space for it. It's bit safer, but also if we keep it always initialised, we can get rid of ugly REQ_F_CQE32_INIT handling. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 16 ++++++---------- io_uring/io_uring.c | 8 +++----- io_uring/io_uring.h | 15 +++------------ io_uring/uring_cmd.c | 5 ++--- 4 files changed, 14 insertions(+), 30 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index f04ce513fadb..9795eda529f7 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -409,7 +409,6 @@ enum { REQ_F_SINGLE_POLL_BIT, REQ_F_DOUBLE_POLL_BIT, REQ_F_PARTIAL_IO_BIT, - REQ_F_CQE32_INIT_BIT, REQ_F_APOLL_MULTISHOT_BIT, REQ_F_CLEAR_POLLIN_BIT, REQ_F_HASH_LOCKED_BIT, @@ -479,8 +478,6 @@ enum { REQ_F_PARTIAL_IO = BIT(REQ_F_PARTIAL_IO_BIT), /* fast poll multishot mode */ REQ_F_APOLL_MULTISHOT = BIT(REQ_F_APOLL_MULTISHOT_BIT), - /* ->extra1 and ->extra2 are initialised */ - REQ_F_CQE32_INIT = BIT(REQ_F_CQE32_INIT_BIT), /* recvmsg special flag, clear EPOLLIN */ REQ_F_CLEAR_POLLIN = BIT(REQ_F_CLEAR_POLLIN_BIT), /* hashed into ->cancel_hash_locked, protected by ->uring_lock */ @@ -579,13 +576,7 @@ struct io_kiocb { struct io_task_work io_task_work; unsigned nr_tw; /* for polled requests, i.e. IORING_OP_POLL_ADD and async armed poll */ - union { - struct hlist_node hash_node; - struct { - u64 extra1; - u64 extra2; - }; - }; + struct hlist_node hash_node; /* internal polling, see IORING_FEAT_FAST_POLL */ struct async_poll *apoll; /* opcode allocated if it needs to store data for async defer */ @@ -595,6 +586,11 @@ struct io_kiocb { /* custom credentials, valid IFF REQ_F_CREDS is set */ const struct cred *creds; struct io_wq_work work; + + struct { + u64 extra1; + u64 extra2; + } big_cqe; }; struct io_overflow_cqe { diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 4d27655be3a6..20b46e64cc07 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -807,13 +807,10 @@ static bool io_cqring_event_overflow(struct io_ring_ctx *ctx, u64 user_data, void io_req_cqe_overflow(struct io_kiocb *req) { - if (!(req->flags & REQ_F_CQE32_INIT)) { - req->extra1 = 0; - req->extra2 = 0; - } io_cqring_event_overflow(req->ctx, req->cqe.user_data, req->cqe.res, req->cqe.flags, - req->extra1, req->extra2); + req->big_cqe.extra1, req->big_cqe.extra2); + memset(&req->big_cqe, 0, sizeof(req->big_cqe)); } /* @@ -1057,6 +1054,7 @@ static void io_preinit_req(struct io_kiocb *req, struct io_ring_ctx *ctx) req->async_data = NULL; /* not necessary, but safer to zero */ memset(&req->cqe, 0, sizeof(req->cqe)); + memset(&req->big_cqe, 0, sizeof(req->big_cqe)); } static void io_flush_cached_locked_reqs(struct io_ring_ctx *ctx, diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 465598223386..9b5dfb6ef484 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -148,21 +148,12 @@ static inline bool io_fill_cqe_req(struct io_ring_ctx *ctx, struct io_kiocb *req if (trace_io_uring_complete_enabled()) trace_io_uring_complete(req->ctx, req, req->cqe.user_data, req->cqe.res, req->cqe.flags, - (req->flags & REQ_F_CQE32_INIT) ? req->extra1 : 0, - (req->flags & REQ_F_CQE32_INIT) ? req->extra2 : 0); + req->big_cqe.extra1, req->big_cqe.extra2); memcpy(cqe, &req->cqe, sizeof(*cqe)); - if (ctx->flags & IORING_SETUP_CQE32) { - u64 extra1 = 0, extra2 = 0; - - if (req->flags & REQ_F_CQE32_INIT) { - extra1 = req->extra1; - extra2 = req->extra2; - } - - WRITE_ONCE(cqe->big_cqe[0], extra1); - WRITE_ONCE(cqe->big_cqe[1], extra2); + memcpy(cqe->big_cqe, &req->big_cqe, sizeof(*cqe)); + memset(&req->big_cqe, 0, sizeof(req->big_cqe)); } return true; } diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index 8e7a03c1b20e..537795fddc87 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -43,9 +43,8 @@ EXPORT_SYMBOL_GPL(io_uring_cmd_do_in_task_lazy); static inline void io_req_set_cqe32_extra(struct io_kiocb *req, u64 extra1, u64 extra2) { - req->extra1 = extra1; - req->extra2 = extra2; - req->flags |= REQ_F_CQE32_INIT; + req->big_cqe.extra1 = extra1; + req->big_cqe.extra2 = extra2; } /* From patchwork Tue Aug 15 17:31:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354009 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F890C04FE2 for ; Tue, 15 Aug 2023 17:34:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238900AbjHORde (ORCPT ); Tue, 15 Aug 2023 13:33:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238922AbjHORdV (ORCPT ); Tue, 15 Aug 2023 13:33:21 -0400 Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [IPv6:2a00:1450:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9CD181BE2 for ; Tue, 15 Aug 2023 10:33:19 -0700 (PDT) Received: by mail-ed1-x52a.google.com with SMTP id 4fb4d7f45d1cf-52164adea19so7373245a12.1 for ; Tue, 15 Aug 2023 10:33:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120798; x=1692725598; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sKStySJbeJj78mrIoTJYsb3Dd14PzI0c1EBmjN6i/cc=; b=C1+H39Q09sPqfZTf0P36rdVSFHw0vSGCKy6IgunC17pgd0z3Go59bZaS3Y1AjJuJ2G v7B0fK9QevzWyyUVn6XkAKFbshaxy5zTwGEuWJKh37Qp902q209EG+wKGob/kLQXLQ/S 4Q5I1xfidNHuDqmTQIznNToopCc60ikvOjnKJA7k+GZlyX+d65f5DbrW3pK1c1qDOTyQ HRRsxZpC8ibFRfQKVB1rsr1u2DsedTwbpDyWloWmDkBP+DOgjdail+8NHZ8hBQklSuOr 9FvP8DsDKcjYBJ//OVbDOKpCUsR0QjKaIGxMtAVviNbLZ7PwTjSwSuRWC6zbaM4jacsF Euvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120798; x=1692725598; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sKStySJbeJj78mrIoTJYsb3Dd14PzI0c1EBmjN6i/cc=; b=R7Ee/eh8ItHfVjc9fS3UBjpriHs6nZkhQtcnpfFJL7GcAB842A+wV3gD/A4j7FvKo5 3M5FtXs13vPOZ5w7GsGnfcKbN1HDEIyEUFKpzMxoevNGQJO7TKqsWsi8pzFAiVsisxZK oGGxMI/udhajBVrPS7DrssQlbkujqnd2lVVnCzrB99AdZk7CYzKy0QWVyvl/oM74GLon n8pxMnPjgII9YfyDstcBOJTlZFO8tKLf+KzNd3lkdyOsnT6JCtSn5XhG/05zPll881sY IeWsG+7TdTcgQTO2hLqYZL6dJumnndHG5rKV+PnwiCQxfV5S2sOMEIO1+dmnUVt/0qHV +3Mg== X-Gm-Message-State: AOJu0YxnE7Rgnn8TwmwjtES5UCF4aN9RF0jKVwNT/54/wQCb1XLtzmCR tidzWcIWWHGmCPUFT9Y2Q24X49SapVU= X-Google-Smtp-Source: AGHT+IFY2mHgTXOaiLJSQPPd7U/s56LiSSkAGf1Vv1+lX+/6D2zT25Q/MFMyYwsqOYXPs3EwJ2XQ4g== X-Received: by 2002:a17:907:2724:b0:993:d536:3cb7 with SMTP id d4-20020a170907272400b00993d5363cb7mr10013824ejl.11.1692120797669; Tue, 15 Aug 2023 10:33:17 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:17 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 04/16] io_uring: refactor __io_get_cqe() Date: Tue, 15 Aug 2023 18:31:33 +0100 Message-ID: <988404c49503827bdf705a14fed5d9c3e95383af.1692119257.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Make __io_get_cqe simpler by not grabbing the cqe from refilled cached, but letting io_get_cqe() do it for us. That's cleaner and removes some duplication. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 13 ++++--------- io_uring/io_uring.h | 23 ++++++++++++----------- 2 files changed, 16 insertions(+), 20 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 20b46e64cc07..623d41755714 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -818,7 +818,7 @@ void io_req_cqe_overflow(struct io_kiocb *req) * control dependency is enough as we're using WRITE_ONCE to * fill the cq entry */ -struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx, bool overflow) +bool io_cqe_cache_refill(struct io_ring_ctx *ctx, bool overflow) { struct io_rings *rings = ctx->rings; unsigned int off = ctx->cached_cq_tail & (ctx->cq_entries - 1); @@ -830,7 +830,7 @@ struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx, bool overflow) * Force overflow the completion. */ if (!overflow && (ctx->check_cq & BIT(IO_CHECK_CQ_OVERFLOW_BIT))) - return NULL; + return false; /* userspace may cheat modifying the tail, be safe and do min */ queued = min(__io_cqring_events(ctx), ctx->cq_entries); @@ -838,7 +838,7 @@ struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx, bool overflow) /* we need a contiguous range, limit based on the current array offset */ len = min(free, ctx->cq_entries - off); if (!len) - return NULL; + return false; if (ctx->flags & IORING_SETUP_CQE32) { off <<= 1; @@ -847,12 +847,7 @@ struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx, bool overflow) ctx->cqe_cached = &rings->cqes[off]; ctx->cqe_sentinel = ctx->cqe_cached + len; - - ctx->cached_cq_tail++; - ctx->cqe_cached++; - if (ctx->flags & IORING_SETUP_CQE32) - ctx->cqe_cached++; - return &rings->cqes[off]; + return true; } static bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data, s32 res, diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 9b5dfb6ef484..9c80d20fe18f 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -38,7 +38,7 @@ enum { IOU_STOP_MULTISHOT = -ECANCELED, }; -struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx, bool overflow); +bool io_cqe_cache_refill(struct io_ring_ctx *ctx, bool overflow); void io_req_cqe_overflow(struct io_kiocb *req); int io_run_task_work_sig(struct io_ring_ctx *ctx); void io_req_defer_failed(struct io_kiocb *req, s32 res); @@ -112,19 +112,20 @@ static inline void io_req_task_work_add(struct io_kiocb *req) static inline struct io_uring_cqe *io_get_cqe_overflow(struct io_ring_ctx *ctx, bool overflow) { - io_lockdep_assert_cq_locked(ctx); + struct io_uring_cqe *cqe; - if (likely(ctx->cqe_cached < ctx->cqe_sentinel)) { - struct io_uring_cqe *cqe = ctx->cqe_cached; + io_lockdep_assert_cq_locked(ctx); - ctx->cached_cq_tail++; - ctx->cqe_cached++; - if (ctx->flags & IORING_SETUP_CQE32) - ctx->cqe_cached++; - return cqe; + if (unlikely(ctx->cqe_cached >= ctx->cqe_sentinel)) { + if (unlikely(!io_cqe_cache_refill(ctx, overflow))) + return NULL; } - - return __io_get_cqe(ctx, overflow); + cqe = ctx->cqe_cached; + ctx->cached_cq_tail++; + ctx->cqe_cached++; + if (ctx->flags & IORING_SETUP_CQE32) + ctx->cqe_cached++; + return cqe; } static inline struct io_uring_cqe *io_get_cqe(struct io_ring_ctx *ctx) From patchwork Tue Aug 15 17:31:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354019 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97E35C07E8F for ; Tue, 15 Aug 2023 17:34:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238912AbjHORdh (ORCPT ); Tue, 15 Aug 2023 13:33:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49688 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238925AbjHORdW (ORCPT ); Tue, 15 Aug 2023 13:33:22 -0400 Received: from mail-lj1-x233.google.com (mail-lj1-x233.google.com [IPv6:2a00:1450:4864:20::233]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 31E791BD9 for ; Tue, 15 Aug 2023 10:33:20 -0700 (PDT) Received: by mail-lj1-x233.google.com with SMTP id 38308e7fff4ca-2b9db1de50cso85722541fa.3 for ; Tue, 15 Aug 2023 10:33:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120798; x=1692725598; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HnxxpondMX0aLczSzWm6JuyWlbpdwrNND3OM3Yt44kY=; b=bt4h+8NXY/G4i9d+eX7RH9cNGMrWlMWbbwyXKC0SZcreepSmsxZo66oAIosSkC2Uah zIaOxwDNUxLPk4RvfkfMMhvAhS33Qjh39RehSF61HJhG5Za4yXzki6TeElmrZTnFIuDJ vRGDGu7SqNb8z6peIjpX7eXjcsSKMxYEBYatfREnul+Ama5QKfVJsaHYN/90pgrRT+/+ Ctx1IrfoDcSglOJMuX+y5u5mRlRyBr+NzC/OCOGWn1IZuvl4H9XZRmSuXUo/bCZe0l60 UUXoVWlKeMFIInb1Hjhd6XMAXYQAQ2O7SWIcXmobeH2wGmK5zvprwm5nNz5czIjey160 7vBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120798; x=1692725598; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HnxxpondMX0aLczSzWm6JuyWlbpdwrNND3OM3Yt44kY=; b=LJ+3Zh1uE1mqL/1rHLY2dhYsXER5DGWNPmSWjCxlrGrbfUnVWMwaic/pxYFQSbNrZj zk9nYaKkFf7Gekg6Q+C4l9qZiAIfi7RdGXV/nLPiwvyb9u7XaX3FCu6AyQhMP4Z4AJG6 chnE2oNOMo9CO0M5xwG2KJ2qLqfHfcpkQfT+ZM1x+6TtIdh77g1hYFV6XbzMb6pIlU4V mecbh6kkXpjE8aQZ/GJvrtIX9XVt2V/naBMSaTvXfGxmDAaqtrqCjaig7HnxEY2v4YXo XPwjzqorvXF73sZ013QWCpMGAz0AKuCtgs3RcK/ZpNBHr7K5JqbuEQo7iCpfDi+nejBK fbyQ== X-Gm-Message-State: AOJu0YwY0rZmuqGn6sECU6PsqVab2Eh3f1i19Hoi5gimiXVxv7loB1I2 fGZlQipsIZ/y4mXRDcxpsd4qNA0CA9w= X-Google-Smtp-Source: AGHT+IHHS8EoiYTjqNICdruoUOWF1Esbx++ib8DRNvKYQM9phZU8dwzYsYgfdiLwg4i5JNiYHGGb9w== X-Received: by 2002:a2e:9b98:0:b0:2b9:5b06:b73c with SMTP id z24-20020a2e9b98000000b002b95b06b73cmr9053738lji.17.1692120798131; Tue, 15 Aug 2023 10:33:18 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:17 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 05/16] io_uring: optimise extra io_get_cqe null check Date: Tue, 15 Aug 2023 18:31:34 +0100 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org If the cached cqe check passes in io_get_cqe*() it already means that the cqe we return is valid and non-zero, however the compiler is unable to optimise null checks like in io_fill_cqe_req(). Do a bit of trickery, return success/fail boolean from io_get_cqe*() and store cqe in the cqe parameter. That makes it do the right thing, erasing the check together with the introduced indirection. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 7 +++---- io_uring/io_uring.h | 20 +++++++++----------- 2 files changed, 12 insertions(+), 15 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 623d41755714..e5378dc7aa19 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -683,10 +683,10 @@ static void __io_cqring_overflow_flush(struct io_ring_ctx *ctx) io_cq_lock(ctx); while (!list_empty(&ctx->cq_overflow_list)) { - struct io_uring_cqe *cqe = io_get_cqe_overflow(ctx, true); + struct io_uring_cqe *cqe; struct io_overflow_cqe *ocqe; - if (!cqe) + if (!io_get_cqe_overflow(ctx, &cqe, true)) break; ocqe = list_first_entry(&ctx->cq_overflow_list, struct io_overflow_cqe, list); @@ -862,8 +862,7 @@ static bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data, s32 res, * submission (by quite a lot). Increment the overflow count in * the ring. */ - cqe = io_get_cqe(ctx); - if (likely(cqe)) { + if (likely(io_get_cqe(ctx, &cqe))) { trace_io_uring_complete(ctx, NULL, user_data, res, cflags, 0, 0); WRITE_ONCE(cqe->user_data, user_data); diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 9c80d20fe18f..2960e35b32a5 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -109,28 +109,27 @@ static inline void io_req_task_work_add(struct io_kiocb *req) #define io_for_each_link(pos, head) \ for (pos = (head); pos; pos = pos->link) -static inline struct io_uring_cqe *io_get_cqe_overflow(struct io_ring_ctx *ctx, - bool overflow) +static inline bool io_get_cqe_overflow(struct io_ring_ctx *ctx, + struct io_uring_cqe **ret, + bool overflow) { - struct io_uring_cqe *cqe; - io_lockdep_assert_cq_locked(ctx); if (unlikely(ctx->cqe_cached >= ctx->cqe_sentinel)) { if (unlikely(!io_cqe_cache_refill(ctx, overflow))) - return NULL; + return false; } - cqe = ctx->cqe_cached; + *ret = ctx->cqe_cached; ctx->cached_cq_tail++; ctx->cqe_cached++; if (ctx->flags & IORING_SETUP_CQE32) ctx->cqe_cached++; - return cqe; + return true; } -static inline struct io_uring_cqe *io_get_cqe(struct io_ring_ctx *ctx) +static inline bool io_get_cqe(struct io_ring_ctx *ctx, struct io_uring_cqe **ret) { - return io_get_cqe_overflow(ctx, false); + return io_get_cqe_overflow(ctx, ret, false); } static inline bool io_fill_cqe_req(struct io_ring_ctx *ctx, struct io_kiocb *req) @@ -142,8 +141,7 @@ static inline bool io_fill_cqe_req(struct io_ring_ctx *ctx, struct io_kiocb *req * submission (by quite a lot). Increment the overflow count in * the ring. */ - cqe = io_get_cqe(ctx); - if (unlikely(!cqe)) + if (unlikely(!io_get_cqe(ctx, &cqe))) return false; if (trace_io_uring_complete_enabled()) From patchwork Tue Aug 15 17:31:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E67DAC04FE0 for ; Tue, 15 Aug 2023 17:34:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238889AbjHORde (ORCPT ); Tue, 15 Aug 2023 13:33:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238924AbjHORdW (ORCPT ); Tue, 15 Aug 2023 13:33:22 -0400 Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A525F11A for ; Tue, 15 Aug 2023 10:33:20 -0700 (PDT) Received: by mail-ej1-x636.google.com with SMTP id a640c23a62f3a-99cdb0fd093so768669466b.1 for ; Tue, 15 Aug 2023 10:33:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120799; x=1692725599; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gygTe6VeHXFhmmjNogX4unEtGJquuz2w9Xe1TMZyLMU=; b=pW7+m4oIczI1BeE3NqdHnc48XY8fb5nb0pw7TcgU/6ACbYS1NhbXMp3Oq2/t9lzbDF rJdSe53UE7KeEyQnWrT8QC8fMvSo3SUUCzL6qEnHmy3JqKbRxV2cuBsOrEzHZJifwPiH tkkdxCZPERySS/6x6aHbIUvBqi+m0mIyuxlz33pMpnT9kQh7JzeBKz9cjl+jj9ugMuT5 xpcEeOloKLfeFznUdZVeVmTnUQGWmuqZqdyzI5XES6eVZ11OQ7b0zt1D1MECoyJTVmg9 gmIXAKPW+NSTSau2/h4jPrQB5hiEAQGBKqbHq7uzZLQyQbPGi65n/0r9O0dCu4MqdUQ8 fEGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120799; x=1692725599; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gygTe6VeHXFhmmjNogX4unEtGJquuz2w9Xe1TMZyLMU=; b=HVMxZuRLkI9Ac7pYEoGT68t3B0gU7THVV4iynHtLFejh75LLSvEYdZSGaZGw6nRtQ8 LUR9ZMLg4ZWGL+IpzSCmVqele+4Qfo4ltx10l2zwXzUMnmjP4HL3GW0phTc2Bi3AxuNy IG1DDT+TVN2KgOWYKbzPU3sgrUV46wbIJA01L/sCGNv1Kr5CTPdqseW9jNs3K2+R4j48 ObpBI7tCU0Nrlh1HaXwyeV6X9kfJO1rp1ol+e5O63U6DT0ei/MPEoqjMNJGObZJuwTTd NuGAc28Z9gz4eZtZ1BxrMJ7Cn2U7qJ5MfcEuq40vE6CpaJYMimFj3UaYpQfWo58zk3vS 347g== X-Gm-Message-State: AOJu0YxQGC5VMdZRqtf+p3Q7SIWTkXtenQQtv0n2yN6UoCfSCwVb593M tt+2f1dhOizjT2IZJj5njCmcSEO1qWc= X-Google-Smtp-Source: AGHT+IH/+cQH/gA0Y7DGSxxQYgSheaH4+RJqtacyS3NnMVEUFED/K6pS2uBLRGy1okr5HokqK3xSGA== X-Received: by 2002:a17:906:9ca:b0:994:56d3:8a42 with SMTP id r10-20020a17090609ca00b0099456d38a42mr11083359eje.27.1692120798799; Tue, 15 Aug 2023 10:33:18 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:18 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 06/16] io_uring: reorder cqring_flush and wakeups Date: Tue, 15 Aug 2023 18:31:35 +0100 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Unlike in the past, io_commit_cqring_flush() doesn't do anything that may need io_cqring_wake() to be issued after, all requests it completes will go via task_work. Do io_commit_cqring_flush() after io_cqring_wake() to clean up __io_cq_unlock_post(). Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 14 +++----------- io_uring/rw.c | 2 +- 2 files changed, 4 insertions(+), 12 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index e5378dc7aa19..8d27d2a2e893 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -629,19 +629,11 @@ static inline void io_cq_lock(struct io_ring_ctx *ctx) static inline void __io_cq_unlock_post(struct io_ring_ctx *ctx) { io_commit_cqring(ctx); - - if (ctx->task_complete) { - /* - * ->task_complete implies that only current might be waiting - * for CQEs, and obviously, we currently don't. No one is - * waiting, wakeups are futile, skip them. - */ - io_commit_cqring_flush(ctx); - } else { + if (!ctx->task_complete) { spin_unlock(&ctx->completion_lock); - io_commit_cqring_flush(ctx); io_cqring_wake(ctx); } + io_commit_cqring_flush(ctx); } static void io_cq_unlock_post(struct io_ring_ctx *ctx) @@ -649,8 +641,8 @@ static void io_cq_unlock_post(struct io_ring_ctx *ctx) { io_commit_cqring(ctx); spin_unlock(&ctx->completion_lock); - io_commit_cqring_flush(ctx); io_cqring_wake(ctx); + io_commit_cqring_flush(ctx); } /* Returns true if there are no backlogged entries after the flush */ diff --git a/io_uring/rw.c b/io_uring/rw.c index 9b51afdae505..20140d3505f1 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -985,9 +985,9 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags) static void io_cqring_ev_posted_iopoll(struct io_ring_ctx *ctx) { - io_commit_cqring_flush(ctx); if (ctx->flags & IORING_SETUP_SQPOLL) io_cqring_wake(ctx); + io_commit_cqring_flush(ctx); } void io_rw_fail(struct io_kiocb *req) From patchwork Tue Aug 15 17:31:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354011 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36AEAC07E8A for ; Tue, 15 Aug 2023 17:34:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238903AbjHORdf (ORCPT ); Tue, 15 Aug 2023 13:33:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238927AbjHORdW (ORCPT ); Tue, 15 Aug 2023 13:33:22 -0400 Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17FB71BD1 for ; Tue, 15 Aug 2023 10:33:21 -0700 (PDT) Received: by mail-ej1-x630.google.com with SMTP id a640c23a62f3a-99bc9e3cbf1so1174802366b.0 for ; Tue, 15 Aug 2023 10:33:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120799; x=1692725599; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Inlah7eauKNxkRnddZu2S6oogzyAkD4Jk1qThVpdDDw=; b=fU0+PbX9+1kfkD1A1K1SBDmsyx7zzer8Lb4YAb97QRnH+3byu20f7vJPLYKaP8/omo 4uHv+rc7XyPDa+YhpohhnAVKTqcGfOokhOMrsEUAyKFLTb6QXW3S2Kg4J5K0YAsBhDWv /StgOYXiFt8wchhv9nvYSqzdWVXZF2IG7Pihywqb2EABL0TyjLbhSLFP85PTLjJHLllo sst+CtxQtt5HQ2WVHxUtC3h3ICocESAycsoJHRmTq2BDAblWEyXtlrVFgq2EqJAx6vru GF5WtH3fewGOWW6PuMa7B8Y3RoaNk381Kn3GgAvEkdFLBdhKWiR0S2ic8AIWml3ah9jO nRZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120799; x=1692725599; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Inlah7eauKNxkRnddZu2S6oogzyAkD4Jk1qThVpdDDw=; b=WnvT1a0L2Di5v3uN/YrkR08Xd9Qi693wCEjIbBdFoAz/6Lud+zrGq1ZRb+3v63QwHr LEiiGtz9eQoByyjkbGayXyIiElVuMfd39QMWvOmSGEolBzVOAziwF3FrJZ2qbJprDAiH 3UEF73Wme8xWuOUXHf1ZuhtjjMfdtwdsuREpABUWVibpret0jWYQh/x4l2pr9iXHqPtD E/Zvh5qHBIeDxdP51iCx4DP1iU3HmuQwrcHGGqm1T0PMT0+naZlfHNIfeN+LUW6XH2Vm bmD6Y1+IjAfdaAh1R/Nz81NI0JTsSZnd8cTZsOavPOu+LZjYjABxFXIPa2/SBasMuXKs DyqA== X-Gm-Message-State: AOJu0YyAOBwYjfsY8aT26fBH0gqkq4IlKq/cbKTWRZKaA6faNBCWMMhe QSWsSpbZvltTmGqAuV2vUXjCLPXyh08= X-Google-Smtp-Source: AGHT+IHcfAvH0QMGpmqma4f9whcoxFycIW2u7CEhtBTebZ7OF8SZ3URCNF9gM0uUDHJq4rZ9eWYYBg== X-Received: by 2002:a17:907:d94:b0:99b:fcd7:2d7a with SMTP id go20-20020a1709070d9400b0099bfcd72d7amr3119388ejc.34.1692120799320; Tue, 15 Aug 2023 10:33:19 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:19 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 07/16] io_uring: merge iopoll and normal completion paths Date: Tue, 15 Aug 2023 18:31:36 +0100 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org io_do_iopoll() and io_submit_flush_completions() are pretty similar, both filling CQEs and then free a list of requests. Don't duplicate it and make iopoll use __io_submit_flush_completions(), which also helps with inlining and other optimisations. For that, we need to first find all completed iopoll requests and splice them from the iopoll list and then pass it down. This adds one extra list traversal, which should be fine as requests will stay hot in cache. CQ locking is already conditional, introduce ->lockless_cq and skip locking for IOPOLL as it's protected by ->uring_lock. We also add a wakeup optimisation for IOPOLL to __io_cq_unlock_post(), so it works just like io_cqring_ev_posted_iopoll(). Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 1 + io_uring/io_uring.c | 18 ++++++++++++------ io_uring/io_uring.h | 2 +- io_uring/rw.c | 24 +++++------------------- 4 files changed, 19 insertions(+), 26 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 9795eda529f7..c0c03d8059df 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -205,6 +205,7 @@ struct io_ring_ctx { unsigned int has_evfd: 1; /* all CQEs should be posted only by the submitter task */ unsigned int task_complete: 1; + unsigned int lockless_cq: 1; unsigned int syscall_iopoll: 1; unsigned int poll_activated: 1; unsigned int drain_disabled: 1; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 8d27d2a2e893..204c6a31c5d1 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -147,7 +147,6 @@ static bool io_uring_try_cancel_requests(struct io_ring_ctx *ctx, bool cancel_all); static void io_queue_sqe(struct io_kiocb *req); -static void __io_submit_flush_completions(struct io_ring_ctx *ctx); struct kmem_cache *req_cachep; @@ -616,7 +615,7 @@ void __io_commit_cqring_flush(struct io_ring_ctx *ctx) static inline void __io_cq_lock(struct io_ring_ctx *ctx) { - if (!ctx->task_complete) + if (!ctx->lockless_cq) spin_lock(&ctx->completion_lock); } @@ -630,8 +629,11 @@ static inline void __io_cq_unlock_post(struct io_ring_ctx *ctx) { io_commit_cqring(ctx); if (!ctx->task_complete) { - spin_unlock(&ctx->completion_lock); - io_cqring_wake(ctx); + if (!ctx->lockless_cq) + spin_unlock(&ctx->completion_lock); + /* IOPOLL rings only need to wake up if it's also SQPOLL */ + if (!ctx->syscall_iopoll) + io_cqring_wake(ctx); } io_commit_cqring_flush(ctx); } @@ -1485,7 +1487,8 @@ void io_queue_next(struct io_kiocb *req) io_req_task_queue(nxt); } -void io_free_batch_list(struct io_ring_ctx *ctx, struct io_wq_work_node *node) +static void io_free_batch_list(struct io_ring_ctx *ctx, + struct io_wq_work_node *node) __must_hold(&ctx->uring_lock) { do { @@ -1522,7 +1525,7 @@ void io_free_batch_list(struct io_ring_ctx *ctx, struct io_wq_work_node *node) } while (node); } -static void __io_submit_flush_completions(struct io_ring_ctx *ctx) +void __io_submit_flush_completions(struct io_ring_ctx *ctx) __must_hold(&ctx->uring_lock) { struct io_submit_state *state = &ctx->submit_state; @@ -3836,6 +3839,9 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, !(ctx->flags & IORING_SETUP_SQPOLL)) ctx->task_complete = true; + if (ctx->task_complete || (ctx->flags & IORING_SETUP_IOPOLL)) + ctx->lockless_cq = true; + /* * lazy poll_wq activation relies on ->task_complete for synchronisation * purposes, see io_activate_pollwq() diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 2960e35b32a5..07fd185064d2 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -72,7 +72,7 @@ int io_ring_add_registered_file(struct io_uring_task *tctx, struct file *file, int io_poll_issue(struct io_kiocb *req, struct io_tw_state *ts); int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr); int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin); -void io_free_batch_list(struct io_ring_ctx *ctx, struct io_wq_work_node *node); +void __io_submit_flush_completions(struct io_ring_ctx *ctx); int io_req_prep_async(struct io_kiocb *req); struct io_wq_work *io_wq_free_work(struct io_wq_work *work); diff --git a/io_uring/rw.c b/io_uring/rw.c index 20140d3505f1..0a1e515f0510 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -983,13 +983,6 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags) return ret; } -static void io_cqring_ev_posted_iopoll(struct io_ring_ctx *ctx) -{ - if (ctx->flags & IORING_SETUP_SQPOLL) - io_cqring_wake(ctx); - io_commit_cqring_flush(ctx); -} - void io_rw_fail(struct io_kiocb *req) { int res; @@ -1060,24 +1053,17 @@ int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin) if (!smp_load_acquire(&req->iopoll_completed)) break; nr_events++; - if (unlikely(req->flags & REQ_F_CQE_SKIP)) - continue; - req->cqe.flags = io_put_kbuf(req, 0); - if (unlikely(!io_fill_cqe_req(ctx, req))) { - spin_lock(&ctx->completion_lock); - io_req_cqe_overflow(req); - spin_unlock(&ctx->completion_lock); - } } - if (unlikely(!nr_events)) return 0; - io_commit_cqring(ctx); - io_cqring_ev_posted_iopoll(ctx); pos = start ? start->next : ctx->iopoll_list.first; wq_list_cut(&ctx->iopoll_list, prev, start); - io_free_batch_list(ctx, pos); + + if (WARN_ON_ONCE(!wq_list_empty(&ctx->submit_state.compl_reqs))) + return 0; + ctx->submit_state.compl_reqs.first = pos; + __io_submit_flush_completions(ctx); return nr_events; } From patchwork Tue Aug 15 17:31:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354014 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69C2AC07E8B for ; Tue, 15 Aug 2023 17:34:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238908AbjHORdg (ORCPT ); Tue, 15 Aug 2023 13:33:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238926AbjHORdW (ORCPT ); Tue, 15 Aug 2023 13:33:22 -0400 Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [IPv6:2a00:1450:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92BC01BDD for ; Tue, 15 Aug 2023 10:33:21 -0700 (PDT) Received: by mail-ej1-x631.google.com with SMTP id a640c23a62f3a-99d6d5054bcso1075501966b.1 for ; Tue, 15 Aug 2023 10:33:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120800; x=1692725600; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RiLtehMN9WT6sO8RFHE9JdGWLEysvvJYWKRAtymGPlA=; b=o/YFJuAYY9V81CQ/5UVni8MUEQ7yeMXeCSihqeEwRdd9R/5ykqBwuu40YVHELDCk0g nR2CqrcqAi0p6Wao/MnBkkkcWDHSUTmPhJZGbp/5EaEOtuEPcwIoNazJcdzzvxtAGCtv 4twox8HdsIyF5kgtZ+/3GF8DPcTxfFg0PJVmtPqf3isMyiaRSuAXouJO9Olv93q2+rpJ Wcv+ePyNDRETP9wh6kUwNjb7axWbT83jdB8URFRhQezdbLgv5sA/L/oLPBHJOb+x0jO2 V5FG0CYA7DaekU2VVQL1VmNi9RRYlPXs+CT6OkkA8oBHo23RWfnoTkReGcaAvbfcra4H +/iQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120800; x=1692725600; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RiLtehMN9WT6sO8RFHE9JdGWLEysvvJYWKRAtymGPlA=; b=VoxCjymwSgENeNavZibsQTDU8dvbR1JrxhvHtxd+uuf0AP1/HW9nhndqQUZY3jetMF bkdy+UR+OLhMJqifJlhbp+zKuILdmK92sqie1m1f80ieht0CBIEBG5S9GUiqmx8UG5/B oyW6ayDchU6+I5coBnEA9mwKyCUivgx9i8sIGaBimAdspjECyo/c4+cZYxstFqTSWAft HeqmLYljjOHu2dW6cP2YZcwasDR7G/Vf6H7POsTTOC33PsAhIOrNdo76BqIWzCNV5aso qhbok5BCbv41Nrozrfk9zDB/uqhjPMI04QCJsYkwG75Tl+JJtmL5WDgPQ2kaxMre7Dh5 ayLw== X-Gm-Message-State: AOJu0YzUa6kL9DDzNgdHQ/ygKwIETxuA4QPDIvPaVlPCo79emV4/wNjM 4V9N2F2qq+ArFvmAVK2uVn4Xltdg63M= X-Google-Smtp-Source: AGHT+IELz+jhyAzyNS7Y6zhqu8ogX+db2t0VUDByHfdRYuV+havxdiiWVlhPUZyJzz9gEWBwf23DMA== X-Received: by 2002:a17:907:928e:b0:99c:572:c0e4 with SMTP id bw14-20020a170907928e00b0099c0572c0e4mr2668642ejc.7.1692120799713; Tue, 15 Aug 2023 10:33:19 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:19 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 08/16] io_uring: compact SQ/CQ heads/tails Date: Tue, 15 Aug 2023 18:31:37 +0100 Message-ID: <5e3fade0f17f0357684536d77bc75e0028f2b62e.1692119257.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Queues heads and tails cache line aligned. That makes sq, cq taking 4 lines or 5 lines if we include the rest of struct io_rings (e.g. sq_flags is frequently accessed). Since modern io_uring is mostly single threaded, it doesn't make much send to sread them as such, it wastes space and puts additional pressure on caches. Put them all into a single line. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index c0c03d8059df..608a8e80e881 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -69,8 +69,8 @@ struct io_uring_task { }; struct io_uring { - u32 head ____cacheline_aligned_in_smp; - u32 tail ____cacheline_aligned_in_smp; + u32 head; + u32 tail; }; /* From patchwork Tue Aug 15 17:31:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354007 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D77BFC04A94 for ; Tue, 15 Aug 2023 17:34:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238876AbjHORdc (ORCPT ); Tue, 15 Aug 2023 13:33:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238930AbjHORdY (ORCPT ); Tue, 15 Aug 2023 13:33:24 -0400 Received: from mail-ej1-x629.google.com (mail-ej1-x629.google.com [IPv6:2a00:1450:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5F3111A for ; Tue, 15 Aug 2023 10:33:22 -0700 (PDT) Received: by mail-ej1-x629.google.com with SMTP id a640c23a62f3a-99c3d3c3db9so762524266b.3 for ; Tue, 15 Aug 2023 10:33:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120801; x=1692725601; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7fqoJ9wK3l0OutAHn1RjfCmGlPQMfmmuyh2e116EeUs=; b=Ec20YhQrAhRGJS7/7cgefR9F8GK9b+n689fQazNHEbooyjq843yEY8w4H1CTYepNi+ bERPSgc6lC8igefEHavXREJnnYCTJzgzlLAKzWCQoFGHJpbZ+QE7ERcY0zNiVTmS2EQu WT7F6+bTBod4FN/v0wd9bRU+KzNN9nPwc9tXh39ThvGNXWWMWQGUQLqDcTzIMPuM61Xl 7yn7+isezYzGMu1I6rMHU7jRDo+/o35/Mr1LeI6L9drDrrSv0bEyYHIDtXWf+Lz5kWlO bPeNoFPvL0MB37BvDkkO54pC7LOlR1R+c+Mi24ECJBPbydyVEotkVTtXKdcirqobYO+e 22Pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120801; x=1692725601; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7fqoJ9wK3l0OutAHn1RjfCmGlPQMfmmuyh2e116EeUs=; b=En8PHcy6aymDaSqqfjOm+6lpYLdJeydVvIPZxNdO0Twd7f5YRqJ9rMMKNyLpsE6IbU Rh9TsZXyBu+ClT29/wInDFbpHjMWCFRjH5cnX8VRJcV9o2secyXgHxHONNRI078Z+ylb Tr+CHXSkkU5sq5zl3HMA8b82Udi087ie9io3818LNZlljfqokTXYNkOa0Z357CcKcqYs +NAI3VNjQOj5EGl7d2jpqh7WmSoCg8733sgNLoOj6/YP+PZLlDAIrZ8GPDc9ZglDsV1e vC2uYEFg86KsUzkdCLgzEin9c/gd0iWp6el9RdTgxA0iQcDxx/scPB3v0JRUWQtG4Crp Ftuw== X-Gm-Message-State: AOJu0YzQ+ysfCATBNHH10IdEAAOYDQt3RmBmBBvBw2m5zJDYXZL1rK7A tvKgDpOfvO93dvxSIlvJSxUDsbOV+G8= X-Google-Smtp-Source: AGHT+IEyVpzkTgh1IlRRCQDze4BfGrhVmn2gIeyCj37LBTfB+TtUQIfqli5y51YsXbBPGgWUCt4sSw== X-Received: by 2002:a17:906:314c:b0:99b:64d0:f6c8 with SMTP id e12-20020a170906314c00b0099b64d0f6c8mr9946378eje.50.1692120800646; Tue, 15 Aug 2023 10:33:20 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:19 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 09/16] io_uring: add option to remove SQ indirection Date: Tue, 15 Aug 2023 18:31:38 +0100 Message-ID: <0672f81d64ffe9f91835e240d8fef7a72bd895ec.1692119257.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Not many aware, but io_uring submission queue has two levels. The first level usually appears as sq_array and stores indexes into the actual SQ. To my knowledge, no one has ever seriously used it, nor liburing exposes it to users. Add IORING_SETUP_NO_SQARRAY, when set we don't bother creating and using the sq_array and SQ heads/tails will be pointing directly into the SQ. Improves memory footprint, in term of both allocations as well as cache usage, and also should make io_get_sqe() less branchy in the end. Signed-off-by: Pavel Begunkov --- include/uapi/linux/io_uring.h | 5 ++++ io_uring/io_uring.c | 52 +++++++++++++++++++++-------------- 2 files changed, 37 insertions(+), 20 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 9fc7195f25df..f669b1ed33be 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -185,6 +185,11 @@ enum { */ #define IORING_SETUP_REGISTERED_FD_ONLY (1U << 15) +/* + * Disables the SQ index array. + */ +#define IORING_SETUP_NO_SQARRAY (1U << 16) + enum io_uring_op { IORING_OP_NOP, IORING_OP_READV, diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 204c6a31c5d1..ac6d1687ba6c 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2339,8 +2339,21 @@ static void io_commit_sqring(struct io_ring_ctx *ctx) */ static bool io_get_sqe(struct io_ring_ctx *ctx, const struct io_uring_sqe **sqe) { - unsigned head, mask = ctx->sq_entries - 1; - unsigned sq_idx = ctx->cached_sq_head++ & mask; + unsigned mask = ctx->sq_entries - 1; + unsigned head = ctx->cached_sq_head++ & mask; + + if (!(ctx->flags & IORING_SETUP_NO_SQARRAY)) { + head = READ_ONCE(ctx->sq_array[head]); + if (unlikely(head >= ctx->sq_entries)) { + /* drop invalid entries */ + spin_lock(&ctx->completion_lock); + ctx->cq_extra--; + spin_unlock(&ctx->completion_lock); + WRITE_ONCE(ctx->rings->sq_dropped, + READ_ONCE(ctx->rings->sq_dropped) + 1); + return false; + } + } /* * The cached sq head (or cq tail) serves two purposes: @@ -2350,22 +2363,12 @@ static bool io_get_sqe(struct io_ring_ctx *ctx, const struct io_uring_sqe **sqe) * 2) allows the kernel side to track the head on its own, even * though the application is the one updating it. */ - head = READ_ONCE(ctx->sq_array[sq_idx]); - if (likely(head < ctx->sq_entries)) { - /* double index for 128-byte SQEs, twice as long */ - if (ctx->flags & IORING_SETUP_SQE128) - head <<= 1; - *sqe = &ctx->sq_sqes[head]; - return true; - } - /* drop invalid entries */ - spin_lock(&ctx->completion_lock); - ctx->cq_extra--; - spin_unlock(&ctx->completion_lock); - WRITE_ONCE(ctx->rings->sq_dropped, - READ_ONCE(ctx->rings->sq_dropped) + 1); - return false; + /* double index for 128-byte SQEs, twice as long */ + if (ctx->flags & IORING_SETUP_SQE128) + head <<= 1; + *sqe = &ctx->sq_sqes[head]; + return true; } int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr) @@ -2734,6 +2737,12 @@ static unsigned long rings_size(struct io_ring_ctx *ctx, unsigned int sq_entries return SIZE_MAX; #endif + if (ctx->flags & IORING_SETUP_NO_SQARRAY) { + if (sq_offset) + *sq_offset = SIZE_MAX; + return off; + } + if (sq_offset) *sq_offset = off; @@ -3710,7 +3719,8 @@ static __cold int io_allocate_scq_urings(struct io_ring_ctx *ctx, return PTR_ERR(rings); ctx->rings = rings; - ctx->sq_array = (u32 *)((char *)rings + sq_array_offset); + if (!(ctx->flags & IORING_SETUP_NO_SQARRAY)) + ctx->sq_array = (u32 *)((char *)rings + sq_array_offset); rings->sq_ring_mask = p->sq_entries - 1; rings->cq_ring_mask = p->cq_entries - 1; rings->sq_ring_entries = p->sq_entries; @@ -3921,7 +3931,8 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, p->sq_off.ring_entries = offsetof(struct io_rings, sq_ring_entries); p->sq_off.flags = offsetof(struct io_rings, sq_flags); p->sq_off.dropped = offsetof(struct io_rings, sq_dropped); - p->sq_off.array = (char *)ctx->sq_array - (char *)ctx->rings; + if (!(ctx->flags & IORING_SETUP_NO_SQARRAY)) + p->sq_off.array = (char *)ctx->sq_array - (char *)ctx->rings; p->sq_off.resv1 = 0; if (!(ctx->flags & IORING_SETUP_NO_MMAP)) p->sq_off.user_addr = 0; @@ -4010,7 +4021,8 @@ static long io_uring_setup(u32 entries, struct io_uring_params __user *params) IORING_SETUP_COOP_TASKRUN | IORING_SETUP_TASKRUN_FLAG | IORING_SETUP_SQE128 | IORING_SETUP_CQE32 | IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN | - IORING_SETUP_NO_MMAP | IORING_SETUP_REGISTERED_FD_ONLY)) + IORING_SETUP_NO_MMAP | IORING_SETUP_REGISTERED_FD_ONLY | + IORING_SETUP_NO_SQARRAY)) return -EINVAL; return io_uring_create(entries, &p, params); From patchwork Tue Aug 15 17:31:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354013 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45C68C05052 for ; Tue, 15 Aug 2023 17:34:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238911AbjHORdh (ORCPT ); Tue, 15 Aug 2023 13:33:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238929AbjHORdX (ORCPT ); Tue, 15 Aug 2023 13:33:23 -0400 Received: from mail-ej1-x633.google.com (mail-ej1-x633.google.com [IPv6:2a00:1450:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF7A210EC for ; Tue, 15 Aug 2023 10:33:22 -0700 (PDT) Received: by mail-ej1-x633.google.com with SMTP id a640c23a62f3a-99d6d5054bcso1075506366b.1 for ; Tue, 15 Aug 2023 10:33:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120801; x=1692725601; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=m7df/JEGcQenNIMU2iBP11PVrs8ARCZ4IRK79yJbYn8=; b=OntxgMfOab9T8cCKnXI6DNZ+JS2nHJLrw8qdT6tomvOxWciklAmZMbvjkRxr9fzAFs y+JX9E3bc2qagb74TGT2pnTOAuTuIIo7lz0Wfno96teyWrzz5cedm6XXFeW0l26tZUhY M3j4FUYE1/B5OTVR0L8EmapMs5di7CCbJy9ihMlJg6akq9gq90jj6zz8PE/H+SIV7vVb eQT9f0muhy8NA4QWAlGPZiniUB9JaTETZepcL7HScfXjlHM2lgf5hLDZ7tchCiNPpLwf ywVZZaL7WtvI6N4PbS4/sFCNEjuIZKqYZSu+mZjGI5ByNxaAHX9rAWwHUlN8XPjMOPyh G7Gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120801; x=1692725601; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m7df/JEGcQenNIMU2iBP11PVrs8ARCZ4IRK79yJbYn8=; b=A8BYoKJhX7q/QYykupsSWR8iTjfP0m+vuMCTlyOop8vwpKAM7zDZg6Lj7QcK/kFrqM u8xZ57k52UGaIHcAsWXWR+2m2ZKUho0FpTgSbVkyZ867WDih9RKcBW30bcTXhtb9ZTdc 89R/bzKwzePyQt3dOjEKSG4/prZu8vxkW1pkGac2PPwfFJ2FoZEHvesbnfPV9DrpCzWU IV8XZrPpc3S+K/M0ctRryi39ZivH/mzlk8HsAXgI1JEd+cATqsLePkzHhOyF1gkthEND V7BsTTNnb2XlWkLgPFXF+KgWxoUa+H4ZUS/GmdWcF+RLfRkwAmk9EuaUX53XwUJVBeoO 1DOA== X-Gm-Message-State: AOJu0YwIVLOJzIN/M9kDNCxF1IR1eGdTRQvSSNN82Kt+HsH3FDVmZZio aSqgjdKLHy7Snvlq5EVoD+kbst03qn4= X-Google-Smtp-Source: AGHT+IG+WjlTYocLVvigMd90N5ap6Qa6sJK3aTaWc6YGS+Z4ZCE5ZlTFAIKsrmCngV+mp2J9RTTF2g== X-Received: by 2002:a17:907:6e0d:b0:988:8efc:54fa with SMTP id sd13-20020a1709076e0d00b009888efc54famr2594795ejc.37.1692120801118; Tue, 15 Aug 2023 10:33:21 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:20 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 10/16] io_uring: static_key for !IORING_SETUP_NO_SQARRAY Date: Tue, 15 Aug 2023 18:31:39 +0100 Message-ID: <9c166012c57091af1c23fdc33594e7197c43d66e.1692119257.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org At some point IORING_SETUP_NO_SQARRAY should become the default, so add a static_key to optimise out the chunk of io_get_sqe() dealing with sq_arrays. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index ac6d1687ba6c..c39606740c73 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -72,6 +72,7 @@ #include #include #include +#include #include #define CREATE_TRACE_POINTS @@ -148,6 +149,8 @@ static bool io_uring_try_cancel_requests(struct io_ring_ctx *ctx, static void io_queue_sqe(struct io_kiocb *req); +static __read_mostly DEFINE_STATIC_KEY_FALSE(io_key_has_sqarray); + struct kmem_cache *req_cachep; struct sock *io_uring_get_socket(struct file *file) @@ -2342,7 +2345,8 @@ static bool io_get_sqe(struct io_ring_ctx *ctx, const struct io_uring_sqe **sqe) unsigned mask = ctx->sq_entries - 1; unsigned head = ctx->cached_sq_head++ & mask; - if (!(ctx->flags & IORING_SETUP_NO_SQARRAY)) { + if (static_branch_unlikely(&io_key_has_sqarray) && + (!(ctx->flags & IORING_SETUP_NO_SQARRAY))) { head = READ_ONCE(ctx->sq_array[head]); if (unlikely(head >= ctx->sq_entries)) { /* drop invalid entries */ @@ -2871,6 +2875,9 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) #endif WARN_ON_ONCE(!list_empty(&ctx->ltimeout_list)); + if (!(ctx->flags & IORING_SETUP_NO_SQARRAY)) + static_branch_dec(&io_key_has_sqarray); + io_alloc_cache_free(&ctx->rsrc_node_cache, io_rsrc_node_cache_free); if (ctx->mm_account) { mmdrop(ctx->mm_account); @@ -3844,6 +3851,9 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, if (!ctx) return -ENOMEM; + if (!(ctx->flags & IORING_SETUP_NO_SQARRAY)) + static_branch_inc(&io_key_has_sqarray); + if ((ctx->flags & IORING_SETUP_DEFER_TASKRUN) && !(ctx->flags & IORING_SETUP_IOPOLL) && !(ctx->flags & IORING_SETUP_SQPOLL)) From patchwork Tue Aug 15 17:31:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354003 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 530ADC001B0 for ; Tue, 15 Aug 2023 17:34:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238874AbjHORdc (ORCPT ); Tue, 15 Aug 2023 13:33:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49750 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238931AbjHORdY (ORCPT ); Tue, 15 Aug 2023 13:33:24 -0400 Received: from mail-ed1-x52e.google.com (mail-ed1-x52e.google.com [IPv6:2a00:1450:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9145B1BD1 for ; Tue, 15 Aug 2023 10:33:23 -0700 (PDT) Received: by mail-ed1-x52e.google.com with SMTP id 4fb4d7f45d1cf-5256d74dab9so2392079a12.1 for ; Tue, 15 Aug 2023 10:33:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120802; x=1692725602; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QfCDFTGPqPDquTSsf5QwXyrmIi5jQ5+Y+5MLL1cbVZg=; b=AiZFS76S8pRRQUUf11IQTf4OJSn83kdnxwdBSSeWhiVRGNv8sXGvjOlnLVMfkiLrrU ccF1dHDrAxcD9KK02eAb5DENu0nx1Trvz1V2rS4R6+3gpowK/Z9y/TWi2YLx8mO9gnYs eUaMbdVHmUXm7jwUb/lTb6MxFxwUGDVzsmL4V2ywzosykH6gIcjBwztYbDiCNP4B1747 SoaPjInuy9UWksKonZOs15OEwupsxt9eXWAMVH95ldNI+B71DMoRoetMYc63V9RmwXdZ QmL0ryPmvcQgc3GI8ejrL5jLtWaQABPbZIFkGs3Lp9CrHFGYz1BuZsxqV3XefHHkncR0 GbFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120802; x=1692725602; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QfCDFTGPqPDquTSsf5QwXyrmIi5jQ5+Y+5MLL1cbVZg=; b=cvWQgKxd9tufKplZErXncM52J7iqQhfykenK7OOHy/fllP9+EYs2pfJg1u7uUj3Le9 sn8sOb2/GsTaY/F93ZDxlRtZJut2FCWATh4fsVlOPAg8zXSmIPu4u1UOJIYFZTEAlwZm 0H5WQo9s25Lxk4hMq0hftQ0b9xQCIfl2ST6+6dxbAm+oLqfOWNfL6C3BYFhdfEThldWW 7PKuHZm3TcBhQbultR71ztWgAJ0YtB7N4tv8ypZxKPn1M0u8+B8/FYJRKB5nzF7htg77 ROJF+HO/PrxQAy8RkA112NlHIwSs2MIMqZhPpQF9zFrDXNMGpFAgvXsqLJYNKv12ZqVf 8LRg== X-Gm-Message-State: AOJu0YxOCt3rr5TwGhkpdpZD5umrRDSk1xvwrsBYuOT3CWpX2zpVja1L 8bnUc/DWKEOMmVUx4GtYzxIbA2R2boQ= X-Google-Smtp-Source: AGHT+IGcsoqvmFp8/PzJdr4/H3FZpRG27YuovQgUBoGUH3IU9c8iIMVmqmgRZpc5/pvk7o+AA+IDpw== X-Received: by 2002:a17:906:118:b0:99d:dce8:2be5 with SMTP id 24-20020a170906011800b0099ddce82be5mr707264eje.1.1692120801812; Tue, 15 Aug 2023 10:33:21 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:21 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 11/16] io_uring: move non aligned field to the end Date: Tue, 15 Aug 2023 18:31:40 +0100 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Move not cache aligned fields down in io_ring_ctx, should change anything, but makes further refactoring easier. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 36 +++++++++++++++++----------------- 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 608a8e80e881..ad87d6074fb2 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -270,24 +270,6 @@ struct io_ring_ctx { struct io_alloc_cache netmsg_cache; } ____cacheline_aligned_in_smp; - /* IRQ completion list, under ->completion_lock */ - struct io_wq_work_list locked_free_list; - unsigned int locked_free_nr; - - const struct cred *sq_creds; /* cred used for __io_sq_thread() */ - struct io_sq_data *sq_data; /* if using sq thread polling */ - - struct wait_queue_head sqo_sq_wait; - struct list_head sqd_list; - - unsigned long check_cq; - - unsigned int file_alloc_start; - unsigned int file_alloc_end; - - struct xarray personalities; - u32 pers_next; - struct { /* * We cache a range of free CQEs we can use, once exhausted it @@ -332,6 +314,24 @@ struct io_ring_ctx { unsigned cq_last_tm_flush; } ____cacheline_aligned_in_smp; + /* IRQ completion list, under ->completion_lock */ + struct io_wq_work_list locked_free_list; + unsigned int locked_free_nr; + + const struct cred *sq_creds; /* cred used for __io_sq_thread() */ + struct io_sq_data *sq_data; /* if using sq thread polling */ + + struct wait_queue_head sqo_sq_wait; + struct list_head sqd_list; + + unsigned long check_cq; + + unsigned int file_alloc_start; + unsigned int file_alloc_end; + + struct xarray personalities; + u32 pers_next; + /* Keep this last, we don't need it for the fast path */ struct wait_queue_head poll_wq; struct io_restriction restrictions; From patchwork Tue Aug 15 17:31:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354012 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5AA64C07E8C for ; Tue, 15 Aug 2023 17:34:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238778AbjHORdg (ORCPT ); Tue, 15 Aug 2023 13:33:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238932AbjHORdZ (ORCPT ); Tue, 15 Aug 2023 13:33:25 -0400 Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5376E1BD9 for ; Tue, 15 Aug 2023 10:33:24 -0700 (PDT) Received: by mail-ej1-x632.google.com with SMTP id a640c23a62f3a-99bf9252eddso796197666b.3 for ; Tue, 15 Aug 2023 10:33:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120802; x=1692725602; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8s21GtQ/YbH9JV5FC5+yZFO7uO9zKtUmbFucfHazoQo=; b=jZTePif5hLr7Du/bHC93UhwOSoZToX2ZdWLFZyn+NXRi8IGgAu2K4qBV1rPTaW9EHq JD8ym+sAAfTUESy7IQ13hgbeMZouqfuXOuhH9+c/xJZo7KqzkeuXnqfaiX7y1S7fZupn tunECETFr+mnxWKUV4VG2iivVSiiqT/kCAcaBc/Imy6AbeC9P6aj9fqJESG4SR2dAT7m EoPHExVDynfoiQ09fq4Vy+dZSaiWIdybNYTtAO25X3gIHAFDl7IClCPD6N4MH7E+bFlP DpAMy+gW6KBTpoUU+CxXlqtShgrzoBzPslK8/htEKR279l13h9WR5KYQPhe38nZiRLaG KUdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120802; x=1692725602; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8s21GtQ/YbH9JV5FC5+yZFO7uO9zKtUmbFucfHazoQo=; b=DegOl+TNVIlk8ybHcCsZq6O3mVNmjH3zTZfT6JGikRA9EX1Jk2+GxEtB65RyaFOplQ UzqQDfauNSjeEBQhga2+muByebiDcsfUochx1v5KPpYp72PHM6m8/HNc79HSKQg/7h2Y MCx07U++84Taey05bI6TvfB+ci4au1+zhEzgf2DPRC3mZthOFXw7OCrCjOhFUNUUnElg 8vkGuqI94bzREfCsTQMfflisRQP0QeTCqCklrLV9tBPwfD32/pC1ivU/a09GAse6f3V7 6vN0mZmjQDWN5NC06fzIv42OsH9frlzEpbJTDUKFMsW92li3ny7KDCe3ch9t3URCEYQK 2kMw== X-Gm-Message-State: AOJu0Yx+5upZ2BWHRESNwE7u9BOi+oiselkkK9/AlFzp/F1WPIXpeAIo HGYNDlS2EzWrKbl7mXHsOO4DszTaeeo= X-Google-Smtp-Source: AGHT+IGCkifbv7HEhmHNQYJE+b203OiBXropl++EB71u4I4OuXfndg5OQChgfFW+pr1DPooaM9krpw== X-Received: by 2002:a17:906:217:b0:99d:dc60:7108 with SMTP id 23-20020a170906021700b0099ddc607108mr1007608ejd.63.1692120802596; Tue, 15 Aug 2023 10:33:22 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:21 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 12/16] io_uring: banish non-hot data to end of io_ring_ctx Date: Tue, 15 Aug 2023 18:31:41 +0100 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Let's move all slow path, setup/init and so on fields to the end of io_ring_ctx, that makes ctx reorganisation later easier. That includes, page arrays used only on tear down, CQ overflow list, old provided buffer caches and used by io-wq poll hashes. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 37 +++++++++++++++++----------------- 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index ad87d6074fb2..72e609752323 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -211,20 +211,11 @@ struct io_ring_ctx { unsigned int drain_disabled: 1; unsigned int compat: 1; - enum task_work_notify_mode notify_method; + struct task_struct *submitter_task; + struct io_rings *rings; + struct percpu_ref refs; - /* - * If IORING_SETUP_NO_MMAP is used, then the below holds - * the gup'ed pages for the two rings, and the sqes. - */ - unsigned short n_ring_pages; - unsigned short n_sqe_pages; - struct page **ring_pages; - struct page **sqe_pages; - - struct io_rings *rings; - struct task_struct *submitter_task; - struct percpu_ref refs; + enum task_work_notify_mode notify_method; } ____cacheline_aligned_in_smp; /* submission data */ @@ -262,10 +253,8 @@ struct io_ring_ctx { struct io_buffer_list *io_bl; struct xarray io_bl_xa; - struct list_head io_buffers_cache; struct io_hash_table cancel_table_locked; - struct list_head cq_overflow_list; struct io_alloc_cache apoll_cache; struct io_alloc_cache netmsg_cache; } ____cacheline_aligned_in_smp; @@ -298,11 +287,8 @@ struct io_ring_ctx { * manipulate the list, hence no extra locking is needed there. */ struct io_wq_work_list iopoll_list; - struct io_hash_table cancel_table; struct llist_head work_llist; - - struct list_head io_buffers_comp; } ____cacheline_aligned_in_smp; /* timeouts */ @@ -318,6 +304,10 @@ struct io_ring_ctx { struct io_wq_work_list locked_free_list; unsigned int locked_free_nr; + struct list_head io_buffers_comp; + struct list_head cq_overflow_list; + struct io_hash_table cancel_table; + const struct cred *sq_creds; /* cred used for __io_sq_thread() */ struct io_sq_data *sq_data; /* if using sq thread polling */ @@ -332,6 +322,8 @@ struct io_ring_ctx { struct xarray personalities; u32 pers_next; + struct list_head io_buffers_cache; + /* Keep this last, we don't need it for the fast path */ struct wait_queue_head poll_wq; struct io_restriction restrictions; @@ -375,6 +367,15 @@ struct io_ring_ctx { unsigned sq_thread_idle; /* protected by ->completion_lock */ unsigned evfd_last_cq_tail; + + /* + * If IORING_SETUP_NO_MMAP is used, then the below holds + * the gup'ed pages for the two rings, and the sqes. + */ + unsigned short n_ring_pages; + unsigned short n_sqe_pages; + struct page **ring_pages; + struct page **sqe_pages; }; struct io_tw_state { From patchwork Tue Aug 15 17:31:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354004 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92086C41513 for ; Tue, 15 Aug 2023 17:34:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238873AbjHORdb (ORCPT ); Tue, 15 Aug 2023 13:33:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238933AbjHORdZ (ORCPT ); Tue, 15 Aug 2023 13:33:25 -0400 Received: from mail-ej1-x635.google.com (mail-ej1-x635.google.com [IPv6:2a00:1450:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B51F11BDC for ; Tue, 15 Aug 2023 10:33:24 -0700 (PDT) Received: by mail-ej1-x635.google.com with SMTP id a640c23a62f3a-986d8332f50so765027066b.0 for ; Tue, 15 Aug 2023 10:33:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120803; x=1692725603; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cvhsasKZJnva2wCI0Nz03kKnyZToCj12NVuapLyWr3E=; b=QXOuJsGn0ubTXKg7iFhwp218Gn17b+QOWBWZ7v5Z1Syj9NX/QRiLwKvZj6CGvfB6A+ dUs+XbfdUd+YmB5oFO/ROlxqsLbhKsig7Cdnm53oq294v2VFLBmpbpbNpx86cQXbB2Rk JLcX7oWSDYpwgg/qneeI+VrYjhpb1p8d6jDti0Yxu9mSyVYTM4R8S9f+QOc/QtbwtQkS PgWoAOkmVkg7kCpn+x6S1qXN6OG4XhR+MTB7xm/VIIAqUE7Ozqny192lJ+fqnFTCklr2 mYKM6heVoeuWtUoqU0L6QCFn6UKsobVn5u6K1VGN7Hb7hfwyqwNvUmYFBocj/ZJZWMo1 7pTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120803; x=1692725603; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cvhsasKZJnva2wCI0Nz03kKnyZToCj12NVuapLyWr3E=; b=TE22TcJfkGDNQhAqXFOdlHt4QGIZTO/m88H+mPxyAhkKJnYsCrKzzttpxHipUTSls+ 3j3mUoNsnMpGdMhEGCxOdG8cQncJDicGCbqWK3HwyoNIElMSO6HADco1wrRf1bzxAfiG qoikgMRlZA/lUBIdzBoWvOEcxggfU8u+emzJKB9Lj4L5RHRrAIU76I2XIinm94LCB5Hi Y9qqljImG9OW4zny8e/qBy+hOiU4oCZXJ3vEkzU7LiF2mNBAJjaMWfeN3FWt8R7H1G5+ zPUIrTEgiVcwBvKU0CUfsEgC6RqGhbV9zKw773/RrqCQAlTxZtG7eoUEdBwlMNnnLCPL 273w== X-Gm-Message-State: AOJu0YwG6V1ELBzMIwsD7DkRXEiSKiLDLRMOzN2/vw2ZXlPc6xMV83uR LdM09ZP1NekWCIkl6lsMyzQ8C0k7VLA= X-Google-Smtp-Source: AGHT+IEPp/DvW2GuU3r3wzBD04B8ywxjV7YDaoEUK4NouO6zrB1K5n4APRiRVlOa+V+U3ljLX/KUWQ== X-Received: by 2002:a17:907:2ceb:b0:99c:22e0:ae84 with SMTP id hz11-20020a1709072ceb00b0099c22e0ae84mr10192137ejc.28.1692120802969; Tue, 15 Aug 2023 10:33:22 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:22 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 13/16] io_uring: separate task_work/waiting cache line Date: Tue, 15 Aug 2023 18:31:42 +0100 Message-ID: <37c3b2f2563587f531dbc9cb4cf3ad6da1e33a94.1692119257.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org task_work's are typically queued up from IRQ/softirq potentially by a random CPU like in case of networking. Batch ctx fields bouncing as this into a separate cache line. We also move ->cq_timeouts there because waiters have to read and check it. We can also conditionally hide ->cq_timeouts in the future from the CQ wait path as a not really useful rudiment. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 72e609752323..5de5dffe29df 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -270,15 +270,25 @@ struct io_ring_ctx { unsigned cached_cq_tail; unsigned cq_entries; struct io_ev_fd __rcu *io_ev_fd; - struct wait_queue_head cq_wait; unsigned cq_extra; } ____cacheline_aligned_in_smp; + /* + * task_work and async notification delivery cacheline. Expected to + * regularly bounce b/w CPUs. + */ + struct { + struct llist_head work_llist; + unsigned long check_cq; + atomic_t cq_wait_nr; + atomic_t cq_timeouts; + struct wait_queue_head cq_wait; + } ____cacheline_aligned_in_smp; + struct { spinlock_t completion_lock; bool poll_multi_queue; - atomic_t cq_wait_nr; /* * ->iopoll_list is protected by the ctx->uring_lock for @@ -287,14 +297,11 @@ struct io_ring_ctx { * manipulate the list, hence no extra locking is needed there. */ struct io_wq_work_list iopoll_list; - - struct llist_head work_llist; } ____cacheline_aligned_in_smp; /* timeouts */ struct { spinlock_t timeout_lock; - atomic_t cq_timeouts; struct list_head timeout_list; struct list_head ltimeout_list; unsigned cq_last_tm_flush; @@ -314,8 +321,6 @@ struct io_ring_ctx { struct wait_queue_head sqo_sq_wait; struct list_head sqd_list; - unsigned long check_cq; - unsigned int file_alloc_start; unsigned int file_alloc_end; From patchwork Tue Aug 15 17:31:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354018 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87FBEC07E8D for ; Tue, 15 Aug 2023 17:34:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238918AbjHORdh (ORCPT ); Tue, 15 Aug 2023 13:33:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49768 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238935AbjHORd1 (ORCPT ); Tue, 15 Aug 2023 13:33:27 -0400 Received: from mail-lf1-x132.google.com (mail-lf1-x132.google.com [IPv6:2a00:1450:4864:20::132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9CB011A for ; Tue, 15 Aug 2023 10:33:25 -0700 (PDT) Received: by mail-lf1-x132.google.com with SMTP id 2adb3069b0e04-4fe94dde7d7so7527386e87.3 for ; Tue, 15 Aug 2023 10:33:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120804; x=1692725604; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LpaAHUyNNg2kIgP5vy8SQO2UtTfI3jq3CqcwchLqWvo=; b=oza1p8J5Ie7cS6Sgm0RcjoCY1JzI4hlyapHxx8OrAqZ2ZzeBZ58HOQi8w3rhr8G9jb /TArgKTr/RxsMLy9o+vg+SdySGXMG41MhC1HfVeGriDqSDYc1zdFY4gVWncp+eQWqZQS uHmgQXw73uU5eds67Zrd8EDq3tHZ4OtbeOY7Kd8CGIJNmHB3vK0NdcOqCYdUnS9X30el +nbQGtDvoNnLYb9r0WOSyTMUxwfF/8y5gQ5cVJuWOz6fVXHcB9KFvJEUWX4u9gFA0CjC AwpqUpKX1778hNsAfvDNCV+Tc468j3N2eJZQLzBEPD82jY/bLfnpwNmxrtGII5METsmW s1sQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120804; x=1692725604; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LpaAHUyNNg2kIgP5vy8SQO2UtTfI3jq3CqcwchLqWvo=; b=KsoiXJQFncRtFjsvEJG0Jw+5kwC54YufL341ym3mFbyXY6Htss/o51ZdqHEhIMDPLK 0CFRXnvE1vF2WUTGFErJ0r7vZ7dgRfFIEQWSq8QqUZt4uSu5K8zpLNYN2k9Ob8KOR+NB zgqzpVg431RTgdukJv2PvX16oQDC70QX/9TjyXTo90hz7hI60+Q2f0MYGfH28l8T47id Xn9OSK9q9LNgKkw84PTNEhRpsSsR30X6x9EBCgYWN8Nu74UYNxDFV4cm7GdefoTwc1XV Lh03HS/LSw16dlZPjtC9Tl6KRrSuQntes0vyZCRumXKraWcd5Tm9SBt8JbHCt935P+Ov v8EQ== X-Gm-Message-State: AOJu0YwcgXJnQd21ataElgpxHjXLQ1tZU8xP3aHb62SOWkuM1tqFZv+K CZ+Baos9D40X6qu4Lr1yuL2VI08iOVI= X-Google-Smtp-Source: AGHT+IEgOAQs9n/lJ4HU+8t12Hr535cBotn9CduqJtXnULqSX6/zA+eyI0mxOpsxJjyS0BhXzwY+Zg== X-Received: by 2002:a05:6512:12c3:b0:4fb:8bcd:acd4 with SMTP id p3-20020a05651212c300b004fb8bcdacd4mr11077297lfg.37.1692120803716; Tue, 15 Aug 2023 10:33:23 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:23 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 14/16] io_uring: move multishot cqe cache in ctx Date: Tue, 15 Aug 2023 18:31:43 +0100 Message-ID: <4dae5652bc608b131ef5d79a3cb1f671e16193e1.1692119257.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org We cache multishot CQEs before flushing them to the CQ in submit_state.cqe. It's a 16 entry cache totalling 256 bytes in the middle of the io_submit_state structure. Move it out of there, it should help with CPU caches for the submission state, and shouldn't affect cached CQEs. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 3 ++- io_uring/io_uring.c | 6 +++--- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 5de5dffe29df..01bdbc223edd 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -176,7 +176,6 @@ struct io_submit_state { unsigned short submit_nr; unsigned int cqes_count; struct blk_plug plug; - struct io_uring_cqe cqes[16]; }; struct io_ev_fd { @@ -307,6 +306,8 @@ struct io_ring_ctx { unsigned cq_last_tm_flush; } ____cacheline_aligned_in_smp; + struct io_uring_cqe completion_cqes[16]; + /* IRQ completion list, under ->completion_lock */ struct io_wq_work_list locked_free_list; unsigned int locked_free_nr; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index c39606740c73..5b13d22d1b76 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -883,7 +883,7 @@ static void __io_flush_post_cqes(struct io_ring_ctx *ctx) lockdep_assert_held(&ctx->uring_lock); for (i = 0; i < state->cqes_count; i++) { - struct io_uring_cqe *cqe = &state->cqes[i]; + struct io_uring_cqe *cqe = &ctx->completion_cqes[i]; if (!io_fill_cqe_aux(ctx, cqe->user_data, cqe->res, cqe->flags)) { if (ctx->task_complete) { @@ -934,7 +934,7 @@ bool io_fill_cqe_req_aux(struct io_kiocb *req, bool defer, s32 res, u32 cflags) lockdep_assert_held(&ctx->uring_lock); - if (ctx->submit_state.cqes_count == ARRAY_SIZE(ctx->submit_state.cqes)) { + if (ctx->submit_state.cqes_count == ARRAY_SIZE(ctx->completion_cqes)) { __io_cq_lock(ctx); __io_flush_post_cqes(ctx); /* no need to flush - flush is deferred */ @@ -948,7 +948,7 @@ bool io_fill_cqe_req_aux(struct io_kiocb *req, bool defer, s32 res, u32 cflags) if (test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq)) return false; - cqe = &ctx->submit_state.cqes[ctx->submit_state.cqes_count++]; + cqe = &ctx->completion_cqes[ctx->submit_state.cqes_count++]; cqe->user_data = user_data; cqe->res = res; cqe->flags = cflags; From patchwork Tue Aug 15 17:31:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354010 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21FECC04FE1 for ; Tue, 15 Aug 2023 17:34:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238890AbjHORde (ORCPT ); Tue, 15 Aug 2023 13:33:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238934AbjHORd1 (ORCPT ); Tue, 15 Aug 2023 13:33:27 -0400 Received: from mail-ej1-x62f.google.com (mail-ej1-x62f.google.com [IPv6:2a00:1450:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30C2710EC for ; Tue, 15 Aug 2023 10:33:26 -0700 (PDT) Received: by mail-ej1-x62f.google.com with SMTP id a640c23a62f3a-99c353a395cso762350866b.2 for ; Tue, 15 Aug 2023 10:33:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120804; x=1692725604; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rJ//xnNby4d5jNgpCMvTnltm5MwCd4efW3wW1j315lY=; b=gZIxvZlevMC19qRNOEmzIn6U8jdU3xzc36ajRqn9zY/4a1NJQWVNcgiVjPcDycV56Z BGc2AwlHyU0dDjNqKnVyuymS0YkJWsxYQTRcWsdLaBFM4Tkt6aLsB7Zycu5ZYY85J2jf su7MxjjlP1LJE4WRLkXnn3Ow6o+Jn4NUWyDC4zx5rkEsaq2xWI6ILEANnzmnRyGAQY1f Uu8lYHYr0BdEutubYWiKUtSyTca0ppw/R+Mq1IB4a1goNo1H7oMfxtn3oihKR7qG09Rz 0DyS4c6zmEUyF5E9DzjGzD2LzGFU5c1QGwOfjjNVKTeuDvY35Vjq06eTCT56Gjgxv2uZ fvyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120804; x=1692725604; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rJ//xnNby4d5jNgpCMvTnltm5MwCd4efW3wW1j315lY=; b=KPqaHs0B73Oppk64hnLWPTRQSK9AgCD+GhTXRqPcYEMSvUyoMIl2X2221UeuPvLIaS DasXTJm2QqZe9Ikw02F8dPjUwLuCljBx58THDYV5OhSgmFw3sAcshvWhLIUp4RryzVh1 KTaQRtMAo3HR9RxVSAiUZwuyW5SuSG8GjJ4wofuB2+wwR67swZj+ubjFzy3TSjUJenxj tJCZfjEQCQP1Vb9NsuZhAzcKHCwu9SFWIN0/eLXO/FDkC8NXTrWoyQn0rdgUdTdC/DSa KvQJTdru3RKXfWEMiZ31jYWwlSUqbTGEaAXMiWnP6V3hZFdeqWYkkNljzCK/nOw/B2ps 4p5w== X-Gm-Message-State: AOJu0YxwoVBQdEaYq2PP/Vx5J95h7iWP4oVHYffOU050TeEgEJe0qGX7 R2N8N0bjWUdR8wSN7piYfnLdq0EYMVU= X-Google-Smtp-Source: AGHT+IHI+R9lVtI4m64blK/1qoLBTtpM4x9PBeRjuLDSBd/kcPcA01TbXmzMbUmbDOaXnBTGRO4uBQ== X-Received: by 2002:a17:906:5a4b:b0:99c:6bc6:3520 with SMTP id my11-20020a1709065a4b00b0099c6bc63520mr9858015ejc.65.1692120804184; Tue, 15 Aug 2023 10:33:24 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:23 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 15/16] io_uring: move iopoll ctx fields around Date: Tue, 15 Aug 2023 18:31:44 +0100 Message-ID: <31634a99be3201292182ce2adde05dac1c664b53.1692119257.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Move poll_multi_queue and iopoll_list to the submission cache line, it doesn't make much sense to keep them separately, and is better place for it in general. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 01bdbc223edd..13d19b9be9f4 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -256,6 +256,15 @@ struct io_ring_ctx { struct io_hash_table cancel_table_locked; struct io_alloc_cache apoll_cache; struct io_alloc_cache netmsg_cache; + + /* + * ->iopoll_list is protected by the ctx->uring_lock for + * io_uring instances that don't use IORING_SETUP_SQPOLL. + * For SQPOLL, only the single threaded io_sq_thread() will + * manipulate the list, hence no extra locking is needed there. + */ + struct io_wq_work_list iopoll_list; + bool poll_multi_queue; } ____cacheline_aligned_in_smp; struct { @@ -284,20 +293,6 @@ struct io_ring_ctx { struct wait_queue_head cq_wait; } ____cacheline_aligned_in_smp; - struct { - spinlock_t completion_lock; - - bool poll_multi_queue; - - /* - * ->iopoll_list is protected by the ctx->uring_lock for - * io_uring instances that don't use IORING_SETUP_SQPOLL. - * For SQPOLL, only the single threaded io_sq_thread() will - * manipulate the list, hence no extra locking is needed there. - */ - struct io_wq_work_list iopoll_list; - } ____cacheline_aligned_in_smp; - /* timeouts */ struct { spinlock_t timeout_lock; @@ -308,6 +303,8 @@ struct io_ring_ctx { struct io_uring_cqe completion_cqes[16]; + spinlock_t completion_lock; + /* IRQ completion list, under ->completion_lock */ struct io_wq_work_list locked_free_list; unsigned int locked_free_nr; From patchwork Tue Aug 15 17:31:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13354005 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6A74C04E69 for ; Tue, 15 Aug 2023 17:34:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238879AbjHORdd (ORCPT ); Tue, 15 Aug 2023 13:33:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238938AbjHORd2 (ORCPT ); Tue, 15 Aug 2023 13:33:28 -0400 Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7C671BDB for ; Tue, 15 Aug 2023 10:33:26 -0700 (PDT) Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-997c4107d62so752360266b.0 for ; Tue, 15 Aug 2023 10:33:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692120805; x=1692725605; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=f2DEBEI1Rhro87I5c331VMzG9L/kWUItjuQZp5E+Z+g=; b=ejf2GKyEy/7WZjyKtAFJm/6sXYwHJQFvUghKceSKy4W3j0uEv+XRF+JQ0JWkLm7rzk QNY3mrEi8x6SykC+30LxshTIz8kqsqhcgwQExZAFIk3VzadUfL51l88yakZXSLuLRcrh BcZ8DeaVS7uwiR4rZ4y1tLGB4kD01WjZDYcOTh936FAaZfNrlvuk4mEdYZLRKaRH0zZb HWXQ+T+7HJrLy+tQac3LbywJ+EmdsPZ5MH5tJHAUIHfSwwikUnSJV3jUIb1DpIF4cqD8 JH48Ssn4Pc1CF8Tixa8XrUAwLI0f6FFatOtrI9oxvL2u2R0yJy2Lnvuw7JrgaD2DjlXS z95w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692120805; x=1692725605; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=f2DEBEI1Rhro87I5c331VMzG9L/kWUItjuQZp5E+Z+g=; b=lVbRjNh64UDgGz1dejesSEjqxP90NSvD3OeXUj1TYvktrUH2Abx91lIn8LSlHyjL4e ONW+CF+LpGdAfWmkfavtjZKYcpaC46ECPNk29ApkAZROIX+Xd7Z7hm9eeCFjalNTeHfa MpG64hBz5mw0JA/cydQxxMRHukKmjC7CKBotVebuh/vVZQMAQSEDvX1QnqvZGKCyjrcH EnU3yOZdgD1DaFmNoEL93Xy5WWflbi19Nkw03uIHRG7rsphjtzbGlIWSCvoVVC1EyWuH Ei0epo3zAPJ3f0vHf/8yJwKmurorFSX9aII7DtFe8PpGwhai1T8+8gxgADTq248ltkX9 38Gg== X-Gm-Message-State: AOJu0YzNkl4sHmbVsLI4sx7OGOyHJ3rJxVOoDNXiz7Eiydf+8tIjq5M+ NUg5wROWguvYbbA1WEDxHvTKVknokXQ= X-Google-Smtp-Source: AGHT+IFQvIo3cJEdYao0O7NxA+FzfLBXzmi4zuUDrNDtZ4odxA3PfR+U+VmLXP+mE6lzj6gHULvA3A== X-Received: by 2002:a17:906:3f12:b0:99d:dfd:a5ba with SMTP id c18-20020a1709063f1200b0099d0dfda5bamr11423291ejj.43.1692120805121; Tue, 15 Aug 2023 10:33:25 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:6d35]) by smtp.gmail.com with ESMTPSA id kk9-20020a170907766900b0099cc36c4681sm7269878ejc.157.2023.08.15.10.33.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 10:33:24 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH 16/16] io_uring: force inline io_fill_cqe_req Date: Tue, 15 Aug 2023 18:31:45 +0100 Message-ID: <2365692588dc7410db42e68d92960ee7dc2bf2f9.1692119258.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org There are only 2 callers of io_fill_cqe_req left, and one of them is extremely hot. Force inline the function. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 07fd185064d2..547c30582fb8 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -132,7 +132,8 @@ static inline bool io_get_cqe(struct io_ring_ctx *ctx, struct io_uring_cqe **ret return io_get_cqe_overflow(ctx, ret, false); } -static inline bool io_fill_cqe_req(struct io_ring_ctx *ctx, struct io_kiocb *req) +static __always_inline bool io_fill_cqe_req(struct io_ring_ctx *ctx, + struct io_kiocb *req) { struct io_uring_cqe *cqe;