io_uring/sqpoll: close race on waiting for sqring entries

Message ID	78b04485-ad25-448a-88d4-1649f446883c@kernel.dk (mailing list archive)
State	New
Headers	show Received: from mail-io1-f48.google.com (mail-io1-f48.google.com [209.85.166.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0227D1CB9F6 for <io-uring@vger.kernel.org>; Tue, 15 Oct 2024 15:07:10 +0000 (UTC) Message-ID: <78b04485-ad25-448a-88d4-1649f446883c@kernel.dk> Date: Tue, 15 Oct 2024 09:07:07 -0600 Precedence: bulk MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: io-uring <io-uring@vger.kernel.org> From: Jens Axboe <axboe@kernel.dk> Subject: [PATCH] io_uring/sqpoll: close race on waiting for sqring entries Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit
Series	io_uring/sqpoll: close race on waiting for sqring entries \| expand io_uring/sqpoll: close race on waiting for sqring entries

Message ID

78b04485-ad25-448a-88d4-1649f446883c@kernel.dk (mailing list archive)

State

New

Headers

Message-ID: <78b04485-ad25-448a-88d4-1649f446883c@kernel.dk>
Date: Tue, 15 Oct 2024 09:07:07 -0600
Precedence: bulk
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: en-US
To: io-uring <io-uring@vger.kernel.org>
From: Jens Axboe <axboe@kernel.dk>
Subject: [PATCH] io_uring/sqpoll: close race on waiting for sqring entries
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit

Series

io_uring/sqpoll: close race on waiting for sqring entries | expand

Commit Message

Jens Axboe Oct. 15, 2024, 3:07 p.m. UTC

When an application uses SQPOLL, it must wait for the SQPOLL thread to
consume SQE entries, if it fails to get an sqe when calling
io_uring_get_sqe(). It can do so by calling io_uring_enter(2) with the
flag value of IORING_ENTER_SQ_WAIT. In liburing, this is generally done
with io_uring_sqring_wait(). There's a natural expectation that once
this call returns, a new SQE entry can be retrieved, filled out, and
submitted. However, the kernel uses the cached sq head to determine if
the SQRING is full or not. If the SQPOLL thread is currently in the
process of submitting SQE entries, it may have updated the cached sq
head, but not yet committed it to the SQ ring. Hence the kernel may find
that there are SQE entries ready to be consumed, and return successfully
to the application. If the SQPOLL thread hasn't yet committed the SQ
ring entries by the time the application returns to userspace and
attempts to get a new SQE, it will fail getting a new SQE.

Fix this by having io_sqring_full() always use the user visible SQ ring
head entry, rather than the internally cached one.

Cc: stable@vger.kernel.org # 5.10+
Link: https://github.com/axboe/liburing/discussions/1267
Signed-off-by: Jens Axboe <axboe@kernel.dk>

---

diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index 9d70b2cf7b1e..913dbcebe5c9 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -284,7 +284,14 @@  static inline bool io_sqring_full(struct io_ring_ctx *ctx)
 {
 	struct io_rings *r = ctx->rings;
 
-	return READ_ONCE(r->sq.tail) - ctx->cached_sq_head == ctx->sq_entries;
+	/*
+	 * SQPOLL must use the actual sqring head, as using the cached_sq_head
+	 * is race prone if the SQPOLL thread has grabbed entries but not yet
+	 * committed them to the ring. For !SQPOLL, this doesn't matter, but
+	 * since this helper is just used for SQPOLL sqring waits (or POLLOUT),
+	 * just read the actual sqring head unconditionally.
+	 */
+	return READ_ONCE(r->sq.tail) - READ_ONCE(r->sq.head) == ctx->sq_entries;
 }
 
 static inline unsigned int io_sqring_entries(struct io_ring_ctx *ctx)

io_uring/sqpoll: close race on waiting for sqring entries

Commit Message

Patch