From patchwork Tue Feb 4 19:46:35 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13959706 Received: from mail-io1-f42.google.com (mail-io1-f42.google.com [209.85.166.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1D868204698 for ; Tue, 4 Feb 2025 19:48:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698501; cv=none; b=HDZCVj9kZnTU0Wj7RQOdMlIqtJDQEct4uMnS8HdsMbOdf3p5TGlN0fxVyDweSYv99y7P1/sBBgyiiExfvwljj8ZwE97GuE9zmcKGlhASPJSyv8lvswS4zowz0lCDzs66ZHGeojQrLWj2POIY/m8WXahiT7tNRKoDUTQyt0idmcQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698501; c=relaxed/simple; bh=eGiwkNWqVn4n6sELTRMgfaMcoF7E583l9mfCxaa+7q8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HJDjwU7Ap5M/hNen+0vmtRSTvjQL+QMKhG83te/XehVynpw4rhLhQWjUyi6oK45gHx4iuXcxgRJxG8+gUF+M5mg8/0Z5syf7E/LNP7D7RvCcJpn7ySduUEhywCx8Ox7xtSEmh5N91exBKvx0Od6q2lCizYuSp3Z2q4abQg4YXeg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=zteLo+n2; arc=none smtp.client-ip=209.85.166.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="zteLo+n2" Received: by mail-io1-f42.google.com with SMTP id ca18e2360f4ac-851c4ee2a37so108874039f.3 for ; Tue, 04 Feb 2025 11:48:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738698499; x=1739303299; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=enKCXdT18Z+amYe4hyaD8mGmZru39ch23j+4VBK1x/o=; b=zteLo+n2FC35FIPsZDSxuqqv4X6FZ/zA5Mxf6HtHr+rS5PsAH2Tc9nPPjhUclGQyk1 9U0y/s3dVfs8BlPJBi5rQQx7VYwzwJ6HsqnvAy8+x8TFvZ7HmFOFI3MlUD994UpkSgW6 WJwoNdaCgSe2cBqOkIdT+a616L+9l8KbQ5r3sJswZ8j6qXO4nneZ25artKEClg5Ax4ab ErQ+37NTO04TCoiOwPhcnxixR30NVqD6cxRRoipQmhwDOvXu/WKcDa4EejIEubY43MKM j1CeKrAYpIDpTzfPoLwu8MbzocMHs+YX+6vo/5CAZwUtRA8ReDpz4OM/2kWvNTic5i8p huFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738698499; x=1739303299; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=enKCXdT18Z+amYe4hyaD8mGmZru39ch23j+4VBK1x/o=; b=V0LB+RrD3q0mMUWJ8X2eBQtn32mDpmSDc6nWtqBTAcQlQIK4RF8FGXSIPWRuUSp8nL ddYY5jTqdWneza1ND57q6DPpWLa2fgxTPKovq9rWTTWUAVNWaEwbTEWOZ8mRSQtwO9qh RhBcMHFfFN5xWbLGDJIVe16beS403Zkn+mYMawctZa+2qXJIyEsZfjfZ6BMvP8GHhnkm XPbbwo0tCYqdtZqEkcoBXmKAXUhfKvJswFTSOtps38x8oX1CpFyh8NZoTNiHuzvjvwDL IRvaKXNbmwAlLUNLV8Rb+yo4NqjtwGkVrcUD1MQN3/Up9l8YrDgX1J/F63jftuae3Jvk ZzWg== X-Gm-Message-State: AOJu0YzSffh+Su3aDVvs6iOokQBOvKiKlizYnHEDmLnTx5FSztIzBRRA pmCLUojSghhZzJ6pXDmICPP3oNMgpiKSNkrDta0dRTiKNOPlDy08fDKvTZyNtB4hstkzQGLdbcG N X-Gm-Gg: ASbGncvIw7r8lnAWAMZ4ZbTrud3jmuJJivqh9vZEihfQ0K0Pkc+Rq0R/q093xzrvvAb X8LqAMMNJF02nCVWq5DByxDqB+X21YZQ8KeXIRqCcOmJQxaeqsAbL2Z1vXiOwq4kxEoKRi8YmKo ijjKGG0l+RN5IS483S7fPya0dSoMX2B5tNpECkP4ja9NxKq1td6L+ri7U5rS2HOGo7ZqyIvNtdh hXoRMVPyYUyGSee4nLz+IhbgeAoWmLYJTQQeXp7/OXRd5zDz1EOCTu2ILp9Zst02yCrCl3FAKFC zsh38T/Hq51qwe8dKHU= X-Google-Smtp-Source: AGHT+IHPAymKEUBTBKdzksMiXR6c5ug44eAf4HMNGF1lBWnftCR85DIzydMrvesVJgCfTmEerFHJ2Q== X-Received: by 2002:a05:6e02:1545:b0:3d0:405d:e94f with SMTP id e9e14a558f8ab-3d04f917886mr1410995ab.17.1738698498799; Tue, 04 Feb 2025 11:48:18 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ec746c95c4sm2841466173.127.2025.02.04.11.48.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2025 11:48:17 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 01/11] eventpoll: abstract out main epoll reaper into a function Date: Tue, 4 Feb 2025 12:46:35 -0700 Message-ID: <20250204194814.393112-2-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250204194814.393112-1-axboe@kernel.dk> References: <20250204194814.393112-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add epoll_wait(), which takes a struct file and the number of events etc to reap. This can then be called by do_epoll_wait(), and used by io_uring as well. No intended functional changes in this patch. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 31 ++++++++++++++++++------------- include/linux/eventpoll.h | 4 ++++ 2 files changed, 22 insertions(+), 13 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 7c0980db77b3..73b639caed3d 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2445,12 +2445,8 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd, return do_epoll_ctl(epfd, op, fd, &epds, false); } -/* - * Implement the event wait interface for the eventpoll file. It is the kernel - * part of the user space epoll_wait(2). - */ -static int do_epoll_wait(int epfd, struct epoll_event __user *events, - int maxevents, struct timespec64 *to) +int epoll_wait(struct file *file, struct epoll_event __user *events, + int maxevents, struct timespec64 *to) { struct eventpoll *ep; @@ -2462,28 +2458,37 @@ static int do_epoll_wait(int epfd, struct epoll_event __user *events, if (!access_ok(events, maxevents * sizeof(struct epoll_event))) return -EFAULT; - /* Get the "struct file *" for the eventpoll file */ - CLASS(fd, f)(epfd); - if (fd_empty(f)) - return -EBADF; - /* * We have to check that the file structure underneath the fd * the user passed to us _is_ an eventpoll file. */ - if (!is_file_epoll(fd_file(f))) + if (!is_file_epoll(file)) return -EINVAL; /* * At this point it is safe to assume that the "private_data" contains * our own data structure. */ - ep = fd_file(f)->private_data; + ep = file->private_data; /* Time to fish for events ... */ return ep_poll(ep, events, maxevents, to); } +/* + * Implement the event wait interface for the eventpoll file. It is the kernel + * part of the user space epoll_wait(2). + */ +static int do_epoll_wait(int epfd, struct epoll_event __user *events, + int maxevents, struct timespec64 *to) +{ + /* Get the "struct file *" for the eventpoll file */ + CLASS(fd, f)(epfd); + if (!fd_empty(f)) + return epoll_wait(fd_file(f), events, maxevents, to); + return -EBADF; +} + SYSCALL_DEFINE4(epoll_wait, int, epfd, struct epoll_event __user *, events, int, maxevents, int, timeout) { diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h index 0c0d00fcd131..f37fea931c44 100644 --- a/include/linux/eventpoll.h +++ b/include/linux/eventpoll.h @@ -25,6 +25,10 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long t /* Used to release the epoll bits inside the "struct file" */ void eventpoll_release_file(struct file *file); +/* Use to reap events */ +int epoll_wait(struct file *file, struct epoll_event __user *events, + int maxevents, struct timespec64 *to); + /* * This is called from inside fs/file_table.c:__fput() to unlink files * from the eventpoll interface. We need to have this facility to cleanup From patchwork Tue Feb 4 19:46:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13959707 Received: from mail-io1-f43.google.com (mail-io1-f43.google.com [209.85.166.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7720E217703 for ; Tue, 4 Feb 2025 19:48:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698503; cv=none; b=QYnjrZp3nJ8fP+y42MIrOYRjxzjFwwJAst9P1UkE/rMkdo4OduWgBg4UmXuguFmhnomSfHnI28NcePPH2OH0DgtkcLZcb4BgebmykU/hvmRVi+0M854mLbhshkrWqTRJuiMLeICO3l44Vp7FIw173y7XgO/4Tzb0R8YIh8fcuvQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698503; c=relaxed/simple; bh=HCmSTkxKH/HdcO7Uglkp2F2gW3Sb8auKNqzvPdoBqxk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Bx6GLkNWOmf7Jr6jov4gYxCDxDs5ncuIOgV+lgiBuZFp5tUvILXMQhTPGrHCp+bEY4AAjLfiphUrmMxOY0a74ZLDFik8+WOqhyT90oxPJu7Lf6KS5rdBkNlk7SlH8fr2Z9OVAZ3021sDoskHjBQhuipHd9YvbePSIe3CjHBstQQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=r+9bmMqR; arc=none smtp.client-ip=209.85.166.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="r+9bmMqR" Received: by mail-io1-f43.google.com with SMTP id ca18e2360f4ac-844e9b8b0b9so399313139f.0 for ; Tue, 04 Feb 2025 11:48:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738698500; x=1739303300; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Z+1svFbJCvXV9Zr5117Ph4lZhgiz5rV+2RtGExbCLLw=; b=r+9bmMqREeRvjz0JKPd01cJZECmzs/jCDM43H0wXvZ+1HdOKH5yjdtPrp986avfsIX 21lBytDhIBQNX3bBMqegzYYLPcPq96YhEJTeXiprlUbYpV2gbjWBI1Ln3mctPafy4mMI nI0DTF+AMH2QBFu8ULUoKE8uwLvsU4Jkr6ogvxCG6gEyJ9mbSojYWjVg/2vfkuwm4w6L 8UTIMkW/ppU5DVlxKku615GQEq3s5L03CtHu4DuRDuGxFmx5JZ5mCkABxteLe0+SXGg+ /6pLlOfGm5tYc4k/NdkyB7X/lZQ9ERAHMaEbj8aRb7jJKqOO3c7a8qar1s5jATmYnpHf kreA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738698500; x=1739303300; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Z+1svFbJCvXV9Zr5117Ph4lZhgiz5rV+2RtGExbCLLw=; b=Hb68F6NX8TaP2jocNEFrQ4uHA2AuBaDENLDq1etPsVi4MKIkiarxBY3CgvPdFb4dGp +PFj5s0zr3MY3r+JATDTfhMD9/GUvqCnkFbsHhgrHf07sW8uTlb74uMWdqqA1uA8ljy0 SmCANlkk0fX1Ho6PB4+hx22XEbnwbKhbQK//lVXDoOlvvxC5uUCLGZuny9i5OkO+l77P lEIyen4nrGjKlkVIGgkirx3qMOLR0GdrArPMTde0UglPCkVFJWD5nVsFsnfTfjU+0tGG gXrbECVn4/CK86ZSSon9jfh8u2ou59S8nUVx+xJscHdZ7b28wMwoAJw9dhdSrFYN0qu4 YOvA== X-Gm-Message-State: AOJu0Yzu/EYec+apm78IvnblmrRELQh0UJWJ+9hZ6juVvwg2dYyUMFef pOyf0W3OE+TYTVofvofb/g173qnzoqEB9zH30jqEEFebxILOxk1f+yw6XO+mSrXzSbcKH1ujLzT B X-Gm-Gg: ASbGnctlvnkEugNN2trGtPrh8K7xkxM0VsCNlrE0WHucQn5D3tnG1a16WvChycL+WOi AUkxG2afeeIz3VpYG9lcYogSy1UTlO6QfLi+fQv+IRsT+jEHRcbw+Z689H28e/hPkjU7ZS7RHuV eh7Yg0L5BRFLPiv2OM1SGz9shlTHnTdbY+ZSo8fvnrmR2FQKpd452qZ12gibncT1ypfwoJzzq/o xO1sMAdvqMZ2nmRcFsplGcksqIaSR/EJ1MAxjrr4r4KqBV/1hTelZnILrMP+W5dQE7xbJAz5cAT xGiHR1GCGrWY6pBOvLo= X-Google-Smtp-Source: AGHT+IGyhMhWzZeNFelBaJa5XqWbciXoqrb56RQyl9GgEyL9hOLIOlPxlzAjAP9SCizEF/4ij6wYeg== X-Received: by 2002:a05:6602:7210:b0:84f:2929:5ee0 with SMTP id ca18e2360f4ac-854ea50f874mr24700439f.10.1738698500067; Tue, 04 Feb 2025 11:48:20 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ec746c95c4sm2841466173.127.2025.02.04.11.48.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2025 11:48:19 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 02/11] eventpoll: add helper to remove wait entry from wait queue head Date: Tue, 4 Feb 2025 12:46:36 -0700 Message-ID: <20250204194814.393112-3-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250204194814.393112-1-axboe@kernel.dk> References: <20250204194814.393112-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 __epoll_wait_remove() is the core helper, it kills a given wait_queue_entry from the eventpoll wait_queue_head. Use it internally, and provide an overall helper, epoll_wait_remove(), which takes a struct file and provides the same functionality. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 58 +++++++++++++++++++++++++-------------- include/linux/eventpoll.h | 3 ++ 2 files changed, 40 insertions(+), 21 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 73b639caed3d..01edbee5c766 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1980,6 +1980,42 @@ static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry, return ret; } +static int __epoll_wait_remove(struct eventpoll *ep, + struct wait_queue_entry *wait, int timed_out) +{ + int eavail; + + /* + * We were woken up, thus go and try to harvest some events. If timed + * out and still on the wait queue, recheck eavail carefully under + * lock, below. + */ + eavail = 1; + + if (!list_empty_careful(&wait->entry)) { + write_lock_irq(&ep->lock); + /* + * If the thread timed out and is not on the wait queue, it + * means that the thread was woken up after its timeout expired + * before it could reacquire the lock. Thus, when wait.entry is + * empty, it needs to harvest events. + */ + if (timed_out) + eavail = list_empty(&wait->entry); + __remove_wait_queue(&ep->wq, wait); + write_unlock_irq(&ep->lock); + } + + return eavail; +} + +int epoll_wait_remove(struct file *file, struct wait_queue_entry *wait) +{ + if (is_file_epoll(file)) + return __epoll_wait_remove(file->private_data, wait, false); + return -EINVAL; +} + /** * ep_poll - Retrieves ready events, and delivers them to the caller-supplied * event buffer. @@ -2100,27 +2136,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, HRTIMER_MODE_ABS); __set_current_state(TASK_RUNNING); - /* - * We were woken up, thus go and try to harvest some events. - * If timed out and still on the wait queue, recheck eavail - * carefully under lock, below. - */ - eavail = 1; - - if (!list_empty_careful(&wait.entry)) { - write_lock_irq(&ep->lock); - /* - * If the thread timed out and is not on the wait queue, - * it means that the thread was woken up after its - * timeout expired before it could reacquire the lock. - * Thus, when wait.entry is empty, it needs to harvest - * events. - */ - if (timed_out) - eavail = list_empty(&wait.entry); - __remove_wait_queue(&ep->wq, &wait); - write_unlock_irq(&ep->lock); - } + eavail = __epoll_wait_remove(ep, &wait, timed_out); } } diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h index f37fea931c44..1301fc74aca0 100644 --- a/include/linux/eventpoll.h +++ b/include/linux/eventpoll.h @@ -29,6 +29,9 @@ void eventpoll_release_file(struct file *file); int epoll_wait(struct file *file, struct epoll_event __user *events, int maxevents, struct timespec64 *to); +/* Remove wait entry */ +int epoll_wait_remove(struct file *file, struct wait_queue_entry *wait); + /* * This is called from inside fs/file_table.c:__fput() to unlink files * from the eventpoll interface. We need to have this facility to cleanup From patchwork Tue Feb 4 19:46:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13959708 Received: from mail-io1-f49.google.com (mail-io1-f49.google.com [209.85.166.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8649A218E8B for ; Tue, 4 Feb 2025 19:48:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698504; cv=none; b=CHqp0fh5U+T9q2aTkY+4ZSEZh7TH6kM0sdYqDsYJi/ycnAlPBScRH+5wBus/lxCh2kC+jeEXUcQ79lQO+WM4yABwM0dXmNKFSaRcY4hN3E2yPAQLqnMhC4jVrOXNXroKz7dg7F8ifKF8t21hYLbpHBGf1QsbJ0k2Ix/Q939mXj8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698504; c=relaxed/simple; bh=FfOsgqQUbrqMBseqWoxvdeHMXbEwUwBb1lrRINWoyNY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k1FDkX6Up3v8ZO19Y9V9lj0EKnw8JIh2k2h5hBeepRWFoThHKFPridslLKXo7jOp7stcYpoVbZmB/BGe+81M3ddE29QQTxdG10gSuNO8piAslqyKWmX9fZLq2MkxfCh1aBsgb8HXILoMth3gTTaX/PfB0RZSQBOk5uAHxCffmz0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=deb31OxN; arc=none smtp.client-ip=209.85.166.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="deb31OxN" Received: by mail-io1-f49.google.com with SMTP id ca18e2360f4ac-844eac51429so448530039f.2 for ; Tue, 04 Feb 2025 11:48:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738698501; x=1739303301; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=W4mLZ0JfeLifQ8U+aoK/TU8xkbtkVLx+Q+iBJc1yFPs=; b=deb31OxNU1FrYW9c2lj1rJc6b8gWtKbFVb5TKGPbva+mmxr0DaB/gl3FW32XO6Vd9P fv/Ofbej1UEoi9ntN2mhcq78ornCTfVBmTeswca4nbnzg+s1jgomg+22ZRSDQ1njaFeC 0uqdXmH8wpLQ5wTcT5YY51on44ogrMFmVsuBIxSnO9xol6AiFlYRVLYVewRhZ5bM0qSb iTWxkO4IxMF4FpWngUQB2goF5hWWkLHJv7N5drCgG6LOfsmBA+VVBzOBw/gOFCfhnCnW qpaW6jhFf1B31wHrCK097hSORcOU+it22VuRG0wQCZ2zbtPTbRaZq2OA/ybGkIEdhqJv y74A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738698501; x=1739303301; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=W4mLZ0JfeLifQ8U+aoK/TU8xkbtkVLx+Q+iBJc1yFPs=; b=tugVs/tqzq2VEHl7sxnc/KGFcrGx/W6li2spfqvwdFUEIETo20HKDWCMeqmDUvkVE+ X7GqeT59hUFlVYmvziBVZTwIhaB1mTS2ljwINBqgTyt6JzdQ65RJRAKWSWuN6E3052Kc +SEVmyp9o17qdMbkMEF0fjqQeVc2iX571na36/+6rBUF/Xq/9va4RMdL9mlHIonv0DcX p+zKBkTl9XE7kX+PfPx0m2x8wL5tzJCUkITXTcN+i2aEy8tZ84j1Ho44uJDB7M6GsHw0 8lVpt///W1zqT7Lnz1fOYijfN18PvFkEO4UvTRtp3EeXUaKnGobFo/jswCmo8cc94+sn FGwA== X-Gm-Message-State: AOJu0YwKuIOMPKazpOK2UbLeVjYKh8C5b+1J/fzhCXukPVaiHddqofev LeOHUyackZZDyqvgOdw/OVtKkjKsqxCUBTgqQfHc8DwU2ycQ7947urcVi7OvirrMXZVxdaJKzhZ o X-Gm-Gg: ASbGncsxP8/vZkBDRGxuIIQ1ufjY0ApsL9mHqNGpgxx680eUbf/kX4iELF6KujPdIlW SfqewujI5UELMKzkJW/9DBNr1dWkeHWiL4ro9xhS/jMUKEff8kszRi41TAi8R+TS9cGpEx7ldGS 000d59RhoQt9I6Ls6L/lRsGZqt4hUZd0uoUc73CUdSPKqcQjaGFXJhRSQcz5i4Huz/8dpQ8FsHR BPG4Uv3yrpD+n60PVMX2/31vJURN/cbxNalgFHkHb0GKW5AsmT7JjgeJLf8e8/1ZYuzppUdOXvd XKzas+8xbAblatt43LY= X-Google-Smtp-Source: AGHT+IETLElqsMwzUMAfae/TCAVOpEGuh8jeYsJtvh7I5ZQpGcY/b5dQsufHQP1bMD6MomTu5+GjAw== X-Received: by 2002:a05:6e02:1d1a:b0:3cf:fb97:c313 with SMTP id e9e14a558f8ab-3d04f8f6ee9mr1728975ab.18.1738698501376; Tue, 04 Feb 2025 11:48:21 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ec746c95c4sm2841466173.127.2025.02.04.11.48.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2025 11:48:20 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 03/11] eventpoll: abstract out ep_try_send_events() helper Date: Tue, 4 Feb 2025 12:46:37 -0700 Message-ID: <20250204194814.393112-4-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250204194814.393112-1-axboe@kernel.dk> References: <20250204194814.393112-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for reusing this helper in another epoll setup helper, abstract it out. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 01edbee5c766..3cbd290503c7 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2016,6 +2016,22 @@ int epoll_wait_remove(struct file *file, struct wait_queue_entry *wait) return -EINVAL; } +static int ep_try_send_events(struct eventpoll *ep, + struct epoll_event __user *events, int maxevents) +{ + int res; + + /* + * Try to transfer events to user space. In case we get 0 events and + * there's still timeout left over, we go trying again in search of + * more luck. + */ + res = ep_send_events(ep, events, maxevents); + if (res > 0) + ep_suspend_napi_irqs(ep); + return res; +} + /** * ep_poll - Retrieves ready events, and delivers them to the caller-supplied * event buffer. @@ -2067,17 +2083,9 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, while (1) { if (eavail) { - /* - * Try to transfer events to user space. In case we get - * 0 events and there's still timeout left over, we go - * trying again in search of more luck. - */ - res = ep_send_events(ep, events, maxevents); - if (res) { - if (res > 0) - ep_suspend_napi_irqs(ep); + res = ep_try_send_events(ep, events, maxevents); + if (res) return res; - } } if (timed_out) From patchwork Tue Feb 4 19:46:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13959709 Received: from mail-il1-f171.google.com (mail-il1-f171.google.com [209.85.166.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 13A42219A71 for ; Tue, 4 Feb 2025 19:48:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698505; cv=none; b=Mbvgle7Ti4gutGugFstYsTSXk4TUjbZOuwD/BaCG+jaysyLZucK9GgP5JZ69BJgmQK186viI+OxoZCQkpPajDon5RGBkt1GDol+NEkVDoEEJwDP4kuwGG549J3FNx9ytC2NgyHPAXZId1DWBjY4J0JW2Aom4rBtRt7Bffil7XNI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698505; c=relaxed/simple; bh=Lq/TvPbkX+M9fh/ofSu3jq3ZsdO0u7WjpsDGycNAOmQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ekMIyhTfGLMuMyyeReDLtOm78aI9uCeO/or+JyxbT0W5mn3rNhkKG3LGCes+Fwu5m11FQZGgPY+47Q6q5eIntUVnlXKG2XNJ2pblVN4PptavTKjIx8BAxLbO1NkC/1yj1JfOwkmfgp+53n41UbeRSMRLgWY1sT/8fyMskt+WSew= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=ZwGeFWo3; arc=none smtp.client-ip=209.85.166.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="ZwGeFWo3" Received: by mail-il1-f171.google.com with SMTP id e9e14a558f8ab-3d005bc9f4aso20504525ab.1 for ; Tue, 04 Feb 2025 11:48:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738698502; x=1739303302; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=enUf4uTUQbLHbQ4xi6ZoCFVbZFlA1oPnBMw51mV/SrU=; b=ZwGeFWo3zMUKS9HrCbnBiv/tn63qE7Xgsu88K9cm/VUkuR6mjBqKNrSQG+H61BjtcD nxXbIHI7R+jYirPeKiTXyxujvOrPfNB5+UHadKTdf42A94l5aAADp+pBcA9OopN0OPcI RZhpIzpFUEn/Utj/n9D03kuHegtGznSlgWLlx7GRDBGjZfB5b37GGCdZlbO2RxGWPezn aV2Y5iQY6F0KpoJsz2+GBI4o5o3KlizEebm7v5fDy11Vki+J5t7Wb96Pnah3FvddMCMX HJ8t0DcUBko0+k+CjctwUa2pwnxl3CJs9ShL6x4PnJ9xX4U5A6S+9O81kWZ3vLwEVhRY FWiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738698502; x=1739303302; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=enUf4uTUQbLHbQ4xi6ZoCFVbZFlA1oPnBMw51mV/SrU=; b=YfTyQaQXKOlSw/j/aIaNc5ucbcJWBjxaLXMfhpV8vNZPHGFR1Cqu57SpTaSyfCk/R5 MAq1d7WVV4itpRCGcF9RC691hvrDO+G97R87aG9FiGjWpjwPS48CDuRom8A4FinRNwDT Pg7q7pFySwZfqJcX+2wbLVu/0RkbI3Cr9lwgAl6pCf+5z6Vo0WdyXU/MwAHg0IeJ+kc7 eB99LnMAXfu//uFNd2OoKr2uAxpyXz/Kc0cAVK8x0SURtUtCn1vzCKMUaDbLmDJkU0UT uoiIzZmeFG7yhVIDBKL2XYLiEnh3CNTx6KPSanjoww54fEhIT26MoIxacPIBxAACaTOB A4GA== X-Gm-Message-State: AOJu0YxpXwTax5wWFeYyDmqKbT2duicOMvRXkzmS14lsTU6MGvj0TeHE eptDvAswDxoLvg9tH4KvOFKyqV4Wz6D+LnYmJ+0uRyVzYBcAWs+6Iyrh08rs7YAFI31rvk8e6Bk 4 X-Gm-Gg: ASbGncs6tCEAfCjfBQkU7bZkc9BN1yz2wwF4ijbgnv8k5gtsamHyHVvcd4ZE5AOK5nv 1FG4Xa+lZU+Xoxavp7lWrSdv9Diah/YmpKqjfl2AJm/E8+IlafNIUVEbaTkdsUiFGZFKjy2FalK FocE6T0qQOBWroHTxboV+6DVwVdguS1Uyi+8ndojzn8QywWLSvJfw3t/LT6rzpb1QELfx4a1dGj OWO6AuufFZR9zXpn+FNWwy2YGL/magyNsqLCtl4BVdXPTJp4cr/l2BUmyMxq0rPbPmbkzqvKIrH HJoNP09acs1YkCNoWTk= X-Google-Smtp-Source: AGHT+IF/nxjM7bisnj4ikl2AuC0UZKr/Ut8VzJ+BcIeuFUeqlzaZ7Tx/qZxbD+5/2uh2JILKnff/YA== X-Received: by 2002:a05:6e02:1705:b0:3a7:87f2:b010 with SMTP id e9e14a558f8ab-3d04f4052damr2387755ab.5.1738698502608; Tue, 04 Feb 2025 11:48:22 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ec746c95c4sm2841466173.127.2025.02.04.11.48.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2025 11:48:21 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 04/11] eventpoll: add struct wait_queue_entry argument to epoll_wait() Date: Tue, 4 Feb 2025 12:46:38 -0700 Message-ID: <20250204194814.393112-5-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250204194814.393112-1-axboe@kernel.dk> References: <20250204194814.393112-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for allowing an outside caller to add itself to the epoll waitqueue, pass in a struct wait_queue_entry. Unused in its current form, but will be utilized shortly. No intended functional changes in this patch. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 5 +++-- include/linux/eventpoll.h | 3 ++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 3cbd290503c7..ecaa5591f4be 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2470,7 +2470,8 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd, } int epoll_wait(struct file *file, struct epoll_event __user *events, - int maxevents, struct timespec64 *to) + int maxevents, struct timespec64 *to, + struct wait_queue_entry *wait) { struct eventpoll *ep; @@ -2509,7 +2510,7 @@ static int do_epoll_wait(int epfd, struct epoll_event __user *events, /* Get the "struct file *" for the eventpoll file */ CLASS(fd, f)(epfd); if (!fd_empty(f)) - return epoll_wait(fd_file(f), events, maxevents, to); + return epoll_wait(fd_file(f), events, maxevents, to, NULL); return -EBADF; } diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h index 1301fc74aca0..24f9344df5a3 100644 --- a/include/linux/eventpoll.h +++ b/include/linux/eventpoll.h @@ -27,7 +27,8 @@ void eventpoll_release_file(struct file *file); /* Use to reap events */ int epoll_wait(struct file *file, struct epoll_event __user *events, - int maxevents, struct timespec64 *to); + int maxevents, struct timespec64 *to, + struct wait_queue_entry *wait); /* Remove wait entry */ int epoll_wait_remove(struct file *file, struct wait_queue_entry *wait); From patchwork Tue Feb 4 19:46:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13959710 Received: from mail-io1-f41.google.com (mail-io1-f41.google.com [209.85.166.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 36C2D2185BC for ; Tue, 4 Feb 2025 19:48:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698506; cv=none; b=PaVAPPTsaJPzVmOuuUkx5kFTU6lwnWNTuHYtKoA2CeYOf4+iHpUPc7dJ7zrtK+L7QI219e5aipyj4E0pg2UVR+uXoTizmwz3jo2AcqK11LkWTm5/zuWSkh1zlI+kNzCB98koA8GjWT8H1QShgE0pPTwPOHDJbH8PugZXrAw5gm4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698506; c=relaxed/simple; bh=ayD177cZGh9bhC4WV7iQTen+4xZyD0WE3g92TyrwJnY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KDjbkEJDV9J924c5ByKnxnM8d6HyeFe3GuO5LeuW3p2VMEcYER8DETeTV9XeDrvftrAgg7DJLvho42He6QjP66HLWBJBbjOiQtBcgzoEF1eTzoluWS8E1vT5EYcLrEpMybq608tZl0rjsh4y/RExFmO7KHhRtpd0cfqMKVetKuE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=u8fDDmQL; arc=none smtp.client-ip=209.85.166.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="u8fDDmQL" Received: by mail-io1-f41.google.com with SMTP id ca18e2360f4ac-844e9b8b0b9so399317739f.0 for ; Tue, 04 Feb 2025 11:48:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738698504; x=1739303304; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mCWW/DHnmpnY+C0n2fbNEGbgNNkUYylpMG8Oip3AQsg=; b=u8fDDmQL7S1VYlCJPIfJ2aKUY8hDEL4Iqyndt8ycWgiD3+2/XyClbeQlitSIwJF1Po dIcv6tAvHUuyL+SWToIfnCl7uqNDWCDovzTvV/vfms3GMTcejFXWt+3tVeU+K1DiAL5H p8tz7V/Uk3NUjLxWrH3fHfpDYPogxvK6q2sTyrRdKSvpvKmMHGEJSBhbh5BAcjthG/Yp fhuf+ZwULFcH+hJhrQ/DjExnNu+PKzEmIDbNYtrIi/u4S+FWauqnqzDTntEfCmvGf7i9 znI9CqNai9ijOByKXCGFe7Zir1mWV3IE0yy74BrVuKZjBcmiBqBl/Go55NJoS42Oercd OPtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738698504; x=1739303304; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mCWW/DHnmpnY+C0n2fbNEGbgNNkUYylpMG8Oip3AQsg=; b=D6SFDR8OriSYhEZQriwDdDbcx2U5i6EkcKisu+HpGANm/lkHqXuS7Y9cnF9HOojwCK bVd96KS8JTwVesD1F6D0RZVbVbhAEQGRd99hlAidNjjcQ+/jgBdHPlNiAPQYgdr9RGXv uWhc59jvZrAdqsYWfEjcBoWKKc79OEHkVat3oeluCZDyNCZwu7nsLaANYCDcZ+NfoubI EF3X46vUezDmw6tuwX/pxUpVTUxM+7SMU4r/K3uJqVLIc0wjdq+V4q6iKE0GRaz9AvFX kVkxlQrMD83DQHEsGYONLsregQACIiFne3kQtZdz4vQUBto7jbsdd18ItyrqjVrCBs+n EchQ== X-Gm-Message-State: AOJu0Yx57YILPuX+W6nquw/J/P0syDkT80jyQaYwrBMIba7p7izECSKP jbhy4gDpFHjHFxyM9KAabKFuk7KZG5QVu6rkgtdDJiaaPyLz3+9oEDIJEddyzgYbnfS69N2BJGt 1 X-Gm-Gg: ASbGncs5HGtKzp1hmdCj9fOffsgVWsp07+iHH5TMRUnrpPHDWs2YnJNjx4AEo6jrVTp MMyCNkvTgK1j8imF71bZp78vmoQREBGETk5iwt3W/ptIjKcv0gSEBp+Fozx/lGqzxg4asmsU1Q1 HuvxEBK8WddWYYLLUBkSIuTH9onziogHchlkr0jG/NB0tBjPG4ryA9wjK98tHkLamqtWNkBK6zi +yg5RFrGKhRz1NAvC6HE9fK1CJDUVXDLUBDGrliNrmfFFCf4Mnn77HBTj1nQCSK1WVPdh6YZOBj EKYE6ONSJnB/yXdZUmo= X-Google-Smtp-Source: AGHT+IF7dUsRk254fowl2HD6TqYE+4+YaIYbCM0ktGhfBv+SkQBsJEtqt76l4Q1QMeTr6rPavpxGEg== X-Received: by 2002:a05:6602:3689:b0:84f:5547:8398 with SMTP id ca18e2360f4ac-854ea50fbfdmr28213839f.11.1738698503845; Tue, 04 Feb 2025 11:48:23 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ec746c95c4sm2841466173.127.2025.02.04.11.48.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2025 11:48:22 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 05/11] eventpoll: add ep_poll_queue() loop Date: Tue, 4 Feb 2025 12:46:39 -0700 Message-ID: <20250204194814.393112-6-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250204194814.393112-1-axboe@kernel.dk> References: <20250204194814.393112-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 If a wait_queue_entry is passed in to epoll_wait(), then utilize this new helper for reaping events and/or adding to the epoll waitqueue rather than calling the potentially sleeping ep_poll(). It works like ep_poll(), except it doesn't block - it either returns the events that are already available, or it adds the specified entry to the struct eventpoll waitqueue to get a callback when events are triggered. It returns -EIOCBQUEUED for that case. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 37 ++++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index ecaa5591f4be..a8be0c7110e4 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2032,6 +2032,39 @@ static int ep_try_send_events(struct eventpoll *ep, return res; } +static int ep_poll_queue(struct eventpoll *ep, + struct epoll_event __user *events, int maxevents, + struct wait_queue_entry *wait) +{ + int res, eavail; + + /* See ep_poll() for commentary */ + eavail = ep_events_available(ep); + while (1) { + if (eavail) { + res = ep_try_send_events(ep, events, maxevents); + if (res) + return res; + } + + eavail = ep_busy_loop(ep, true); + if (eavail) + continue; + + if (!list_empty_careful(&wait->entry)) + return -EIOCBQUEUED; + + write_lock_irq(&ep->lock); + eavail = ep_events_available(ep); + if (!eavail) + __add_wait_queue_exclusive(&ep->wq, wait); + write_unlock_irq(&ep->lock); + + if (!eavail) + return -EIOCBQUEUED; + } +} + /** * ep_poll - Retrieves ready events, and delivers them to the caller-supplied * event buffer. @@ -2497,7 +2530,9 @@ int epoll_wait(struct file *file, struct epoll_event __user *events, ep = file->private_data; /* Time to fish for events ... */ - return ep_poll(ep, events, maxevents, to); + if (!wait) + return ep_poll(ep, events, maxevents, to); + return ep_poll_queue(ep, events, maxevents, wait); } /* From patchwork Tue Feb 4 19:46:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13959711 Received: from mail-il1-f178.google.com (mail-il1-f178.google.com [209.85.166.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 48AA1219A6E for ; Tue, 4 Feb 2025 19:48:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698507; cv=none; b=Fuyc7DOysZNq3nMK4K4yUAC9C0jZJaKWa44lmpRgkEe7yj3uYBp3rStLU/WcsaXeFAVu5YdZ/2Wz10HX+OMnQZDiALpSOMXPioZIgraimagb6Fn6xtrFn02QseK+60Dgjt7PqTE4kjhrYfoTrhSIBLrXpIZsXjw+xwyekXsXd/4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698507; c=relaxed/simple; bh=AJ4EpG+a/HnjEaQKeFR8lZf7Ex4++mtSxvprrt1iy8A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=i+1mNWfmXby45iY/srkdBrYtvAsS5zDvt9uk6ykRbZTfQ8Yi4gdrVqFaWK92Qy+gznKbr/lqWD4PGToKcyR3TtPcSPryTtGODUzFpv4s2I1Aas2M2Cb40CthpPvosj5OK5d9Jfl1PUHP3ZZ1M5S+wcHun9P2JXFJdBN+96C2ch4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=JfN2gut8; arc=none smtp.client-ip=209.85.166.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="JfN2gut8" Received: by mail-il1-f178.google.com with SMTP id e9e14a558f8ab-3ce873818a3so53136645ab.1 for ; Tue, 04 Feb 2025 11:48:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738698505; x=1739303305; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Yl4I89AW9N+9ZjPLUSwNXzZLneK6YdFoM+i0fKJ0Giw=; b=JfN2gut8Mxyx+Qm9lTtApKyUPE9Vk+jkt8VT758RePaXAhRuka29qcPnwrIGQbmros krHD88HtdrvAy7cvfMXtiSm8wweQQHxGYn3gUzI9ybt5QgY/7Y96xJzhHpsuc01X082K q1vSCkQdUCY9fKWJ2jEiTZ7BZKbWWat33liqs0jSJvaJ95+I6syqal1eZ3Iw1eeCxBXB vD9Fei9So0vXWrLR5M0XAULuPFoiKMjCqtQ1E8K9X2WajZzi70eST6m+POhNxZ7kn5J5 NbtEB2/L/MwVMVx8n+pgMz+gOCbR52DnnnrbOneufoeJFEPLwgz7uCOdKPLdRVpD6FVU W3iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738698505; x=1739303305; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Yl4I89AW9N+9ZjPLUSwNXzZLneK6YdFoM+i0fKJ0Giw=; b=penMQ0ghy+5WSiWsrtOzyhK560bycXYslgG/WTZXQwxkw1X4X4tLCdJZ1ijNuxOJXB c2GPwAwGMtBubekbUCjqEpr/FlE45X6aGI/4gr/MPTuKgLQEvIhA2CjxnAWut+G2CsrZ S0AHE+3aK32ozPE0/f2FmXBL94BRRhdwAiP0QoI8J5ahAZ5W+M0cDYlQxI4jGyk85c7c faI5VaOVKzBIB00tnDfO9OsFY1/JqF4Rx7E1JqUYOpFeK4wsrcklRukO7PSLR00s5koz J1V6Yv39dt1X4iwxMV/HRuf6JZQzVpXza9f52tjZgftu/wC50dG3AbbAC7oJtNUxczL5 2QDg== X-Gm-Message-State: AOJu0YwrNlBq1Ibe6JNfsfpcDL/MtbaTZyp7LEFvQvuSUDQDpv431OQP 5m43m/bTOMWf7n7IpcWoDKKM3HFyf1HMbQQO+3SdG3kniXwtT0KM918pw+51VDb3r9XldSCoAuJ J X-Gm-Gg: ASbGncs55dW77UydkFHJWCwepfoj2Qc7nYOujrGLUCFHqgs1EK7naretxL9WRZxlRvv bC8BgYrM2kSdzH4PQXYqX9MUFjkPFAONux4j77fVOB14zLl0cQsujktflqp5EWPekFOkM6CiY0l vs2ugdHXNK82D0BzmKdQirooF3owHpWYP0lAkbCLvz1wzX2eVZS4v1nVLYw5t5WU7J8AUdVWLh6 onkAPcB+MKO2h5ls9awgbqQLGl3InP7YImnVQqxd8T4vFyMEaU2nuuzpfd7XJcuBUuKTPVKYfo0 LcAR00rteZOb7FdHjVQ= X-Google-Smtp-Source: AGHT+IELVjxsL1tqCfebEN1r34RIVmp0hKbUIDlK1Ylpdn3ib9har3+McLtmJPMliq4v7hQ7wmzlMg== X-Received: by 2002:a05:6e02:f:b0:3cf:b2b0:5d35 with SMTP id e9e14a558f8ab-3d04f41ad8amr2550205ab.7.1738698504932; Tue, 04 Feb 2025 11:48:24 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ec746c95c4sm2841466173.127.2025.02.04.11.48.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2025 11:48:24 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 06/11] io_uring/epoll: remove CONFIG_EPOLL guards Date: Tue, 4 Feb 2025 12:46:40 -0700 Message-ID: <20250204194814.393112-7-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250204194814.393112-1-axboe@kernel.dk> References: <20250204194814.393112-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Just have the Makefile add the object if epoll is enabled, then it's not necessary to guard the entire epoll.c file inside an CONFIG_EPOLL ifdef. Signed-off-by: Jens Axboe --- io_uring/Makefile | 9 +++++---- io_uring/epoll.c | 2 -- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/io_uring/Makefile b/io_uring/Makefile index d695b60dba4f..7114a6dbd439 100644 --- a/io_uring/Makefile +++ b/io_uring/Makefile @@ -11,9 +11,10 @@ obj-$(CONFIG_IO_URING) += io_uring.o opdef.o kbuf.o rsrc.o notif.o \ eventfd.o uring_cmd.o openclose.o \ sqpoll.o xattr.o nop.o fs.o splice.o \ sync.o msg_ring.o advise.o openclose.o \ - epoll.o statx.o timeout.o fdinfo.o \ - cancel.o waitid.o register.o \ - truncate.o memmap.o alloc_cache.o + statx.o timeout.o fdinfo.o cancel.o \ + waitid.o register.o truncate.o \ + memmap.o alloc_cache.o obj-$(CONFIG_IO_WQ) += io-wq.o obj-$(CONFIG_FUTEX) += futex.o -obj-$(CONFIG_NET_RX_BUSY_POLL) += napi.o +obj-$(CONFIG_EPOLL) += epoll.o +obj-$(CONFIG_NET_RX_BUSY_POLL) += napi.o diff --git a/io_uring/epoll.c b/io_uring/epoll.c index 89bff2068a19..7848d9cc073d 100644 --- a/io_uring/epoll.c +++ b/io_uring/epoll.c @@ -12,7 +12,6 @@ #include "io_uring.h" #include "epoll.h" -#if defined(CONFIG_EPOLL) struct io_epoll { struct file *file; int epfd; @@ -58,4 +57,3 @@ int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags) io_req_set_res(req, ret, 0); return IOU_OK; } -#endif From patchwork Tue Feb 4 19:46:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13959712 Received: from mail-io1-f53.google.com (mail-io1-f53.google.com [209.85.166.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B7622185BC for ; Tue, 4 Feb 2025 19:48:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698509; cv=none; b=KUQUSe0qXpGbM9k2psTArigudFijI12QGEhnjRQzvMs3TMaVsZrK4Z/mIgzWLTw0xb2C71eqC7MQ8Izb1Gaa+Gy4Z4EXwJgxgYUOhnLML+VnT9n8V3DMRpTJtcoCsrcB28GUY75zFAbkAo9GY6wbRrUdCn6K2dLFXGYTaZi/aeE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698509; c=relaxed/simple; bh=im1ntOsYLHYjJiEvx6njeC6vx3wirWuYtKuqP171Gs4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=THd63mRu+qbHWWbopjgwFISEgaOmGq+nzwwvZfGqVHavBC51lgbfAaFFGcd+8hEJqv5i7TB1t7gIzZ62IiKga0oI4S/QqYzD3CuHdWJguPWn9wLFWic/m7mIC0qrivF9cNDJWBQMb+0Gpa+pnfkt7BpFWFYUat7YaaYyne8Hwlo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=SsvWfTrP; arc=none smtp.client-ip=209.85.166.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="SsvWfTrP" Received: by mail-io1-f53.google.com with SMTP id ca18e2360f4ac-844d7f81dd1so171907739f.2 for ; Tue, 04 Feb 2025 11:48:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738698506; x=1739303306; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TQ8sKu9TA7JXcFTCTxuSEZOOG4qI+mYE1gZPeZ19VjI=; b=SsvWfTrPPFbY5eWm1kGdmTK4bIFiTSznfzgR9fXviYxF+7VaB4SmDSUx0rSPBeIMNn f13pUbio3mp2TBgMC8WZwbwK1zCyDob5A4EJTdHW9WqdeyiDvNAxao/ZQPZd3k1q2mLH gvErpQzOaNyQJVuLFAQjYgS9J9CoNJwlNmdglJwZua0gm1dyAsT13bn32QR9Q7MkC1J4 4+itUYlkQkeILa4XSo8s0biwRTaykHDnYmcZurDxm471Zitk1kLsEH2ZzFOn+otjMiF4 a8G16c5/f/cobgJNwUyXWca8THYyA1mre63/j31MnKXssrODU695XAhza5iMtRaGsb6S NF6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738698506; x=1739303306; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TQ8sKu9TA7JXcFTCTxuSEZOOG4qI+mYE1gZPeZ19VjI=; b=EbhlXbb91dmkR7/xdFTyA62Io0KTcDnDt5aO/lXqcvn5jBWxLYHNeUXEm2IEKYwXJ3 muuPM8BY2qSrNdfSJMonmMyxTnL1I3L5JcmraVbEtO/XbbnxYR0G0NHzZWQ1fsb9keTe 849L9vjDis9hoQmlQ/txgP2pQ+f0rpCchJoZzz8As5lUODCK4SwhcS2bnWK4EryOSnlD 7PXtX8+L+dneOJNmNLtiplS97x62basl2h/HvRI73iP8K7FV7b3kNpJwW8z04i+dBI02 fsYIyUDBetvtWdYLMqUZ3RaAITZ5F0t3tuGoIqflzsdE+MxJAystq/eEeeHcAGpwqSKb wZCw== X-Gm-Message-State: AOJu0Yy6AibvppJ8Gowi0XR4PKcA5dpeJnNserRfvZ+jwY7ksqDWXDOQ e4XexmjVYtxxJqao6gsjKxsTwEpLXzbPvzUj21Jk3IHlqAs4gV4XAzXQ6fkujq6XL+ZN4BMrpNy l X-Gm-Gg: ASbGncvVnvKtcO3WcjmYIbskk3swTQj8Qsr0HsCuDSEJOxmo+RHXC5E3JzaUWt9W+XW dwkYh18KRXCFSsyFgz3qDuekvZtEJ/ZaqLH84k1e1chl31AXESlfWphAcfQYXjMo3TFkHudMkHN lh3r3Ag4QH3DOxjXcqZphNF1+3ETJDFoDM9Jn8wEzr1RdfRBx+AOXFx7Br51WsUY+jvazWPCk9Q WfYYH3DzgUpDlJaJldb4OnASfeblzatATuFStPp5vhBQ4G59SRmyCZLwlx/7mCC+OeTpy0C1gIc 28y2+SFQ/DmU1Q5L27I= X-Google-Smtp-Source: AGHT+IGgKsTAIuMWIIUyjhHQg9pg2TRiKqIeaaiPRkZN0hdZykJCKB/DPuXnqQ84gaSqM31W+Yz5gA== X-Received: by 2002:a05:6602:4192:b0:844:cbd0:66ca with SMTP id ca18e2360f4ac-854ea411c82mr28620039f.1.1738698506071; Tue, 04 Feb 2025 11:48:26 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ec746c95c4sm2841466173.127.2025.02.04.11.48.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2025 11:48:25 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 07/11] io_uring/poll: pull ownership handling into poll.h Date: Tue, 4 Feb 2025 12:46:41 -0700 Message-ID: <20250204194814.393112-8-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250204194814.393112-1-axboe@kernel.dk> References: <20250204194814.393112-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for using it from somewhere else. Rather than try and duplicate the functionality, just make it generically available to io_uring opcodes. Note: would have to be used carefully, cannot be used by opcodes that can trigger poll logic. Signed-off-by: Jens Axboe --- io_uring/poll.c | 30 +----------------------------- io_uring/poll.h | 31 +++++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+), 29 deletions(-) diff --git a/io_uring/poll.c b/io_uring/poll.c index bb1c0cd4f809..5e44ac562491 100644 --- a/io_uring/poll.c +++ b/io_uring/poll.c @@ -41,16 +41,6 @@ struct io_poll_table { __poll_t result_mask; }; -#define IO_POLL_CANCEL_FLAG BIT(31) -#define IO_POLL_RETRY_FLAG BIT(30) -#define IO_POLL_REF_MASK GENMASK(29, 0) - -/* - * We usually have 1-2 refs taken, 128 is more than enough and we want to - * maximise the margin between this amount and the moment when it overflows. - */ -#define IO_POLL_REF_BIAS 128 - #define IO_WQE_F_DOUBLE 1 static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync, @@ -70,7 +60,7 @@ static inline bool wqe_is_double(struct wait_queue_entry *wqe) return priv & IO_WQE_F_DOUBLE; } -static bool io_poll_get_ownership_slowpath(struct io_kiocb *req) +bool io_poll_get_ownership_slowpath(struct io_kiocb *req) { int v; @@ -85,24 +75,6 @@ static bool io_poll_get_ownership_slowpath(struct io_kiocb *req) return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK); } -/* - * If refs part of ->poll_refs (see IO_POLL_REF_MASK) is 0, it's free. We can - * bump it and acquire ownership. It's disallowed to modify requests while not - * owning it, that prevents from races for enqueueing task_work's and b/w - * arming poll and wakeups. - */ -static inline bool io_poll_get_ownership(struct io_kiocb *req) -{ - if (unlikely(atomic_read(&req->poll_refs) >= IO_POLL_REF_BIAS)) - return io_poll_get_ownership_slowpath(req); - return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK); -} - -static void io_poll_mark_cancelled(struct io_kiocb *req) -{ - atomic_or(IO_POLL_CANCEL_FLAG, &req->poll_refs); -} - static struct io_poll *io_poll_get_double(struct io_kiocb *req) { /* pure poll stashes this in ->async_data, poll driven retry elsewhere */ diff --git a/io_uring/poll.h b/io_uring/poll.h index 04ede93113dc..2f416cd3be13 100644 --- a/io_uring/poll.h +++ b/io_uring/poll.h @@ -21,6 +21,18 @@ struct async_poll { struct io_poll *double_poll; }; +#define IO_POLL_CANCEL_FLAG BIT(31) +#define IO_POLL_RETRY_FLAG BIT(30) +#define IO_POLL_REF_MASK GENMASK(29, 0) + +bool io_poll_get_ownership_slowpath(struct io_kiocb *req); + +/* + * We usually have 1-2 refs taken, 128 is more than enough and we want to + * maximise the margin between this amount and the moment when it overflows. + */ +#define IO_POLL_REF_BIAS 128 + /* * Must only be called inside issue_flags & IO_URING_F_MULTISHOT, or * potentially other cases where we already "own" this poll request. @@ -30,6 +42,25 @@ static inline void io_poll_multishot_retry(struct io_kiocb *req) atomic_inc(&req->poll_refs); } +/* + * If refs part of ->poll_refs (see IO_POLL_REF_MASK) is 0, it's free. We can + * bump it and acquire ownership. It's disallowed to modify requests while not + * owning it, that prevents from races for enqueueing task_work's and b/w + * arming poll and wakeups. + */ +static inline bool io_poll_get_ownership(struct io_kiocb *req) +{ + if (unlikely(atomic_read(&req->poll_refs) >= IO_POLL_REF_BIAS)) + return io_poll_get_ownership_slowpath(req); + return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK); +} + +static inline void io_poll_mark_cancelled(struct io_kiocb *req) +{ + atomic_or(IO_POLL_CANCEL_FLAG, &req->poll_refs); +} + + int io_poll_add_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_poll_add(struct io_kiocb *req, unsigned int issue_flags); From patchwork Tue Feb 4 19:46:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13959713 Received: from mail-io1-f46.google.com (mail-io1-f46.google.com [209.85.166.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1631121A430 for ; Tue, 4 Feb 2025 19:48:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698510; cv=none; b=E7LZbNjYchpZOjoGgL/dCxftyNt/G8F3o2Tb7NHVnIr0oJm+o32P3dDmIqwP25ZxGUWizZFBucDClt/1FxVqmTCQcZI/3j875579oKGYzyCenZZJK9fCU1xskVs1+8ipmvXEjq85B63FGZa7NfScc8LYgqZHInMDpQsGueC16TU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698510; c=relaxed/simple; bh=iDg1pcafrTMf1M9THTPfS2st6QLt98jXRHhiBzRj5oQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lFaQJgUwRuuvwc+47xqFGhF5u1udNL5F5FN5WEHWWrxtsVKSIWXk1f7GX3P9CwWubw6LFkFCXSOMunqzymmXxqZed8l+7yQLypJ0CnO08fqgKs1jVlmM5SlDg5bp8m/GeHLKavtLH62fLN1YnIYgwgY0Io5pRTmvGQ1Awl9jcZY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=iQnT28YV; arc=none smtp.client-ip=209.85.166.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="iQnT28YV" Received: by mail-io1-f46.google.com with SMTP id ca18e2360f4ac-844e6d1283aso3904839f.1 for ; Tue, 04 Feb 2025 11:48:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738698508; x=1739303308; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1DNE4ajdvoJ7lxr6HNYBzebwW7qzyN+uH2k2BONZWcI=; b=iQnT28YVwSV9Z2vBpZk4m77N/DpfBaU2BP70tZxHLHYdXmiEsj6Qw1CRan1uWQM926 PRiQbvOJTa/aFo9zNbRJdEYXepBM55tnzkB7VMrpBEgzwDjRSiPTmt7eEsOrNxMs7Vfk ZNDN2aCdEC9tms2i/PVWKHw6qQXLgy7yrO1l7EXa8Sp65njp//ie2UG0PW5yMVY0AZ6l tqDTcIAi2dZuxx+CAt817FQP7GPljxyMwt+/HAaNW0DXzcZFYuRkUSZ06lKDXkyNhb7K ikkqH6J3un3gv7sDJ4RgFnUWlr28p8RUSIxbQy3QZ3rYs1FWPhfCPa/mZ5ZYGvhNszgx j2SA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738698508; x=1739303308; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1DNE4ajdvoJ7lxr6HNYBzebwW7qzyN+uH2k2BONZWcI=; b=ONB8G0bauLxu6EleJ2gGBXyGOy63E3EAsDFGiM9vtrRdfY+k+WFA/8JjupZqFgEifZ K+9Bohph4FrjxYouAufGSx8B81L0/d6C2v73izLd73G9c+xlIfWJ4yjNpGkc9Hkf8GnY u4U9uqa021naA1kvSqcv/vEtlZGgrgebRl89N8g9JbGNkv07fPbQKOWdHg13Iu2mPwJ+ ESAZOlJXjTXIB380R0aZxH5DhI7ntxnN47oMOXFMwlO5tdfH4Lsw/3ftl6LecTVVe7/O OGk/cbgAO2Z0UWx8Z/drsPzMHj5rB8BlF5eCKbeG5CyRKkNFXDqRiA87LZehJ+i+FWYJ FW1g== X-Gm-Message-State: AOJu0YwM23iU6cJV7+vYFLAXF/6ftKB4azEmAWlpsYol2ckiCrkczUzl WcD98aoCH07zuGvE45GdGValWl66epXmBu/CLjh0axyWRuXTWq6M1J4gYlAKoZgY8hnBmNlUbFr / X-Gm-Gg: ASbGncsPx4I2bN3/u9dKldW4L4DbczpLV8UStcQhO2DJvchO88qNGALnvpqnLQj+8Ig PSw3baJVZnx8l/veHVGkhZRZM8DRTlL3luB3vCM8Vm7wlKB4oS+KVt1f1H0ePZYa8TRH9RCZb0d OKD5pOMmBzYZoNclOZHHHBJdB9Hj3BQYOLtGL3UTZs1Y0J5OOTX+1p+hxy7ScRRz5Nm3wXgORc0 +OcR751LuO5qKDsXXI3kpDayDDtoYIgWAZNl2opYFVHpHcEtwunVthnwnPC2dm0Ct3mHkp+oO4V 5u4kDHbI2jvLNXhVVh8= X-Google-Smtp-Source: AGHT+IFNPD1IIdcxu8TcCpr/goDKctc2AwLqLRzlAImxHhejCrluUFCk1NTnS7+2N+VAFH1n4BQjJA== X-Received: by 2002:a5d:8d91:0:b0:841:9225:1f56 with SMTP id ca18e2360f4ac-854de076c3fmr394580239f.3.1738698507837; Tue, 04 Feb 2025 11:48:27 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ec746c95c4sm2841466173.127.2025.02.04.11.48.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2025 11:48:26 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 08/11] io_uring/poll: add IO_POLL_FINISH_FLAG Date: Tue, 4 Feb 2025 12:46:42 -0700 Message-ID: <20250204194814.393112-9-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250204194814.393112-1-axboe@kernel.dk> References: <20250204194814.393112-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Use the same value as the retry, doesn't need to be unique, just available if the poll owning mechanism user wishes to use it for something other than a retry condition. Signed-off-by: Jens Axboe --- io_uring/poll.h | 1 + 1 file changed, 1 insertion(+) diff --git a/io_uring/poll.h b/io_uring/poll.h index 2f416cd3be13..97d14b2b2751 100644 --- a/io_uring/poll.h +++ b/io_uring/poll.h @@ -23,6 +23,7 @@ struct async_poll { #define IO_POLL_CANCEL_FLAG BIT(31) #define IO_POLL_RETRY_FLAG BIT(30) +#define IO_POLL_FINISH_FLAG IO_POLL_RETRY_FLAG #define IO_POLL_REF_MASK GENMASK(29, 0) bool io_poll_get_ownership_slowpath(struct io_kiocb *req); From patchwork Tue Feb 4 19:46:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13959714 Received: from mail-il1-f171.google.com (mail-il1-f171.google.com [209.85.166.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9655E21A443 for ; Tue, 4 Feb 2025 19:48:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698512; cv=none; b=FssTp2WbwHtbXJTEqKVToH2XmvL/EI17lAs0amPGQb/QF3s4c2qYZl2np2jq6p5CfAOQKSKtx2iyZtZ/0aRyD/86xTwbmKv0a/JK4rpieH5pix9jZ8FUZ1BsxU0G4CKVvrPuGM2Wxg1mnrQJ8vTeuKcOcycPGDh14Q/rRWyA/W8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698512; c=relaxed/simple; bh=Rzi4n7BnGBjUIYN0OxwHtFC3zQZ/7IVkoBCMx/gBgAw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rzP5nKl/BdQ1o/3+T1okJZ6qhnjTQVSEPaLiyh2qVtBNvQ8aCTidwbw6JZPxfQMlTqpkNQ4h/R+VBN21cs59NedGgHt/Xp726tcChAqII2/Sy0sIRYWnvhH/tlO2uAGdt0LF3qJcvDLUaa648cPAMYA3dejHmHEnC1YhLlZtT4g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=wUYGxhdq; arc=none smtp.client-ip=209.85.166.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="wUYGxhdq" Received: by mail-il1-f171.google.com with SMTP id e9e14a558f8ab-3ce886a2d5bso50610915ab.1 for ; Tue, 04 Feb 2025 11:48:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738698509; x=1739303309; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NI6IDh8GTM14O03ITyxFHnMkqCbc9YajSaStyhMKY54=; b=wUYGxhdqBlckqeYszEpke9XaBMzvqX6gTTWjLzNEnkFNWSNbanp2Nhai7Xr6DRT5Qq TX1WMXZR/5YXQ83cRrqa652nwYRg8FmgfGTZ0u2I52T6psehwO45cxyNYNGDW40Dwcwk GUaAZdSQkO+TfvqPPSfZgfo/Wnh2U9g3/bLoKYLQH6Qi+T2fHhgiRpwY8OaaPuj+hNJt L1OKB79sAzZnQiqUFRr8rm+3ZLPH7W08BnK9UFPqHYpmdEQVK4NdmBiQNh9hUt1Io9qb o6SVe/YtFHEH3pA+rjh3LeC9kdImxog4JZlPRz8Sf7ClZt0PXTLQxN8ves4U7EIaCzrM ifxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738698509; x=1739303309; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NI6IDh8GTM14O03ITyxFHnMkqCbc9YajSaStyhMKY54=; b=XNujPsJR6UFgX570qt5eYjMqTmZElcV89i4xBMjVQrWm5OngrhtbHsdMJGkKr/T3y3 ON3XYguUD5ltFx7vywV+zuX/UkAWl8uPjyDUOEuGtslAj7zUaFGU2pN0MNtrCHcvAvN6 1NEG0g3q2FS00U3GKkKHezSKa6X+10G+7fC/fVGdKYK+yWyve6m/3NlvlL3PcgCAwW7T jU1m9E1VbXEk1Y6mM/0sJyqjmzaZMPoAN9Ha+fOcydZTxflB25kYf4hyVkRPdISQvOeG QOZP8tLrzZPGZVyjluIBv8VwvCNcluowk9PE/FidTDF2PcQRD0YRP3/6V9+9O27/Llcm Z4Sg== X-Gm-Message-State: AOJu0Yykcq4I4ncDSMkT1KRpCY1UqyMBuqwIl2pF1jkneGWXtIBzk3b5 tPZ5wThwdN0Dxu22WNRrXZlPUQxnSqYQD2qUX9ISS4qq7Rkxdvm+K3rrKj1pUImLmNYvB9vLiJ0 7 X-Gm-Gg: ASbGncsNYWfJDPLSqJ5HYFHdAhRIz06OLT/+UB9kUuBDyInq0jlLzdeM8BSIlyKbcr+ ZvH9tp0RLDFUR3Tzlwz1+DQ7s/tFFU1gNarNSyr7NVOo45rlCqWtI/Eyb29i88eg8VPWacD7NOL bYSnshtlywH8CWYZwINcBjXiZvazmik0N/8uHwqbnSKZ0lG006pd/JEFjSb2zn/BzEiYC1D4Yvc 6axikdYphFCqrSOGek3vjOLAlbAOVF/WWEfDgoIhjAFgN/Q6+wl6BZfpuU+ChJnVu+Bwstw87Ao 0DWUmD+2ttEkeKFM28M= X-Google-Smtp-Source: AGHT+IGk/L0GfIidAS69VEUawevznso8/J9/9a0NNkUKQpGh8FZKaQywp1szfXpo+wo39blZLaGRNw== X-Received: by 2002:a05:6e02:1989:b0:3cf:b6c9:5fc9 with SMTP id e9e14a558f8ab-3d04f417460mr2712465ab.8.1738698509070; Tue, 04 Feb 2025 11:48:29 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ec746c95c4sm2841466173.127.2025.02.04.11.48.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2025 11:48:28 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 09/11] io_uring/epoll: add support for IORING_OP_EPOLL_WAIT Date: Tue, 4 Feb 2025 12:46:43 -0700 Message-ID: <20250204194814.393112-10-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250204194814.393112-1-axboe@kernel.dk> References: <20250204194814.393112-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 For existing epoll event loops that can't fully convert to io_uring, the used approach is usually to add the io_uring fd to the epoll instance and use epoll_wait() to wait on both "legacy" and io_uring events. While this work, it isn't optimal as: 1) epoll_wait() is pretty limited in what it can do. It does not support partial reaping of events, or waiting on a batch of events. 2) When an io_uring ring is added to an epoll instance, it activates the io_uring "I'm being polled" logic which slows things down. Rather than use this approach, with EPOLL_WAIT support added to io_uring, event loops can use the normal io_uring wait logic for everything, as long as an epoll wait request has been armed with io_uring. Note that IORING_OP_EPOLL_WAIT does NOT take a timeout value, as this is an async request. Waiting on io_uring events in general has various timeout parameters, and those are the ones that should be used when waiting on any kind of request. If events are immediately available for reaping, then This opcode will return those immediately. If none are available, then it will post an async completion when they become available. cqe->res will contain either an error code (< 0 value) for a malformed request, invalid epoll instance, etc. It will return a positive result indicating how many events were reaped. IORING_OP_EPOLL_WAIT requests may be canceled using the normal io_uring cancelation infrastructure. The poll logic for managing ownership is adopted to guard the epoll side too. Signed-off-by: Jens Axboe --- include/linux/io_uring_types.h | 4 + include/uapi/linux/io_uring.h | 1 + io_uring/cancel.c | 5 + io_uring/epoll.c | 169 +++++++++++++++++++++++++++++++++ io_uring/epoll.h | 22 +++++ io_uring/io_uring.c | 5 + io_uring/opdef.c | 14 +++ 7 files changed, 220 insertions(+) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 3def525a1da3..ee56992d31d5 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -370,6 +370,10 @@ struct io_ring_ctx { struct io_alloc_cache futex_cache; #endif +#ifdef CONFIG_EPOLL + struct hlist_head epoll_list; +#endif + const struct cred *sq_creds; /* cred used for __io_sq_thread() */ struct io_sq_data *sq_data; /* if using sq thread polling */ diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index e11c82638527..a559e1e1544a 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -278,6 +278,7 @@ enum io_uring_op { IORING_OP_FTRUNCATE, IORING_OP_BIND, IORING_OP_LISTEN, + IORING_OP_EPOLL_WAIT, /* this goes last, obviously */ IORING_OP_LAST, diff --git a/io_uring/cancel.c b/io_uring/cancel.c index 484193567839..9cebd0145cb4 100644 --- a/io_uring/cancel.c +++ b/io_uring/cancel.c @@ -17,6 +17,7 @@ #include "timeout.h" #include "waitid.h" #include "futex.h" +#include "epoll.h" #include "cancel.h" struct io_cancel { @@ -128,6 +129,10 @@ int io_try_cancel(struct io_uring_task *tctx, struct io_cancel_data *cd, if (ret != -ENOENT) return ret; + ret = io_epoll_wait_cancel(ctx, cd, issue_flags); + if (ret != -ENOENT) + return ret; + spin_lock(&ctx->completion_lock); if (!(cd->flags & IORING_ASYNC_CANCEL_FD)) ret = io_timeout_cancel(ctx, cd); diff --git a/io_uring/epoll.c b/io_uring/epoll.c index 7848d9cc073d..5a47f0cce647 100644 --- a/io_uring/epoll.c +++ b/io_uring/epoll.c @@ -11,6 +11,7 @@ #include "io_uring.h" #include "epoll.h" +#include "poll.h" struct io_epoll { struct file *file; @@ -20,6 +21,13 @@ struct io_epoll { struct epoll_event event; }; +struct io_epoll_wait { + struct file *file; + int maxevents; + struct epoll_event __user *events; + struct wait_queue_entry wait; +}; + int io_epoll_ctl_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_epoll *epoll = io_kiocb_to_cmd(req, struct io_epoll); @@ -57,3 +65,164 @@ int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags) io_req_set_res(req, ret, 0); return IOU_OK; } + +static void __io_epoll_finish(struct io_kiocb *req, int res) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + + lockdep_assert_held(&req->ctx->uring_lock); + + epoll_wait_remove(req->file, &iew->wait); + hlist_del_init(&req->hash_node); + io_req_set_res(req, res, 0); + req->io_task_work.func = io_req_task_complete; + io_req_task_work_add(req); +} + +static void __io_epoll_cancel(struct io_kiocb *req) +{ + __io_epoll_finish(req, -ECANCELED); +} + +static void __io_epoll_wait_cancel(struct io_kiocb *req) +{ + io_poll_mark_cancelled(req); + if (io_poll_get_ownership(req)) + __io_epoll_cancel(req); +} + +bool io_epoll_wait_remove_all(struct io_ring_ctx *ctx, struct io_uring_task *tctx, + bool cancel_all) +{ + struct hlist_node *tmp; + struct io_kiocb *req; + bool found = false; + + lockdep_assert_held(&ctx->uring_lock); + + hlist_for_each_entry_safe(req, tmp, &ctx->epoll_list, hash_node) { + if (!io_match_task_safe(req, tctx, cancel_all)) + continue; + __io_epoll_wait_cancel(req); + found = true; + } + + return found; +} + +int io_epoll_wait_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, + unsigned int issue_flags) +{ + struct hlist_node *tmp; + struct io_kiocb *req; + int nr = 0; + + io_ring_submit_lock(ctx, issue_flags); + hlist_for_each_entry_safe(req, tmp, &ctx->epoll_list, hash_node) { + if (!io_cancel_req_match(req, cd)) + continue; + __io_epoll_wait_cancel(req); + nr++; + } + io_ring_submit_unlock(ctx, issue_flags); + return nr ?: -ENOENT; +} + +static void io_epoll_retry(struct io_kiocb *req, struct io_tw_state *ts) +{ + int v; + + do { + v = atomic_read(&req->poll_refs); + if (unlikely(v != 1)) { + if (WARN_ON_ONCE(!(v & IO_POLL_REF_MASK))) + return; + if (v & IO_POLL_CANCEL_FLAG) { + __io_epoll_cancel(req); + return; + } + if (v & IO_POLL_FINISH_FLAG) + return; + } + v &= IO_POLL_REF_MASK; + } while (atomic_sub_return(v, &req->poll_refs) & IO_POLL_REF_MASK); + + io_req_task_submit(req, ts); +} + +static int io_epoll_execute(struct io_kiocb *req) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + + list_del_init_careful(&iew->wait.entry); + if (io_poll_get_ownership(req)) { + req->io_task_work.func = io_epoll_retry; + io_req_task_work_add(req); + } + + return 1; +} + +static __cold int io_epoll_pollfree_wake(struct io_kiocb *req) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + + io_poll_mark_cancelled(req); + list_del_init_careful(&iew->wait.entry); + io_epoll_execute(req); + return 1; +} + +static int io_epoll_wait_fn(struct wait_queue_entry *wait, unsigned mode, + int sync, void *key) +{ + struct io_kiocb *req = wait->private; + __poll_t mask = key_to_poll(key); + + if (unlikely(mask & POLLFREE)) + return io_epoll_pollfree_wake(req); + + return io_epoll_execute(req); +} + +int io_epoll_wait_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + + if (sqe->off || sqe->rw_flags || sqe->buf_index || sqe->splice_fd_in) + return -EINVAL; + + iew->maxevents = READ_ONCE(sqe->len); + iew->events = u64_to_user_ptr(READ_ONCE(sqe->addr)); + + iew->wait.flags = 0; + iew->wait.private = req; + iew->wait.func = io_epoll_wait_fn; + INIT_LIST_HEAD(&iew->wait.entry); + INIT_HLIST_NODE(&req->hash_node); + atomic_set(&req->poll_refs, 0); + return 0; +} + +int io_epoll_wait(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + struct io_ring_ctx *ctx = req->ctx; + int ret; + + io_ring_submit_lock(ctx, issue_flags); + + ret = epoll_wait(req->file, iew->events, iew->maxevents, NULL, &iew->wait); + if (ret == -EIOCBQUEUED) { + if (hlist_unhashed(&req->hash_node)) + hlist_add_head(&req->hash_node, &ctx->epoll_list); + io_ring_submit_unlock(ctx, issue_flags); + return IOU_ISSUE_SKIP_COMPLETE; + } else if (ret < 0) { + req_set_fail(req); + } + hlist_del_init(&req->hash_node); + io_ring_submit_unlock(ctx, issue_flags); + io_req_set_res(req, ret, 0); + return IOU_OK; +} diff --git a/io_uring/epoll.h b/io_uring/epoll.h index 870cce11ba98..296940d89063 100644 --- a/io_uring/epoll.h +++ b/io_uring/epoll.h @@ -1,6 +1,28 @@ // SPDX-License-Identifier: GPL-2.0 +#include "cancel.h" + #if defined(CONFIG_EPOLL) +int io_epoll_wait_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, + unsigned int issue_flags); +bool io_epoll_wait_remove_all(struct io_ring_ctx *ctx, struct io_uring_task *tctx, + bool cancel_all); + int io_epoll_ctl_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags); +int io_epoll_wait_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_epoll_wait(struct io_kiocb *req, unsigned int issue_flags); +#else +static inline bool io_epoll_wait_remove_all(struct io_ring_ctx *ctx, + struct io_uring_task *tctx, + bool cancel_all) +{ + return false; +} +static inline int io_epoll_wait_cancel(struct io_ring_ctx *ctx, + struct io_cancel_data *cd, + unsigned int issue_flags) +{ + return 0; +} #endif diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index e34a92c73a5d..78375981907d 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -93,6 +93,7 @@ #include "notif.h" #include "waitid.h" #include "futex.h" +#include "epoll.h" #include "napi.h" #include "uring_cmd.h" #include "msg_ring.h" @@ -358,6 +359,9 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) INIT_HLIST_HEAD(&ctx->waitid_list); #ifdef CONFIG_FUTEX INIT_HLIST_HEAD(&ctx->futex_list); +#endif +#ifdef CONFIG_EPOLL + INIT_HLIST_HEAD(&ctx->epoll_list); #endif INIT_DELAYED_WORK(&ctx->fallback_work, io_fallback_req_func); INIT_WQ_LIST(&ctx->submit_state.compl_reqs); @@ -3084,6 +3088,7 @@ static __cold bool io_uring_try_cancel_requests(struct io_ring_ctx *ctx, ret |= io_poll_remove_all(ctx, tctx, cancel_all); ret |= io_waitid_remove_all(ctx, tctx, cancel_all); ret |= io_futex_remove_all(ctx, tctx, cancel_all); + ret |= io_epoll_wait_remove_all(ctx, tctx, cancel_all); ret |= io_uring_try_cancel_uring_cmd(ctx, tctx, cancel_all); mutex_unlock(&ctx->uring_lock); ret |= io_kill_timeouts(ctx, tctx, cancel_all); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index e8baef4e5146..44553a657476 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -514,6 +514,17 @@ const struct io_issue_def io_issue_defs[] = { .async_size = sizeof(struct io_async_msghdr), #else .prep = io_eopnotsupp_prep, +#endif + }, + [IORING_OP_EPOLL_WAIT] = { + .needs_file = 1, + .unbound_nonreg_file = 1, + .audit_skip = 1, +#if defined(CONFIG_EPOLL) + .prep = io_epoll_wait_prep, + .issue = io_epoll_wait, +#else + .prep = io_eopnotsupp_prep, #endif }, }; @@ -745,6 +756,9 @@ const struct io_cold_def io_cold_defs[] = { [IORING_OP_LISTEN] = { .name = "LISTEN", }, + [IORING_OP_EPOLL_WAIT] = { + .name = "EPOLL_WAIT", + }, }; const char *io_uring_get_opcode(u8 opcode) From patchwork Tue Feb 4 19:46:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13959715 Received: from mail-io1-f54.google.com (mail-io1-f54.google.com [209.85.166.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BCA94218AA2 for ; Tue, 4 Feb 2025 19:48:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698513; cv=none; b=WJ4VFWk8maNGRvOutyrsKeKOC0T3VKMWgh4quO1s85+DZt1nFQ/swWbbWAsTm/ZvBsv35wKZHIczieMSfBz98bg9ze2K+O/HiP0qwboudNVcCd6oIu5O6N8DM2sQ3CbjThW7Gw2LHCx8wMDSDsmpTg8H+tJB3BUq3Tuu05IaUTE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698513; c=relaxed/simple; bh=rMFiX9v+6Qn9eaEXSN5YhNQY8Wccxwgd3xMBzNbmH/g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nrzKfDWYM/f9L8CXnw+y2wNmb9RBeoALWGUv6tr608hBQPCpqP+BkEITMZhn/Ya0MIbYs4gZrkOicaHRJ7ofMtvrmnGfXJiTOyDAShEzTsxNeGgV1p9tqJio3Zf2V9OgnpHHtili6wCalDGZD75igYMqT/sRjEpQjlxEEC87g+w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=nC+RRdxA; arc=none smtp.client-ip=209.85.166.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="nC+RRdxA" Received: by mail-io1-f54.google.com with SMTP id ca18e2360f4ac-844ce6d0716so411317839f.1 for ; Tue, 04 Feb 2025 11:48:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738698510; x=1739303310; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=58iy4j9boE/URjl057boaZ+KOAekFDkHOf6N1Ju+vZY=; b=nC+RRdxASnCRNCeerBYaNVopoVENoRF2mcFCdAHXxO0TBOz60PE2ABBcY3YEQfamfr V+A5B8lhK+Q5ts/L/bHmNQX06H+xD/FKJFtzAO2W2jH47qz+NoP1Sl2tHtIAqiu+G/cW +lOIosu6ZfdFU0EgvC87PbgSjzDofkH+Nl0l02xNU4G8veQYlvO21lNQahgmyJi5ZAI6 SFLsKLqOUC2k0j/BcQp+yEsg1m7t/URSZm4lW4FzWAp9fbavlb7Vt9GOqTCG2fg4J7Sz YzDwnME2vyzuWbfn3et5yom+Q5WGfI3T0o1tD1G1mP2JdmEvgFHCCw2ySsi7qIZIU+mC Sdjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738698510; x=1739303310; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=58iy4j9boE/URjl057boaZ+KOAekFDkHOf6N1Ju+vZY=; b=dNUtARsO7uAMoTQuxGnPB+6DvYwXCOGQChqk/W7GE6/PKdjuF7/gRSuwVf7hGsZruG Ib7qQqnUxP21ZGJUVRXcqf0HN2ImWtyKk4bUSV/SBUjoPyAAmUgRLwxo1z0L8oeZQPoY dB8CgP3SKbk+htMEv+UG96js/m+LPlZfmMXBGgzRCjZO/X365MZHzUUlNYD7g3/Vjl5M YMzbKjO+lhL19T7vBNuZCcU9QEfOwID+g+Egk2IJoVZm9v+XNtXZWmE29/ZWDQxuTUPN DfUJz+UTA+ITPBmJQQt1eQ+lnTiIQ7wYQiBYErzbIo+CnPvtOY42rRhjK9I13RZYqVvI 0PJQ== X-Gm-Message-State: AOJu0YzQtXfq0owKcLleKI/Inec05um/LOQWRTsmII9Px5sdcI9lfvQs fr8Kspvliw+uxmm5OhKdmnCbxLGuizOzN/QMUWvRAAbvhSmW80YMMYPTZdTfUKuDQFnXiHUfBhg r X-Gm-Gg: ASbGncvetoseai2bQlPsf1le/zYxhjcRH1SvMEWoNQVUYUSVoIajxkQeeheb1rX9wrn 2nAgyXbO16VAlym5Iig2a8+FKe9BUX0MWO8TjvhvEZe4Lhu/kAZy7dCuf9IJivf7wsVkM+FmTNz v+94EOVBdkz1TJLixwya9IR1uUKvk3GK5g/SGKSUpF3RqgkF9VMnvAd5eAZHGklAJRDH46nTPNN Js1tqGPA2jTVSul+cZNqENipz4L1Im55b2rnDyk+jSyszr94KU2yShl4mhPx3B5nHXb4T2jdsHw WdDMbVmnPTFuYdAcJWs= X-Google-Smtp-Source: AGHT+IFE7ZhJCuBp+WjGdKWIbdxjAaCNfYYqYgyd7hMpS8QJF7O6hJJrVQWqdpwEDpEaYYbdGFmAFA== X-Received: by 2002:a05:6602:3a8a:b0:841:a9d3:3b39 with SMTP id ca18e2360f4ac-854ea436a47mr37859239f.5.1738698510572; Tue, 04 Feb 2025 11:48:30 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ec746c95c4sm2841466173.127.2025.02.04.11.48.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2025 11:48:29 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 10/11] io_uring/epoll: add support for provided buffers Date: Tue, 4 Feb 2025 12:46:44 -0700 Message-ID: <20250204194814.393112-11-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250204194814.393112-1-axboe@kernel.dk> References: <20250204194814.393112-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This will be a prerequisite for adding multishot support, but can be used with single shot support as well. Works like any other request that supports provided buffers - set addr to NULL and ensure that sqe->buf_group is set, and IOSQE_BUFFER_SELECT in sqe->flags. Then epoll wait will pick a buffer from that group and store the events there. Signed-off-by: Jens Axboe --- io_uring/epoll.c | 31 +++++++++++++++++++++++++++---- io_uring/opdef.c | 1 + 2 files changed, 28 insertions(+), 4 deletions(-) diff --git a/io_uring/epoll.c b/io_uring/epoll.c index 5a47f0cce647..134112e7a505 100644 --- a/io_uring/epoll.c +++ b/io_uring/epoll.c @@ -10,6 +10,7 @@ #include #include "io_uring.h" +#include "kbuf.h" #include "epoll.h" #include "poll.h" @@ -189,11 +190,13 @@ int io_epoll_wait_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); - if (sqe->off || sqe->rw_flags || sqe->buf_index || sqe->splice_fd_in) + if (sqe->off || sqe->rw_flags || sqe->splice_fd_in) return -EINVAL; iew->maxevents = READ_ONCE(sqe->len); iew->events = u64_to_user_ptr(READ_ONCE(sqe->addr)); + if (req->flags & REQ_F_BUFFER_SELECT && iew->events) + return -EINVAL; iew->wait.flags = 0; iew->wait.private = req; @@ -207,22 +210,42 @@ int io_epoll_wait_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) int io_epoll_wait(struct io_kiocb *req, unsigned int issue_flags) { struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + struct epoll_event __user *evs = iew->events; struct io_ring_ctx *ctx = req->ctx; + int maxevents = iew->maxevents; + unsigned int cflags = 0; int ret; io_ring_submit_lock(ctx, issue_flags); - ret = epoll_wait(req->file, iew->events, iew->maxevents, NULL, &iew->wait); + if (io_do_buffer_select(req)) { + size_t len = iew->maxevents * sizeof(*evs); + + evs = io_buffer_select(req, &len, 0); + if (!evs) { + ret = -ENOBUFS; + goto err; + } + maxevents = len / sizeof(*evs); + } + + ret = epoll_wait(req->file, evs, maxevents, NULL, &iew->wait); if (ret == -EIOCBQUEUED) { + io_kbuf_recycle(req, 0); if (hlist_unhashed(&req->hash_node)) hlist_add_head(&req->hash_node, &ctx->epoll_list); io_ring_submit_unlock(ctx, issue_flags); return IOU_ISSUE_SKIP_COMPLETE; - } else if (ret < 0) { + } else if (ret > 0) { + cflags = io_put_kbuf(req, ret * sizeof(*evs), 0); + } else if (!ret) { + io_kbuf_recycle(req, 0); + } else { +err: req_set_fail(req); } hlist_del_init(&req->hash_node); io_ring_submit_unlock(ctx, issue_flags); - io_req_set_res(req, ret, 0); + io_req_set_res(req, ret, cflags); return IOU_OK; } diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 44553a657476..04ff2b438531 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -520,6 +520,7 @@ const struct io_issue_def io_issue_defs[] = { .needs_file = 1, .unbound_nonreg_file = 1, .audit_skip = 1, + .buffer_select = 1, #if defined(CONFIG_EPOLL) .prep = io_epoll_wait_prep, .issue = io_epoll_wait, From patchwork Tue Feb 4 19:46:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13959716 Received: from mail-il1-f175.google.com (mail-il1-f175.google.com [209.85.166.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32A132185BC for ; Tue, 4 Feb 2025 19:48:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698514; cv=none; b=gOnUfnAv+PVk0pRNWZMJVP7Z5CUTr8Rvqc5435D/LoxoVW/J3loF24Mrydwdg/MTBpb2Oymw6EmS/eL9LvE3P/yKqPfhL36Hq9KA0663LS7c08NkvNi7kb7UC8Ltyc+/U/XTewM2CMruhUWwgYxf62Z6noLCr0yoY2q6lLRYhas= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738698514; c=relaxed/simple; bh=7NCJa/YUkbrD14iVC5Fo7E+HifsEy18miILjbxUpvpQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tWIYYdp8j/+9YX/TFkQKVw7g4IjIyyVgW7gJFCh4ch+UPreOX8OY3AW2by2uWUfJ8yE/kRZBJLnMgnEjs4cIf8e70KHzKFeG2yG3e0jGwzzvwVhQEhEeGykDOo9l48f5DIxgCXKvv5l9wiwMIGpZR3ojfnzFhaF4FutFP4sfOZM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=P455PtdB; arc=none smtp.client-ip=209.85.166.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="P455PtdB" Received: by mail-il1-f175.google.com with SMTP id e9e14a558f8ab-3cda56e1dffso17716985ab.1 for ; Tue, 04 Feb 2025 11:48:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738698512; x=1739303312; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0n2mNjkTDU5O7DP6W32rHso6Kfv5rwx5G4baOMJXIqA=; b=P455PtdBDi5iPMZVb2rEpFQ+2R9b6VTgNwJw0kfBSU7pzgp3NGznRN6bdbPucGBaZZ GxnUCOGR1CgeXl1n1iDPySYZQ60y+RLy+W7IVKLuh7IQEtpqEHMOOiv+y2aBwtOxraOD 3cc4KPq/0lRPrH9EGBXchochg3KmZSk/hwdjfyQSAtp7DM2aI3e36bo6n7h2hNBU9uqt lmZwOd0Y/zE67KHLlde1xpcxcy89a+akslbU/+drkFd9r6zgUuKH2S3zETerCNOHxw/o RJD+yaws08tFnCGdBE5g9FOcIMmkDiOZ+SlZtVpJLDZr54vulKLzTCJSULXpYu4m3mhb 6qlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738698512; x=1739303312; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0n2mNjkTDU5O7DP6W32rHso6Kfv5rwx5G4baOMJXIqA=; b=MIC9//f+Paf5EblvOoFQy8ko4MuHbMRnUtnZLmGowtM0PAyfQdCE60AX/UMBSB3+28 TlRrhPBDjUo915klbQsXgTmvrSb0qtmLyYO4Dj+wAf8fKN9q8nqLtDZtG0zAz6ym825H HO0EhvBTCj32DTKAXcK8Gvwp4AzwGGAXZmZQigE0il/WKdtNjOUgQeCeQZzjb0eKrRZE yRJSLELjW9+GYenCkUaJh6gG3bFELePGliIt7LhduR9PP9kKDbMe11fK2V2gRXrkY4jE qRFb0i4cFP+Ex9ic6G5GebE4CNOd38hpRj73OKE1yDHFH8gvguKlM9YIRkyTi7S8xAM3 tMcA== X-Gm-Message-State: AOJu0Yyx7KkK3BUU8dQWtRBBCM6HD7Uue1y0t09UkD8snosnC6/g4GIL z8j6TkGr7ZnTgX5z7swi2x9XtKUEyqht/wHepj5uAPgWKHrcMr7Z2c/Nd4444yMlACM3fare5dr W X-Gm-Gg: ASbGncu1cQ0har6hLnuv5zqzvzdnmHf3Sg/+lmBiAuErLW/s4KCpAAcBcmr1iWx7Ly1 YzRGID2hTYiAKhazgTbWnCKSMom1QH+vBwlFluSoOGKwLesw2AFok90cGNGuLQHPsIy/W4iSIOZ 6KPGCksw7yvg3wHqOc1PJ0LmV/Li2w50rz6FOqNcQmDcBmQKGJnxA4HhirVX0//8w0VQ3u6zamC 84GONxjbvVinh/uCOhZ9ex1gOf2DN1y+N5xn7EXzcPXQng2RdHxV8d0pOk3X5kDwjwjrb5+NX4E XLn4ZT8vRA4Ai358riQ= X-Google-Smtp-Source: AGHT+IHlV5dF+b2i226E432f8IFH3lV+HH0PKBbcEE5SLTTpyRfoD/A14429fAVuvD4sVNwCMDE13Q== X-Received: by 2002:a05:6e02:13a5:b0:3cf:cd3c:bdfd with SMTP id e9e14a558f8ab-3d04f461636mr1942345ab.12.1738698511915; Tue, 04 Feb 2025 11:48:31 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ec746c95c4sm2841466173.127.2025.02.04.11.48.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2025 11:48:30 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 11/11] io_uring/epoll: add multishot support for IORING_OP_EPOLL_WAIT Date: Tue, 4 Feb 2025 12:46:45 -0700 Message-ID: <20250204194814.393112-12-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250204194814.393112-1-axboe@kernel.dk> References: <20250204194814.393112-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 As with other multishot requests, submitting a multishot epoll wait request will keep it re-armed post the initial trigger. This allows multiple epoll wait completions per request submitted, every time events are available. If more completions are expected for this epoll wait request, then IORING_CQE_F_MORE will be set in the posted cqe->flags. For multishot, the request remains on the epoll callback waitqueue head. This means that epoll doesn't need to juggle the ep->lock writelock (and disable/enable IRQs) for each invocation of the reaping loop. That should translate into nice efficiency gains. Use by setting IORING_EPOLL_WAIT_MULTISHOT in the sqe->epoll_flags member. Must be used with provided buffers. Signed-off-by: Jens Axboe --- include/uapi/linux/io_uring.h | 6 +++++ io_uring/epoll.c | 46 ++++++++++++++++++++++++++++------- 2 files changed, 43 insertions(+), 9 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index a559e1e1544a..93f504b6d4ec 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -73,6 +73,7 @@ struct io_uring_sqe { __u32 futex_flags; __u32 install_fd_flags; __u32 nop_flags; + __u32 epoll_flags; }; __u64 user_data; /* data to be passed back at completion time */ /* pack this to avoid bogus arm OABI complaints */ @@ -405,6 +406,11 @@ enum io_uring_op { #define IORING_ACCEPT_DONTWAIT (1U << 1) #define IORING_ACCEPT_POLL_FIRST (1U << 2) +/* + * epoll_wait flags, stored in sqe->epoll_flags + */ +#define IORING_EPOLL_WAIT_MULTISHOT (1U << 0) + /* * IORING_OP_MSG_RING command types, stored in sqe->addr */ diff --git a/io_uring/epoll.c b/io_uring/epoll.c index 134112e7a505..2474f2e069ef 100644 --- a/io_uring/epoll.c +++ b/io_uring/epoll.c @@ -25,6 +25,7 @@ struct io_epoll { struct io_epoll_wait { struct file *file; int maxevents; + int flags; struct epoll_event __user *events; struct wait_queue_entry wait; }; @@ -151,11 +152,12 @@ static void io_epoll_retry(struct io_kiocb *req, struct io_tw_state *ts) io_req_task_submit(req, ts); } -static int io_epoll_execute(struct io_kiocb *req) +static int io_epoll_execute(struct io_kiocb *req, __poll_t mask) { struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); - list_del_init_careful(&iew->wait.entry); + if (mask & EPOLL_URING_WAKE || !(req->flags & REQ_F_APOLL_MULTISHOT)) + list_del_init_careful(&iew->wait.entry); if (io_poll_get_ownership(req)) { req->io_task_work.func = io_epoll_retry; io_req_task_work_add(req); @@ -164,13 +166,13 @@ static int io_epoll_execute(struct io_kiocb *req) return 1; } -static __cold int io_epoll_pollfree_wake(struct io_kiocb *req) +static __cold int io_epoll_pollfree_wake(struct io_kiocb *req, __poll_t mask) { struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); io_poll_mark_cancelled(req); list_del_init_careful(&iew->wait.entry); - io_epoll_execute(req); + io_epoll_execute(req, mask); return 1; } @@ -181,20 +183,28 @@ static int io_epoll_wait_fn(struct wait_queue_entry *wait, unsigned mode, __poll_t mask = key_to_poll(key); if (unlikely(mask & POLLFREE)) - return io_epoll_pollfree_wake(req); + return io_epoll_pollfree_wake(req, mask); - return io_epoll_execute(req); + return io_epoll_execute(req, mask); } int io_epoll_wait_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); - if (sqe->off || sqe->rw_flags || sqe->splice_fd_in) + if (sqe->off || sqe->splice_fd_in) return -EINVAL; iew->maxevents = READ_ONCE(sqe->len); iew->events = u64_to_user_ptr(READ_ONCE(sqe->addr)); + iew->flags = READ_ONCE(sqe->epoll_flags); + if (iew->flags & ~IORING_EPOLL_WAIT_MULTISHOT) { + return -EINVAL; + } else if (iew->flags & IORING_EPOLL_WAIT_MULTISHOT) { + if (!(req->flags & REQ_F_BUFFER_SELECT)) + return -EINVAL; + req->flags |= REQ_F_APOLL_MULTISHOT; + } if (req->flags & REQ_F_BUFFER_SELECT && iew->events) return -EINVAL; @@ -217,7 +227,7 @@ int io_epoll_wait(struct io_kiocb *req, unsigned int issue_flags) int ret; io_ring_submit_lock(ctx, issue_flags); - +retry: if (io_do_buffer_select(req)) { size_t len = iew->maxevents * sizeof(*evs); @@ -238,14 +248,32 @@ int io_epoll_wait(struct io_kiocb *req, unsigned int issue_flags) return IOU_ISSUE_SKIP_COMPLETE; } else if (ret > 0) { cflags = io_put_kbuf(req, ret * sizeof(*evs), 0); + if (req->flags & REQ_F_BL_EMPTY) + goto stop_multi; + if (req->flags & REQ_F_APOLL_MULTISHOT) { + if (io_req_post_cqe(req, ret, cflags | IORING_CQE_F_MORE)) + goto retry; + goto stop_multi; + } } else if (!ret) { io_kbuf_recycle(req, 0); } else { err: req_set_fail(req); + if (req->flags & REQ_F_APOLL_MULTISHOT) { +stop_multi: + atomic_or(IO_POLL_FINISH_FLAG, &req->poll_refs); + io_poll_multishot_retry(req); + if (!list_empty_careful(&iew->wait.entry)) + epoll_wait_remove(req->file, &iew->wait); + req->flags &= ~REQ_F_APOLL_MULTISHOT; + } } - hlist_del_init(&req->hash_node); + if (!(req->flags & REQ_F_APOLL_MULTISHOT)) + hlist_del_init(&req->hash_node); io_ring_submit_unlock(ctx, issue_flags); io_req_set_res(req, ret, cflags); + if (issue_flags & IO_URING_F_MULTISHOT) + return IOU_STOP_MULTISHOT; return IOU_OK; }