From patchwork Tue Jan 28 13:39:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Kellermann X-Patchwork-Id: 13952564 Received: from mail-wr1-f46.google.com (mail-wr1-f46.google.com [209.85.221.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 96A131A76AC for ; Tue, 28 Jan 2025 13:39:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071580; cv=none; b=ZvKuX7IHbcBeyoPenq0iilqTyIoOXThtJxHLwm3FkLEfS3g9f3id1wxfaKyyh6kY6ZoSzFpgzrlnjiFd1rrZ8fVHToOymj88VL/SF+IigTxQIiVwLcV5rY+GBMSpywb1/Hr2C0zyrH/F/07tTADJI7muw3vSS+AdMO8P6Lsmm78= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071580; c=relaxed/simple; bh=srkResxdP4Pqydq94udETtB9G/kWLvauGsd1OifrxTQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CHzgV5liSzrPodOyeRPxL5Ulg4IXJg7MZT5fygthFyeEp74nMBZg/Ptjic9H6lx7D0IT1Y+4jRaVmMawMzKhwYGwcSIT9QpCb5Iw+nTzzzpG/89TPmC271f8vC53HVv93ZhG1srqheQo9NRsDPvF/P8+pHMGJx264jMZPbO+o6o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com; spf=pass smtp.mailfrom=ionos.com; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b=dyk+YEbt; arc=none smtp.client-ip=209.85.221.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ionos.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b="dyk+YEbt" Received: by mail-wr1-f46.google.com with SMTP id ffacd0b85a97d-386329da1d9so2904606f8f.1 for ; Tue, 28 Jan 2025 05:39:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ionos.com; s=google; t=1738071577; x=1738676377; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CelVTbMR7GPzFcsvtgcyQD9/IaPZPL2IppoN0M2ePcg=; b=dyk+YEbthISWkdcqxuLiVFjyarAOmKSoohVqr6g4AIo3AVs06haipYSpJStqqYlj/c uZr5cQb5czQzA2BJMjX8YrVI7uGgQ7aXEQYMlARjZCEbydkcn5yqLB+kbyZ9wIAzdxN3 hOKg1Ed6Id9TQu4FDJ6N4n9fOrZwKzRQjOD/GhUfaOjF8WDaA1n/mWUjNlUw7slDAtsB wlsvCVPGYpywLO9yd/2aViQWO4fYDH3FlhEZ0+y8qxjxzfXEt4jmFboqr3Qif/0uFW/v V6gkjSBEJYLLGXY68b7SEmAUk0dhh7udzkAseEIRHsMr4nNHlVhZNInk9VpXJ47B5bWW 26Cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738071577; x=1738676377; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CelVTbMR7GPzFcsvtgcyQD9/IaPZPL2IppoN0M2ePcg=; b=fI9C6OxX5xOA0eGj2BS3Whs1S33K1fiVfH4jZ5kDQPKPXRDPu263lKDYxrHyouz9Tp INhwvilGLxYZnKLTS6Mhq0JJFXdA9XQtR/FQoEYqfV0su5yDb9eFLMroMUbXnlETsF/I oCL5J1GyeGfO1HcDPBJ7mvuVSr6dkEW1YaOFjDG60UarYqTpy7Uo6DLc2wpsTiQKVLaF Ob9HpdpnhAgV6sz5dthJq0GBm8zD5mws/IIxiRryZ4BPE0jXOsfRA2IF5IJVLdKB9mHZ XwpEGA4GwHjaghmVGSeyJgzuC+MuZZ3AUXoMToSE5Fr2yAIbOYb3SahIGEbAeiCUsVVS AU4A== X-Forwarded-Encrypted: i=1; AJvYcCUWvuiwXVZiAClpQwsutQRHBrLyWZRiDUQUgOSNfvYO2SLZKuJ1rTKH9G6GV7Xf3VozFRPzTFNpSA==@vger.kernel.org X-Gm-Message-State: AOJu0Yy14IyX1IzAxnN2mlswniaeUoEsccDOTvHtJgKcu7EIa6q2tosB 27RZIYUGehHPZMFs+yqEtXmG3GWVy7ilur6ZFkKMZZINcPu5hxIfzLqf4sH2HncYTt4ob8WEMpt hUwc= X-Gm-Gg: ASbGncvptqKi07edmZm1kx9XrL+MNS2MMw2mVHbWJG3yHJe0p+yCJxoLiqC/k7Dmkx2 3PXrHtJYaQVGkiRdS0IaWG0oSPfySVXX54GdogGbmV3wU2xL1D1d1dFRMo0mBb5e/skbOS5r7tb bNB+NRBG570zImMmsjdo9TAESCQncG4pOkQc/shQ2YE5GJiK5nbKUdG/PuyEVv8EOdliS0gXsaV Q7snb7oWHVotogDH27Tt4KQUR3X9l4UctsP0lktQxsX7mM72TDBIi5K6GAKW7w2j6YTOpBCpxxH XiHEEC47v4MeqS/+wn0O3p6qD7DnlD/emfTNVUofwFQBRVom6y+Z1JENIhTOhh+lVU5VcEvt92x z/aEgvQ8K2KsEy7k= X-Google-Smtp-Source: AGHT+IF8M/ADaKWu3naSkh4B3oSig0LdPOLITkVOWQq8v9u1yCs+3hetH7yo8n6SXAQJJZxuKwKRAQ== X-Received: by 2002:a05:6000:1a8c:b0:38a:888c:6786 with SMTP id ffacd0b85a97d-38bf57c063fmr41035427f8f.52.1738071576767; Tue, 28 Jan 2025 05:39:36 -0800 (PST) Received: from raven.intern.cm-ag (p200300dc6f2b6900023064fffe740809.dip0.t-ipconnect.de. [2003:dc:6f2b:6900:230:64ff:fe74:809]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38c2a1bb02dsm14160780f8f.70.2025.01.28.05.39.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Jan 2025 05:39:36 -0800 (PST) From: Max Kellermann To: axboe@kernel.dk, asml.silence@gmail.com, io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Max Kellermann Subject: [PATCH 1/8] io_uring/io-wq: eliminate redundant io_work_get_acct() calls Date: Tue, 28 Jan 2025 14:39:20 +0100 Message-ID: <20250128133927.3989681-2-max.kellermann@ionos.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250128133927.3989681-1-max.kellermann@ionos.com> References: <20250128133927.3989681-1-max.kellermann@ionos.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Instead of calling io_work_get_acct() again, pass acct to io_wq_insert_work() and io_wq_remove_pending(). This atomic access in io_work_get_acct() was done under the `acct->lock`, and optimizing it away reduces lock contention a bit. Signed-off-by: Max Kellermann --- io_uring/io-wq.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index 5d0928f37471..6d26f6f068af 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -903,9 +903,8 @@ static void io_run_cancel(struct io_wq_work *work, struct io_wq *wq) } while (work); } -static void io_wq_insert_work(struct io_wq *wq, struct io_wq_work *work) +static void io_wq_insert_work(struct io_wq *wq, struct io_wq_acct *acct, struct io_wq_work *work) { - struct io_wq_acct *acct = io_work_get_acct(wq, work); unsigned int hash; struct io_wq_work *tail; @@ -951,7 +950,7 @@ void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work) } raw_spin_lock(&acct->lock); - io_wq_insert_work(wq, work); + io_wq_insert_work(wq, acct, work); clear_bit(IO_ACCT_STALLED_BIT, &acct->flags); raw_spin_unlock(&acct->lock); @@ -1021,10 +1020,10 @@ static bool io_wq_worker_cancel(struct io_worker *worker, void *data) } static inline void io_wq_remove_pending(struct io_wq *wq, + struct io_wq_acct *acct, struct io_wq_work *work, struct io_wq_work_node *prev) { - struct io_wq_acct *acct = io_work_get_acct(wq, work); unsigned int hash = io_get_work_hash(work); struct io_wq_work *prev_work = NULL; @@ -1051,7 +1050,7 @@ static bool io_acct_cancel_pending_work(struct io_wq *wq, work = container_of(node, struct io_wq_work, list); if (!match->fn(work, match->data)) continue; - io_wq_remove_pending(wq, work, prev); + io_wq_remove_pending(wq, acct, work, prev); raw_spin_unlock(&acct->lock); io_run_cancel(work, wq); match->nr_pending++; From patchwork Tue Jan 28 13:39:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Kellermann X-Patchwork-Id: 13952566 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E24B15CD41 for ; Tue, 28 Jan 2025 13:39:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071582; cv=none; b=TjtM3T4LyBnBo/G4CnY7F29/bmNCRWkt2f38EmIpp5HYByrXMdO2/eUBMVmLfbZ7y1YA8V+ELpb8MB3h12fNMI4YfDAuiz+hOYmFcHFO2aS4293rDfR4K2Co+JXcNjBK1QHTn/Yuhl7vE+4rtWM63cjitn1Dh8hxL8lecQJqldc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071582; c=relaxed/simple; bh=mvwqkwgQ/VbBCILFRqc5Ce5W30pZPIIiXsjC5pLsUOw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bpmT4qjSrFOtl5n2yX7tjhkdtfO4M6N78nJf/fe02oBzfTAjDyIs2cDM599SugFJhYz+XjcWlqJO/vxbVxGPCHPJ3pl/qM9rY2qgp0wrWNQVKAY4jz5nZN4lv7o0oHtmJy+bdvH4xoMDmqhBL9t6OwTgpnHDnDh5zt/ASI1hwKc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com; spf=pass smtp.mailfrom=ionos.com; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b=dmcTbvJq; arc=none smtp.client-ip=209.85.128.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ionos.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b="dmcTbvJq" Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-436345cc17bso40185305e9.0 for ; Tue, 28 Jan 2025 05:39:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ionos.com; s=google; t=1738071578; x=1738676378; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3Mh2jFvUal50P1kZhLOmeKi1LJ4OQZ7qeSID+lPKLRI=; b=dmcTbvJq5LwZgER+aOP9EMfJB3fym6t/rTAyYesr4HpBjdKqfHKVrb8FBZGdfniwpt o+pjOkEljT2cCZuc21djJ3qYoI7jN0qvMp7OVgs3vvSQ/3ZxGAe1xE1SLGhtMfJBVB7z hBmFeDAuv9MVacejRHva/7JfPnToJVPNfxEf4q76/PyFAq3v5R6QfPNR4eDjzDdhPfLx QUVkubAoOE3n7Aba6Va/pYusD57mRzHsPkK+ypsLUxjhrquytPtK/AfHFk0ReRiteLDG wc4pzbLyQoYuxVRsD3v8B3RjA7V11GZ4GzsmLx1es76o1m/XgAOMI8mLTmRjhGvDyVJQ zIOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738071578; x=1738676378; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3Mh2jFvUal50P1kZhLOmeKi1LJ4OQZ7qeSID+lPKLRI=; b=kJ2Y0cm128PcVbircly1vFJ7kneOlzbMHLvLJkesb9eA0T3kTolWdVcfwDnXd77lgW P1YGCgv7qteLHRNRFbg2JJJhCSbztNg5S1gvHwieQ6GVIBolst/TyZTzv+SwktN28TZH 0qaLJB6xH+WRTVBhHfd3vhB2FGEUgP57nGLEmNLSh2k/uH0OKMe6QkG+aA14PZubJFiI maVuZWidMQm7A7nl0E5997FMrs+bsP1zhx9O0FGzsLpAfnTL68Fz5NibPhzJgQQzm1ls 6vJxqNXYM5daI2YPKJzSP46G70wYPWunYZBv8TEZhtslJOqi1QxnDJXOBB8LTXsqaw/K lUOw== X-Forwarded-Encrypted: i=1; AJvYcCVAnZLjZ/yPdHJCFMwQeT+PQz7mafX5fte/eAUx5CCEg1AFw6dGS5UA/H3xsL7ySJGG6sfN4l2WDg==@vger.kernel.org X-Gm-Message-State: AOJu0YzYPd9BezzGlQmHKlokU/84wHLEkcbyCZtIM8K0FPdTUzSeNl71 r65e4KOSfMYSbo0a5bPQxcLR8nz363Ahz9RyXLwAGhbH22sB0DjIHFMVufIP2dE= X-Gm-Gg: ASbGnctxi1EqvsWtrh0IJRFlFhd6EhEI94DbGtiCLfgOlvsUDOtJVLzdEm65p7Xekml CrxdBgTG4oZFJjxn+qbJqBF7fUJXtwMU4Q+65au5IGTxJ9P/pmLfd0UM9Tc/O+YtyzoY2/v8czO QCI2TAsFdozybUQy6YwK6/jLOJMQI0UXikFYIZNdMYtQcRiwbExIN3QPpx2RjXRb0AAKW1Jhr/o rWvpzP7pMeb4mWDdmtjSj3M1L79gW1VAA+FE6RaVHncH3T1cUdpWxKnZSXmcqi6pmJuShYkDlJJ /SglwQF1xA3chykZ671qikznro9FYMsgnIglvP0HPCLqGNQkcJ52+7dDwvhUzn5zXp/wLk36erL hRBAFm0/ewFSxGNk= X-Google-Smtp-Source: AGHT+IHgeKju9Jzoyobd7WVJUbZyx4+buUWEWE2AbiOXa5JBGh33t4+gMIX8Byr7egoIWtLFm9Wfsg== X-Received: by 2002:a05:600c:1d19:b0:434:f623:9ff3 with SMTP id 5b1f17b1804b1-438913e0303mr456136895e9.15.1738071577698; Tue, 28 Jan 2025 05:39:37 -0800 (PST) Received: from raven.intern.cm-ag (p200300dc6f2b6900023064fffe740809.dip0.t-ipconnect.de. [2003:dc:6f2b:6900:230:64ff:fe74:809]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38c2a1bb02dsm14160780f8f.70.2025.01.28.05.39.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Jan 2025 05:39:37 -0800 (PST) From: Max Kellermann To: axboe@kernel.dk, asml.silence@gmail.com, io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Max Kellermann Subject: [PATCH 2/8] io_uring/io-wq: add io_worker.acct pointer Date: Tue, 28 Jan 2025 14:39:21 +0100 Message-ID: <20250128133927.3989681-3-max.kellermann@ionos.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250128133927.3989681-1-max.kellermann@ionos.com> References: <20250128133927.3989681-1-max.kellermann@ionos.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This replaces the `IO_WORKER_F_BOUND` flag. All code that checks this flag is not interested in knowing whether this is a "bound" worker; all it does with this flag is determine the `io_wq_acct` pointer. At the cost of an extra pointer field, we can eliminate some fragile pointer arithmetic. In turn, the `create_index` and `index` fields are not needed anymore. Signed-off-by: Max Kellermann --- io_uring/io-wq.c | 23 ++++++++--------------- 1 file changed, 8 insertions(+), 15 deletions(-) diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index 6d26f6f068af..197352ef78c7 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -30,7 +30,6 @@ enum { IO_WORKER_F_UP = 0, /* up and active */ IO_WORKER_F_RUNNING = 1, /* account as running */ IO_WORKER_F_FREE = 2, /* worker on free list */ - IO_WORKER_F_BOUND = 3, /* is doing bounded work */ }; enum { @@ -46,12 +45,12 @@ enum { */ struct io_worker { refcount_t ref; - int create_index; unsigned long flags; struct hlist_nulls_node nulls_node; struct list_head all_list; struct task_struct *task; struct io_wq *wq; + struct io_wq_acct *acct; struct io_wq_work *cur_work; raw_spinlock_t lock; @@ -79,7 +78,6 @@ struct io_worker { struct io_wq_acct { unsigned nr_workers; unsigned max_workers; - int index; atomic_t nr_running; raw_spinlock_t lock; struct io_wq_work_list work_list; @@ -135,7 +133,7 @@ struct io_cb_cancel_data { bool cancel_all; }; -static bool create_io_worker(struct io_wq *wq, int index); +static bool create_io_worker(struct io_wq *wq, struct io_wq_acct *acct); static void io_wq_dec_running(struct io_worker *worker); static bool io_acct_cancel_pending_work(struct io_wq *wq, struct io_wq_acct *acct, @@ -167,7 +165,7 @@ static inline struct io_wq_acct *io_work_get_acct(struct io_wq *wq, static inline struct io_wq_acct *io_wq_get_acct(struct io_worker *worker) { - return io_get_acct(worker->wq, test_bit(IO_WORKER_F_BOUND, &worker->flags)); + return worker->acct; } static void io_worker_ref_put(struct io_wq *wq) @@ -323,7 +321,7 @@ static bool io_wq_create_worker(struct io_wq *wq, struct io_wq_acct *acct) raw_spin_unlock(&wq->lock); atomic_inc(&acct->nr_running); atomic_inc(&wq->worker_refs); - return create_io_worker(wq, acct->index); + return create_io_worker(wq, acct); } static void io_wq_inc_running(struct io_worker *worker) @@ -343,7 +341,7 @@ static void create_worker_cb(struct callback_head *cb) worker = container_of(cb, struct io_worker, create_work); wq = worker->wq; - acct = &wq->acct[worker->create_index]; + acct = worker->acct; raw_spin_lock(&wq->lock); if (acct->nr_workers < acct->max_workers) { @@ -352,7 +350,7 @@ static void create_worker_cb(struct callback_head *cb) } raw_spin_unlock(&wq->lock); if (do_create) { - create_io_worker(wq, worker->create_index); + create_io_worker(wq, acct); } else { atomic_dec(&acct->nr_running); io_worker_ref_put(wq); @@ -384,7 +382,6 @@ static bool io_queue_worker_create(struct io_worker *worker, atomic_inc(&wq->worker_refs); init_task_work(&worker->create_work, func); - worker->create_index = acct->index; if (!task_work_add(wq->task, &worker->create_work, TWA_SIGNAL)) { /* * EXIT may have been set after checking it above, check after @@ -821,9 +818,8 @@ static void io_workqueue_create(struct work_struct *work) kfree(worker); } -static bool create_io_worker(struct io_wq *wq, int index) +static bool create_io_worker(struct io_wq *wq, struct io_wq_acct *acct) { - struct io_wq_acct *acct = &wq->acct[index]; struct io_worker *worker; struct task_struct *tsk; @@ -842,12 +838,10 @@ static bool create_io_worker(struct io_wq *wq, int index) refcount_set(&worker->ref, 1); worker->wq = wq; + worker->acct = acct; raw_spin_lock_init(&worker->lock); init_completion(&worker->ref_done); - if (index == IO_WQ_ACCT_BOUND) - set_bit(IO_WORKER_F_BOUND, &worker->flags); - tsk = create_io_thread(io_wq_worker, worker, NUMA_NO_NODE); if (!IS_ERR(tsk)) { io_init_new_worker(wq, worker, tsk); @@ -1176,7 +1170,6 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data) for (i = 0; i < IO_WQ_ACCT_NR; i++) { struct io_wq_acct *acct = &wq->acct[i]; - acct->index = i; atomic_set(&acct->nr_running, 0); INIT_WQ_LIST(&acct->work_list); raw_spin_lock_init(&acct->lock); From patchwork Tue Jan 28 13:39:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Kellermann X-Patchwork-Id: 13952568 Received: from mail-wr1-f68.google.com (mail-wr1-f68.google.com [209.85.221.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 510CF1A83E8 for ; Tue, 28 Jan 2025 13:39:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.68 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071583; cv=none; b=QPLPccQxHh2cK4IVj2Ei4MGenC3xAQ8eYJ1ZZqlspaByHTSQnmjfgAxyMQfqZIarbjBjeFvFg21loKmdUDixPAx7nJJYbfaYrb7Jz5mQdrTmGw00aChL/Od8Qiam2PWMlmm6kuG+LzH0jq0UKgPaxkGMDA9wFzZjOLuTUHoYypE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071583; c=relaxed/simple; bh=b9q0H6JlHPZhx/QToXEjyQgV/SZEBw6Z/m//xQGJlOo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BB5jAKCUCqlJd1bVj5kc1NJuLEEarteMSZtYs1UOr519tXV0BpZKu6KgLKxhznLy9hniqbfXOf/WWN1TGApXpiFF1XcjogZ5iM4OPHdFahVDYrwBLPU8ZfyxOgLoJPhXcHA2Y30L6Dq3vCtoN6XgNtjnh5Ui85HdZf9xp4aoV/M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com; spf=pass smtp.mailfrom=ionos.com; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b=gpyXwnBB; arc=none smtp.client-ip=209.85.221.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ionos.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b="gpyXwnBB" Received: by mail-wr1-f68.google.com with SMTP id ffacd0b85a97d-38a8b17d7a7so3036375f8f.2 for ; Tue, 28 Jan 2025 05:39:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ionos.com; s=google; t=1738071579; x=1738676379; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=juRRYQ1dTFnhLtP+KRntaqf9hUoGq/7yGgLWvGvQH84=; b=gpyXwnBB6oq+IaNdNwzqbfUbclMGt1xFE0ySyT/KIk6aiv4R+rkrduCRJbYxF3EDrl uEZo3u/H85rtibk9fr7aQwWCWOKKGXpOx8keTUVsPENpa5kwkA5pQkwJttn25yWNW+a0 f0XKnQKhJvcSW8hrPe9qbHctz9E5hyb3y4+DvOaRpCi2mMusKxAvgtzvXhwSXLxAsFw0 /jxzlBz0zydUmT7dx7XKuEh2JWphhts7FV8tETk+MJEut6Nh/p5+YDrr83VOTk3IIR7f cvGDrcSE8eHXzT+lwnPl2gk9elHf320Yk4QfNgoEDcOqGJjAZKmR4/MIKXicIsa43EPF +7Cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738071579; x=1738676379; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=juRRYQ1dTFnhLtP+KRntaqf9hUoGq/7yGgLWvGvQH84=; b=mNtaQJaOV5R956N+g/+9I8nzsv9slY/xcfcq483929WQx1oOHealRpMrOZ3sdwaMqw IiCEO02fRdTCcMC/4T2ip+Yj7abCiaJnk1x8X6ve0kk7OP5mr13YVcPMeEJ77NfVSTGx L2IYjoiIZFRRroCeBLCamm/noE2yRbuei5kCs0PCzcZUGbK4AU0C62+QZO0FiOBm8BAo ymjYvD1QmHZN3PEBcOUwZ2JxhSww4B+Rpcj0yBqJje5gkwgiCVo6i1ScKW3qvlm4L2P6 mJ7fbpnq4cDZp50ncxnCl03MmtRRrSvJu6P26CvdPP547xZd9a1WBTxYBe83rDHuhiSz 3ctQ== X-Forwarded-Encrypted: i=1; AJvYcCW3sDv3RLDzEcY8cOZoCd/wki8juqPKfBDCDlu3tuIhAPGwy6evkfW0C7toMySPV+hlGPZfCBzUaw==@vger.kernel.org X-Gm-Message-State: AOJu0YzSF6XCtQtpEl97pyG4+ikoWp114XPUaZuSnxnhpdpoKztulXEt 7YZHDyy+IFpOxyNUorvq1BaCt8pHMp+R7HsO9ntArhPx5y6IvHH38Qy6NApvpHI= X-Gm-Gg: ASbGnctGfl58jrkTjLy/+Z7QOSjJsRY5pED05scUSyQYa6rg9JpKlXFoLZDPzxfe4Rn QxHGtioMdjHR9/YapBxl/r3/1G+NTl/ZdovU766Z7AyOFvtLYp0U4R0EKYIP9SHjqFEXlDH1eAZ O0GKCJXn+Ci/Z8HInZrpIpCwzwMVPIchpSiUjY7ILTK/KrsiWd40Hp1flwPPGX+yOCNkbnAJNcp 7fS7nogQEwapY0sh1CVsd+jTCl8UCx44+PN6i2WGJJV/luEex4ucPwtaTzGXQ7TPSfE1C2ILhGS hTcq1z10Dn5cUO1l3ympaBc2tynzbibh906Ocp3nfyNRER5kTW510Q/ASxqHCH8w8fNo26fO6eW A9T1EPxuJiAWdg9aAkIHlPF6opA== X-Google-Smtp-Source: AGHT+IGH5LVV10FXWTscWf1MbdV29OFTGXN+TK6mO0EpZ8qDDcqBVSdd1lNBnzoUoaG3JTtQpj/JMg== X-Received: by 2002:a5d:59a4:0:b0:38b:f3f4:57ae with SMTP id ffacd0b85a97d-38bf5655726mr41692567f8f.10.1738071578528; Tue, 28 Jan 2025 05:39:38 -0800 (PST) Received: from raven.intern.cm-ag (p200300dc6f2b6900023064fffe740809.dip0.t-ipconnect.de. [2003:dc:6f2b:6900:230:64ff:fe74:809]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38c2a1bb02dsm14160780f8f.70.2025.01.28.05.39.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Jan 2025 05:39:38 -0800 (PST) From: Max Kellermann To: axboe@kernel.dk, asml.silence@gmail.com, io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Max Kellermann Subject: [PATCH 3/8] io_uring/io-wq: move worker lists to struct io_wq_acct Date: Tue, 28 Jan 2025 14:39:22 +0100 Message-ID: <20250128133927.3989681-4-max.kellermann@ionos.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250128133927.3989681-1-max.kellermann@ionos.com> References: <20250128133927.3989681-1-max.kellermann@ionos.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Have separate linked lists for bounded and unbounded workers. This way, io_acct_activate_free_worker() sees only workers relevant to it and doesn't need to skip irrelevant ones. This speeds up the linked list traversal (under acct->lock). The `io_wq.lock` field is moved to `io_wq_acct.workers_lock`. It did not actually protect "access to elements below", that is, not all of them; it only protected access to the worker lists. By having two locks instead of one, contention on this lock is reduced. Signed-off-by: Max Kellermann --- io_uring/io-wq.c | 162 ++++++++++++++++++++++++++++------------------- 1 file changed, 96 insertions(+), 66 deletions(-) diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index 197352ef78c7..dfdd45ebe4bb 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -76,9 +76,27 @@ struct io_worker { #define IO_WQ_NR_HASH_BUCKETS (1u << IO_WQ_HASH_ORDER) struct io_wq_acct { + /** + * Protects access to the worker lists. + */ + raw_spinlock_t workers_lock; + unsigned nr_workers; unsigned max_workers; atomic_t nr_running; + + /** + * The list of free workers. Protected by #workers_lock + * (write) and RCU (read). + */ + struct hlist_nulls_head free_list; + + /** + * The list of all workers. Protected by #workers_lock + * (write) and RCU (read). + */ + struct list_head all_list; + raw_spinlock_t lock; struct io_wq_work_list work_list; unsigned long flags; @@ -110,12 +128,6 @@ struct io_wq { struct io_wq_acct acct[IO_WQ_ACCT_NR]; - /* lock protects access to elements below */ - raw_spinlock_t lock; - - struct hlist_nulls_head free_list; - struct list_head all_list; - struct wait_queue_entry wait; struct io_wq_work *hash_tail[IO_WQ_NR_HASH_BUCKETS]; @@ -190,9 +202,9 @@ static void io_worker_cancel_cb(struct io_worker *worker) struct io_wq *wq = worker->wq; atomic_dec(&acct->nr_running); - raw_spin_lock(&wq->lock); + raw_spin_lock(&acct->workers_lock); acct->nr_workers--; - raw_spin_unlock(&wq->lock); + raw_spin_unlock(&acct->workers_lock); io_worker_ref_put(wq); clear_bit_unlock(0, &worker->create_state); io_worker_release(worker); @@ -211,6 +223,7 @@ static bool io_task_worker_match(struct callback_head *cb, void *data) static void io_worker_exit(struct io_worker *worker) { struct io_wq *wq = worker->wq; + struct io_wq_acct *acct = io_wq_get_acct(worker); while (1) { struct callback_head *cb = task_work_cancel_match(wq->task, @@ -224,11 +237,11 @@ static void io_worker_exit(struct io_worker *worker) io_worker_release(worker); wait_for_completion(&worker->ref_done); - raw_spin_lock(&wq->lock); + raw_spin_lock(&acct->workers_lock); if (test_bit(IO_WORKER_F_FREE, &worker->flags)) hlist_nulls_del_rcu(&worker->nulls_node); list_del_rcu(&worker->all_list); - raw_spin_unlock(&wq->lock); + raw_spin_unlock(&acct->workers_lock); io_wq_dec_running(worker); /* * this worker is a goner, clear ->worker_private to avoid any @@ -267,8 +280,7 @@ static inline bool io_acct_run_queue(struct io_wq_acct *acct) * Check head of free list for an available worker. If one isn't available, * caller must create one. */ -static bool io_wq_activate_free_worker(struct io_wq *wq, - struct io_wq_acct *acct) +static bool io_acct_activate_free_worker(struct io_wq_acct *acct) __must_hold(RCU) { struct hlist_nulls_node *n; @@ -279,13 +291,9 @@ static bool io_wq_activate_free_worker(struct io_wq *wq, * activate. If a given worker is on the free_list but in the process * of exiting, keep trying. */ - hlist_nulls_for_each_entry_rcu(worker, n, &wq->free_list, nulls_node) { + hlist_nulls_for_each_entry_rcu(worker, n, &acct->free_list, nulls_node) { if (!io_worker_get(worker)) continue; - if (io_wq_get_acct(worker) != acct) { - io_worker_release(worker); - continue; - } /* * If the worker is already running, it's either already * starting work or finishing work. In either case, if it does @@ -312,13 +320,13 @@ static bool io_wq_create_worker(struct io_wq *wq, struct io_wq_acct *acct) if (unlikely(!acct->max_workers)) pr_warn_once("io-wq is not configured for unbound workers"); - raw_spin_lock(&wq->lock); + raw_spin_lock(&acct->workers_lock); if (acct->nr_workers >= acct->max_workers) { - raw_spin_unlock(&wq->lock); + raw_spin_unlock(&acct->workers_lock); return true; } acct->nr_workers++; - raw_spin_unlock(&wq->lock); + raw_spin_unlock(&acct->workers_lock); atomic_inc(&acct->nr_running); atomic_inc(&wq->worker_refs); return create_io_worker(wq, acct); @@ -342,13 +350,13 @@ static void create_worker_cb(struct callback_head *cb) worker = container_of(cb, struct io_worker, create_work); wq = worker->wq; acct = worker->acct; - raw_spin_lock(&wq->lock); + raw_spin_lock(&acct->workers_lock); if (acct->nr_workers < acct->max_workers) { acct->nr_workers++; do_create = true; } - raw_spin_unlock(&wq->lock); + raw_spin_unlock(&acct->workers_lock); if (do_create) { create_io_worker(wq, acct); } else { @@ -427,25 +435,25 @@ static void io_wq_dec_running(struct io_worker *worker) * Worker will start processing some work. Move it to the busy list, if * it's currently on the freelist */ -static void __io_worker_busy(struct io_wq *wq, struct io_worker *worker) +static void __io_worker_busy(struct io_wq_acct *acct, struct io_worker *worker) { if (test_bit(IO_WORKER_F_FREE, &worker->flags)) { clear_bit(IO_WORKER_F_FREE, &worker->flags); - raw_spin_lock(&wq->lock); + raw_spin_lock(&acct->workers_lock); hlist_nulls_del_init_rcu(&worker->nulls_node); - raw_spin_unlock(&wq->lock); + raw_spin_unlock(&acct->workers_lock); } } /* * No work, worker going to sleep. Move to freelist. */ -static void __io_worker_idle(struct io_wq *wq, struct io_worker *worker) - __must_hold(wq->lock) +static void __io_worker_idle(struct io_wq_acct *acct, struct io_worker *worker) + __must_hold(acct->workers_lock) { if (!test_bit(IO_WORKER_F_FREE, &worker->flags)) { set_bit(IO_WORKER_F_FREE, &worker->flags); - hlist_nulls_add_head_rcu(&worker->nulls_node, &wq->free_list); + hlist_nulls_add_head_rcu(&worker->nulls_node, &acct->free_list); } } @@ -580,7 +588,7 @@ static void io_worker_handle_work(struct io_wq_acct *acct, if (!work) break; - __io_worker_busy(wq, worker); + __io_worker_busy(acct, worker); io_assign_current_work(worker, work); __set_current_state(TASK_RUNNING); @@ -651,20 +659,20 @@ static int io_wq_worker(void *data) while (io_acct_run_queue(acct)) io_worker_handle_work(acct, worker); - raw_spin_lock(&wq->lock); + raw_spin_lock(&acct->workers_lock); /* * Last sleep timed out. Exit if we're not the last worker, * or if someone modified our affinity. */ if (last_timeout && (exit_mask || acct->nr_workers > 1)) { acct->nr_workers--; - raw_spin_unlock(&wq->lock); + raw_spin_unlock(&acct->workers_lock); __set_current_state(TASK_RUNNING); break; } last_timeout = false; - __io_worker_idle(wq, worker); - raw_spin_unlock(&wq->lock); + __io_worker_idle(acct, worker); + raw_spin_unlock(&acct->workers_lock); if (io_run_task_work()) continue; ret = schedule_timeout(WORKER_IDLE_TIMEOUT); @@ -725,18 +733,18 @@ void io_wq_worker_sleeping(struct task_struct *tsk) io_wq_dec_running(worker); } -static void io_init_new_worker(struct io_wq *wq, struct io_worker *worker, +static void io_init_new_worker(struct io_wq *wq, struct io_wq_acct *acct, struct io_worker *worker, struct task_struct *tsk) { tsk->worker_private = worker; worker->task = tsk; set_cpus_allowed_ptr(tsk, wq->cpu_mask); - raw_spin_lock(&wq->lock); - hlist_nulls_add_head_rcu(&worker->nulls_node, &wq->free_list); - list_add_tail_rcu(&worker->all_list, &wq->all_list); + raw_spin_lock(&acct->workers_lock); + hlist_nulls_add_head_rcu(&worker->nulls_node, &acct->free_list); + list_add_tail_rcu(&worker->all_list, &acct->all_list); set_bit(IO_WORKER_F_FREE, &worker->flags); - raw_spin_unlock(&wq->lock); + raw_spin_unlock(&acct->workers_lock); wake_up_new_task(tsk); } @@ -772,20 +780,20 @@ static void create_worker_cont(struct callback_head *cb) struct io_worker *worker; struct task_struct *tsk; struct io_wq *wq; + struct io_wq_acct *acct; worker = container_of(cb, struct io_worker, create_work); clear_bit_unlock(0, &worker->create_state); wq = worker->wq; + acct = io_wq_get_acct(worker); tsk = create_io_thread(io_wq_worker, worker, NUMA_NO_NODE); if (!IS_ERR(tsk)) { - io_init_new_worker(wq, worker, tsk); + io_init_new_worker(wq, acct, worker, tsk); io_worker_release(worker); return; } else if (!io_should_retry_thread(worker, PTR_ERR(tsk))) { - struct io_wq_acct *acct = io_wq_get_acct(worker); - atomic_dec(&acct->nr_running); - raw_spin_lock(&wq->lock); + raw_spin_lock(&acct->workers_lock); acct->nr_workers--; if (!acct->nr_workers) { struct io_cb_cancel_data match = { @@ -793,11 +801,11 @@ static void create_worker_cont(struct callback_head *cb) .cancel_all = true, }; - raw_spin_unlock(&wq->lock); + raw_spin_unlock(&acct->workers_lock); while (io_acct_cancel_pending_work(wq, acct, &match)) ; } else { - raw_spin_unlock(&wq->lock); + raw_spin_unlock(&acct->workers_lock); } io_worker_ref_put(wq); kfree(worker); @@ -829,9 +837,9 @@ static bool create_io_worker(struct io_wq *wq, struct io_wq_acct *acct) if (!worker) { fail: atomic_dec(&acct->nr_running); - raw_spin_lock(&wq->lock); + raw_spin_lock(&acct->workers_lock); acct->nr_workers--; - raw_spin_unlock(&wq->lock); + raw_spin_unlock(&acct->workers_lock); io_worker_ref_put(wq); return false; } @@ -844,7 +852,7 @@ static bool create_io_worker(struct io_wq *wq, struct io_wq_acct *acct) tsk = create_io_thread(io_wq_worker, worker, NUMA_NO_NODE); if (!IS_ERR(tsk)) { - io_init_new_worker(wq, worker, tsk); + io_init_new_worker(wq, acct, worker, tsk); } else if (!io_should_retry_thread(worker, PTR_ERR(tsk))) { kfree(worker); goto fail; @@ -860,14 +868,14 @@ static bool create_io_worker(struct io_wq *wq, struct io_wq_acct *acct) * Iterate the passed in list and call the specific function for each * worker that isn't exiting */ -static bool io_wq_for_each_worker(struct io_wq *wq, - bool (*func)(struct io_worker *, void *), - void *data) +static bool io_acct_for_each_worker(struct io_wq_acct *acct, + bool (*func)(struct io_worker *, void *), + void *data) { struct io_worker *worker; bool ret = false; - list_for_each_entry_rcu(worker, &wq->all_list, all_list) { + list_for_each_entry_rcu(worker, &acct->all_list, all_list) { if (io_worker_get(worker)) { /* no task if node is/was offline */ if (worker->task) @@ -881,6 +889,18 @@ static bool io_wq_for_each_worker(struct io_wq *wq, return ret; } +static bool io_wq_for_each_worker(struct io_wq *wq, + bool (*func)(struct io_worker *, void *), + void *data) +{ + for (int i = 0; i < IO_WQ_ACCT_NR; i++) { + if (!io_acct_for_each_worker(&wq->acct[i], func, data)) + return false; + } + + return true; +} + static bool io_wq_worker_wake(struct io_worker *worker, void *data) { __set_notify_signal(worker->task); @@ -949,7 +969,7 @@ void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work) raw_spin_unlock(&acct->lock); rcu_read_lock(); - do_create = !io_wq_activate_free_worker(wq, acct); + do_create = !io_acct_activate_free_worker(acct); rcu_read_unlock(); if (do_create && ((work_flags & IO_WQ_WORK_CONCURRENT) || @@ -960,12 +980,12 @@ void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work) if (likely(did_create)) return; - raw_spin_lock(&wq->lock); + raw_spin_lock(&acct->workers_lock); if (acct->nr_workers) { - raw_spin_unlock(&wq->lock); + raw_spin_unlock(&acct->workers_lock); return; } - raw_spin_unlock(&wq->lock); + raw_spin_unlock(&acct->workers_lock); /* fatal condition, failed to create the first worker */ io_acct_cancel_pending_work(wq, acct, &match); @@ -1072,11 +1092,22 @@ static void io_wq_cancel_pending_work(struct io_wq *wq, } } +static void io_acct_cancel_running_work(struct io_wq_acct *acct, + struct io_cb_cancel_data *match) +{ + raw_spin_lock(&acct->workers_lock); + io_acct_for_each_worker(acct, io_wq_worker_cancel, match); + raw_spin_unlock(&acct->workers_lock); +} + static void io_wq_cancel_running_work(struct io_wq *wq, struct io_cb_cancel_data *match) { rcu_read_lock(); - io_wq_for_each_worker(wq, io_wq_worker_cancel, match); + + for (int i = 0; i < IO_WQ_ACCT_NR; i++) + io_acct_cancel_running_work(&wq->acct[i], match); + rcu_read_unlock(); } @@ -1099,16 +1130,14 @@ enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel, * as an indication that we attempt to signal cancellation. The * completion will run normally in this case. * - * Do both of these while holding the wq->lock, to ensure that + * Do both of these while holding the acct->workers_lock, to ensure that * we'll find a work item regardless of state. */ io_wq_cancel_pending_work(wq, &match); if (match.nr_pending && !match.cancel_all) return IO_WQ_CANCEL_OK; - raw_spin_lock(&wq->lock); io_wq_cancel_running_work(wq, &match); - raw_spin_unlock(&wq->lock); if (match.nr_running && !match.cancel_all) return IO_WQ_CANCEL_RUNNING; @@ -1132,7 +1161,7 @@ static int io_wq_hash_wake(struct wait_queue_entry *wait, unsigned mode, struct io_wq_acct *acct = &wq->acct[i]; if (test_and_clear_bit(IO_ACCT_STALLED_BIT, &acct->flags)) - io_wq_activate_free_worker(wq, acct); + io_acct_activate_free_worker(acct); } rcu_read_unlock(); return 1; @@ -1171,14 +1200,15 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data) struct io_wq_acct *acct = &wq->acct[i]; atomic_set(&acct->nr_running, 0); + + raw_spin_lock_init(&acct->workers_lock); + INIT_HLIST_NULLS_HEAD(&acct->free_list, 0); + INIT_LIST_HEAD(&acct->all_list); + INIT_WQ_LIST(&acct->work_list); raw_spin_lock_init(&acct->lock); } - raw_spin_lock_init(&wq->lock); - INIT_HLIST_NULLS_HEAD(&wq->free_list, 0); - INIT_LIST_HEAD(&wq->all_list); - wq->task = get_task_struct(data->task); atomic_set(&wq->worker_refs, 1); init_completion(&wq->worker_done); @@ -1364,14 +1394,14 @@ int io_wq_max_workers(struct io_wq *wq, int *new_count) rcu_read_lock(); - raw_spin_lock(&wq->lock); for (i = 0; i < IO_WQ_ACCT_NR; i++) { acct = &wq->acct[i]; + raw_spin_lock(&acct->workers_lock); prev[i] = max_t(int, acct->max_workers, prev[i]); if (new_count[i]) acct->max_workers = new_count[i]; + raw_spin_unlock(&acct->workers_lock); } - raw_spin_unlock(&wq->lock); rcu_read_unlock(); for (i = 0; i < IO_WQ_ACCT_NR; i++) From patchwork Tue Jan 28 13:39:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Kellermann X-Patchwork-Id: 13952567 Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 857261A9B2A for ; Tue, 28 Jan 2025 13:39:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071583; cv=none; b=N0YbjWGcYigTn4GH9dygE/PT9fbYlDrQRZ53gkr4n7DPPE90IgS4IIkLIr1VCbEus+ojC0fSW8P/whrARK/hQXy1egZL+/BBVpgaTPvcmovGiICnjVmJrChTwF7JoJceTj9rBSSfQES14aQKpsN7CGHbSp3C8TZIPWcbEMhA8wQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071583; c=relaxed/simple; bh=nxpfJ3wB5nCck0yc+TIyGkKA+HJxeVDU44WyMwYM6D8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=thnTVqwX5yQ4ZDTj9PxP5ppO7QBf/gq7j1uLNMAzpe/j2Rodwre/oiXmuuHJIDlxDaLwfsLoMfAtE8UFpdwRlOYkEQgE9k8T9TUWYyIiTCjQ/+85ZMlD7lmuluMREpAQH4CzG9lCpOqhw/xQWqWPkLoOEdNDxxM1v/vPdlyG98k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com; spf=pass smtp.mailfrom=ionos.com; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b=eRTpXbZo; arc=none smtp.client-ip=209.85.221.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ionos.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b="eRTpXbZo" Received: by mail-wr1-f47.google.com with SMTP id ffacd0b85a97d-38637614567so2743724f8f.3 for ; Tue, 28 Jan 2025 05:39:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ionos.com; s=google; t=1738071580; x=1738676380; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=q0Sb1zzl9lHTXysx3Zj4dezvRrbNaG/3CIED1Ppo3HQ=; b=eRTpXbZo0q2KH4RaOi4L4vOHpN94ehAMpw7bXAForil75RQdh3FDWOZRpEZ/KHC2ky Cg5JSf1drvBFLgUjjMwU2ignlHkaXDN5x2i7gldGGML6v3YaGU/nS8EOrngNECxgEqQ6 lNbdXH8/rmV2m7HDcj1DpXEjVqaQ1c3K6sKYVbPr2WEUZK99lPEjorFme5aG5/FU3UpV tABRum2vIikFLNjIsToRoUIzn3c4+ABiOVF6hXIceIDipyyQ6qf7nsXH2WLHwR7NYT6Q vGLhgyBVBBMD9sc9A2qjDaQlQfDEmjgJvmh6rpJJeL9EktYw+XU0aJdSaD6vcb8cJApM ixtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738071580; x=1738676380; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=q0Sb1zzl9lHTXysx3Zj4dezvRrbNaG/3CIED1Ppo3HQ=; b=NU9UwbSn/JkrN1UTYDcaYkHLjjsjsmECk+zENseE45OGx84Ub6HKL3FX9ApBQLR0As wAVRol8sknyioU0IVKyWHeAVu9ZP5QUrrlAx1BXLrEirOkx/n+58UP+qmRGggl5LBRY6 TKLw6rQoPqOPXR4hERJnQ/QRCOZijrozt2ng0Zj85MEqq5G5eN2U0nYFVsX2ItX3jreI mLe0S70Bxesdxsv6Ciq3k5aArQXHPQ5mUYySOrHu4urc2yL4y3s7eYNZJnev/FlJ1L7d kmGW8tCrqqK2jDgxxtxo0DTWSudPSs1izffRjZRw/2uJKwhz4yi/V8JebgD1DN86uUC8 C/FQ== X-Forwarded-Encrypted: i=1; AJvYcCUz6iPLQ63CgQia+QF/8rHcuRYLO2forgG6Xtlb/Rj8Syu3rBDoRyH93APcg9n8KuNm7BwlHMFLVA==@vger.kernel.org X-Gm-Message-State: AOJu0Yw40auSDnwwjwPamytzlK4XnSqS4YaoWYMuPKl2Gx5xNMFw+Cd6 bCSGLJh/KOB1pswEpmbUrPPX0j0xdxeDPqe8uuXJrPSDI1QNOoGGQQ/xc9NR4p7+qOgep49Mfis V2A4= X-Gm-Gg: ASbGncuQaoN1M/DtG8gxWhrJCpK017pxvqF3N5rU71xBxCrFrifX7XyIBUw9J7a2cHN qj2hLST17UtKFS7dVi0EVXEolblFsvcjP+KWLN2yV+aDJmDo7s4kQNk3bnyViGWo8177rIRMaja oBjnsM8V3OK/NdrbzEKlrQSIEw5qrXdAyjmX5crZMWAROjHF2/0/Wb7HQBAZYD2LuChjus9V2Z8 NLiOvhY+eaIrpxDxSbqdg4M49Xw+hsue053WqUmiH12fXv0WSOtrqVhseSitqauJ4CnychcsT+X UXoHgPHdA621UgXsg1VRPGcx//ZRmdGlR4LuvVattHGRRKBg++q01lSKrx5s2ifpxNx1/1pqUzq TSq3Ndjz+DMTBXNQ= X-Google-Smtp-Source: AGHT+IGcdEBNubL7sukg4uu3ewHXC+3+vOT/IFRpwXZUOGiTZ50OLbhjM/qTzmALbuEz5d8UD8/mSw== X-Received: by 2002:a05:6000:18a9:b0:38a:86fe:52dc with SMTP id ffacd0b85a97d-38c3b0cbb61mr9338899f8f.13.1738071579794; Tue, 28 Jan 2025 05:39:39 -0800 (PST) Received: from raven.intern.cm-ag (p200300dc6f2b6900023064fffe740809.dip0.t-ipconnect.de. [2003:dc:6f2b:6900:230:64ff:fe74:809]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38c2a1bb02dsm14160780f8f.70.2025.01.28.05.39.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Jan 2025 05:39:38 -0800 (PST) From: Max Kellermann To: axboe@kernel.dk, asml.silence@gmail.com, io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Max Kellermann Subject: [PATCH 4/8] io_uring/io-wq: cache work->flags in variable Date: Tue, 28 Jan 2025 14:39:23 +0100 Message-ID: <20250128133927.3989681-5-max.kellermann@ionos.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250128133927.3989681-1-max.kellermann@ionos.com> References: <20250128133927.3989681-1-max.kellermann@ionos.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This eliminates several redundant atomic reads and therefore reduces the duration the surrounding spinlocks are held. In several io_uring benchmarks, this reduced the CPU time spent in queued_spin_lock_slowpath() considerably: io_uring benchmark with a flood of `IORING_OP_NOP` and `IOSQE_ASYNC`: 38.86% -1.49% [kernel.kallsyms] [k] queued_spin_lock_slowpath 6.75% +0.36% [kernel.kallsyms] [k] io_worker_handle_work 2.60% +0.19% [kernel.kallsyms] [k] io_nop 3.92% +0.18% [kernel.kallsyms] [k] io_req_task_complete 6.34% -0.18% [kernel.kallsyms] [k] io_wq_submit_work HTTP server, static file: 42.79% -2.77% [kernel.kallsyms] [k] queued_spin_lock_slowpath 2.08% +0.23% [kernel.kallsyms] [k] io_wq_submit_work 1.19% +0.20% [kernel.kallsyms] [k] amd_iommu_iotlb_sync_map 1.46% +0.15% [kernel.kallsyms] [k] ep_poll_callback 1.80% +0.15% [kernel.kallsyms] [k] io_worker_handle_work HTTP server, PHP: 35.03% -1.80% [kernel.kallsyms] [k] queued_spin_lock_slowpath 0.84% +0.21% [kernel.kallsyms] [k] amd_iommu_iotlb_sync_map 1.39% +0.12% [kernel.kallsyms] [k] _copy_to_iter 0.21% +0.10% [kernel.kallsyms] [k] update_sd_lb_stats Signed-off-by: Max Kellermann --- io_uring/io-wq.c | 33 +++++++++++++++++++++------------ io_uring/io-wq.h | 7 ++++++- 2 files changed, 27 insertions(+), 13 deletions(-) diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index dfdd45ebe4bb..ba9974e6f521 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -170,9 +170,9 @@ static inline struct io_wq_acct *io_get_acct(struct io_wq *wq, bool bound) } static inline struct io_wq_acct *io_work_get_acct(struct io_wq *wq, - struct io_wq_work *work) + unsigned int work_flags) { - return io_get_acct(wq, !(atomic_read(&work->flags) & IO_WQ_WORK_UNBOUND)); + return io_get_acct(wq, !(work_flags & IO_WQ_WORK_UNBOUND)); } static inline struct io_wq_acct *io_wq_get_acct(struct io_worker *worker) @@ -457,9 +457,14 @@ static void __io_worker_idle(struct io_wq_acct *acct, struct io_worker *worker) } } +static inline unsigned int __io_get_work_hash(unsigned int work_flags) +{ + return work_flags >> IO_WQ_HASH_SHIFT; +} + static inline unsigned int io_get_work_hash(struct io_wq_work *work) { - return atomic_read(&work->flags) >> IO_WQ_HASH_SHIFT; + return __io_get_work_hash(atomic_read(&work->flags)); } static bool io_wait_on_hash(struct io_wq *wq, unsigned int hash) @@ -489,17 +494,19 @@ static struct io_wq_work *io_get_next_work(struct io_wq_acct *acct, struct io_wq *wq = worker->wq; wq_list_for_each(node, prev, &acct->work_list) { + unsigned int work_flags; unsigned int hash; work = container_of(node, struct io_wq_work, list); /* not hashed, can run anytime */ - if (!io_wq_is_hashed(work)) { + work_flags = atomic_read(&work->flags); + if (!__io_wq_is_hashed(work_flags)) { wq_list_del(&acct->work_list, node, prev); return work; } - hash = io_get_work_hash(work); + hash = __io_get_work_hash(work_flags); /* all items with this hash lie in [work, tail] */ tail = wq->hash_tail[hash]; @@ -596,12 +603,13 @@ static void io_worker_handle_work(struct io_wq_acct *acct, /* handle a whole dependent link */ do { struct io_wq_work *next_hashed, *linked; - unsigned int hash = io_get_work_hash(work); + unsigned int work_flags = atomic_read(&work->flags); + unsigned int hash = __io_get_work_hash(work_flags); next_hashed = wq_next_work(work); if (do_kill && - (atomic_read(&work->flags) & IO_WQ_WORK_UNBOUND)) + (work_flags & IO_WQ_WORK_UNBOUND)) atomic_or(IO_WQ_WORK_CANCEL, &work->flags); wq->do_work(work); io_assign_current_work(worker, NULL); @@ -917,18 +925,19 @@ static void io_run_cancel(struct io_wq_work *work, struct io_wq *wq) } while (work); } -static void io_wq_insert_work(struct io_wq *wq, struct io_wq_acct *acct, struct io_wq_work *work) +static void io_wq_insert_work(struct io_wq *wq, struct io_wq_acct *acct, + struct io_wq_work *work, unsigned int work_flags) { unsigned int hash; struct io_wq_work *tail; - if (!io_wq_is_hashed(work)) { + if (!__io_wq_is_hashed(work_flags)) { append: wq_list_add_tail(&work->list, &acct->work_list); return; } - hash = io_get_work_hash(work); + hash = __io_get_work_hash(work_flags); tail = wq->hash_tail[hash]; wq->hash_tail[hash] = work; if (!tail) @@ -944,8 +953,8 @@ static bool io_wq_work_match_item(struct io_wq_work *work, void *data) void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work) { - struct io_wq_acct *acct = io_work_get_acct(wq, work); unsigned int work_flags = atomic_read(&work->flags); + struct io_wq_acct *acct = io_work_get_acct(wq, work_flags); struct io_cb_cancel_data match = { .fn = io_wq_work_match_item, .data = work, @@ -964,7 +973,7 @@ void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work) } raw_spin_lock(&acct->lock); - io_wq_insert_work(wq, acct, work); + io_wq_insert_work(wq, acct, work, work_flags); clear_bit(IO_ACCT_STALLED_BIT, &acct->flags); raw_spin_unlock(&acct->lock); diff --git a/io_uring/io-wq.h b/io_uring/io-wq.h index b3b004a7b625..d4fb2940e435 100644 --- a/io_uring/io-wq.h +++ b/io_uring/io-wq.h @@ -54,9 +54,14 @@ int io_wq_cpu_affinity(struct io_uring_task *tctx, cpumask_var_t mask); int io_wq_max_workers(struct io_wq *wq, int *new_count); bool io_wq_worker_stopped(void); +static inline bool __io_wq_is_hashed(unsigned int work_flags) +{ + return work_flags & IO_WQ_WORK_HASHED; +} + static inline bool io_wq_is_hashed(struct io_wq_work *work) { - return atomic_read(&work->flags) & IO_WQ_WORK_HASHED; + return __io_wq_is_hashed(atomic_read(&work->flags)); } typedef bool (work_cancel_fn)(struct io_wq_work *, void *); From patchwork Tue Jan 28 13:39:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Kellermann X-Patchwork-Id: 13952569 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A67C1AA1C9 for ; Tue, 28 Jan 2025 13:39:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071584; cv=none; b=CLCyKNTIde5fJqYhqycRirvjwwHJGqcxjrn0aRpsGfDJSvZ5KSgmpcV9iGBR82OJh9LzlcmP7gYTMC/L4SZCZcvPSaxhhzdLPAEQiBOqqCpu85r/wGrmAK2Fz34RIRLWE0uSwBXaDitz/OC+BV5Y11jnf8TyolnQt6yvJfngwkA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071584; c=relaxed/simple; bh=QvShGw/EU0/4HTfEl3fvb7iVkBbHr4NN1IvQ8AYev8w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XYRHUNxrQ3fL8/gPDlqdpOCvUeWZAyEZddnZlmshyXSjf9h3vWoM2iCiGYpSLRKtBE2+kCQaxh3ahB2V7VKlrigfdaJlNQ0GzZHeKCzyPppoWUyTbHpmVRnpTqYZfyDISVEdtE57vjWuUL8RJ27X0ckvZ7yA0FV8lnZQx7X94AA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com; spf=pass smtp.mailfrom=ionos.com; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b=ESgk9t6D; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ionos.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b="ESgk9t6D" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-43618283d48so40088465e9.1 for ; Tue, 28 Jan 2025 05:39:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ionos.com; s=google; t=1738071580; x=1738676380; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nS1iDaKH7nB32Am/xLOSbCNtV3WCrSwFl1dM6W+hV7k=; b=ESgk9t6DiKX8rLuTNAF2oT8EQEnbyzmpKlBXT+8IhN7W8JKQtDoYt3MXKR28UlyY+v cPtr7wxjfmedipqL7p9pVw7looUhWP9AoVvccS7+YHYpzX7S9Z9jPqi5A7fdXm5OC3JI ofrk+96xTYsNjjiPayhraiaQnx+IuKttTGI8ti3PcKkzmmvWWbQ5SBPWJ8ACY5xxTRm1 Qdhwgbc9DRKxz9up9KO7GHy+Iy2/sGK3QIEZic9J7BCryzscn050a0aiU0DLQ3yX0cj+ acIZIV3Fd2YYyazye5Ea/cg6qMK5dmsH8yhc4ZhXcLj5FtVTvkYw55RmOaQijzb7tTW0 vYjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738071580; x=1738676380; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nS1iDaKH7nB32Am/xLOSbCNtV3WCrSwFl1dM6W+hV7k=; b=eDqEQF0RB1NGO9N7j9w8SSgSEdavYkDAV9Ur0tTOxUVDXXlMCpAGv9dzfIugU/kqls 0TiJGlyfYmFLPoR6XE+bUezfigNeFMWZ0rfBRJ/hCUN1UYbNXa04fDvYy/7cjZLrX5Sx 9pIKcoYl5vWxKlJgUqLAmP3kiSjddPT6QAJQnPrVlQwUvx3jkd24olfihS5Q0lqtnWim nkVA93DlmEaN6r6iikd5wxKVvIL9LcmyLRKwe/nphDZvKXCzw/xCaDnNN7e7JxvTTLfZ HZtD+QBiR9S2j6qfxwtZhfrF2fjuYBABIG91mQD+UHTyrWk1bPAsHn6gi71PR3zKdzKH m44A== X-Forwarded-Encrypted: i=1; AJvYcCWYlVQTRz+oHVlw1IU7+bqfCLYs7S/qspI7j+yYoSd8SxBltVGa9Zj3P7SW9S45ceI7cIOBguPjbg==@vger.kernel.org X-Gm-Message-State: AOJu0Yx04n1aH3N7zM4ZVsyBVN2stmC/mscjxKBxrQPNw13w0Ne/fiKd z83iV3vKqmWDEUcq3ENGDlnbSh07d00ouhPnmUYS2l1C2psU/kDE9T8TA7LCh6Q= X-Gm-Gg: ASbGncvmw2czeoJt1xqiE4Gi4h2RyIUyJeZyeSpL7K0GgrQiR5N3WrOoNK+ofQr0EQ/ HIjFejRWEbyZ6HdptZ6CPc7STtKBhsV4pPxunhrW5SVdMVFq302gsY/qZ2wVWGQW+sw0FSMQOuK KtIRajXUPj5kdWoSQTW9ZLyIsBAKCN9gzvSzAH2P3nnOek0j80A4AmTNcTWltsgiro3Iiqrj8wy 9HSy3f2P984j3jFALqXwAV8dcT1Z4rsgYcZ8c0I27hfMnUlsaWEn8BqQdhWhsaVBxrXosp6VmbX Pegy4eTMS6jLle9SU8TuYkaWvpjXNO9Z8IPCWHdM150NPJ6lQWRz66HVXDZ6k5Pv6T1aRI9L7IQ O1qBJkAqvCT8qh1c= X-Google-Smtp-Source: AGHT+IELMGr2JQgRmVorvVpEasrvqebsINTeORiuGFoCYQ4dH896syZvGAFKMrBLk3f1laP5h34SAg== X-Received: by 2002:a05:6000:186f:b0:386:373f:47c4 with SMTP id ffacd0b85a97d-38bf57c934fmr43407679f8f.49.1738071580600; Tue, 28 Jan 2025 05:39:40 -0800 (PST) Received: from raven.intern.cm-ag (p200300dc6f2b6900023064fffe740809.dip0.t-ipconnect.de. [2003:dc:6f2b:6900:230:64ff:fe74:809]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38c2a1bb02dsm14160780f8f.70.2025.01.28.05.39.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Jan 2025 05:39:40 -0800 (PST) From: Max Kellermann To: axboe@kernel.dk, asml.silence@gmail.com, io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Max Kellermann Subject: [PATCH 5/8] io_uring/io-wq: do not use bogus hash value Date: Tue, 28 Jan 2025 14:39:24 +0100 Message-ID: <20250128133927.3989681-6-max.kellermann@ionos.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250128133927.3989681-1-max.kellermann@ionos.com> References: <20250128133927.3989681-1-max.kellermann@ionos.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Previously, the `hash` variable was initialized with `-1` and only updated by io_get_next_work() if the current work was hashed. Commit 60cf46ae6054 ("io-wq: hash dependent work") changed this to always call io_get_work_hash() even if the work was not hashed. This caused the `hash != -1U` check to always be true, adding some overhead for the `hash->wait` code. This patch fixes the regression by checking the `IO_WQ_WORK_HASHED` flag. Perf diff for a flood of `IORING_OP_NOP` with `IOSQE_ASYNC`: 38.55% -1.57% [kernel.kallsyms] [k] queued_spin_lock_slowpath 6.86% -0.72% [kernel.kallsyms] [k] io_worker_handle_work 0.10% +0.67% [kernel.kallsyms] [k] put_prev_entity 1.96% +0.59% [kernel.kallsyms] [k] io_nop_prep 3.31% -0.51% [kernel.kallsyms] [k] try_to_wake_up 7.18% -0.47% [kernel.kallsyms] [k] io_wq_free_work Fixes: 60cf46ae6054 ("io-wq: hash dependent work") Cc: Pavel Begunkov Signed-off-by: Max Kellermann --- io_uring/io-wq.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index ba9974e6f521..6e31f312b61a 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -604,7 +604,9 @@ static void io_worker_handle_work(struct io_wq_acct *acct, do { struct io_wq_work *next_hashed, *linked; unsigned int work_flags = atomic_read(&work->flags); - unsigned int hash = __io_get_work_hash(work_flags); + unsigned int hash = __io_wq_is_hashed(work_flags) + ? __io_get_work_hash(work_flags) + : -1U; next_hashed = wq_next_work(work); From patchwork Tue Jan 28 13:39:25 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Kellermann X-Patchwork-Id: 13952570 Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 193AC1A8401 for ; Tue, 28 Jan 2025 13:39:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071585; cv=none; b=KcTW4kt4YX8o7U9FMzRh+i84D7zZnadDocUXoVpPYxhwG9FQkE669i42wix56awEpA06motB58jBhMqMtAHb4whfldPH2yzjWUJeBAQ1sGRAz050hkOxcYpoQTXa0WKlszRX/FONuO5wJHwpRV4s/cIQ/GCO9J4Xvo2YJ9RBjiY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071585; c=relaxed/simple; bh=8YwoeVS/XmIWopqyLGAYx78wP3cqW3SI6yiDMS7cgqk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=C1JjhvDwgYJg1yxviJPPoggEbKLCa8ycstquv2JM6RM9QZg2HhIDAXCk14ao2uS4AAwdSGPMb+QtA7MfXyXMapoeQN40wGEE3+K8NDkwdV1sblyjsGLbQnXM1H+3IVvc+wALHSjqsEMZnatCwDe2RpsftABW7WXGCKs0iIesU4o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com; spf=pass smtp.mailfrom=ionos.com; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b=fufaeNUN; arc=none smtp.client-ip=209.85.221.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ionos.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b="fufaeNUN" Received: by mail-wr1-f48.google.com with SMTP id ffacd0b85a97d-38a8b35e168so3644979f8f.1 for ; Tue, 28 Jan 2025 05:39:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ionos.com; s=google; t=1738071581; x=1738676381; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Z8ARVoEcuhaiyy7y8Dw3Fh+k6qruQ+ILlqwd0Qs/TsM=; b=fufaeNUN+R5yDmPzs0Ok6sn/vINkAu8P/DIB1csZmkax2fV5fZhiDUxY1hM2Oj+9Bg 2uvMUqBt7d+oi0dBabSMVoDE8Jcsav7xJtp6MZIVO/I9umY7Fsp2p9ZZnRIQbYYz1dsN GHxfx9n66iLu2qezrJtJ6rCxBv1PabhsBXpal/YhVHbsqewzcbd1sQMqm1miPlcQHeMp TwQRLZhvp8TKX5g0i+dZX9d79j7i9XjaPKDH8yLpNPYE2FGOOX4RSBLc9IACdqPgiBho GqOiIr792JpMKiVwd+kOdmxZaU1enzS/uTePBegMcQiBXCxWyej1iHS7Mm7/PWn+K64H elyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738071581; x=1738676381; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Z8ARVoEcuhaiyy7y8Dw3Fh+k6qruQ+ILlqwd0Qs/TsM=; b=M9cpDh7uZ3bMBOri6Am6Gy2BbGgKLci8SHK8cu3tCFn0VWFDj9lOtmrlkTgNLf6Uif Q+jWjymTnxnHrwuZRO6ap3MsGAVOXmYnU1WUIOrr2TkitHY3fA1NyQrTDAxzzjP2Fx4I 9Y3EvoiT/6KoA9GFgXXCb/OD8aDyLFCrWUe1T16+tCQraQFtbS1XtFk5dF26ECH6tBn0 829kSXsM2xrJqGX/bIBE3xcq6jGkVh1bttR3bRTdq+kGu7gqTnZrIwgW4LERBBYkPjhx +vlv1X6BXBbfXjRmlV/hD3A+ADVZxA896TobaO9q/wYEwH1zkBE7Qw/Uffx6n9VLxaAB I77A== X-Forwarded-Encrypted: i=1; AJvYcCWfH1IwqkkMuOQvpWC/5nMDMAMxB/u5RCrZJzKJ6m7/sWVppxjoBP6YJmNopZNOfy/6kWOmbQ8A9w==@vger.kernel.org X-Gm-Message-State: AOJu0Yx6FW0dQSfCNoiLwN93uvXS0HScy5kuJWUCgtMfTRT2m1+IcTs4 hBGJjjHFD+WH9W39tZ2ipscumGg7Ngn53dQGNnTLysGgHTwLYW6PHF0xxZB3jcc= X-Gm-Gg: ASbGncslRyMtVtZezh60cBEOKDE3cimes1UynDH95BRWsoDpCuAbyosGhjBsmJHMT7y 1lBmHiDN3FrldYr3/PKj0p1HakKkPuc6gzH5ZHaBx1qgmAAjvr4z9y+c9E7YGs9oWLgeCca6oEO rbzuFjR4xYDESMJ5wtDeG0boPxUk1HZhYFuPAuQ0M4M432PrufhKJCfI1Tl7T6dzNRUugyZhWNn RvSM5uVAMPzxjnVftBvC1ak6xKNn6b7u+lmUy2RKlyBTl6d4wsi85AY89tPRQ4s7DE9GmziwAvY bxi6PEb44COIs/aP9U95Dcg92G52BZ+u+90T+U13hIOyreVGfQluskcuH3SpoM8na9XjLwVxb6/ eIDW8G29JmmooNbdfsgtSBcpUQA== X-Google-Smtp-Source: AGHT+IGul3WSFjDA3kUrmEFfTfFX20BTEU/HgvE7YsOUr9tBZ+dCaFW4q6pwq1qDAzOy8oJKT4N7WA== X-Received: by 2002:a5d:6daf:0:b0:38b:ee01:ae2 with SMTP id ffacd0b85a97d-38c49a27186mr2274103f8f.10.1738071581332; Tue, 28 Jan 2025 05:39:41 -0800 (PST) Received: from raven.intern.cm-ag (p200300dc6f2b6900023064fffe740809.dip0.t-ipconnect.de. [2003:dc:6f2b:6900:230:64ff:fe74:809]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38c2a1bb02dsm14160780f8f.70.2025.01.28.05.39.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Jan 2025 05:39:41 -0800 (PST) From: Max Kellermann To: axboe@kernel.dk, asml.silence@gmail.com, io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Max Kellermann Subject: [PATCH 6/8] io_uring/io-wq: pass io_wq to io_get_next_work() Date: Tue, 28 Jan 2025 14:39:25 +0100 Message-ID: <20250128133927.3989681-7-max.kellermann@ionos.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250128133927.3989681-1-max.kellermann@ionos.com> References: <20250128133927.3989681-1-max.kellermann@ionos.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The only caller has already determined this pointer, so let's skip the redundant dereference. Signed-off-by: Max Kellermann --- io_uring/io-wq.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index 6e31f312b61a..f7d328feb722 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -485,13 +485,12 @@ static bool io_wait_on_hash(struct io_wq *wq, unsigned int hash) } static struct io_wq_work *io_get_next_work(struct io_wq_acct *acct, - struct io_worker *worker) + struct io_wq *wq) __must_hold(acct->lock) { struct io_wq_work_node *node, *prev; struct io_wq_work *work, *tail; unsigned int stall_hash = -1U; - struct io_wq *wq = worker->wq; wq_list_for_each(node, prev, &acct->work_list) { unsigned int work_flags; @@ -576,7 +575,7 @@ static void io_worker_handle_work(struct io_wq_acct *acct, * can't make progress, any work completion or insertion will * clear the stalled flag. */ - work = io_get_next_work(acct, worker); + work = io_get_next_work(acct, wq); if (work) { /* * Make sure cancelation can find this, even before From patchwork Tue Jan 28 13:39:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Kellermann X-Patchwork-Id: 13952571 Received: from mail-wr1-f43.google.com (mail-wr1-f43.google.com [209.85.221.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8D5BD1A83E8 for ; Tue, 28 Jan 2025 13:39:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071586; cv=none; b=GElKhfqqVKupC0rnjMeCmhA8OZtSjLkDNpGezx95TXGTpZ/z26T5ZfZAYidmiwhtxQZitybgiMHjid6kgTsXKTf5E0RJWit4tVNBqbI2b67splNmeoeuAoYgVoB+IfDUQRb0kG9WER86dbZNr+khguSOzzJVB1/mcal9oo+IATY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071586; c=relaxed/simple; bh=lCqFDQKjXKQloLeJSNnPRtRIePgGsRGjnc4yAALoKf4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nM+5Hna8Qr0/rQXOUreDRgtU9CBzuwShElaBwkfz9dG/QEc2MvZhQ8X1CP849LDjvMR0/DMLsiG5zzTtnylX9fV+2G3Qr4BS63vNZDefpNJbJtfRjrFgjSS0Nqu3MXYbbKal1QNtpSONOfsy45ULT0VkBr778ryLHsjinMA/8FE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com; spf=pass smtp.mailfrom=ionos.com; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b=JgcDp4pK; arc=none smtp.client-ip=209.85.221.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ionos.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b="JgcDp4pK" Received: by mail-wr1-f43.google.com with SMTP id ffacd0b85a97d-385ef8b64b3so4939440f8f.0 for ; Tue, 28 Jan 2025 05:39:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ionos.com; s=google; t=1738071583; x=1738676383; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=i8ASk23Y98Dq5UqiMCsyxF3GOBUBf2YhS0lyIFI6nWA=; b=JgcDp4pKgH0uEHkU2IE3NgpRcakLBA7m7EWTv06DS8SONLB8hJT7/FVlrMYYOswOls rBkdb9yvVd2bButCIDldLO9mK6LwRyTtjvCCFlxMM+Sktg2LeqwXaIrFZmjZ/gNPPl2E hcY5jeptxEP3QwzaIHf8GamtHt/Ui/ekdvYegRrtbSX+VnpSof1/7uCz+tiEc86pZfTG kkYPeOu1IQIzuTJNLCbIdoAM5nYR90riPu3WUG4hXt3bym3wHozAOBQl3VjiWOwxT2Cn GAcO5Ig3NTMVWZMH+8PK5FZo+vYtHdQWJeudHUUiyRIKPR6ACHEGBEpAsnTQuC+zF7TK kM3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738071583; x=1738676383; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=i8ASk23Y98Dq5UqiMCsyxF3GOBUBf2YhS0lyIFI6nWA=; b=HowXEvypBFjtw0hOV1qkBo4y1O/CM4xb4/ZCaRydZiD9SKTInC1R7np+ngbfAq5+hJ L6O2jWVjSfO8znpL0CtBLMtRKWhZWptdG6jyhHpLfEPbtqHbTg07Hc9xNLLhgaIUjHQj 7AasqiIpiLcuz6+bbSZZ3vq5GVp5IN1FAq2EkFEVFtyr3tW4h8cB6kSfe6H58Apowq3m Voer2yTE1aY4aeriPSQ9UloG9sobZLLIRi+Qkju5wAVMsiDJa3sUDN5XZjg6y1ObVllJ 7QvA2bsBPPlPtSx+vE28W+WrIaP7xAQMXyKd/qEKUToKBKcrYSNMjVFmtYsDMNjIPohT fHtQ== X-Forwarded-Encrypted: i=1; AJvYcCXLTUnmfPJhAIKMkyF875VIvGq7xm3e+DkI8eQDtq7udUCcUU9yvESiO0wFMNXMYmg2Y5LVTab94g==@vger.kernel.org X-Gm-Message-State: AOJu0YxjkoR2H+3V3YWedv9uVEVXs9nJ7xuKv3++BsFUKZIA5pLQtqWy qlqm7S0cebh9CS2mbSJ/4j/RDhDD9c0nI53Kynbir4Os1g96aJrGryJCBLM47FU= X-Gm-Gg: ASbGncuUVYLH8UM1rjEBMe+lrpPNjBhLM0qFz1z8TbjnT1LzPNTnW66MECoHmwpahj2 Boii/0ksTCITOw1NvM+wbL4xF9y/bl46Hi26z/eDD1JV4pF19+SVyKdDcKq0dfysE3TUyrIqDRp QYeYAujYuBg45xbvZEFiyiMkLM6E1xvzSiHrQaIG2DQTwRQtUA7pYayVNX7+lf6hTLf/oaWS5Ln t5DqIT/lpTnJUjZ4EUCp0PEdjtwkl5hFo6g69T1BrDTPDDUVn5t13TVynq/R8poZrFw+slKfjZZ Wv0dNkv0+TivMPMw5IVrZH55dupw+QKvJHhCDsUbFcqcJOEfr0xCCsWna+upPY3mPYt+Wxod1ww oBwYks9KNxQrMfHU= X-Google-Smtp-Source: AGHT+IHpAzP2vf6kbCaoXmQHJGAOkxMfs8DyzeLyD8CLZ5Z45dMSg/qF6pExH7WRYEt401mL9rsxIA== X-Received: by 2002:a5d:64ed:0:b0:386:3e87:2cd6 with SMTP id ffacd0b85a97d-38bf57b7633mr50074000f8f.38.1738071582763; Tue, 28 Jan 2025 05:39:42 -0800 (PST) Received: from raven.intern.cm-ag (p200300dc6f2b6900023064fffe740809.dip0.t-ipconnect.de. [2003:dc:6f2b:6900:230:64ff:fe74:809]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38c2a1bb02dsm14160780f8f.70.2025.01.28.05.39.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Jan 2025 05:39:41 -0800 (PST) From: Max Kellermann To: axboe@kernel.dk, asml.silence@gmail.com, io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Max Kellermann Subject: [PATCH 7/8] io_uring: cache io_kiocb->flags in variable Date: Tue, 28 Jan 2025 14:39:26 +0100 Message-ID: <20250128133927.3989681-8-max.kellermann@ionos.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250128133927.3989681-1-max.kellermann@ionos.com> References: <20250128133927.3989681-1-max.kellermann@ionos.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This eliminates several redundant reads, some of which probably cannot be optimized away by the compiler. Signed-off-by: Max Kellermann --- io_uring/io_uring.c | 59 +++++++++++++++++++++++++++------------------ 1 file changed, 35 insertions(+), 24 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 7bfbc7c22367..137c2066c5a3 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -391,28 +391,30 @@ static bool req_need_defer(struct io_kiocb *req, u32 seq) static void io_clean_op(struct io_kiocb *req) { - if (req->flags & REQ_F_BUFFER_SELECTED) { + const unsigned int req_flags = req->flags; + + if (req_flags & REQ_F_BUFFER_SELECTED) { spin_lock(&req->ctx->completion_lock); io_kbuf_drop(req); spin_unlock(&req->ctx->completion_lock); } - if (req->flags & REQ_F_NEED_CLEANUP) { + if (req_flags & REQ_F_NEED_CLEANUP) { const struct io_cold_def *def = &io_cold_defs[req->opcode]; if (def->cleanup) def->cleanup(req); } - if ((req->flags & REQ_F_POLLED) && req->apoll) { + if ((req_flags & REQ_F_POLLED) && req->apoll) { kfree(req->apoll->double_poll); kfree(req->apoll); req->apoll = NULL; } - if (req->flags & REQ_F_INFLIGHT) + if (req_flags & REQ_F_INFLIGHT) atomic_dec(&req->tctx->inflight_tracked); - if (req->flags & REQ_F_CREDS) + if (req_flags & REQ_F_CREDS) put_cred(req->creds); - if (req->flags & REQ_F_ASYNC_DATA) { + if (req_flags & REQ_F_ASYNC_DATA) { kfree(req->async_data); req->async_data = NULL; } @@ -453,31 +455,37 @@ static noinline void __io_arm_ltimeout(struct io_kiocb *req) io_queue_linked_timeout(__io_prep_linked_timeout(req)); } -static inline void io_arm_ltimeout(struct io_kiocb *req) +static inline void _io_arm_ltimeout(struct io_kiocb *req, unsigned int req_flags) { - if (unlikely(req->flags & REQ_F_ARM_LTIMEOUT)) + if (unlikely(req_flags & REQ_F_ARM_LTIMEOUT)) __io_arm_ltimeout(req); } +static inline void io_arm_ltimeout(struct io_kiocb *req) +{ + _io_arm_ltimeout(req, req->flags); +} + static void io_prep_async_work(struct io_kiocb *req) { + unsigned int req_flags = req->flags; const struct io_issue_def *def = &io_issue_defs[req->opcode]; struct io_ring_ctx *ctx = req->ctx; - if (!(req->flags & REQ_F_CREDS)) { - req->flags |= REQ_F_CREDS; + if (!(req_flags & REQ_F_CREDS)) { + req_flags = req->flags |= REQ_F_CREDS; req->creds = get_current_cred(); } req->work.list.next = NULL; atomic_set(&req->work.flags, 0); - if (req->flags & REQ_F_FORCE_ASYNC) + if (req_flags & REQ_F_FORCE_ASYNC) atomic_or(IO_WQ_WORK_CONCURRENT, &req->work.flags); - if (req->file && !(req->flags & REQ_F_FIXED_FILE)) - req->flags |= io_file_get_flags(req->file); + if (req->file && !(req_flags & REQ_F_FIXED_FILE)) + req_flags = req->flags |= io_file_get_flags(req->file); - if (req->file && (req->flags & REQ_F_ISREG)) { + if (req->file && (req_flags & REQ_F_ISREG)) { bool should_hash = def->hash_reg_file; /* don't serialize this request if the fs doesn't need it */ @@ -1703,13 +1711,14 @@ static __cold void io_drain_req(struct io_kiocb *req) spin_unlock(&ctx->completion_lock); } -static bool io_assign_file(struct io_kiocb *req, const struct io_issue_def *def, +static bool io_assign_file(struct io_kiocb *req, unsigned int req_flags, + const struct io_issue_def *def, unsigned int issue_flags) { if (req->file || !def->needs_file) return true; - if (req->flags & REQ_F_FIXED_FILE) + if (req_flags & REQ_F_FIXED_FILE) req->file = io_file_get_fixed(req, req->cqe.fd, issue_flags); else req->file = io_file_get_normal(req, req->cqe.fd); @@ -1719,14 +1728,15 @@ static bool io_assign_file(struct io_kiocb *req, const struct io_issue_def *def, static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags) { + const unsigned int req_flags = req->flags; const struct io_issue_def *def = &io_issue_defs[req->opcode]; const struct cred *creds = NULL; int ret; - if (unlikely(!io_assign_file(req, def, issue_flags))) + if (unlikely(!io_assign_file(req, req_flags, def, issue_flags))) return -EBADF; - if (unlikely((req->flags & REQ_F_CREDS) && req->creds != current_cred())) + if (unlikely((req_flags & REQ_F_CREDS) && req->creds != current_cred())) creds = override_creds(req->creds); if (!def->audit_skip) @@ -1783,18 +1793,19 @@ struct io_wq_work *io_wq_free_work(struct io_wq_work *work) void io_wq_submit_work(struct io_wq_work *work) { struct io_kiocb *req = container_of(work, struct io_kiocb, work); + const unsigned int req_flags = req->flags; const struct io_issue_def *def = &io_issue_defs[req->opcode]; unsigned int issue_flags = IO_URING_F_UNLOCKED | IO_URING_F_IOWQ; bool needs_poll = false; int ret = 0, err = -ECANCELED; /* one will be dropped by ->io_wq_free_work() after returning to io-wq */ - if (!(req->flags & REQ_F_REFCOUNT)) + if (!(req_flags & REQ_F_REFCOUNT)) __io_req_set_refcount(req, 2); else req_ref_get(req); - io_arm_ltimeout(req); + _io_arm_ltimeout(req, req_flags); /* either cancelled or io-wq is dying, so don't touch tctx->iowq */ if (atomic_read(&work->flags) & IO_WQ_WORK_CANCEL) { @@ -1802,7 +1813,7 @@ void io_wq_submit_work(struct io_wq_work *work) io_req_task_queue_fail(req, err); return; } - if (!io_assign_file(req, def, issue_flags)) { + if (!io_assign_file(req, req_flags, def, issue_flags)) { err = -EBADF; atomic_or(IO_WQ_WORK_CANCEL, &work->flags); goto fail; @@ -1816,7 +1827,7 @@ void io_wq_submit_work(struct io_wq_work *work) * Don't allow any multishot execution from io-wq. It's more restrictive * than necessary and also cleaner. */ - if (req->flags & REQ_F_APOLL_MULTISHOT) { + if (req_flags & REQ_F_APOLL_MULTISHOT) { err = -EBADFD; if (!io_file_can_poll(req)) goto fail; @@ -1831,7 +1842,7 @@ void io_wq_submit_work(struct io_wq_work *work) } } - if (req->flags & REQ_F_FORCE_ASYNC) { + if (req_flags & REQ_F_FORCE_ASYNC) { bool opcode_poll = def->pollin || def->pollout; if (opcode_poll && io_file_can_poll(req)) { @@ -1849,7 +1860,7 @@ void io_wq_submit_work(struct io_wq_work *work) * If REQ_F_NOWAIT is set, then don't wait or retry with * poll. -EAGAIN is final for that case. */ - if (req->flags & REQ_F_NOWAIT) + if (req_flags & REQ_F_NOWAIT) break; /* From patchwork Tue Jan 28 13:39:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Kellermann X-Patchwork-Id: 13952572 Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3C7881B412B for ; Tue, 28 Jan 2025 13:39:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071587; cv=none; b=syI5U2HcDu6iK4BPU7vYcLTyI8gKHVUnHDvzovoB3JLGPSBGEONBYaT9FtXwtFqBmMoxIPwg+BTEJcteOzrps5D4oKwPegXLC4JFXEOjHUyltAE8sCVuoLbEkoUj/xx0ojvRBYgYABsLPFsdAt9Rr0q6eJTe6y7Uv5Z66qcHvk0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738071587; c=relaxed/simple; bh=rxwS0rDFX14c0VMUZBaTjErpG3AmBNovvucJwZOZofA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HDy0ZrA+xLQ3E/mogutmQXaqIfJ53AFM+I2wsyz0JEEGFU9nbfO16UQlroqVol7BBb2V04+g9lHAn7NnSB6zErCWtiF6P7FSryIvTT/kFiQeaP/6afUVsz67ZG5EqUYaSkq9O8zQ+wLZ3JInuRFbn2lRI0LDYdVEf8a1TLM83AE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com; spf=pass smtp.mailfrom=ionos.com; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b=SIP3kCqd; arc=none smtp.client-ip=209.85.128.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ionos.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ionos.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ionos.com header.i=@ionos.com header.b="SIP3kCqd" Received: by mail-wm1-f53.google.com with SMTP id 5b1f17b1804b1-436281c8a38so39670675e9.3 for ; Tue, 28 Jan 2025 05:39:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ionos.com; s=google; t=1738071583; x=1738676383; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=96TZy8b5BW2zVkT896YXvpQRJfRndGqutQhUtIiN6/o=; b=SIP3kCqdEaMZT+247qyJh425UPNTX6nRNn7qlEDLoq7ykLkxWKiw1/qrUWW/vLXY9M 7KgfdjNoKDvvrPgQ6D6quMyxB5BXqTdfAvw+e1JVuOjV6aLXoB70lquYjZQTC2GB9pp9 YbS186soaYrRoNiqV6WfJOWrVn45Ebj8IJxDVoG/WrHUVuMx/MmfGLNYjsOdUj2VQQJi vLm7K4slVbb8KWRoDvVHzu03lMPyzy9PTNG2sLYpxoVKP4+Mqr/lOQWmkoHcDa0TPubC k7XAiLWlpeIfR+IYdSAVTSd0Df4oHRsX9gnx4/bOdwQjWRhulgXjVedbCFh/x21/NkSb 1A4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738071583; x=1738676383; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=96TZy8b5BW2zVkT896YXvpQRJfRndGqutQhUtIiN6/o=; b=VWY/Y9MdEX8mhEKGPToLtnkvFqyFtHGUm5tkQKT38VT1XvpjRk5FbZu3Xm4vM4+ZbT scsyG8h53h/GIgRz0uqogrqYlbwmVQyH19RSN3T/FMEye9oC/G/7GTrFzZ2q9y323etL oX0D22EI8WqCPN/jsknt/ohvplOdF/POWLibnZBkoPlTMarnKWwtd+br/EvmR35eEzP4 UiKmOOkXaJq8t1UWth585mAbDtjq6yudyAqQ8lRFq84h/pu9+Fs3Cq5WqoRWrvXIthmx 4phfLD6U6t9id5z1MIjetG6hIj0+giVaxQSomLVDHDiU6cPhmXQuhbPv+rJBrkZVngR9 +2OA== X-Forwarded-Encrypted: i=1; AJvYcCWNhF8Dj0L5jswkd49EXBnd56nc2XYfkuJL2BzDHJjqcZ8IyeAo1HMNuRfJF9BvOwmqWQaf+PFQsg==@vger.kernel.org X-Gm-Message-State: AOJu0YwctU7LhLtVO5xl175uWLJWxmyWMyB5s7YDzLOtMw43ibVwo0d+ lvWwJHWORb9OyGjkuqpbXi4+aowSvZFrh1MG4rbTc2Ud/7FAXb+fHjdJ5NaomJg= X-Gm-Gg: ASbGncvsIZ+bbUuSoqK43Ml4NfrWtOjFR+ne9rZJDfDh6xYr4e2bjTEIOWtBcDs42I9 +/XjKXvKk0jBCkb/gmvgqyIZjm0q5AUAIJ1igIkc96veHT+UclwkOHqPOPxsjdtNkNuF6L353H2 mBu97N7joDe6mdQ+YGdzSQjqOi9L80MGGLgm+KDE+O32erF1cuNQ/WHQE4IeiICKlMaT34VFp3D QL1+ivGH/8nyXt7UFwjppXKmrNJDFkTFBFUN7u7l6jfv+3ohUV9FPzzrUhO88CxVtYAbCXdc2Al uzJJElh32V4OOwODP8BSZwVLTC6nmnj1hLHJi/oFvWT8Nrj9eJUzDaLJVIETuar781MlhbELash tlTbKjvEIFewluVc= X-Google-Smtp-Source: AGHT+IFHR2TSaCX4qA/N4aKyPqiprXuE+wstO94ZvTs6BcxDNv4grOj/aPxscmSz1+8YtYt8k2Ju/g== X-Received: by 2002:a05:6000:18ab:b0:38a:88a0:2234 with SMTP id ffacd0b85a97d-38bf5655328mr34750312f8f.4.1738071583546; Tue, 28 Jan 2025 05:39:43 -0800 (PST) Received: from raven.intern.cm-ag (p200300dc6f2b6900023064fffe740809.dip0.t-ipconnect.de. [2003:dc:6f2b:6900:230:64ff:fe74:809]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38c2a1bb02dsm14160780f8f.70.2025.01.28.05.39.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Jan 2025 05:39:43 -0800 (PST) From: Max Kellermann To: axboe@kernel.dk, asml.silence@gmail.com, io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Max Kellermann Subject: [PATCH 8/8] io_uring: skip redundant poll wakeups Date: Tue, 28 Jan 2025 14:39:27 +0100 Message-ID: <20250128133927.3989681-9-max.kellermann@ionos.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250128133927.3989681-1-max.kellermann@ionos.com> References: <20250128133927.3989681-1-max.kellermann@ionos.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Using io_uring with epoll is very expensive because every completion leads to a __wake_up() call, most of which are unnecessary because the polling process has already been woken up but has not had a chance to process the completions. During this time, wq_has_sleeper() still returns true, therefore this check is not enough. Perf diff for a HTTP server pushing a static file with splice() into the TCP socket: 37.91% -2.00% [kernel.kallsyms] [k] queued_spin_lock_slowpath 1.69% -1.67% [kernel.kallsyms] [k] ep_poll_callback 0.95% +1.64% [kernel.kallsyms] [k] io_wq_free_work 0.88% -0.35% [kernel.kallsyms] [k] _raw_spin_lock_irqsave 1.66% +0.28% [kernel.kallsyms] [k] io_worker_handle_work 1.14% +0.18% [kernel.kallsyms] [k] _raw_spin_lock 0.24% -0.17% [kernel.kallsyms] [k] __wake_up Signed-off-by: Max Kellermann --- include/linux/io_uring_types.h | 10 ++++++++++ io_uring/io_uring.c | 4 ++++ io_uring/io_uring.h | 2 +- 3 files changed, 15 insertions(+), 1 deletion(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 623d8e798a11..01514cb76095 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -384,6 +384,16 @@ struct io_ring_ctx { struct wait_queue_head poll_wq; struct io_restriction restrictions; + /** + * Non-zero if a process is waiting for #poll_wq and reset to + * zero when #poll_wq is woken up. This is supposed to + * eliminate redundant wakeup calls. Only checking + * wq_has_sleeper() is not enough because it will return true + * until the sleeper has actually woken up and has been + * scheduled. + */ + atomic_t poll_wq_waiting; + u32 pers_next; struct xarray personalities; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 137c2066c5a3..b65efd07e9f0 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2760,6 +2760,7 @@ static __cold void io_activate_pollwq_cb(struct callback_head *cb) * Wake ups for some events between start of polling and activation * might've been lost due to loose synchronisation. */ + atomic_set_release(&ctx->poll_wq_waiting, 0); wake_up_all(&ctx->poll_wq); percpu_ref_put(&ctx->refs); } @@ -2793,6 +2794,9 @@ static __poll_t io_uring_poll(struct file *file, poll_table *wait) if (unlikely(!ctx->poll_activated)) io_activate_pollwq(ctx); + + atomic_set(&ctx->poll_wq_waiting, 1); + /* * provides mb() which pairs with barrier from wq_has_sleeper * call in io_commit_cqring diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index f65e3f3ede51..186cee066f9f 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -287,7 +287,7 @@ static inline void io_commit_cqring(struct io_ring_ctx *ctx) static inline void io_poll_wq_wake(struct io_ring_ctx *ctx) { - if (wq_has_sleeper(&ctx->poll_wq)) + if (wq_has_sleeper(&ctx->poll_wq) && atomic_xchg_release(&ctx->poll_wq_waiting, 0) > 0) __wake_up(&ctx->poll_wq, TASK_NORMAL, 0, poll_to_key(EPOLL_URING_WAKE | EPOLLIN)); }