From patchwork Sat Jun 22 03:58:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 13708196 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDF28C27C53 for ; Sat, 22 Jun 2024 03:58:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 233458D01B4; Fri, 21 Jun 2024 23:58:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E40D8D01AF; Fri, 21 Jun 2024 23:58:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 00FD78D01B4; Fri, 21 Jun 2024 23:58:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id D52208D01AF for ; Fri, 21 Jun 2024 23:58:53 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 51BA9C132D for ; Sat, 22 Jun 2024 03:58:53 +0000 (UTC) X-FDA: 82257168546.23.C0A610F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf10.hostedemail.com (Postfix) with ESMTP id 338ADC0008 for ; Sat, 22 Jun 2024 03:58:51 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=go9e3Ji5; spf=pass (imf10.hostedemail.com: domain of leobras@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=leobras@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719028721; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=z0Dg1NIU4Aju17NE5wHHed1iDaZiJauMtlBEOZtTK7I=; b=5xcBIGJXNKdC50yRPZqh3IXvhvZqxf9aLZzAF4c9qoeT3NgItkXo36kWHbUatEqBRwMowJ 4LIbi3AEjdRdbQ6p/nt1je7TTVcY48Cy3F3nP/Eg5oQojxnjV0JRgeOLStb8SAr3jFIz7I lcYWrlwE4VLl0CUWnmiapo/2w9ZX2sM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719028721; a=rsa-sha256; cv=none; b=lqqeowVRZ9SMb/mUtovx2ndeKQSlc0hhSrE+OFGFU/7It1g/eh5HQ5w4v52bdn0wzdKNTJ 2J9fBInAT2QUeRaNPu1Pj1Y98ugHgw7wiYwSfdkuHJCkbiWZkttVk8YDrApRX0wzI47lfX HzxKLOCbuthTpUX12pq6FNhOSObgubg= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=go9e3Ji5; spf=pass (imf10.hostedemail.com: domain of leobras@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=leobras@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1719028730; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=z0Dg1NIU4Aju17NE5wHHed1iDaZiJauMtlBEOZtTK7I=; b=go9e3Ji5vXMQq5MBlZ7Gdr6tsInirddnvOGUQN6wNLflCnTgh4Bfy5MJRssFx3ReA8+0Xf cHQmOQeV15lTx3QtWvnDGBqyfpGxZyYgaBn5D3GciL5sT6bEZHKbYEIitn5RGTX80MGeP1 ElqRk9kJKMDy0Q1EKTNTxOYiYtfCyA4= Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-473-nJLn2vGNPYOCSXykC0yqaQ-1; Fri, 21 Jun 2024 23:58:46 -0400 X-MC-Unique: nJLn2vGNPYOCSXykC0yqaQ-1 Received: by mail-pl1-f200.google.com with SMTP id d9443c01a7336-1f9a0cb228eso26053475ad.1 for ; Fri, 21 Jun 2024 20:58:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719028725; x=1719633525; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=z0Dg1NIU4Aju17NE5wHHed1iDaZiJauMtlBEOZtTK7I=; b=hIQFckLJBHZEoqdwwjj2cslBkPSDixpgldBxobOXpSkdO7WAf6EbQgxvSAeTBGnQ6b 4nKXqYd4rm1dZLdj4Hr/0EIPaNHOV5/SbkSGfmVDnn9KvPLh5l9JI2WS5y4pbz2I6Q5t l0L5NpINRpGu0gTLVgvDFQ7U7R0WYcMcr3TrTsY/SXQ1a9sKygxb9GBcvEsNn692NvzO uJIFtSrHgKLSR8l3piuCsZzSIPYSbgCyth0CwxNwfPw9zN+aCMOJBCAKlaEYEHnGA512 EhsPttyi3yXng4dx5nZJksKkOt2x4vjtjqaNHgIeeVOb+eWTdGhBBchm7yf8s/dpsnl7 ljDw== X-Forwarded-Encrypted: i=1; AJvYcCU3aDPEkDOQ89q2sw/YgPnn7YuzBEfAScVuUtNw3PDr9G9zPI6Zg7Xymgz/ltCvkvVtjmP4kB9TxiC3F/ogbrFQlKs= X-Gm-Message-State: AOJu0YzrLABhC9A1RvJ3hb88cUEfAskJYxBz/l87JbLEs0yb1xiA7S2p vQaS0WQPqItOhU5VCz+MY32IoczBiYvcPPrBTPaaHGl21x2+sZE1icEbvFX/vL0yaRlwA6Hf1Ru CzZP+qLNTvT+yWuEudnZvwbOJvq7+/dEyzO06Ni/stVgglE5/ X-Received: by 2002:a17:903:2444:b0:1f9:d0da:5b42 with SMTP id d9443c01a7336-1f9d0da608amr58943045ad.46.1719028725069; Fri, 21 Jun 2024 20:58:45 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE+Z05JsNeYced1ZM2XKeZjLOv76WXxQ8159J4vFgdL3RzUnlY7H/PhH/+1Ug253Ae9W8bh0A== X-Received: by 2002:a17:903:2444:b0:1f9:d0da:5b42 with SMTP id d9443c01a7336-1f9d0da608amr58942855ad.46.1719028724605; Fri, 21 Jun 2024 20:58:44 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:1b3:a801:c138:e21d:3579:5747:ad1]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f9eb32b9edsm21832365ad.118.2024.06.21.20.58.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 21 Jun 2024 20:58:44 -0700 (PDT) From: Leonardo Bras To: Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Vlastimil Babka , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Leonardo Bras , Thomas Gleixner , Marcelo Tosatti Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH v1 1/4] Introducing qpw_lock() and per-cpu queue & flush work Date: Sat, 22 Jun 2024 00:58:09 -0300 Message-ID: <20240622035815.569665-2-leobras@redhat.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240622035815.569665-1-leobras@redhat.com> References: <20240622035815.569665-1-leobras@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: 338ADC0008 X-Stat-Signature: c4b7qq4ttk4wb93wi7mojf7pmxnse6wm X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1719028731-833469 X-HE-Meta: U2FsdGVkX19Z3RdznlPyCGZzYrSNaHiSytf4VKIeiU/JTsdJUjWgtoBSzNo4+5EtCWB0G7e+y8LDeYF0WlYlLwOYNzl5Zdf7h4mLLwLZRlsc/HGQGaQ8uUlZjjSk7P8lrCgivp/wxnMA+5KBMVwNnMSUm7VBXbeQPgvn2xFc+4reW8+OIRejeN9X2MdceZ58gCns1AxFXB+h6igoKqlsis/omYEFzOze3dXF14nK44DnTv9KQYbifHIPLneOzK5A/mPH8lb6kWEzYd3dVUuRbeCYnpeJjwV4mtpsy62hK3HRoRIc60xmbhS5zul/3P2bhKpD+Zvorhpxi+fzVPUx8wByAkZvGNP81TyCBkzlFVb2iLyVGkvdKNI56HcpjD4F7xqhqXamuXLVcUdCGaZE24IvpiZYIEBwWRHiYMJuTul/OENXYplY6BeDgZqLp4F2eAMT3D1q/juPWVtFyC4GOo1u1pOxlbIP4hfC4pAgg1WKCfhQr4BPMGA1P3zY/ftjAFlSuAcCAJq3/kmmIdvs3vtwY8pd8J6DwxlQ6lKtEwY9bcp4/SQbcXEXMQ5DzVFWS1WcVBMV+Rr2IssH0AmhmFCEqm5cJn6BEoxnCbT+BIV6at9KvbQuoS2FK6bLCGaQcCJAjqFFrM3MxaNLlwM+8v2QILkzaD2Uz72iXuSH89sWidWsEyhUQN1aALj6cbQVv+5qgKDkIRZB+Xa95J/0TIAtyc+4oHo8GhH5glrNyumV0rh+F2vrYN+kWWkJ/32J2rc6SWIBRMim7LWxPr5/FnlG/2YgM5j35QkNzIb7zba6JGmfQm8YS80RXiwNhtPvOkgu6tzNL8KR83B5XTvcq32yZ08+VLBgmJRuTloOspCB8utdFdIxWRS7FH3u5pXDEcK22v8pNdI+zd7bzpukrpvvV6I10xL9/VRrnHeLJXX5bsChRGAIIkt23k0nmIIBs29w3p00q4JZw2PxcwU FZK/T9pg sknyRKyKAH3g4kDCnpOlG8CGLhR0VkxNJM+Ffv8HlwAP+lokDSd806XxE/MczGYe/HSWA5t/8dyvnnZPKCG8o4zdThGBiG380UTfekBl8Sple/QE4s+f0MK/NaGCCAKLTzt7sC7UlN6QnCqDdpHbOI5upedeY6gFbRfiMT0XnbZeI+OfZppKlYd4E0rJSq4vSu/hRFmQHwY2v5oUrnS9/JvHu/UVRb8g5dWwb0BJ0TuKjNjNNL28UWP5Bl2thr/s2v2jgIRofobUbJZzEmVyhgPIHuXQNWdqOYnfxVBV9JaYqPi/2kO6CI4OIqNSavsyE1U7BJl9dhk9uzspmEHOZ/JKcgmkQ+BI56mTkZbdadB18dB+xdKnubDHWg/SnzDxb7J3Mf6JJmcem8ZKGPy2kKvDIlvwulPcb6J8kShFf+CHUHe/URmPlmGDvuMSNGRzy2u3na5IqrIqVwNjPMOYJn46+lA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Some places in the kernel implement a parallel programming strategy consisting on local_locks() for most of the work, and some rare remote operations are scheduled on target cpu. This keeps cache bouncing low since cacheline tends to be mostly local, and avoids the cost of locks in non-RT kernels, even though the very few remote operations will be expensive due to scheduling overhead. On the other hand, for RT workloads this can represent a problem: getting an important workload scheduled out to deal with some unrelated task is sure to introduce unexpected deadline misses. It's interesting, though, that local_lock()s in RT kernels become spinlock(). We can make use of those to avoid scheduling work on a remote cpu by directly updating another cpu's per_cpu structure, while holding it's spinlock(). In order to do that, it's necessary to introduce a new set of functions to make it possible to get another cpu's per-cpu "local" lock (qpw_{un,}lock*) and also the corresponding queue_percpu_work_on() and flush_percpu_work() helpers to run the remote work. On non-RT kernels, no changes are expected, as every one of the introduced helpers work the exactly same as the current implementation: qpw_{un,}lock*() -> local_{un,}lock*() (ignores cpu parameter) queue_percpu_work_on() -> queue_work_on() flush_percpu_work() -> flush_work() For RT kernels, though, qpw_{un,}lock*() will use the extra cpu parameter to select the correct per-cpu structure to work on, and acquire the spinlock for that cpu. queue_percpu_work_on() will just call the requested function in the current cpu, which will operate in another cpu's per-cpu object. Since the local_locks() become spinlock()s in PREEMPT_RT, we are safe doing that. flush_percpu_work() then becomes a no-op since no work is actually scheduled on a remote cpu. Some minimal code rework is needed in order to make this mechanism work: The calls for local_{un,}lock*() on the functions that are currently scheduled on remote cpus need to be replaced by qpw_{un,}lock_n*(), so in RT kernels they can reference a different cpu. It's also necessary to use a qpw_struct instead of a work_struct, but it just contains a work struct and, in PREEMPT_RT, the target cpu. This should have almost no impact on non-RT kernels: few this_cpu_ptr() will become per_cpu_ptr(,smp_processor_id()). On RT kernels, this should improve performance and reduce latency by removing scheduling noise. Signed-off-by: Leonardo Bras --- include/linux/qpw.h | 88 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) create mode 100644 include/linux/qpw.h diff --git a/include/linux/qpw.h b/include/linux/qpw.h new file mode 100644 index 000000000000..ea2686a01e5e --- /dev/null +++ b/include/linux/qpw.h @@ -0,0 +1,88 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_QPW_H +#define _LINUX_QPW_H + +#include "linux/local_lock.h" +#include "linux/workqueue.h" + +#ifndef CONFIG_PREEMPT_RT + +struct qpw_struct { + struct work_struct work; +}; + +#define qpw_lock(lock, cpu) \ + local_lock(lock) + +#define qpw_unlock(lock, cpu) \ + local_unlock(lock) + +#define qpw_lock_irqsave(lock, flags, cpu) \ + local_lock_irqsave(lock, flags) + +#define qpw_unlock_irqrestore(lock, flags, cpu) \ + local_unlock_irqrestore(lock, flags) + +#define queue_percpu_work_on(c, wq, qpw) \ + queue_work_on(c, wq, &(qpw)->work) + +#define flush_percpu_work(qpw) \ + flush_work(&(qpw)->work) + +#define qpw_get_cpu(qpw) \ + smp_processor_id() + +#define INIT_QPW(qpw, func, c) \ + INIT_WORK(&(qpw)->work, (func)) + +#else /* !CONFIG_PREEMPT_RT */ + +struct qpw_struct { + struct work_struct work; + int cpu; +}; + +#define qpw_lock(__lock, cpu) \ + do { \ + migrate_disable(); \ + spin_lock(per_cpu_ptr((__lock), cpu)); \ + } while (0) + +#define qpw_unlock(__lock, cpu) \ + do { \ + spin_unlock(per_cpu_ptr((__lock), cpu)); \ + migrate_enable(); \ + } while (0) + +#define qpw_lock_irqsave(lock, flags, cpu) \ + do { \ + typecheck(unsigned long, flags); \ + flags = 0; \ + qpw_lock(lock, cpu); \ + } while (0) + +#define qpw_unlock_irqrestore(lock, flags, cpu) \ + qpw_unlock(lock, cpu) + +#define queue_percpu_work_on(c, wq, qpw) \ + do { \ + struct qpw_struct *__qpw = (qpw); \ + WARN_ON((c) != __qpw->cpu); \ + __qpw->work.func(&__qpw->work); \ + } while (0) + +#define flush_percpu_work(qpw) \ + do {} while (0) + +#define qpw_get_cpu(w) \ + container_of((w), struct qpw_struct, work)->cpu + +#define INIT_QPW(qpw, func, c) \ + do { \ + struct qpw_struct *__qpw = (qpw); \ + INIT_WORK(&__qpw->work, (func)); \ + __qpw->cpu = (c); \ + } while (0) + +#endif /* CONFIG_PREEMPT_RT */ +#endif /* LINUX_QPW_H */