From patchwork Wed Feb 26 12:03:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13992201 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4425DC021BF for ; Wed, 26 Feb 2025 12:04:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE264280038; Wed, 26 Feb 2025 07:03:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D92CD280024; Wed, 26 Feb 2025 07:03:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 96FCF280036; Wed, 26 Feb 2025 07:03:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 73B0428002D for ; Wed, 26 Feb 2025 07:03:52 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 2DBD7121207 for ; Wed, 26 Feb 2025 12:03:52 +0000 (UTC) X-FDA: 83161961904.13.18DE458 Received: from invmail4.hynix.com (exvmail4.skhynix.com [166.125.252.92]) by imf27.hostedemail.com (Postfix) with ESMTP id 4864B40003 for ; Wed, 26 Feb 2025 12:03:49 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740571430; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=t7ZESj6S5klDd37iho9K+N1w/Vn9wQb8Zgaocb9z0sI=; b=UR2od9mfmWfWWfZ1+k8I/ZZnp+Jh+pIC5Pc9DsqwC6ZWdMl/i1V/9oULa0EIbxo8p2PKmA eDDPlaB7C1blhdCrENfSqCQIEg3Wdpaidm5HH5IMMATneoCz1qwDE8oc10ydQrP6dUZ28o e+2MQjeo3mqQhzjd4nFiCb2NUkJe7sg= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740571430; a=rsa-sha256; cv=none; b=XwKXx9fZnv1EB5KewxOk9TRJNUgzlQNZByLVx0XXNf3EHTMW3KFtctSIrh5BWXyd/b7Cxx T2dywBvPgeLwFHxYVBb/bDN5PSOh+oCbAk8LGEUQ1AYbDeP0MdrNZXXakuK9OTCsxLhf2f f9XnSjt135x3N661CQEuCg8yNPru/R4= X-AuditID: a67dfc5b-3e1ff7000001d7ae-19-67bf03225d34 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, rjgolo@gmail.com Subject: [RFC PATCH v12 based on v6.14-rc4 08/25] mm: introduce luf_batch to be used as hash table to store luf meta data Date: Wed, 26 Feb 2025 21:03:19 +0900 Message-Id: <20250226120336.29565-8-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20250226120336.29565-1-byungchul@sk.com> References: <20250226113024.GA1935@system.software.com> <20250226120336.29565-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrCLMWRmVeSWpSXmKPExsXC9ZZnka4S8/50g4lrtSzmrF/DZvF5wz82 i6/rfzFbPP3Ux2JxedccNot7a/6zWpzftZbVYsfSfUwWlw4sYLI43nuAyWL+vc9sFps3TWW2 OD5lKqPF7x9z2Bz4PL639rF47Jx1l91jwaZSj80rtDw2repk89j0aRK7x7tz59g9Tsz4zeLx ft9VNo+tv+w8GqdeY/P4vEkugCeKyyYlNSezLLVI3y6BK6Nt2hHWgi0GFa92d7I1MB5S72Lk 5JAQMJG4svgmI4z95cZsNhCbTUBd4saNn8wgtoiAmcTB1j/sXYxcHMwCy5gk9p5oACsSFqiU 2DC5AayIRUBV4n7rPbBBvAKmEgeO/WWBGCovsXrDAbAaTqBBn6YdA+sVEkiW2Pn7DxPIUAmB 22wSU3+uZINokJQ4uOIGywRG3gWMDKsYhTLzynITM3NM9DIq8zIr9JLzczcxAsN6We2f6B2M ny4EH2IU4GBU4uF9cGZvuhBrYllxZe4hRgkOZiURXs7MPelCvCmJlVWpRfnxRaU5qcWHGKU5 WJTEeY2+lacICaQnlqRmp6YWpBbBZJk4OKUaGFn5veMu7Jo9v9hj8YPDmY+aBYXmr42M5Hl/ mmNa6VmLZGW5my1rFm2N1uwJehh3vm6a9a4sY5acl/u/esxin/rTn/e3+47n3y7V96/Jmf76 5IrP9e5BH5pC5d1mVbx3+c6w8w2Dr6dsY6XUdsZohR0TP7IWx2U3VJrdfPLlqHixYEj4rAkf 9iixFGckGmoxFxUnAgCGMxFcZwIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrNLMWRmVeSWpSXmKPExsXC5WfdrKvEvD/d4P5pRYs569ewWXze8I/N 4uv6X8wWTz/1sVgcnnuS1eLyrjlsFvfW/Ge1OL9rLavFjqX7mCwuHVjAZHG89wCTxfx7n9ks Nm+aymxxfMpURovfP+awOfB7fG/tY/HYOesuu8eCTaUem1doeWxa1cnmsenTJHaPd+fOsXuc mPGbxeP9vqtsHotffGDy2PrLzqNx6jU2j8+b5AJ4o7hsUlJzMstSi/TtErgy2qYdYS3YYlDx ancnWwPjIfUuRk4OCQETiS83ZrOB2GwC6hI3bvxkBrFFBMwkDrb+Ye9i5OJgFljGJLH3RANY kbBApcSGyQ1gRSwCqhL3W+8xgti8AqYSB479ZYEYKi+xesMBsBpOoEGfph0D6xUSSJbY+fsP 0wRGrgWMDKsYRTLzynITM3NM9YqzMyrzMiv0kvNzNzECg3RZ7Z+JOxi/XHY/xCjAwajEw/vg zN50IdbEsuLK3EOMEhzMSiK8nJl70oV4UxIrq1KL8uOLSnNSiw8xSnOwKInzeoWnJggJpCeW pGanphakFsFkmTg4pRoYnZVb1h3Trokzf3zIhynw8//k806zfs20vsVzJEfno8nbb5sWdsgp W/fq8yXbF1nNOfO0Ze++bPlHFo/OiruVnq47YxJy+9qRD8VTczz/inf3zjqtdlXtesezdH+R p7MORsmdFvt/V13uD5v1AXEzY7X6x36VSy4/vlQ0keULZw5nbsmeqoNdSizFGYmGWsxFxYkA dXrYUU4CAAA= X-CFilter-Loop: Reflected X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 4864B40003 X-Stat-Signature: jj5n6eb7t5brgdkw347dx6b4h4u5z8fm X-HE-Tag: 1740571429-204223 X-HE-Meta: U2FsdGVkX181jA5pnDpjRnmttV3LMpez68oC/wCOBUTzE1DHHjEitNjIhAMUasnPLksxj0pCghRrp7IPOJVY7f3wEdfqVvyLTCsLyO4Wo9DM3wG8k86fPLiro9CsHbSakwGFcIEKIkiZhQEEQi0jaJ3DNh4CuKXMqFAFcPCT8+jD+G6o1t8fVrIz+BJV8f33Ce0xuY7LbhsHeSVOIkDS3Qwk9adKsOTJFDumChojvOIcvgskYwXeiFT1qoDiW6SbmDmHBilGVvohuIsmxEHofjwHqPAY7sdHWU733HeQwArtV4LaxdcU06GtwLKoB2G7JQhCD6p56VeEzTPBZn+cVGieLIhpAazPeb4ITY8Z2R6RCwiK1sC5zKTLkxGVFgXtQk2d3dHaa5rPHG2AMciCCM1RacWAWrow73wzleFb6WY7kLI61AIs/S6R6lsIDUtm5D75ZFk2CQK/J0iN9M6EN2D+EKRXHtG8RElfoniEfqU5H9WyufdOR6y2fLGnB1J7DEciMRdvNKDXtJNQGoxOaKMLvWLKh5qNkZFqkRr5HHLhCLIoZrGZ4VWqx4VPbNbDay/agBcJvkOMbzZ9kskZqtxFhrx/d7/24oBPIrMA5JNNS7ROyPhksJE9TkNbN7Gzo6wdbenY0ij/nnxM2r0gIH27u7vugQm17C69WaoYcA6mGPH06J+pEHaEiSowHZUW5q5K5FdU08WzCwJoIWzRm7z80drjjpxDdCQTbGPIxEr0lDmziw2pKycIcGS62JPiZYoDd/q1Rv10bgjVsPj6YhSxsxQQR4h8Kjcp6+OsfsjEJ9nuAPhE9FrwuYhaRmBf0IlKoxgAaiFa+XYuSTUjFh6lo1HjAf0EJ1vtSe9XsDHf1oLafWKQIxkIFSDgdGSEMQi6J6ixT0YlM2BnpXsrvLruvrqSSIEe1FX0JSEj9DPFLJrwWiFq/WU+9IXy/1sd2r83ae0TZMs2g1hlWBD SD0DUave QESvNYz4zExd1xqCmjwB83f74oj5kUdfL2yXoqeNhA6N0YLzsG+ydvLotGCHYA0XdKm3ihknDRNWsnQg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Functionally, no change. This is a preparation for luf mechanism that needs to keep luf meta data per page while staying in pcp or buddy allocator. The meta data includes cpumask for tlb shootdown and luf's request generation number. Since struct page doesn't have enough room to store luf meta data, this patch introduces a hash table to store them and makes each page keep its hash key instead. Since all the pages in pcp or buddy share the hash table, confliction is inevitable so care must be taken when reading or updating its entry. Signed-off-by: Byungchul Park --- include/linux/mm_types.h | 10 ++++ mm/internal.h | 8 +++ mm/rmap.c | 122 +++++++++++++++++++++++++++++++++++++-- 3 files changed, 136 insertions(+), 4 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 7d78a285e52ca..4bfe8d072b0ea 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -32,6 +32,16 @@ struct address_space; struct mem_cgroup; +#ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH +struct luf_batch { + struct tlbflush_unmap_batch batch; + unsigned long ugen; + rwlock_t lock; +}; +#else +struct luf_batch {}; +#endif + /* * Each physical page in the system has a struct page associated with * it to keep track of whatever it is we are using the page for at the diff --git a/mm/internal.h b/mm/internal.h index 4c8ed93a792ec..3333d8d461c2c 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1253,6 +1253,8 @@ extern struct workqueue_struct *mm_percpu_wq; void try_to_unmap_flush(void); void try_to_unmap_flush_dirty(void); void flush_tlb_batched_pending(struct mm_struct *mm); +void fold_batch(struct tlbflush_unmap_batch *dst, struct tlbflush_unmap_batch *src, bool reset); +void fold_luf_batch(struct luf_batch *dst, struct luf_batch *src); #else static inline void try_to_unmap_flush(void) { @@ -1263,6 +1265,12 @@ static inline void try_to_unmap_flush_dirty(void) static inline void flush_tlb_batched_pending(struct mm_struct *mm) { } +static inline void fold_batch(struct tlbflush_unmap_batch *dst, struct tlbflush_unmap_batch *src, bool reset) +{ +} +static inline void fold_luf_batch(struct luf_batch *dst, struct luf_batch *src) +{ +} #endif /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */ extern const struct trace_print_flags pageflag_names[]; diff --git a/mm/rmap.c b/mm/rmap.c index ed345503e4f88..74fbf6c2fb3a7 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -641,7 +641,7 @@ struct anon_vma *folio_lock_anon_vma_read(const struct folio *folio, * function, ugen_before(), should be used to evaluate the temporal * sequence of events because the number is designed to wraparound. */ -static atomic_long_t __maybe_unused luf_ugen = ATOMIC_LONG_INIT(LUF_UGEN_INIT); +static atomic_long_t luf_ugen = ATOMIC_LONG_INIT(LUF_UGEN_INIT); /* * Don't return invalid luf_ugen, zero. @@ -656,6 +656,122 @@ static unsigned long __maybe_unused new_luf_ugen(void) return ugen; } +static void reset_batch(struct tlbflush_unmap_batch *batch) +{ + arch_tlbbatch_clear(&batch->arch); + batch->flush_required = false; + batch->writable = false; +} + +void fold_batch(struct tlbflush_unmap_batch *dst, + struct tlbflush_unmap_batch *src, bool reset) +{ + if (!src->flush_required) + return; + + /* + * Fold src to dst. + */ + arch_tlbbatch_fold(&dst->arch, &src->arch); + dst->writable = dst->writable || src->writable; + dst->flush_required = true; + + if (!reset) + return; + + /* + * Reset src. + */ + reset_batch(src); +} + +/* + * The range that luf_key covers, which is 'unsigned short' type. + */ +#define NR_LUF_BATCH (1 << (sizeof(short) * 8)) + +/* + * Use 0th entry as accumulated batch. + */ +static struct luf_batch luf_batch[NR_LUF_BATCH]; + +static void luf_batch_init(struct luf_batch *lb) +{ + rwlock_init(&lb->lock); + reset_batch(&lb->batch); + lb->ugen = atomic_long_read(&luf_ugen) - 1; +} + +static int __init luf_init(void) +{ + int i; + + for (i = 0; i < NR_LUF_BATCH; i++) + luf_batch_init(&luf_batch[i]); + + return 0; +} +early_initcall(luf_init); + +/* + * key to point an entry of the luf_batch array + * + * note: zero means invalid key + */ +static atomic_t luf_kgen = ATOMIC_INIT(1); + +/* + * Don't return invalid luf_key, zero. + */ +static unsigned short __maybe_unused new_luf_key(void) +{ + unsigned short luf_key = atomic_inc_return(&luf_kgen); + + if (!luf_key) + luf_key = atomic_inc_return(&luf_kgen); + + return luf_key; +} + +static void __fold_luf_batch(struct luf_batch *dst_lb, + struct tlbflush_unmap_batch *src_batch, + unsigned long src_ugen) +{ + /* + * dst_lb->ugen represents one that requires tlb shootdown for + * it, that is, sort of request number. The newer it is, the + * more tlb shootdown might be needed to fulfill the newer + * request. Conservertively keep the newer one. + */ + if (!dst_lb->ugen || ugen_before(dst_lb->ugen, src_ugen)) + dst_lb->ugen = src_ugen; + fold_batch(&dst_lb->batch, src_batch, false); +} + +void fold_luf_batch(struct luf_batch *dst, struct luf_batch *src) +{ + unsigned long flags; + + /* + * Exactly same. Nothing to fold. + */ + if (dst == src) + return; + + if (&src->lock < &dst->lock) { + read_lock_irqsave(&src->lock, flags); + write_lock(&dst->lock); + } else { + write_lock_irqsave(&dst->lock, flags); + read_lock(&src->lock); + } + + __fold_luf_batch(dst, &src->batch, src->ugen); + + write_unlock(&dst->lock); + read_unlock_irqrestore(&src->lock, flags); +} + /* * Flush TLB entries for recently unmapped pages from remote CPUs. It is * important if a PTE was dirty when it was unmapped that it's flushed @@ -670,9 +786,7 @@ void try_to_unmap_flush(void) return; arch_tlbbatch_flush(&tlb_ubc->arch); - arch_tlbbatch_clear(&tlb_ubc->arch); - tlb_ubc->flush_required = false; - tlb_ubc->writable = false; + reset_batch(tlb_ubc); } /* Flush iff there are potentially writable TLB entries that can race with IO */