From patchwork Mon Oct 11 14:43:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hyeonggon Yoo <42.hyeyoo@gmail.com> X-Patchwork-Id: 12550223 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BABC6C433FE for ; Mon, 11 Oct 2021 14:43:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5E5E5603E9 for ; Mon, 11 Oct 2021 14:43:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5E5E5603E9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 7BDF8900002; Mon, 11 Oct 2021 10:43:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 746A96B0072; Mon, 11 Oct 2021 10:43:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E7A4900002; Mon, 11 Oct 2021 10:43:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0217.hostedemail.com [216.40.44.217]) by kanga.kvack.org (Postfix) with ESMTP id 4C79E6B0071 for ; Mon, 11 Oct 2021 10:43:40 -0400 (EDT) Received: from smtpin40.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id F0C3B1802E8D5 for ; Mon, 11 Oct 2021 14:43:39 +0000 (UTC) X-FDA: 78684425358.40.1392692 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf25.hostedemail.com (Postfix) with ESMTP id 9D635B001780 for ; Mon, 11 Oct 2021 14:43:39 +0000 (UTC) Received: by mail-pl1-f175.google.com with SMTP id f21so2721877plb.3 for ; Mon, 11 Oct 2021 07:43:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=nVcPbrW7jTXPSod0cAliddkcKMdj7+RC4+UQL16qp2w=; b=HKmLQjSKI5121K/arvfpPig0nCIrxGzhYAr3aAiQCJ4m26FnXbMGiSPRb1V4vB25zd z4jaO7HfCWEfT6wXpJuPUZ9XgMwaX1nfBGrchKRRVXaMlj56yU/HwBnOfZ57zDaL7Cb+ ABcz6XAE8d0RhNDoGoTB+K3ihldmFDHZ3mwZ11VULGau+hAiqY6zrBMqBuo6hX7/ATQ1 W1Ng1sWzOux/pBG0LofXdRLfgxv8xE0kkPQyXByQPoJCVoOWjK001jIe1oR8qF8yQnGB 1NAcH9IGzqFFuJxWaV1w0xIvWkvVt3avJ0g+DnPW3IOUy893GK5H/MZtudL4hjAexoOj x6og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=nVcPbrW7jTXPSod0cAliddkcKMdj7+RC4+UQL16qp2w=; b=rk3FfzlMlYHWPfstqcuVHJunOBqo36CHXxoXsX4evnMop18uNP6aptYpi5fz51Revm CBFDxgT7SwcQVJ+TELlMe02pbxWaIkccsH0aPievIUMMDmCsnkPwgLn3tX7Ek+0KNuzO QkHf/QHlmeM9Z8cmLILZ2o6d3Wbvr7AInd8sH/n/tE3NbEOMYwpGjIup8Oi1aPP+Mjr/ aRfw2mmS80cjxDfrON8p6u9e+XOXwdLwfelK9gBSTFg/iqtx+zz1FBzL/1ZjodL+C7wu F2Oc13wgDo5MzEiUZ6/Fylb0V1u/GaZsUSXTNEZ3mlNyox5OXxCVictJuRG8t21/TjA8 hBvQ== X-Gm-Message-State: AOAM532WAolpQYMnrYppbDeZKbmYZlEqCPNo8W6T5fowbkLordf7R6Tu BG5aMmh6knXS9suHz8+KDEYuU4LNKZw= X-Google-Smtp-Source: ABdhPJxkhDzidQh0CsIpV/KbDDV/tXjU/ZIEfEDFVL2VAtEW524XfWmrMGMWLPXrQCB0nBLnnRJ6jA== X-Received: by 2002:a17:902:f703:b029:12c:982:c9ae with SMTP id h3-20020a170902f703b029012c0982c9aemr25114746plo.20.1633963418417; Mon, 11 Oct 2021 07:43:38 -0700 (PDT) Received: from kvm.asia-northeast3-a.c.our-ratio-313919.internal (24.151.64.34.bc.googleusercontent.com. [34.64.151.24]) by smtp.gmail.com with ESMTPSA id 192sm7528593pfy.121.2021.10.11.07.43.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Oct 2021 07:43:37 -0700 (PDT) From: Hyeonggon Yoo <42.hyeyoo@gmail.com> To: linux-mm@kvack.org Cc: 42.hyeyoo@gmail.com, Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Vlastimil Babka , linux-kernel@vger.kernel.org Subject: [PATCH v2] mm, slub: Use prefetchw instead of prefetch Date: Mon, 11 Oct 2021 14:43:31 +0000 Message-Id: <20211011144331.70084-1-42.hyeyoo@gmail.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 9D635B001780 X-Stat-Signature: w715zgypceob7etzbexoyw37ds6w34zf Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=HKmLQjSK; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf25.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com X-HE-Tag: 1633963419-719521 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: commit 0ad9500e16fe ("slub: prefetch next freelist pointer in slab_alloc()") introduced prefetch_freepointer() because when other cpu(s) freed objects into a page that current cpu owns, the freelist link is hot on cpu(s) which freed objects and possibly very cold on current cpu. But if freelist link chain is hot on cpu(s) which freed objects, it's better to invalidate that chain because they're not going to access again within a short time. So use prefetchw instead of prefetch. On supported architectures like x86 and arm, it invalidates other copied instances of a cache line when prefetching it. Before: Time: 91.677 Performance counter stats for 'hackbench -g 100 -l 10000': 1462938.07 msec cpu-clock # 15.908 CPUs utilized 18072550 context-switches # 12.354 K/sec 1018814 cpu-migrations # 696.416 /sec 104558 page-faults # 71.471 /sec 1580035699271 cycles # 1.080 GHz (54.51%) 2003670016013 instructions # 1.27 insn per cycle (54.31%) 5702204863 branch-misses (54.28%) 643368500985 cache-references # 439.778 M/sec (54.26%) 18475582235 cache-misses # 2.872 % of all cache refs (54.28%) 642206796636 L1-dcache-loads # 438.984 M/sec (46.87%) 18215813147 L1-dcache-load-misses # 2.84% of all L1-dcache accesses (46.83%) 653842996501 dTLB-loads # 446.938 M/sec (46.63%) 3227179675 dTLB-load-misses # 0.49% of all dTLB cache accesses (46.85%) 537531951350 iTLB-loads # 367.433 M/sec (54.33%) 114750630 iTLB-load-misses # 0.02% of all iTLB cache accesses (54.37%) 630135543177 L1-icache-loads # 430.733 M/sec (46.80%) 22923237620 L1-icache-load-misses # 3.64% of all L1-icache accesses (46.76%) 91.964452802 seconds time elapsed 43.416742000 seconds user 1422.441123000 seconds sys After: Time: 90.220 Performance counter stats for 'hackbench -g 100 -l 10000': 1437418.48 msec cpu-clock # 15.880 CPUs utilized 17694068 context-switches # 12.310 K/sec 958257 cpu-migrations # 666.651 /sec 100604 page-faults # 69.989 /sec 1583259429428 cycles # 1.101 GHz (54.57%) 2004002484935 instructions # 1.27 insn per cycle (54.37%) 5594202389 branch-misses (54.36%) 643113574524 cache-references # 447.409 M/sec (54.39%) 18233791870 cache-misses # 2.835 % of all cache refs (54.37%) 640205852062 L1-dcache-loads # 445.386 M/sec (46.75%) 17968160377 L1-dcache-load-misses # 2.81% of all L1-dcache accesses (46.79%) 651747432274 dTLB-loads # 453.415 M/sec (46.59%) 3127124271 dTLB-load-misses # 0.48% of all dTLB cache accesses (46.75%) 535395273064 iTLB-loads # 372.470 M/sec (54.38%) 113500056 iTLB-load-misses # 0.02% of all iTLB cache accesses (54.35%) 628871845924 L1-icache-loads # 437.501 M/sec (46.80%) 22585641203 L1-icache-load-misses # 3.59% of all L1-icache accesses (46.79%) 90.514819303 seconds time elapsed 43.877656000 seconds user 1397.176001000 seconds sys Link: https://lkml.org/lkml/2021/10/8/598 Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: Vlastimil Babka --- mm/slub.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/slub.c b/mm/slub.c index 3d2025f7163b..ce3d8b11215c 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -354,7 +354,7 @@ static inline void *get_freepointer(struct kmem_cache *s, void *object) static void prefetch_freepointer(const struct kmem_cache *s, void *object) { - prefetch(object + s->offset); + prefetchw(object + s->offset); } static inline void *get_freepointer_safe(struct kmem_cache *s, void *object)