From patchwork Tue May 19 21:45:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ahmed S. Darwish" X-Patchwork-Id: 11558865 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 55C5713B4 for ; Tue, 19 May 2020 21:46:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2D03C206BE for ; Tue, 19 May 2020 21:46:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2D03C206BE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linutronix.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5F5698001C; Tue, 19 May 2020 17:46:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5A73B900003; Tue, 19 May 2020 17:46:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4BCD68001C; Tue, 19 May 2020 17:46:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0182.hostedemail.com [216.40.44.182]) by kanga.kvack.org (Postfix) with ESMTP id 33DF3900003 for ; Tue, 19 May 2020 17:46:39 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E76504DB1 for ; Tue, 19 May 2020 21:46:38 +0000 (UTC) X-FDA: 76834803276.19.tax76_74319f1063b23 X-Spam-Summary: 2,0,0,54079e2c524f91d7,d41d8cd98f00b204,a.darwish@linutronix.de,,RULES_HIT:41:69:355:379:541:800:960:967:968:973:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1535:1544:1605:1711:1730:1747:1777:1792:2194:2199:2393:2525:2538:2559:2563:2682:2685:2693:2859:2897:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3165:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4117:4250:4419:5007:6119:7550:7652:7875:7903:7904:8957:9025:9592:10004:11026:11658:11914:12043:12160:12291:12295:12296:12297:12438:12517:12519:12555:12679:12683:12895:13894:14096:14181:14394:14721:14819:21080:21324:21451:21627:21740:21795:21939:21983:21987:21990:30029:30034:30051:30054:30070,0,RBL:193.142.43.55:@linutronix.de:.lbl8.mailshell.net-62.8.6.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:26,LUA_SUMMARY:none X-HE-Tag: tax76_74319f1063b23 X-Filterd-Recvd-Size: 6128 Received: from Galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Tue, 19 May 2020 21:46:38 +0000 (UTC) Received: from [5.158.153.53] (helo=debian-buster-darwi.lab.linutronix.de.) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.80) (envelope-from ) id 1jbA3j-0002af-36; Tue, 19 May 2020 23:45:59 +0200 From: "Ahmed S. Darwish" To: Peter Zijlstra , Ingo Molnar , Will Deacon Cc: Thomas Gleixner , "Paul E. McKenney" , "Sebastian A. Siewior" , Steven Rostedt , LKML , "Ahmed S. Darwish" , Andrew Morton , Konstantin Khlebnikov , linux-mm@kvack.org Subject: [PATCH v1 02/25] mm/swap: Don't abuse the seqcount latching API Date: Tue, 19 May 2020 23:45:24 +0200 Message-Id: <20200519214547.352050-3-a.darwish@linutronix.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200519214547.352050-1-a.darwish@linutronix.de> References: <20200519214547.352050-1-a.darwish@linutronix.de> MIME-Version: 1.0 X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Commit eef1a429f234 ("mm/swap.c: piggyback lru_add_drain_all() calls") implemented an optimization mechanism to exit the to-be-started LRU drain operation (name it A) if another drain operation *started and finished* while (A) was blocked on the LRU draining mutex. This was done through a seqcount latch, which is an abuse of its semantics: 1. Seqcount latching should be used for the purpose of switching between two storage places with sequence protection to allow interruptible, preemptible writer sections. The optimization mechanism has absolutely nothing to do with that. 2. The used raw_write_seqcount_latch() has two smp write memory barriers to always insure one consistent storage place out of the two storage places available. This extra smp_wmb() is redundant for the optimization use case. Beside the API abuse, the semantics of a latch sequence counter was force fitted into the optimization. What was actually meant is to track generations of LRU draining operations, where "current lru draining generation = x" implies that all generations 0 < n <= x are already *scheduled* for draining. Remove the conceptually-inappropriate seqcount latch usage and manually implement the optimization using a counter and SMP memory barriers. Link: https://lkml.kernel.org/r/CALYGNiPSr-cxV9MX9czaVh6Wz_gzSv3H_8KPvgjBTGbJywUJpA@mail.gmail.com Signed-off-by: Ahmed S. Darwish --- mm/swap.c | 57 +++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 47 insertions(+), 10 deletions(-) diff --git a/mm/swap.c b/mm/swap.c index bf9a79fed62d..d6910eeed43d 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -713,10 +713,20 @@ static void lru_add_drain_per_cpu(struct work_struct *dummy) */ void lru_add_drain_all(void) { - static seqcount_t seqcount = SEQCNT_ZERO(seqcount); - static DEFINE_MUTEX(lock); + /* + * lru_drain_gen - Current generation of pages that could be in vectors + * + * (A) Definition: lru_drain_gen = x implies that all generations + * 0 < n <= x are already scheduled for draining. + * + * This is an optimization for the highly-contended use case where a + * user space workload keeps constantly generating a flow of pages + * for each CPU. + */ + static unsigned int lru_drain_gen; static struct cpumask has_work; - int cpu, seq; + static DEFINE_MUTEX(lock); + int cpu, this_gen; /* * Make sure nobody triggers this path before mm_percpu_wq is fully @@ -725,21 +735,48 @@ void lru_add_drain_all(void) if (WARN_ON(!mm_percpu_wq)) return; - seq = raw_read_seqcount_latch(&seqcount); + /* + * (B) Cache the LRU draining generation number + * + * smp_rmb() ensures that the counter is loaded before the mutex is + * taken. It pairs with the smp_wmb() inside the mutex critical section + * at (D). + */ + this_gen = READ_ONCE(lru_drain_gen); + smp_rmb(); mutex_lock(&lock); /* - * Piggyback on drain started and finished while we waited for lock: - * all pages pended at the time of our enter were drained from vectors. + * (C) Exit the draining operation if a newer generation, from another + * lru_add_drain_all(), was already scheduled for draining. Check (A). */ - if (__read_seqcount_retry(&seqcount, seq)) + if (unlikely(this_gen != lru_drain_gen)) goto done; - raw_write_seqcount_latch(&seqcount); + /* + * (D) Increment generation number + * + * Pairs with READ_ONCE() and smp_rmb() at (B), outside of the critical + * section. + * + * This pairing must be done here, before the for_each_online_cpu loop + * below which drains the page vectors. + * + * Let x, y, and z represent some system CPU numbers, where x < y < z. + * Assume CPU #z is is in the middle of the for_each_online_cpu loop + * below and has already reached CPU #y's per-cpu data. CPU #x comes + * along, adds some pages to its per-cpu vectors, then calls + * lru_add_drain_all(). + * + * If the paired smp_wmb() below is done at any later step, e.g. after + * the loop, CPU #x will just exit at (C) and miss flushing out all of + * its added pages. + */ + WRITE_ONCE(lru_drain_gen, lru_drain_gen + 1); + smp_wmb(); cpumask_clear(&has_work); - for_each_online_cpu(cpu) { struct work_struct *work = &per_cpu(lru_add_drain_work, cpu); @@ -766,7 +803,7 @@ void lru_add_drain_all(void) { lru_add_drain(); } -#endif +#endif /* CONFIG_SMP */ /** * release_pages - batched put_page()