From patchwork Thu May 28 11:00:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575577 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5042A60D for ; Thu, 28 May 2020 11:02:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 280F120888 for ; Thu, 28 May 2020 11:02:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 280F120888 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 63BEE800BF; Thu, 28 May 2020 07:01:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 579B0800BE; Thu, 28 May 2020 07:01:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3CE3C800BF; Thu, 28 May 2020 07:01:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0090.hostedemail.com [216.40.44.90]) by kanga.kvack.org (Postfix) with ESMTP id 0CB44800BE for ; Thu, 28 May 2020 07:01:43 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id BA07C52B3 for ; Thu, 28 May 2020 11:01:42 +0000 (UTC) X-FDA: 76865837244.13.band64_661f9ef67475c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id 9AF1F18140B69 for ; Thu, 28 May 2020 11:01:42 +0000 (UTC) X-Spam-Summary: 2,0,0,87e9ba8df7359b58,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:69:355:379:541:800:960:966:968:973:988:989:1260:1261:1345:1359:1431:1437:1534:1542:1711:1730:1747:1777:1792:1801:2196:2199:2393:2559:2562:2898:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3872:4321:4385:4605:5007:6261:6737:8957:9010:9121:9592:10004:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12555:12683:12895:13161:13229:13846:14181:14394:14721:14915:21060:21080:21451:21627:21987:30054:30070,0,RBL:115.124.30.45:@linux.alibaba.com:.lbl8.mailshell.net-64.201.201.201 62.20.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: band64_661f9ef67475c X-Filterd-Recvd-Size: 4467 Received: from out30-45.freemail.mail.aliyun.com (out30-45.freemail.mail.aliyun.com [115.124.30.45]) by imf18.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:01:41 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R471e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0TztdQ9O_1590663684; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztdQ9O_1590663684) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:24 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi Subject: [PATCH v11 01/16] mm/vmscan: remove unnecessary lruvec adding Date: Thu, 28 May 2020 19:00:43 +0800 Message-Id: <1590663658-184131-2-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: 9AF1F18140B69 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We don't have to add a freeable page into lru and then remove from it. This change saves a couple of actions and makes the moving more clear. The SetPageLRU needs to be kept here for list intergrity. Otherwise: #0 mave_pages_to_lru #1 release_pages if (put_page_testzero()) if !put_page_testzero !PageLRU //skip lru_lock list_add(&page->lru,) list_add(&page->lru,) //corrupt [akpm@linux-foundation.org: coding style fixes] Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Tejun Heo Cc: Matthew Wilcox Cc: Hugh Dickins Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/vmscan.c | 32 +++++++++++++++++++++----------- 1 file changed, 21 insertions(+), 11 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 3a482b22fe4e..d856a1545ad6 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1855,26 +1855,29 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, while (!list_empty(list)) { page = lru_to_page(list); VM_BUG_ON_PAGE(PageLRU(page), page); + list_del(&page->lru); if (unlikely(!page_evictable(page))) { - list_del(&page->lru); spin_unlock_irq(&pgdat->lru_lock); putback_lru_page(page); spin_lock_irq(&pgdat->lru_lock); continue; } - lruvec = mem_cgroup_page_lruvec(page, pgdat); + /* + * The SetPageLRU needs to be kept here for list intergrity. + * Otherwise: + * #0 mave_pages_to_lru #1 release_pages + * if (put_page_testzero()) + * if !put_page_testzero + * !PageLRU //skip lru_lock + * list_add(&page->lru,) + * list_add(&page->lru,) //corrupt + */ SetPageLRU(page); - lru = page_lru(page); - - nr_pages = hpage_nr_pages(page); - update_lru_size(lruvec, lru, page_zonenum(page), nr_pages); - list_move(&page->lru, &lruvec->lists[lru]); - if (put_page_testzero(page)) { + if (unlikely(put_page_testzero(page))) { __ClearPageLRU(page); __ClearPageActive(page); - del_page_from_lru_list(page, lruvec, lru); if (unlikely(PageCompound(page))) { spin_unlock_irq(&pgdat->lru_lock); @@ -1882,9 +1885,16 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, spin_lock_irq(&pgdat->lru_lock); } else list_add(&page->lru, &pages_to_free); - } else { - nr_moved += nr_pages; + continue; } + + lruvec = mem_cgroup_page_lruvec(page, pgdat); + lru = page_lru(page); + nr_pages = hpage_nr_pages(page); + + update_lru_size(lruvec, lru, page_zonenum(page), nr_pages); + list_add(&page->lru, &lruvec->lists[lru]); + nr_moved += nr_pages; } /* From patchwork Thu May 28 11:00:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575557 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 60EE960D for ; Thu, 28 May 2020 11:01:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 373C8208DB for ; Thu, 28 May 2020 11:01:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 373C8208DB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3F13C8001A; Thu, 28 May 2020 07:01:33 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3C7C280010; Thu, 28 May 2020 07:01:33 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2DDD28001A; Thu, 28 May 2020 07:01:33 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0180.hostedemail.com [216.40.44.180]) by kanga.kvack.org (Postfix) with ESMTP id 1C37B80010 for ; Thu, 28 May 2020 07:01:33 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id D348852AE for ; Thu, 28 May 2020 11:01:32 +0000 (UTC) X-FDA: 76865836824.07.party43_64a71f1d28a33 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id B4C331803F9B6 for ; Thu, 28 May 2020 11:01:32 +0000 (UTC) X-Spam-Summary: 2,0,0,a60942d40f47b4e7,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1534:1541:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3352:3865:3867:3871:3874:4321:5007:6119:6261:6737:7903:10004:11026:11658:11914:12043:12048:12297:12438:12555:12895:12986:13069:13311:13357:13846:14096:14181:14384:14394:14721:14915:21060:21080:21451:21627:21990:30054:30070,0,RBL:115.124.30.131:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: party43_64a71f1d28a33 X-Filterd-Recvd-Size: 2768 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) by imf39.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:01:31 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R461e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07488;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0TztdQ9T_1590663684; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztdQ9T_1590663684) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:25 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi Subject: [PATCH v11 02/16] mm/page_idle: no unlikely double check for idle page counting Date: Thu, 28 May 2020 19:00:44 +0800 Message-Id: <1590663658-184131-3-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: B4C331803F9B6 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As func comments mentioned, few isolated page missing be tolerated. So why not do further to drop the unlikely double check. That won't cause more idle pages, but reduce a lock contention. This is also a preparation for later new page isolation feature. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Matthew Wilcox Cc: Hugh Dickins Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/page_idle.c | 8 -------- 1 file changed, 8 deletions(-) diff --git a/mm/page_idle.c b/mm/page_idle.c index 295512465065..914df63948b1 100644 --- a/mm/page_idle.c +++ b/mm/page_idle.c @@ -31,7 +31,6 @@ static struct page *page_idle_get_page(unsigned long pfn) { struct page *page; - pg_data_t *pgdat; if (!pfn_valid(pfn)) return NULL; @@ -41,13 +40,6 @@ static struct page *page_idle_get_page(unsigned long pfn) !get_page_unless_zero(page)) return NULL; - pgdat = page_pgdat(page); - spin_lock_irq(&pgdat->lru_lock); - if (unlikely(!PageLRU(page))) { - put_page(page); - page = NULL; - } - spin_unlock_irq(&pgdat->lru_lock); return page; } From patchwork Thu May 28 11:00:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575561 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5EF6360D for ; Thu, 28 May 2020 11:01:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2BB1A208DB for ; Thu, 28 May 2020 11:01:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2BB1A208DB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B4314800B7; Thu, 28 May 2020 07:01:34 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id ACEA280010; Thu, 28 May 2020 07:01:34 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E3FF800B7; Thu, 28 May 2020 07:01:34 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0149.hostedemail.com [216.40.44.149]) by kanga.kvack.org (Postfix) with ESMTP id 8566980010 for ; Thu, 28 May 2020 07:01:34 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 2D2D2824556B for ; Thu, 28 May 2020 11:01:34 +0000 (UTC) X-FDA: 76865836908.19.grip01_64df7c65bcc4e Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id 0F2A71AD31E for ; Thu, 28 May 2020 11:01:34 +0000 (UTC) X-Spam-Summary: 2,0,0,359b209ce11d6f9d,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1534:1541:1711:1730:1747:1777:1792:2198:2199:2393:2559:2562:2731:2899:3138:3139:3140:3141:3142:3352:3865:3866:3867:3868:3870:3871:3872:4605:5007:6261:6737:8957:10004:11026:11473:11658:11914:12048:12296:12297:12438:12555:12895:13069:13161:13229:13311:13357:13846:14096:14181:14384:14394:14721:14915:21060:21080:21451:21627:30034:30054,0,RBL:115.124.30.133:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: grip01_64df7c65bcc4e X-Filterd-Recvd-Size: 2983 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by imf02.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:01:32 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01419;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0TztdQ9a_1590663685; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztdQ9a_1590663685) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:25 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi Subject: [PATCH v11 03/16] mm/compaction: correct the comments of compact_defer_shift Date: Thu, 28 May 2020 19:00:45 +0800 Message-Id: <1590663658-184131-4-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: 0F2A71AD31E X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There is no compact_defer_limit. It should be compact_defer_shift in use. and add compact_order_failed explanation. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- include/linux/mmzone.h | 1 + mm/compaction.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index d8cad09d34ff..545b663678ed 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -510,6 +510,7 @@ struct zone { * On compaction failure, 1< X-Patchwork-Id: 11575559 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 878A260D for ; Thu, 28 May 2020 11:01:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5E8DA208E4 for ; Thu, 28 May 2020 11:01:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5E8DA208E4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B9183800B6; Thu, 28 May 2020 07:01:33 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B68AC80010; Thu, 28 May 2020 07:01:33 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA5F7800B6; Thu, 28 May 2020 07:01:33 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0206.hostedemail.com [216.40.44.206]) by kanga.kvack.org (Postfix) with ESMTP id 93F4380010 for ; Thu, 28 May 2020 07:01:33 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 43289181AEF15 for ; Thu, 28 May 2020 11:01:33 +0000 (UTC) X-FDA: 76865836866.25.mass78_64c31bea5423d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 321801804E51A for ; Thu, 28 May 2020 11:01:33 +0000 (UTC) X-Spam-Summary: 2,0,0,685120da4ced1df7,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1535:1543:1711:1730:1747:1777:1792:2198:2199:2393:2559:2562:2731:2898:3138:3139:3140:3141:3142:3354:3865:3867:3870:3871:3872:4321:4605:5007:6261:6737:10004:11026:11473:11658:11914:12043:12048:12114:12296:12297:12438:12555:12895:12986:13161:13229:13255:13846:14096:14181:14394:14721:14915:21060:21080:21451:21627:21740:21990:30046:30054:30064:30070,0,RBL:115.124.30.45:@linux.alibaba.com:.lbl8.mailshell.net-64.201.201.201 62.20.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:22,LUA_SUMMARY:none X-HE-Tag: mass78_64c31bea5423d X-Filterd-Recvd-Size: 5115 Received: from out30-45.freemail.mail.aliyun.com (out30-45.freemail.mail.aliyun.com [115.124.30.45]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:01:32 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R911e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04427;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=21;SR=0;TI=SMTPD_---0TztfJev_1590663686; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztfJev_1590663686) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:26 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi , Steven Rostedt , Ingo Molnar , Vlastimil Babka , Mike Kravetz Subject: [PATCH v11 04/16] mm/compaction: rename compact_deferred as compact_should_defer Date: Thu, 28 May 2020 19:00:46 +0800 Message-Id: <1590663658-184131-5-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: 321801804E51A X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The compact_deferred is a defer suggestion check, deferring action does in defer_compaction not here. so, better rename it to avoid confusing. Signed-off-by: Alex Shi Cc: Steven Rostedt Cc: Ingo Molnar Cc: Andrew Morton Cc: Vlastimil Babka Cc: Mike Kravetz Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org --- include/linux/compaction.h | 4 ++-- include/trace/events/compaction.h | 2 +- mm/compaction.c | 8 ++++---- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/include/linux/compaction.h b/include/linux/compaction.h index 3ed2f22b588a..9626f057b9f2 100644 --- a/include/linux/compaction.h +++ b/include/linux/compaction.h @@ -100,7 +100,7 @@ extern enum compact_result compaction_suitable(struct zone *zone, int order, unsigned int alloc_flags, int highest_zoneidx); extern void defer_compaction(struct zone *zone, int order); -extern bool compaction_deferred(struct zone *zone, int order); +extern bool compaction_should_defer(struct zone *zone, int order); extern void compaction_defer_reset(struct zone *zone, int order, bool alloc_success); extern bool compaction_restarting(struct zone *zone, int order); @@ -199,7 +199,7 @@ static inline void defer_compaction(struct zone *zone, int order) { } -static inline bool compaction_deferred(struct zone *zone, int order) +static inline bool compaction_should_defer(struct zone *zone, int order) { return true; } diff --git a/include/trace/events/compaction.h b/include/trace/events/compaction.h index 54e5bf081171..33633c71df04 100644 --- a/include/trace/events/compaction.h +++ b/include/trace/events/compaction.h @@ -274,7 +274,7 @@ 1UL << __entry->defer_shift) ); -DEFINE_EVENT(mm_compaction_defer_template, mm_compaction_deferred, +DEFINE_EVENT(mm_compaction_defer_template, mm_compaction_should_defer, TP_PROTO(struct zone *zone, int order), diff --git a/mm/compaction.c b/mm/compaction.c index 38cdf392837b..c359772dbfcc 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -154,7 +154,7 @@ void defer_compaction(struct zone *zone, int order) } /* Returns true if compaction should be skipped this time */ -bool compaction_deferred(struct zone *zone, int order) +bool compaction_should_defer(struct zone *zone, int order) { unsigned long defer_limit = 1UL << zone->compact_defer_shift; @@ -168,7 +168,7 @@ bool compaction_deferred(struct zone *zone, int order) if (zone->compact_considered >= defer_limit) return false; - trace_mm_compaction_deferred(zone, order); + trace_mm_compaction_should_defer(zone, order); return true; } @@ -2379,7 +2379,7 @@ enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order, enum compact_result status; if (prio > MIN_COMPACT_PRIORITY - && compaction_deferred(zone, order)) { + && compaction_should_defer(zone, order)) { rc = max_t(enum compact_result, COMPACT_DEFERRED, rc); continue; } @@ -2563,7 +2563,7 @@ static void kcompactd_do_work(pg_data_t *pgdat) if (!populated_zone(zone)) continue; - if (compaction_deferred(zone, cc.order)) + if (compaction_should_defer(zone, cc.order)) continue; if (compaction_suitable(zone, cc.order, 0, zoneid) != From patchwork Thu May 28 11:00:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575583 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9D09592A for ; Thu, 28 May 2020 11:02:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 742132088E for ; Thu, 28 May 2020 11:02:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 742132088E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 579D5800C2; Thu, 28 May 2020 07:01:57 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 507C4800BE; Thu, 28 May 2020 07:01:57 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C8FD800C2; Thu, 28 May 2020 07:01:57 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0244.hostedemail.com [216.40.44.244]) by kanga.kvack.org (Postfix) with ESMTP id 1E9E0800BE for ; Thu, 28 May 2020 07:01:57 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id CE43C181AEF15 for ; Thu, 28 May 2020 11:01:56 +0000 (UTC) X-FDA: 76865837832.16.song12_682d20b443c5a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin16.hostedemail.com (Postfix) with ESMTP id 9CA67100E6903 for ; Thu, 28 May 2020 11:01:56 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:30001:30054,0,RBL:47.88.44.36:@linux.alibaba.com:.lbl8.mailshell.net-62.18.0.100 64.10.201.10;47.88.44.36-irl.urbl.hostedemail.com-127.0.0.175,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: song12_682d20b443c5a X-Filterd-Recvd-Size: 5380 Received: from out4436.biz.mail.alibaba.com (out4436.biz.mail.alibaba.com [47.88.44.36]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:01:55 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04394;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0TztM4bn_1590663686; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztM4bn_1590663686) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:26 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi Subject: [PATCH v11 05/16] mm/thp: move lru_add_page_tail func to huge_memory.c Date: Thu, 28 May 2020 19:00:47 +0800 Message-Id: <1590663658-184131-6-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: 9CA67100E6903 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The func is only used in huge_memory.c, defining it in other file with a CONFIG_TRANSPARENT_HUGEPAGE macro restrict just looks weird. Let's move it THP. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Matthew Wilcox Cc: Hugh Dickins Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org --- include/linux/swap.h | 2 -- mm/huge_memory.c | 30 ++++++++++++++++++++++++++++++ mm/swap.c | 33 --------------------------------- 3 files changed, 30 insertions(+), 35 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index d9e362c7439c..d12ecacce307 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -338,8 +338,6 @@ extern void lru_note_cost(struct lruvec *lruvec, bool file, unsigned int nr_pages); extern void lru_note_cost_page(struct page *); extern void lru_cache_add(struct page *); -extern void lru_add_page_tail(struct page *page, struct page *page_tail, - struct lruvec *lruvec, struct list_head *head); extern void activate_page(struct page *); extern void mark_page_accessed(struct page *); extern void lru_add_drain(void); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 21e6687895e2..4c3990ba29cb 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2316,6 +2316,36 @@ static void remap_page(struct page *page) } } +void lru_add_page_tail(struct page *page, struct page *page_tail, + struct lruvec *lruvec, struct list_head *list) +{ + VM_BUG_ON_PAGE(!PageHead(page), page); + VM_BUG_ON_PAGE(PageCompound(page_tail), page); + VM_BUG_ON_PAGE(PageLRU(page_tail), page); + lockdep_assert_held(&lruvec_pgdat(lruvec)->lru_lock); + + if (!list) + SetPageLRU(page_tail); + + if (likely(PageLRU(page))) + list_add_tail(&page_tail->lru, &page->lru); + else if (list) { + /* page reclaim is reclaiming a huge page */ + get_page(page_tail); + list_add_tail(&page_tail->lru, list); + } else { + /* + * Head page has not yet been counted, as an hpage, + * so we must account for each subpage individually. + * + * Put page_tail on the list at the correct position + * so they all end up in order. + */ + add_page_to_lru_list_tail(page_tail, lruvec, + page_lru(page_tail)); + } +} + static void __split_huge_page_tail(struct page *head, int tail, struct lruvec *lruvec, struct list_head *list) { diff --git a/mm/swap.c b/mm/swap.c index acd88873f076..ffb4ea7b82b5 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -880,39 +880,6 @@ void __pagevec_release(struct pagevec *pvec) } EXPORT_SYMBOL(__pagevec_release); -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -/* used by __split_huge_page_refcount() */ -void lru_add_page_tail(struct page *page, struct page *page_tail, - struct lruvec *lruvec, struct list_head *list) -{ - VM_BUG_ON_PAGE(!PageHead(page), page); - VM_BUG_ON_PAGE(PageCompound(page_tail), page); - VM_BUG_ON_PAGE(PageLRU(page_tail), page); - lockdep_assert_held(&lruvec_pgdat(lruvec)->lru_lock); - - if (!list) - SetPageLRU(page_tail); - - if (likely(PageLRU(page))) - list_add_tail(&page_tail->lru, &page->lru); - else if (list) { - /* page reclaim is reclaiming a huge page */ - get_page(page_tail); - list_add_tail(&page_tail->lru, list); - } else { - /* - * Head page has not yet been counted, as an hpage, - * so we must account for each subpage individually. - * - * Put page_tail on the list at the correct position - * so they all end up in order. - */ - add_page_to_lru_list_tail(page_tail, lruvec, - page_lru(page_tail)); - } -} -#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ - static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec, void *arg) { From patchwork Thu May 28 11:00:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575589 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6109E60D for ; Thu, 28 May 2020 11:02:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 380A3208E4 for ; Thu, 28 May 2020 11:02:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 380A3208E4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 50255800BE; Thu, 28 May 2020 07:02:38 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 48AC38001A; Thu, 28 May 2020 07:02:38 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 352F1800BE; Thu, 28 May 2020 07:02:38 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0159.hostedemail.com [216.40.44.159]) by kanga.kvack.org (Postfix) with ESMTP id 1B3628001A for ; Thu, 28 May 2020 07:02:38 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D8334180AD83B for ; Thu, 28 May 2020 11:02:37 +0000 (UTC) X-FDA: 76865839554.03.story60_6dfdf7d149c56 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin03.hostedemail.com (Postfix) with ESMTP id A091B28A4EA for ; Thu, 28 May 2020 11:02:37 +0000 (UTC) X-Spam-Summary: 2,0,0,28a8d4e4d93bd3bd,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1534:1541:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3352:3865:3867:3871:3872:4321:4605:5007:6261:6737:7903:8957:9010:10004:11026:11232:11473:11658:11914:12043:12048:12296:12297:12555:12895:13069:13311:13357:13846:14096:14181:14384:14394:14721:14915:21060:21080:21451:21627:30054,0,RBL:115.124.30.45:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: story60_6dfdf7d149c56 X-Filterd-Recvd-Size: 2995 Received: from out30-45.freemail.mail.aliyun.com (out30-45.freemail.mail.aliyun.com [115.124.30.45]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:02:35 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R681e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07488;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0TztM4bt_1590663687; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztM4bt_1590663687) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:27 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi Subject: [PATCH v11 06/16] mm/thp: clean up lru_add_page_tail Date: Thu, 28 May 2020 19:00:48 +0800 Message-Id: <1590663658-184131-7-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: A091B28A4EA X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since the first parameter is only used by head page, it's better to make it stright. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Matthew Wilcox Cc: Hugh Dickins Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/huge_memory.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 4c3990ba29cb..a4ba75e143b3 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2316,19 +2316,19 @@ static void remap_page(struct page *page) } } -void lru_add_page_tail(struct page *page, struct page *page_tail, +void lru_add_page_tail(struct page *head, struct page *page_tail, struct lruvec *lruvec, struct list_head *list) { - VM_BUG_ON_PAGE(!PageHead(page), page); - VM_BUG_ON_PAGE(PageCompound(page_tail), page); - VM_BUG_ON_PAGE(PageLRU(page_tail), page); + VM_BUG_ON_PAGE(!PageHead(head), head); + VM_BUG_ON_PAGE(PageCompound(page_tail), head); + VM_BUG_ON_PAGE(PageLRU(page_tail), head); lockdep_assert_held(&lruvec_pgdat(lruvec)->lru_lock); if (!list) SetPageLRU(page_tail); - if (likely(PageLRU(page))) - list_add_tail(&page_tail->lru, &page->lru); + if (likely(PageLRU(head))) + list_add_tail(&page_tail->lru, &head->lru); else if (list) { /* page reclaim is reclaiming a huge page */ get_page(page_tail); From patchwork Thu May 28 11:00:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575587 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CE89860D for ; Thu, 28 May 2020 11:02:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A46D12088E for ; Thu, 28 May 2020 11:02:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A46D12088E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BE04E800C4; Thu, 28 May 2020 07:02:28 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B8FCE800BE; Thu, 28 May 2020 07:02:28 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7F03800C4; Thu, 28 May 2020 07:02:28 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0159.hostedemail.com [216.40.44.159]) by kanga.kvack.org (Postfix) with ESMTP id 8B9F8800BE for ; Thu, 28 May 2020 07:02:28 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 4E79F4DAC for ; Thu, 28 May 2020 11:02:28 +0000 (UTC) X-FDA: 76865839176.03.van49_6cb5ef237822c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin03.hostedemail.com (Postfix) with ESMTP id 28C6C28A4E9 for ; Thu, 28 May 2020 11:02:28 +0000 (UTC) X-Spam-Summary: 2,0,0,d27375980f7be9c0,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:69:355:379:541:800:960:966:973:988:989:1260:1261:1345:1359:1431:1437:1534:1543:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2895:2904:3138:3139:3140:3141:3142:3354:3865:3867:3868:4321:4385:5007:6261:6737:7514:7903:8957:9010:9592:10004:11026:11232:11473:11658:11914:12043:12048:12296:12297:12438:12555:12679:12895:13846:14096:14181:14394:14721:14915:21060:21080:21451:21627:21740:30054:30070,0,RBL:115.124.30.43:@linux.alibaba.com:.lbl8.mailshell.net-64.201.201.201 62.20.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: van49_6cb5ef237822c X-Filterd-Recvd-Size: 4868 Received: from out30-43.freemail.mail.aliyun.com (out30-43.freemail.mail.aliyun.com [115.124.30.43]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:02:26 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R811e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07425;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=19;SR=0;TI=SMTPD_---0TztfJfN_1590663687; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztfJfN_1590663687) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:27 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi , "Kirill A. Shutemov" , Andrea Arcangeli Subject: [PATCH v11 07/16] mm/thp: narrow lru locking Date: Thu, 28 May 2020 19:00:49 +0800 Message-Id: <1590663658-184131-8-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: 28C6C28A4E9 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: lru_lock and page cache xa_lock have no reason with current sequence, put them together isn't necessary. let's narrow the lru locking, but left the local_irq_disable/preempt_disable to block interrupt re-entry and statistic update. Signed-off-by: Alex Shi Signed-off-by: Wei Yang Cc: Kirill A. Shutemov Cc: Andrea Arcangeli Cc: Johannes Weiner Cc: Andrew Morton Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/huge_memory.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a4ba75e143b3..44d4b45281a3 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2418,8 +2418,6 @@ static void __split_huge_page(struct page *page, struct list_head *list, unsigned long offset = 0; int i; - lruvec = mem_cgroup_page_lruvec(head, pgdat); - /* complete memcg works before add pages to LRU */ mem_cgroup_split_huge_fixup(head); @@ -2431,6 +2429,11 @@ static void __split_huge_page(struct page *page, struct list_head *list, xa_lock(&swap_cache->i_pages); } + /* lock lru list/PageCompound, isolate freezed by page_ref_freeze */ + spin_lock(&pgdat->lru_lock); + + lruvec = mem_cgroup_page_lruvec(head, pgdat); + for (i = HPAGE_PMD_NR - 1; i >= 1; i--) { __split_huge_page_tail(head, i, lruvec, list); /* Some pages can be beyond i_size: drop them from page cache */ @@ -2448,8 +2451,8 @@ static void __split_huge_page(struct page *page, struct list_head *list, head + i, 0); } } - ClearPageCompound(head); + spin_unlock(&pgdat->lru_lock); split_page_owner(head, HPAGE_PMD_ORDER); @@ -2467,8 +2470,8 @@ static void __split_huge_page(struct page *page, struct list_head *list, page_ref_add(head, 2); xa_unlock(&head->mapping->i_pages); } - - spin_unlock_irqrestore(&pgdat->lru_lock, flags); + preempt_enable(); + local_irq_restore(flags); remap_page(head); @@ -2607,7 +2610,6 @@ bool can_split_huge_page(struct page *page, int *pextra_pins) int split_huge_page_to_list(struct page *page, struct list_head *list) { struct page *head = compound_head(page); - struct pglist_data *pgdata = NODE_DATA(page_to_nid(head)); struct deferred_split *ds_queue = get_deferred_split_queue(head); struct anon_vma *anon_vma = NULL; struct address_space *mapping = NULL; @@ -2673,9 +2675,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) unmap_page(head); VM_BUG_ON_PAGE(compound_mapcount(head), head); - /* prevent PageLRU to go away from under us, and freeze lru stats */ - spin_lock_irqsave(&pgdata->lru_lock, flags); - + local_irq_save(flags); + preempt_disable(); if (mapping) { XA_STATE(xas, &mapping->i_pages, page_index(head)); @@ -2724,7 +2725,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) spin_unlock(&ds_queue->split_queue_lock); fail: if (mapping) xa_unlock(&mapping->i_pages); - spin_unlock_irqrestore(&pgdata->lru_lock, flags); + preempt_enable(); + local_irq_restore(flags); remap_page(head); ret = -EBUSY; } From patchwork Thu May 28 11:00:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575565 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A150492A for ; Thu, 28 May 2020 11:01:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 786D720888 for ; Thu, 28 May 2020 11:01:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 786D720888 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 15A78800BA; Thu, 28 May 2020 07:01:36 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 13202800B8; Thu, 28 May 2020 07:01:36 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 045FF800BA; Thu, 28 May 2020 07:01:36 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0041.hostedemail.com [216.40.44.41]) by kanga.kvack.org (Postfix) with ESMTP id E456D800B8 for ; Thu, 28 May 2020 07:01:35 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id A0F9F180AD83E for ; Thu, 28 May 2020 11:01:35 +0000 (UTC) X-FDA: 76865836950.29.ant56_650eeb7bed555 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin29.hostedemail.com (Postfix) with ESMTP id 2AEDB180868F2 for ; Thu, 28 May 2020 11:01:35 +0000 (UTC) X-Spam-Summary: 2,0,0,0b1f9924323506a5,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1534:1541:1711:1714:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3351:3870:3876:4321:5007:6261:6737:7514:10004:11026:11473:11658:11914:12043:12048:12296:12297:12438:12555:12895:13069:13311:13357:13846:14096:14181:14384:14394:14721:14915:21060:21080:21450:21451:21627:21990:30054:30070,0,RBL:115.124.30.42:@linux.alibaba.com:.lbl8.mailshell.net-64.201.201.201 62.20.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: ant56_650eeb7bed555 X-Filterd-Recvd-Size: 2571 Received: from out30-42.freemail.mail.aliyun.com (out30-42.freemail.mail.aliyun.com [115.124.30.42]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:01:34 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01419;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=19;SR=0;TI=SMTPD_---0Tztb9j4_1590663687; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0Tztb9j4_1590663687) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:28 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi , Michal Hocko , Vladimir Davydov Subject: [PATCH v11 08/16] mm/memcg: add debug checking in lock_page_memcg Date: Thu, 28 May 2020 19:00:50 +0800 Message-Id: <1590663658-184131-9-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: 2AEDB180868F2 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add a debug checking in lock_page_memcg, then we could get alarm if anything wrong here. Suggested-by: Johannes Weiner Signed-off-by: Alex Shi Cc: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Cc: Andrew Morton Cc: cgroups@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/memcontrol.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index d04f1e242d47..91b073891d06 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1983,6 +1983,12 @@ struct mem_cgroup *lock_page_memcg(struct page *page) if (unlikely(!memcg)) return NULL; +#ifdef CONFIG_PROVE_LOCKING + local_irq_save(flags); + might_lock(&memcg->move_lock); + local_irq_restore(flags); +#endif + if (atomic_read(&memcg->moving_account) <= 0) return memcg; From patchwork Thu May 28 11:00:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575579 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6EA3760D for ; Thu, 28 May 2020 11:02:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 45E5A2088E for ; Thu, 28 May 2020 11:02:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 45E5A2088E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B446A800C0; Thu, 28 May 2020 07:01:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AF73C800BE; Thu, 28 May 2020 07:01:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B0D2800C1; Thu, 28 May 2020 07:01:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0136.hostedemail.com [216.40.44.136]) by kanga.kvack.org (Postfix) with ESMTP id 6B681800C0 for ; Thu, 28 May 2020 07:01:43 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 1E61F52A8 for ; Thu, 28 May 2020 11:01:43 +0000 (UTC) X-FDA: 76865837286.12.owl16_66104662ff424 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin12.hostedemail.com (Postfix) with ESMTP id 986A41801EC26 for ; Thu, 28 May 2020 11:01:42 +0000 (UTC) X-Spam-Summary: 2,0,0,84879f7f4805f535,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:2:41:69:355:379:541:800:960:967:968:973:988:989:1260:1261:1345:1359:1431:1437:1535:1605:1606:1730:1747:1777:1792:2393:2525:2553:2559:2563:2682:2685:2693:2859:2898:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3867:3868:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4117:4250:4321:4605:5007:6261:6737:7514:7903:8603:8957:9010:9025:9592:10004:11026:11232:11473:11638:11639:11658:11914:12043:12048:12296:12297:12438:12555:12895:12986:13161:13229:13845:13846:14096:14394:14915:21060:21080:21451:21627:21740:21788:21809:21987:21990:30054:30064:30070:30090,0,RBL:115.124.30.57:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: owl16_66104662ff424 X-Filterd-Recvd-Size: 6827 Received: from out30-57.freemail.mail.aliyun.com (out30-57.freemail.mail.aliyun.com [115.124.30.57]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:01:40 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04357;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=19;SR=0;TI=SMTPD_---0TztNDDE_1590663688; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztNDDE_1590663688) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:28 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi , Michal Hocko , Vladimir Davydov Subject: [PATCH v11 09/16] mm/lru: introduce TestClearPageLRU Date: Thu, 28 May 2020 19:00:51 +0800 Message-Id: <1590663658-184131-10-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: 986A41801EC26 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Combine PageLRU check and ClearPageLRU into a function by new introduced func TestClearPageLRU. This function will be used as page isolation precondition to prevent other isolations some where else. Then there are may non PageLRU page on lru list, need to remove BUG checking accordingly. As Andrew Morton mentioned this change would dirty cacheline for page isn't on LRU. But the lost would be acceptable with Rong Chen report: https://lkml.org/lkml/2020/3/4/173 Suggested-by: Johannes Weiner Signed-off-by: Alex Shi Cc: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Cc: Andrew Morton Cc: linux-kernel@vger.kernel.org Cc: cgroups@vger.kernel.org Cc: linux-mm@kvack.org --- include/linux/page-flags.h | 1 + mm/mlock.c | 3 +-- mm/swap.c | 8 ++------ mm/vmscan.c | 29 +++++++++++++---------------- 4 files changed, 17 insertions(+), 24 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 222f6f7b2bb3..45a576631a94 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -326,6 +326,7 @@ static inline void page_init_poison(struct page *page, size_t size) PAGEFLAG(Dirty, dirty, PF_HEAD) TESTSCFLAG(Dirty, dirty, PF_HEAD) __CLEARPAGEFLAG(Dirty, dirty, PF_HEAD) PAGEFLAG(LRU, lru, PF_HEAD) __CLEARPAGEFLAG(LRU, lru, PF_HEAD) + TESTCLEARFLAG(LRU, lru, PF_HEAD) PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD) TESTCLEARFLAG(Active, active, PF_HEAD) PAGEFLAG(Workingset, workingset, PF_HEAD) diff --git a/mm/mlock.c b/mm/mlock.c index a72c1eeded77..03b3a5d99ad7 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -108,13 +108,12 @@ void mlock_vma_page(struct page *page) */ static bool __munlock_isolate_lru_page(struct page *page, bool getpage) { - if (PageLRU(page)) { + if (TestClearPageLRU(page)) { struct lruvec *lruvec; lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); if (getpage) get_page(page); - ClearPageLRU(page); del_page_from_lru_list(page, lruvec, page_lru(page)); return true; } diff --git a/mm/swap.c b/mm/swap.c index ffb4ea7b82b5..2898efc24135 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -59,15 +59,13 @@ */ static void __page_cache_release(struct page *page) { - if (PageLRU(page)) { + if (TestClearPageLRU(page)) { pg_data_t *pgdat = page_pgdat(page); struct lruvec *lruvec; unsigned long flags; spin_lock_irqsave(&pgdat->lru_lock, flags); lruvec = mem_cgroup_page_lruvec(page, pgdat); - VM_BUG_ON_PAGE(!PageLRU(page), page); - __ClearPageLRU(page); del_page_from_lru_list(page, lruvec, page_off_lru(page)); spin_unlock_irqrestore(&pgdat->lru_lock, flags); } @@ -827,7 +825,7 @@ void release_pages(struct page **pages, int nr) continue; } - if (PageLRU(page)) { + if (TestClearPageLRU(page)) { struct pglist_data *pgdat = page_pgdat(page); if (pgdat != locked_pgdat) { @@ -840,8 +838,6 @@ void release_pages(struct page **pages, int nr) } lruvec = mem_cgroup_page_lruvec(page, locked_pgdat); - VM_BUG_ON_PAGE(!PageLRU(page), page); - __ClearPageLRU(page); del_page_from_lru_list(page, lruvec, page_off_lru(page)); } diff --git a/mm/vmscan.c b/mm/vmscan.c index d856a1545ad6..8a88a907c19d 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1547,16 +1547,16 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode) { int ret = -EINVAL; - /* Only take pages on the LRU. */ - if (!PageLRU(page)) - return ret; - /* Compaction should not handle unevictable pages but CMA can do so */ if (PageUnevictable(page) && !(mode & ISOLATE_UNEVICTABLE)) return ret; ret = -EBUSY; + /* Only take pages on the LRU. */ + if (!PageLRU(page)) + return ret; + /* * To minimise LRU disruption, the caller can indicate that it only * wants to isolate pages it will be able to operate on without @@ -1670,8 +1670,6 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, page = lru_to_page(src); prefetchw_prev_lru_page(page, src, flags); - VM_BUG_ON_PAGE(!PageLRU(page), page); - nr_pages = compound_nr(page); total_scan += nr_pages; @@ -1768,21 +1766,20 @@ int isolate_lru_page(struct page *page) VM_BUG_ON_PAGE(!page_count(page), page); WARN_RATELIMIT(PageTail(page), "trying to isolate tail page"); - if (PageLRU(page)) { + get_page(page); + if (TestClearPageLRU(page)) { pg_data_t *pgdat = page_pgdat(page); struct lruvec *lruvec; + int lru = page_lru(page); - spin_lock_irq(&pgdat->lru_lock); lruvec = mem_cgroup_page_lruvec(page, pgdat); - if (PageLRU(page)) { - int lru = page_lru(page); - get_page(page); - ClearPageLRU(page); - del_page_from_lru_list(page, lruvec, lru); - ret = 0; - } + spin_lock_irq(&pgdat->lru_lock); + del_page_from_lru_list(page, lruvec, lru); spin_unlock_irq(&pgdat->lru_lock); - } + ret = 0; + } else + put_page(page); + return ret; } From patchwork Thu May 28 11:00:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575567 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2372592A for ; Thu, 28 May 2020 11:01:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EE0E820888 for ; Thu, 28 May 2020 11:01:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EE0E820888 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5520C800B9; Thu, 28 May 2020 07:01:36 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4DDF9800B8; Thu, 28 May 2020 07:01:36 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3575E800BB; Thu, 28 May 2020 07:01:36 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0226.hostedemail.com [216.40.44.226]) by kanga.kvack.org (Postfix) with ESMTP id 007CB800B9 for ; Thu, 28 May 2020 07:01:35 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id A7B5B52B7 for ; Thu, 28 May 2020 11:01:35 +0000 (UTC) X-FDA: 76865836950.22.list85_64fef5e08331a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 0D63918038E78 for ; Thu, 28 May 2020 11:01:35 +0000 (UTC) X-Spam-Summary: 2,0,0,29fb9ed91dc45078,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:2:41:69:355:379:541:800:960:966:973:988:989:1260:1261:1345:1359:1431:1437:1535:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2553:2559:2562:2693:2731:2736:2895:2898:2899:3138:3139:3140:3141:3142:3369:3865:3867:3868:3870:3871:3872:3874:4049:4118:4250:4321:4385:4605:5007:6119:6261:6737:7903:8603:8957:9010:9592:10004:11026:11232:11658:11914:12043:12048:12291:12296:12297:12438:12555:12683:12895:12986:13846:14394:14915:21060:21080:21450:21451:21627:21987:21990:30012:30054:30070:30090,0,RBL:115.124.30.132:@linux.alibaba.com:.lbl8.mailshell.net-64.201.201.201 62.20.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: list85_64fef5e08331a X-Filterd-Recvd-Size: 7623 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) by imf06.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:01:33 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R231e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04427;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0TztNb5P_1590663688; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztNb5P_1590663688) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:29 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi Subject: [PATCH v11 10/16] mm/compaction: do page isolation first in compaction Date: Thu, 28 May 2020 19:00:52 +0800 Message-Id: <1590663658-184131-11-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: 0D63918038E78 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Johannes Weiner has suggested: "So here is a crazy idea that may be worth exploring: Right now, pgdat->lru_lock protects both PageLRU *and* the lruvec's linked list. Can we make PageLRU atomic and use it to stabilize the lru_lock instead, and then use the lru_lock only serialize list operations? ..." Yes, this patch is doing so on __isolate_lru_page which is the core page isolation func in compaction and shrinking path. With this patch, the compaction will only deal the PageLRU set and now isolated pages to skip the just alloced page which no LRU bit. And the isolation could exclusive the other isolations in memcg move_account, page migrations and thp split_huge_page. As a side effect, PageLRU may be cleared during shrink_inactive_list path for isolation reason. If so, we can skip that page. Suggested-by: Johannes Weiner Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Matthew Wilcox Cc: Hugh Dickins Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org --- include/linux/swap.h | 2 +- mm/compaction.c | 33 +++++++++++++++++++++++---------- mm/vmscan.c | 38 ++++++++++++++++++++++---------------- 3 files changed, 46 insertions(+), 27 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index d12ecacce307..8baf0c2928e2 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -356,7 +356,7 @@ extern void lru_cache_add_active_or_unevictable(struct page *page, extern unsigned long zone_reclaimable_pages(struct zone *zone); extern unsigned long try_to_free_pages(struct zonelist *zonelist, int order, gfp_t gfp_mask, nodemask_t *mask); -extern int __isolate_lru_page(struct page *page, isolate_mode_t mode); +extern int __isolate_lru_page_prepare(struct page *page, isolate_mode_t mode); extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg, unsigned long nr_pages, gfp_t gfp_mask, diff --git a/mm/compaction.c b/mm/compaction.c index c359772dbfcc..c36d832b2a84 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -954,6 +954,23 @@ static bool too_many_isolated(pg_data_t *pgdat) if (!(cc->gfp_mask & __GFP_FS) && page_mapping(page)) goto isolate_fail; + if (__isolate_lru_page_prepare(page, isolate_mode) != 0) + goto isolate_fail; + + /* + * Be careful not to clear PageLRU until after we're + * sure the page is not being freed elsewhere -- the + * page release code relies on it. + */ + if (unlikely(!get_page_unless_zero(page))) + goto isolate_fail; + + /* Try isolate the page */ + if (!TestClearPageLRU(page)) { + put_page(page); + goto isolate_fail; + } + /* If we already hold the lock, we can skip some rechecking */ if (!locked) { locked = compact_lock_irqsave(&pgdat->lru_lock, @@ -966,10 +983,6 @@ static bool too_many_isolated(pg_data_t *pgdat) goto isolate_abort; } - /* Recheck PageLRU and PageCompound under lock */ - if (!PageLRU(page)) - goto isolate_fail; - /* * Page become compound since the non-locked check, * and it's on LRU. It can only be a THP so the order @@ -980,18 +993,18 @@ static bool too_many_isolated(pg_data_t *pgdat) goto isolate_fail; } - /* Recheck page extra references under lock */ - if (page_count(page) > page_mapcount(page) + + /* + * Recheck page extra references under lock. The + * extra page_count comes from above + * get_page_unless_zero(). + */ + if (page_count(page) > page_mapcount(page) + 1 + (!PageAnon(page) || PageSwapCache(page))) goto isolate_fail; } lruvec = mem_cgroup_page_lruvec(page, pgdat); - /* Try isolate the page */ - if (__isolate_lru_page(page, isolate_mode) != 0) - goto isolate_fail; - /* The whole page is taken off the LRU; skip the tail pages. */ if (PageCompound(page)) low_pfn += compound_nr(page) - 1; diff --git a/mm/vmscan.c b/mm/vmscan.c index 8a88a907c19d..df0765203473 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1543,7 +1543,7 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone, * * returns 0 on success, -ve errno on failure. */ -int __isolate_lru_page(struct page *page, isolate_mode_t mode) +int __isolate_lru_page_prepare(struct page *page, isolate_mode_t mode) { int ret = -EINVAL; @@ -1597,20 +1597,9 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode) if ((mode & ISOLATE_UNMAPPED) && page_mapped(page)) return ret; - if (likely(get_page_unless_zero(page))) { - /* - * Be careful not to clear PageLRU until after we're - * sure the page is not being freed elsewhere -- the - * page release code relies on it. - */ - ClearPageLRU(page); - ret = 0; - } - - return ret; + return 0; } - /* * Update LRU sizes after isolating pages. The LRU size updates must * be complete before mem_cgroup_update_lru_size due to a sanity check. @@ -1690,17 +1679,34 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, * only when the page is being freed somewhere else. */ scan += nr_pages; - switch (__isolate_lru_page(page, mode)) { + switch (__isolate_lru_page_prepare(page, mode)) { case 0: + /* + * Be careful not to clear PageLRU until after we're + * sure the page is not being freed elsewhere -- the + * page release code relies on it. + */ + if (unlikely(!get_page_unless_zero(page))) + goto busy; + + if (!TestClearPageLRU(page)) { + /* + * This page may in other isolation path, + * but we still hold lru_lock. + */ + put_page(page); + goto busy; + } + nr_taken += nr_pages; nr_zone_taken[page_zonenum(page)] += nr_pages; list_move(&page->lru, dst); break; - +busy: case -EBUSY: /* else it is being freed elsewhere */ list_move(&page->lru, src); - continue; + break; default: BUG(); From patchwork Thu May 28 11:00:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575573 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A7FCB92A for ; Thu, 28 May 2020 11:01:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7EA5920B80 for ; Thu, 28 May 2020 11:01:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7EA5920B80 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7762B800BD; Thu, 28 May 2020 07:01:41 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6B56E800B8; Thu, 28 May 2020 07:01:41 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 59C7E800BD; Thu, 28 May 2020 07:01:41 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 3DB05800B8 for ; Thu, 28 May 2020 07:01:41 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E6CF34DAB for ; Thu, 28 May 2020 11:01:40 +0000 (UTC) X-FDA: 76865837160.06.class50_65ca64be81203 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 9FD1510040F14 for ; Thu, 28 May 2020 11:01:40 +0000 (UTC) X-Spam-Summary: 2,0,0,f1970e51448005d8,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:2:41:69:355:379:541:560:800:960:968:973:988:989:1260:1261:1345:1359:1431:1437:1535:1605:1606:1730:1747:1777:1792:2198:2199:2393:2559:2562:2693:2898:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4117:4250:4321:5007:6261:6737:8957:9592:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12555:12683:12895:13846:14394:14915:21060:21080:21451:21627:21987:21990:30054:30064:30070:30079,0,RBL:115.124.30.44:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:2:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: class50_65ca64be81203 X-Filterd-Recvd-Size: 6973 Received: from out30-44.freemail.mail.aliyun.com (out30-44.freemail.mail.aliyun.com [115.124.30.44]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:01:37 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01358;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0TztNDDP_1590663689; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztNDDP_1590663689) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:29 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi , "Kirill A. Shutemov" Subject: [PATCH v11 11/16] mm/mlock: reorder isolation sequence during munlock Date: Thu, 28 May 2020 19:00:53 +0800 Message-Id: <1590663658-184131-12-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: 9FD1510040F14 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch reorder the isolation steps during munlock, move the lru lock to guard each pages, unfold __munlock_isolate_lru_page func, to do the preparation for lru lock change. __split_huge_page_refcount doesn't exist, but we still have to guard PageMlocked and PageLRU in __split_huge_page_tail. [lkp@intel.com: found a sleeping function bug ... at mm/rmap.c] Signed-off-by: Alex Shi Cc: Kirill A. Shutemov Cc: Andrew Morton Cc: Johannes Weiner Cc: Matthew Wilcox Cc: Hugh Dickins Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/mlock.c | 93 ++++++++++++++++++++++++++++++++++---------------------------- 1 file changed, 51 insertions(+), 42 deletions(-) diff --git a/mm/mlock.c b/mm/mlock.c index 03b3a5d99ad7..a0856085c4b7 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -103,25 +103,6 @@ void mlock_vma_page(struct page *page) } /* - * Isolate a page from LRU with optional get_page() pin. - * Assumes lru_lock already held and page already pinned. - */ -static bool __munlock_isolate_lru_page(struct page *page, bool getpage) -{ - if (TestClearPageLRU(page)) { - struct lruvec *lruvec; - - lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); - if (getpage) - get_page(page); - del_page_from_lru_list(page, lruvec, page_lru(page)); - return true; - } - - return false; -} - -/* * Finish munlock after successful page isolation * * Page must be locked. This is a wrapper for try_to_munlock() @@ -181,6 +162,7 @@ static void __munlock_isolation_failed(struct page *page) unsigned int munlock_vma_page(struct page *page) { int nr_pages; + bool clearlru = false; pg_data_t *pgdat = page_pgdat(page); /* For try_to_munlock() and to serialize with page migration */ @@ -189,32 +171,42 @@ unsigned int munlock_vma_page(struct page *page) VM_BUG_ON_PAGE(PageTail(page), page); /* - * Serialize with any parallel __split_huge_page_refcount() which + * Serialize with any parallel __split_huge_page_tail() which * might otherwise copy PageMlocked to part of the tail pages before * we clear it in the head page. It also stabilizes hpage_nr_pages(). */ + get_page(page); spin_lock_irq(&pgdat->lru_lock); + clearlru = TestClearPageLRU(page); if (!TestClearPageMlocked(page)) { - /* Potentially, PTE-mapped THP: do not skip the rest PTEs */ - nr_pages = 1; - goto unlock_out; + if (clearlru) + SetPageLRU(page); + /* + * Potentially, PTE-mapped THP: do not skip the rest PTEs + * Reuse lock as memory barrier for release_pages racing. + */ + spin_unlock_irq(&pgdat->lru_lock); + put_page(page); + return 0; } nr_pages = hpage_nr_pages(page); __mod_zone_page_state(page_zone(page), NR_MLOCK, -nr_pages); - if (__munlock_isolate_lru_page(page, true)) { + if (clearlru) { + struct lruvec *lruvec; + + lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + del_page_from_lru_list(page, lruvec, page_lru(page)); spin_unlock_irq(&pgdat->lru_lock); __munlock_isolated_page(page); - goto out; + } else { + spin_unlock_irq(&pgdat->lru_lock); + put_page(page); + __munlock_isolation_failed(page); } - __munlock_isolation_failed(page); - -unlock_out: - spin_unlock_irq(&pgdat->lru_lock); -out: return nr_pages - 1; } @@ -297,34 +289,51 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) pagevec_init(&pvec_putback); /* Phase 1: page isolation */ - spin_lock_irq(&zone->zone_pgdat->lru_lock); for (i = 0; i < nr; i++) { struct page *page = pvec->pages[i]; + struct lruvec *lruvec; + bool clearlru; - if (TestClearPageMlocked(page)) { - /* - * We already have pin from follow_page_mask() - * so we can spare the get_page() here. - */ - if (__munlock_isolate_lru_page(page, false)) - continue; - else - __munlock_isolation_failed(page); - } else { + clearlru = TestClearPageLRU(page); + spin_lock_irq(&zone->zone_pgdat->lru_lock); + + if (!TestClearPageMlocked(page)) { delta_munlocked++; + if (clearlru) + SetPageLRU(page); + goto putback; + } + + if (!clearlru) { + __munlock_isolation_failed(page); + goto putback; } /* + * Isolate this page. + * We already have pin from follow_page_mask() + * so we can spare the get_page() here. + */ + lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + del_page_from_lru_list(page, lruvec, page_lru(page)); + spin_unlock_irq(&zone->zone_pgdat->lru_lock); + continue; + + /* * We won't be munlocking this page in the next phase * but we still need to release the follow_page_mask() * pin. We cannot do it under lru_lock however. If it's * the last pin, __page_cache_release() would deadlock. */ +putback: + spin_unlock_irq(&zone->zone_pgdat->lru_lock); pagevec_add(&pvec_putback, pvec->pages[i]); pvec->pages[i] = NULL; } + /* tempary disable irq, will remove later */ + local_irq_disable(); __mod_zone_page_state(zone, NR_MLOCK, delta_munlocked); - spin_unlock_irq(&zone->zone_pgdat->lru_lock); + local_irq_enable(); /* Now we can release pins of pages that we are not munlocking */ pagevec_release(&pvec_putback); From patchwork Thu May 28 11:00:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575575 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 946E892A for ; Thu, 28 May 2020 11:01:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4642920888 for ; Thu, 28 May 2020 11:01:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4642920888 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CCB00800B8; Thu, 28 May 2020 07:01:41 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C2C8C800BE; Thu, 28 May 2020 07:01:41 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B1CFD800B8; Thu, 28 May 2020 07:01:41 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0019.hostedemail.com [216.40.44.19]) by kanga.kvack.org (Postfix) with ESMTP id 8CD24800BE for ; Thu, 28 May 2020 07:01:41 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 47DD4180ACF61 for ; Thu, 28 May 2020 11:01:41 +0000 (UTC) X-FDA: 76865837202.09.doll30_65adcec08145a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin09.hostedemail.com (Postfix) with ESMTP id 011F2180AD83E for ; Thu, 28 May 2020 11:01:40 +0000 (UTC) X-Spam-Summary: 2,0,0,9480037f1987b93f,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:69:327:355:379:541:960:966:967:968:973:981:988:989:1260:1261:1345:1359:1431:1437:1605:1730:1747:1777:1792:1801:2194:2195:2196:2198:2199:2200:2201:2202:2393:2525:2559:2563:2682:2685:2693:2731:2859:2890:2895:2898:2901:2924:2926:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3308:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4042:4250:4321:4385:4605:5007:6261:6737:7514:7875:7903:8957:9010:9025:9207:9592:10004:11026:11657:11914:12043:12048:12291:12296:12297:12438:12555:12683:12895:12986:13845:13846:13868:14096:14394:14915:21060:21080:21450:21451:21627:21740:21966:21987:21990:30001:30045:30054:30070:30090,0,RBL:115.124.30.45:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA _SUMMARY X-HE-Tag: doll30_65adcec08145a X-Filterd-Recvd-Size: 32317 Received: from out30-45.freemail.mail.aliyun.com (out30-45.freemail.mail.aliyun.com [115.124.30.45]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:01:37 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04357;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=19;SR=0;TI=SMTPD_---0TztdQAT_1590663689; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztdQAT_1590663689) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:30 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi , Michal Hocko , Vladimir Davydov Subject: [PATCH v11 12/16] mm/lru: replace pgdat lru_lock with lruvec lock Date: Thu, 28 May 2020 19:00:54 +0800 Message-Id: <1590663658-184131-13-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: 011F2180AD83E X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch moves per node lru_lock into lruvec, thus bring a lru_lock for each of memcg per node. So on a large machine, each of memcg don't have to suffer from per node pgdat->lru_lock competition. They could go fast with their self lru_lock. After move memcg charge before lru inserting, page isolation could stable page's memcg, then per memcg lruvec lock is stable and could replace per node lru lock. According to Daniel Jordan's suggestion, I run 208 'dd' with on 104 containers on a 2s * 26cores * HT box with a modefied case: https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice With this and later patches, the readtwice performance increases about 80% within concurrent containers. Also add a debug func in locking which may give some clues if there are sth out of hands. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Cc: Yang Shi Cc: Matthew Wilcox Cc: Konstantin Khlebnikov Cc: Hugh Dickins Cc: Tejun Heo Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: cgroups@vger.kernel.org --- include/linux/memcontrol.h | 56 ++++++++++++++++++++++++++++++++ include/linux/mmzone.h | 2 ++ mm/compaction.c | 61 +++++++++++++++++++++++------------ mm/huge_memory.c | 9 ++---- mm/memcontrol.c | 79 ++++++++++++++++++++++++++++++++++++++++++++-- mm/mlock.c | 32 +++++++++---------- mm/mmzone.c | 1 + mm/swap.c | 75 ++++++++++++++++++++----------------------- mm/swap_state.c | 6 ++-- mm/vmscan.c | 70 ++++++++++++++++++++++------------------ mm/workingset.c | 4 +-- 11 files changed, 275 insertions(+), 120 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 0ba84f1c3f91..a4601169bf7d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -413,6 +413,17 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, struct mem_cgroup *get_mem_cgroup_from_page(struct page *page); +struct lruvec *lock_page_lruvec(struct page *page); +struct lruvec *lock_page_lruvec_irq(struct page *page); +struct lruvec *lock_page_lruvec_irqsave(struct page *page, + unsigned long *flags); + +void unlock_page_lruvec(struct lruvec *lruvec); +void unlock_page_lruvec_irq(struct lruvec *lruvec); +void unlock_page_lruvec_irqrestore(struct lruvec *lruvec, unsigned long flags); + +void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page); + static inline struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *css){ return css ? container_of(css, struct mem_cgroup, css) : NULL; @@ -894,6 +905,47 @@ static inline void mem_cgroup_put(struct mem_cgroup *memcg) { } +static inline struct lruvec *lock_page_lruvec(struct page *page) +{ + struct pglist_data *pgdat = page_pgdat(page); + + spin_lock(&pgdat->__lruvec.lru_lock); + return &pgdat->__lruvec; +} + +static inline struct lruvec *lock_page_lruvec_irq(struct page *page) +{ + struct pglist_data *pgdat = page_pgdat(page); + + spin_lock_irq(&pgdat->__lruvec.lru_lock); + return &pgdat->__lruvec; +} + +static inline struct lruvec *lock_page_lruvec_irqsave(struct page *page, + unsigned long *flagsp) +{ + struct pglist_data *pgdat = page_pgdat(page); + + spin_lock_irqsave(&pgdat->__lruvec.lru_lock, *flagsp); + return &pgdat->__lruvec; +} + +static inline void unlock_page_lruvec(struct lruvec *lruvec) +{ + spin_unlock(&lruvec->lru_lock); +} + +static inline void unlock_page_lruvec_irq(struct lruvec *lruvec) +{ + spin_unlock_irq(&lruvec->lru_lock); +} + +static inline void unlock_page_lruvec_irqrestore(struct lruvec *lruvec, + unsigned long flags) +{ + spin_unlock_irqrestore(&lruvec->lru_lock, flags); +} + static inline struct mem_cgroup * mem_cgroup_iter(struct mem_cgroup *root, struct mem_cgroup *prev, @@ -1128,6 +1180,10 @@ static inline void count_memcg_page_event(struct page *page, void count_memcg_event_mm(struct mm_struct *mm, enum vm_event_item idx) { } + +static inline void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page) +{ +} #endif /* CONFIG_MEMCG */ /* idx can be of type enum memcg_stat_item or node_stat_item */ diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 545b663678ed..d70a12214936 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -259,6 +259,8 @@ struct lruvec { atomic_long_t inactive_age; /* Refaults at the time of last reclaim cycle */ unsigned long refaults; + /* per lruvec lru_lock for memcg */ + spinlock_t lru_lock; /* Various lruvec state flags (enum lruvec_flags) */ unsigned long flags; #ifdef CONFIG_MEMCG diff --git a/mm/compaction.c b/mm/compaction.c index c36d832b2a84..e15aa83ebc07 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -787,7 +787,7 @@ static bool too_many_isolated(pg_data_t *pgdat) unsigned long nr_scanned = 0, nr_isolated = 0; struct lruvec *lruvec; unsigned long flags = 0; - bool locked = false; + struct lruvec *locked_lruvec = NULL; struct page *page = NULL, *valid_page = NULL; unsigned long start_pfn = low_pfn; bool skip_on_failure = false; @@ -847,11 +847,21 @@ static bool too_many_isolated(pg_data_t *pgdat) * contention, to give chance to IRQs. Abort completely if * a fatal signal is pending. */ - if (!(low_pfn % SWAP_CLUSTER_MAX) - && compact_unlock_should_abort(&pgdat->lru_lock, - flags, &locked, cc)) { - low_pfn = 0; - goto fatal_pending; + if (!(low_pfn % SWAP_CLUSTER_MAX)) { + if (locked_lruvec) { + unlock_page_lruvec_irqrestore(locked_lruvec, + flags); + locked_lruvec = NULL; + } + + if (fatal_signal_pending(current)) { + cc->contended = true; + + low_pfn = 0; + goto fatal_pending; + } + + cond_resched(); } if (!pfn_valid_within(low_pfn)) @@ -921,10 +931,9 @@ static bool too_many_isolated(pg_data_t *pgdat) */ if (unlikely(__PageMovable(page)) && !PageIsolated(page)) { - if (locked) { - spin_unlock_irqrestore(&pgdat->lru_lock, - flags); - locked = false; + if (locked_lruvec) { + unlock_page_lruvec_irqrestore(locked_lruvec, flags); + locked_lruvec = NULL; } if (!isolate_movable_page(page, isolate_mode)) @@ -971,10 +980,20 @@ static bool too_many_isolated(pg_data_t *pgdat) goto isolate_fail; } + rcu_read_lock(); + lruvec = mem_cgroup_page_lruvec(page, pgdat); + /* If we already hold the lock, we can skip some rechecking */ - if (!locked) { - locked = compact_lock_irqsave(&pgdat->lru_lock, - &flags, cc); + if (lruvec != locked_lruvec) { + if (locked_lruvec) + unlock_page_lruvec_irqrestore(locked_lruvec, + flags); + + compact_lock_irqsave(&lruvec->lru_lock, &flags, cc); + locked_lruvec = lruvec; + rcu_read_unlock(); + + lruvec_memcg_debug(lruvec, page); /* Try get exclusive access under lock */ if (!skip_updated) { @@ -1001,9 +1020,8 @@ static bool too_many_isolated(pg_data_t *pgdat) if (page_count(page) > page_mapcount(page) + 1 + (!PageAnon(page) || PageSwapCache(page))) goto isolate_fail; - } - - lruvec = mem_cgroup_page_lruvec(page, pgdat); + } else + rcu_read_unlock(); /* The whole page is taken off the LRU; skip the tail pages. */ if (PageCompound(page)) @@ -1043,9 +1061,10 @@ static bool too_many_isolated(pg_data_t *pgdat) * page anyway. */ if (nr_isolated) { - if (locked) { - spin_unlock_irqrestore(&pgdat->lru_lock, flags); - locked = false; + if (locked_lruvec) { + unlock_page_lruvec_irqrestore(locked_lruvec, + flags); + locked_lruvec = NULL; } putback_movable_pages(&cc->migratepages); cc->nr_migratepages = 0; @@ -1070,8 +1089,8 @@ static bool too_many_isolated(pg_data_t *pgdat) low_pfn = end_pfn; isolate_abort: - if (locked) - spin_unlock_irqrestore(&pgdat->lru_lock, flags); + if (locked_lruvec) + unlock_page_lruvec_irqrestore(locked_lruvec, flags); /* * Updated the cached scanner pfn once the pageblock has been scanned diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 44d4b45281a3..39025c651692 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2322,7 +2322,7 @@ void lru_add_page_tail(struct page *head, struct page *page_tail, VM_BUG_ON_PAGE(!PageHead(head), head); VM_BUG_ON_PAGE(PageCompound(page_tail), head); VM_BUG_ON_PAGE(PageLRU(page_tail), head); - lockdep_assert_held(&lruvec_pgdat(lruvec)->lru_lock); + lockdep_assert_held(&lruvec->lru_lock); if (!list) SetPageLRU(page_tail); @@ -2412,7 +2412,6 @@ static void __split_huge_page(struct page *page, struct list_head *list, pgoff_t end, unsigned long flags) { struct page *head = compound_head(page); - pg_data_t *pgdat = page_pgdat(head); struct lruvec *lruvec; struct address_space *swap_cache = NULL; unsigned long offset = 0; @@ -2430,9 +2429,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, } /* lock lru list/PageCompound, isolate freezed by page_ref_freeze */ - spin_lock(&pgdat->lru_lock); - - lruvec = mem_cgroup_page_lruvec(head, pgdat); + lruvec = lock_page_lruvec(head); for (i = HPAGE_PMD_NR - 1; i >= 1; i--) { __split_huge_page_tail(head, i, lruvec, list); @@ -2452,7 +2449,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, } } ClearPageCompound(head); - spin_unlock(&pgdat->lru_lock); + unlock_page_lruvec(lruvec); split_page_owner(head, HPAGE_PMD_ORDER); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 91b073891d06..b106e3b86fff 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1196,6 +1196,20 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg, return ret; } + +void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page) +{ +#ifdef CONFIG_DEBUG_VM + if (mem_cgroup_disabled()) + return; + + if (!page->mem_cgroup) + VM_BUG_ON_PAGE(lruvec_memcg(lruvec) != root_mem_cgroup, page); + else + VM_BUG_ON_PAGE(lruvec_memcg(lruvec) != page->mem_cgroup, page); +#endif +} + /** * mem_cgroup_page_lruvec - return lruvec for isolating/putting an LRU page * @page: the page @@ -1215,7 +1229,7 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct pglist_data *pgd goto out; } - memcg = page->mem_cgroup; + memcg = READ_ONCE(page->mem_cgroup); /* * Swapcache readahead pages are added to the LRU - and * possibly migrated - before they are charged. @@ -1236,6 +1250,67 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct pglist_data *pgd return lruvec; } +/* page was isolated */ +struct lruvec *lock_page_lruvec(struct page *page) +{ + struct lruvec *lruvec; + struct pglist_data *pgdat = page_pgdat(page); + + rcu_read_lock(); + lruvec = mem_cgroup_page_lruvec(page, pgdat); + spin_lock(&lruvec->lru_lock); + rcu_read_unlock(); + + lruvec_memcg_debug(lruvec, page); + + return lruvec; +} + +struct lruvec *lock_page_lruvec_irq(struct page *page) +{ + struct lruvec *lruvec; + struct pglist_data *pgdat = page_pgdat(page); + + rcu_read_lock(); + lruvec = mem_cgroup_page_lruvec(page, pgdat); + spin_lock_irq(&lruvec->lru_lock); + rcu_read_unlock(); + + lruvec_memcg_debug(lruvec, page); + + return lruvec; +} + +struct lruvec *lock_page_lruvec_irqsave(struct page *page, unsigned long *flags) +{ + struct lruvec *lruvec; + struct pglist_data *pgdat = page_pgdat(page); + + rcu_read_lock(); + lruvec = mem_cgroup_page_lruvec(page, pgdat); + spin_lock_irqsave(&lruvec->lru_lock, *flags); + rcu_read_unlock(); + + lruvec_memcg_debug(lruvec, page); + + return lruvec; +} + +void unlock_page_lruvec(struct lruvec *lruvec) +{ + spin_unlock(&lruvec->lru_lock); +} + +void unlock_page_lruvec_irq(struct lruvec *lruvec) +{ + spin_unlock_irq(&lruvec->lru_lock); +} + +void unlock_page_lruvec_irqrestore(struct lruvec *lruvec, unsigned long flags) +{ + spin_unlock_irqrestore(&lruvec->lru_lock, flags); +} + /** * mem_cgroup_update_lru_size - account for adding or removing an lru page * @lruvec: mem_cgroup per zone lru vector @@ -2942,7 +3017,7 @@ void __memcg_kmem_uncharge_page(struct page *page, int order) /* * Because tail pages are not marked as "used", set it. We're under - * pgdat->lru_lock and migration entries setup in all page mappings. + * lruvec->lru_lock and migration entries setup in all page mappings. */ void mem_cgroup_split_huge_fixup(struct page *head) { diff --git a/mm/mlock.c b/mm/mlock.c index a0856085c4b7..c1ef4ac7a744 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -163,7 +163,7 @@ unsigned int munlock_vma_page(struct page *page) { int nr_pages; bool clearlru = false; - pg_data_t *pgdat = page_pgdat(page); + struct lruvec *lruvec; /* For try_to_munlock() and to serialize with page migration */ BUG_ON(!PageLocked(page)); @@ -176,7 +176,7 @@ unsigned int munlock_vma_page(struct page *page) * we clear it in the head page. It also stabilizes hpage_nr_pages(). */ get_page(page); - spin_lock_irq(&pgdat->lru_lock); + lruvec = lock_page_lruvec_irq(page); clearlru = TestClearPageLRU(page); if (!TestClearPageMlocked(page)) { @@ -186,7 +186,7 @@ unsigned int munlock_vma_page(struct page *page) * Potentially, PTE-mapped THP: do not skip the rest PTEs * Reuse lock as memory barrier for release_pages racing. */ - spin_unlock_irq(&pgdat->lru_lock); + unlock_page_lruvec_irq(lruvec); put_page(page); return 0; } @@ -195,14 +195,11 @@ unsigned int munlock_vma_page(struct page *page) __mod_zone_page_state(page_zone(page), NR_MLOCK, -nr_pages); if (clearlru) { - struct lruvec *lruvec; - - lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); del_page_from_lru_list(page, lruvec, page_lru(page)); - spin_unlock_irq(&pgdat->lru_lock); + unlock_page_lruvec_irq(lruvec); __munlock_isolated_page(page); } else { - spin_unlock_irq(&pgdat->lru_lock); + unlock_page_lruvec_irq(lruvec); put_page(page); __munlock_isolation_failed(page); } @@ -284,6 +281,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) int nr = pagevec_count(pvec); int delta_munlocked = -nr; struct pagevec pvec_putback; + struct lruvec *lruvec = NULL; int pgrescued = 0; pagevec_init(&pvec_putback); @@ -291,11 +289,17 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) /* Phase 1: page isolation */ for (i = 0; i < nr; i++) { struct page *page = pvec->pages[i]; - struct lruvec *lruvec; + struct lruvec *new_lruvec; bool clearlru; clearlru = TestClearPageLRU(page); - spin_lock_irq(&zone->zone_pgdat->lru_lock); + + new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + if (new_lruvec != lruvec) { + if (lruvec) + unlock_page_lruvec_irq(lruvec); + lruvec = lock_page_lruvec_irq(page); + } if (!TestClearPageMlocked(page)) { delta_munlocked++; @@ -314,9 +318,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) * We already have pin from follow_page_mask() * so we can spare the get_page() here. */ - lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); del_page_from_lru_list(page, lruvec, page_lru(page)); - spin_unlock_irq(&zone->zone_pgdat->lru_lock); continue; /* @@ -326,14 +328,12 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) * the last pin, __page_cache_release() would deadlock. */ putback: - spin_unlock_irq(&zone->zone_pgdat->lru_lock); pagevec_add(&pvec_putback, pvec->pages[i]); pvec->pages[i] = NULL; } - /* tempary disable irq, will remove later */ - local_irq_disable(); __mod_zone_page_state(zone, NR_MLOCK, delta_munlocked); - local_irq_enable(); + if (lruvec) + unlock_page_lruvec_irq(lruvec); /* Now we can release pins of pages that we are not munlocking */ pagevec_release(&pvec_putback); diff --git a/mm/mmzone.c b/mm/mmzone.c index 4686fdc23bb9..3750a90ed4a0 100644 --- a/mm/mmzone.c +++ b/mm/mmzone.c @@ -91,6 +91,7 @@ void lruvec_init(struct lruvec *lruvec) enum lru_list lru; memset(lruvec, 0, sizeof(struct lruvec)); + spin_lock_init(&lruvec->lru_lock); for_each_lru(lru) INIT_LIST_HEAD(&lruvec->lists[lru]); diff --git a/mm/swap.c b/mm/swap.c index 2898efc24135..91ff3d4a7751 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -60,14 +60,12 @@ static void __page_cache_release(struct page *page) { if (TestClearPageLRU(page)) { - pg_data_t *pgdat = page_pgdat(page); struct lruvec *lruvec; unsigned long flags; - spin_lock_irqsave(&pgdat->lru_lock, flags); - lruvec = mem_cgroup_page_lruvec(page, pgdat); + lruvec = lock_page_lruvec_irqsave(page, &flags); del_page_from_lru_list(page, lruvec, page_off_lru(page)); - spin_unlock_irqrestore(&pgdat->lru_lock, flags); + unlock_page_lruvec_irqrestore(lruvec, flags); } __ClearPageWaiters(page); } @@ -187,26 +185,24 @@ static void pagevec_lru_move_fn(struct pagevec *pvec, void *arg) { int i; - struct pglist_data *pgdat = NULL; - struct lruvec *lruvec; + struct lruvec *lruvec = NULL; unsigned long flags = 0; for (i = 0; i < pagevec_count(pvec); i++) { struct page *page = pvec->pages[i]; - struct pglist_data *pagepgdat = page_pgdat(page); + struct lruvec *new_lruvec; - if (pagepgdat != pgdat) { - if (pgdat) - spin_unlock_irqrestore(&pgdat->lru_lock, flags); - pgdat = pagepgdat; - spin_lock_irqsave(&pgdat->lru_lock, flags); + new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + if (lruvec != new_lruvec) { + if (lruvec) + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = lock_page_lruvec_irqsave(page, &flags); } - lruvec = mem_cgroup_page_lruvec(page, pgdat); (*move_fn)(page, lruvec, arg); } - if (pgdat) - spin_unlock_irqrestore(&pgdat->lru_lock, flags); + if (lruvec) + unlock_page_lruvec_irqrestore(lruvec, flags); release_pages(pvec->pages, pvec->nr); pagevec_reinit(pvec); } @@ -345,11 +341,12 @@ static inline void activate_page_drain(int cpu) void activate_page(struct page *page) { pg_data_t *pgdat = page_pgdat(page); + struct lruvec *lruvec; page = compound_head(page); - spin_lock_irq(&pgdat->lru_lock); - __activate_page(page, mem_cgroup_page_lruvec(page, pgdat), NULL); - spin_unlock_irq(&pgdat->lru_lock); + lruvec = lock_page_lruvec_irq(page); + __activate_page(page, lruvec, NULL); + unlock_page_lruvec_irq(lruvec); } #endif @@ -773,8 +770,7 @@ void release_pages(struct page **pages, int nr) { int i; LIST_HEAD(pages_to_free); - struct pglist_data *locked_pgdat = NULL; - struct lruvec *lruvec; + struct lruvec *lruvec = NULL; unsigned long uninitialized_var(flags); unsigned int uninitialized_var(lock_batch); @@ -784,21 +780,20 @@ void release_pages(struct page **pages, int nr) /* * Make sure the IRQ-safe lock-holding time does not get * excessive with a continuous string of pages from the - * same pgdat. The lock is held only if pgdat != NULL. + * same lruvec. The lock is held only if lruvec != NULL. */ - if (locked_pgdat && ++lock_batch == SWAP_CLUSTER_MAX) { - spin_unlock_irqrestore(&locked_pgdat->lru_lock, flags); - locked_pgdat = NULL; + if (lruvec && ++lock_batch == SWAP_CLUSTER_MAX) { + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = NULL; } if (is_huge_zero_page(page)) continue; if (is_zone_device_page(page)) { - if (locked_pgdat) { - spin_unlock_irqrestore(&locked_pgdat->lru_lock, - flags); - locked_pgdat = NULL; + if (lruvec) { + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = NULL; } /* * ZONE_DEVICE pages that return 'false' from @@ -817,27 +812,27 @@ void release_pages(struct page **pages, int nr) continue; if (PageCompound(page)) { - if (locked_pgdat) { - spin_unlock_irqrestore(&locked_pgdat->lru_lock, flags); - locked_pgdat = NULL; + if (lruvec) { + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = NULL; } __put_compound_page(page); continue; } if (TestClearPageLRU(page)) { - struct pglist_data *pgdat = page_pgdat(page); + struct lruvec *new_lruvec; - if (pgdat != locked_pgdat) { - if (locked_pgdat) - spin_unlock_irqrestore(&locked_pgdat->lru_lock, + new_lruvec = mem_cgroup_page_lruvec(page, + page_pgdat(page)); + if (new_lruvec != lruvec) { + if (lruvec) + unlock_page_lruvec_irqrestore(lruvec, flags); lock_batch = 0; - locked_pgdat = pgdat; - spin_lock_irqsave(&locked_pgdat->lru_lock, flags); + lruvec = lock_page_lruvec_irqsave(page, &flags); } - lruvec = mem_cgroup_page_lruvec(page, locked_pgdat); del_page_from_lru_list(page, lruvec, page_off_lru(page)); } @@ -847,8 +842,8 @@ void release_pages(struct page **pages, int nr) list_add(&page->lru, &pages_to_free); } - if (locked_pgdat) - spin_unlock_irqrestore(&locked_pgdat->lru_lock, flags); + if (lruvec) + unlock_page_lruvec_irqrestore(lruvec, flags); mem_cgroup_uncharge_list(&pages_to_free); free_unref_page_list(&pages_to_free); diff --git a/mm/swap_state.c b/mm/swap_state.c index 9d20b00627af..cbaa1a60434d 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -362,6 +362,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, { struct swap_info_struct *si; struct page *page; + struct lruvec *lruvec = NULL; *new_page_allocated = false; @@ -441,9 +442,10 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, } /* XXX: Move to lru_cache_add() when it supports new vs putback */ - spin_lock_irq(&page_pgdat(page)->lru_lock); + lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + spin_lock_irq(&lruvec->lru_lock); lru_note_cost_page(page); - spin_unlock_irq(&page_pgdat(page)->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); /* Caller will initiate read into locked page */ SetPageWorkingset(page); diff --git a/mm/vmscan.c b/mm/vmscan.c index df0765203473..c4c30b530876 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1774,14 +1774,12 @@ int isolate_lru_page(struct page *page) get_page(page); if (TestClearPageLRU(page)) { - pg_data_t *pgdat = page_pgdat(page); struct lruvec *lruvec; int lru = page_lru(page); - lruvec = mem_cgroup_page_lruvec(page, pgdat); - spin_lock_irq(&pgdat->lru_lock); + lruvec = lock_page_lruvec_irq(page); del_page_from_lru_list(page, lruvec, lru); - spin_unlock_irq(&pgdat->lru_lock); + unlock_page_lruvec_irq(lruvec); ret = 0; } else put_page(page); @@ -1849,20 +1847,22 @@ static int too_many_isolated(struct pglist_data *pgdat, int file, static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, struct list_head *list) { - struct pglist_data *pgdat = lruvec_pgdat(lruvec); int nr_pages, nr_moved = 0; LIST_HEAD(pages_to_free); struct page *page; + struct lruvec *orig_lruvec = lruvec; enum lru_list lru; while (!list_empty(list)) { + struct lruvec *new_lruvec = NULL; + page = lru_to_page(list); VM_BUG_ON_PAGE(PageLRU(page), page); list_del(&page->lru); if (unlikely(!page_evictable(page))) { - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); putback_lru_page(page); - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&lruvec->lru_lock); continue; } @@ -1876,6 +1876,12 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, * list_add(&page->lru,) * list_add(&page->lru,) //corrupt */ + new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + if (new_lruvec != lruvec) { + if (lruvec) + spin_unlock_irq(&lruvec->lru_lock); + lruvec = lock_page_lruvec_irq(page); + } SetPageLRU(page); if (unlikely(put_page_testzero(page))) { @@ -1883,15 +1889,14 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, __ClearPageActive(page); if (unlikely(PageCompound(page))) { - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); destroy_compound_page(page); - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&lruvec->lru_lock); } else list_add(&page->lru, &pages_to_free); continue; } - lruvec = mem_cgroup_page_lruvec(page, pgdat); lru = page_lru(page); nr_pages = hpage_nr_pages(page); @@ -1899,6 +1904,11 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, list_add(&page->lru, &lruvec->lists[lru]); nr_moved += nr_pages; } + if (orig_lruvec != lruvec) { + if (lruvec) + spin_unlock_irq(&lruvec->lru_lock); + spin_lock_irq(&orig_lruvec->lru_lock); + } /* * To save our caller's stack, now use input list for pages to free. @@ -1954,7 +1964,7 @@ static int current_may_throttle(void) lru_add_drain(); - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&lruvec->lru_lock); nr_taken = isolate_lru_pages(nr_to_scan, lruvec, &page_list, &nr_scanned, sc, lru); @@ -1966,7 +1976,7 @@ static int current_may_throttle(void) __count_memcg_events(lruvec_memcg(lruvec), item, nr_scanned); __count_vm_events(PGSCAN_ANON + file, nr_scanned); - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); if (nr_taken == 0) return 0; @@ -1974,7 +1984,7 @@ static int current_may_throttle(void) nr_reclaimed = shrink_page_list(&page_list, pgdat, sc, 0, &stat, false); - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&lruvec->lru_lock); move_pages_to_lru(lruvec, &page_list); @@ -1986,7 +1996,7 @@ static int current_may_throttle(void) __count_memcg_events(lruvec_memcg(lruvec), item, nr_reclaimed); __count_vm_events(PGSTEAL_ANON + file, nr_reclaimed); - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); mem_cgroup_uncharge_list(&page_list); free_unref_page_list(&page_list); @@ -2038,7 +2048,7 @@ static void shrink_active_list(unsigned long nr_to_scan, lru_add_drain(); - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&lruvec->lru_lock); nr_taken = isolate_lru_pages(nr_to_scan, lruvec, &l_hold, &nr_scanned, sc, lru); @@ -2048,7 +2058,7 @@ static void shrink_active_list(unsigned long nr_to_scan, __count_vm_events(PGREFILL, nr_scanned); __count_memcg_events(lruvec_memcg(lruvec), PGREFILL, nr_scanned); - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); while (!list_empty(&l_hold)) { cond_resched(); @@ -2094,7 +2104,7 @@ static void shrink_active_list(unsigned long nr_to_scan, /* * Move pages back to the lru list. */ - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&lruvec->lru_lock); nr_activate = move_pages_to_lru(lruvec, &l_active); nr_deactivate = move_pages_to_lru(lruvec, &l_inactive); @@ -2105,7 +2115,7 @@ static void shrink_active_list(unsigned long nr_to_scan, __count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE, nr_deactivate); __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken); - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); mem_cgroup_uncharge_list(&l_active); free_unref_page_list(&l_active); @@ -2695,10 +2705,10 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc) /* * Determine the scan balance between anon and file LRUs. */ - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&target_lruvec->lru_lock); sc->anon_cost = target_lruvec->anon_cost; sc->file_cost = target_lruvec->file_cost; - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&target_lruvec->lru_lock); /* * Target desirable inactive:active list ratios for the anon @@ -4274,24 +4284,22 @@ int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order) */ void check_move_unevictable_pages(struct pagevec *pvec) { - struct lruvec *lruvec; - struct pglist_data *pgdat = NULL; + struct lruvec *lruvec = NULL; int pgscanned = 0; int pgrescued = 0; int i; for (i = 0; i < pvec->nr; i++) { struct page *page = pvec->pages[i]; - struct pglist_data *pagepgdat = page_pgdat(page); + struct lruvec *new_lruvec; pgscanned++; - if (pagepgdat != pgdat) { - if (pgdat) - spin_unlock_irq(&pgdat->lru_lock); - pgdat = pagepgdat; - spin_lock_irq(&pgdat->lru_lock); + new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + if (lruvec != new_lruvec) { + if (lruvec) + unlock_page_lruvec_irq(lruvec); + lruvec = lock_page_lruvec_irq(page); } - lruvec = mem_cgroup_page_lruvec(page, pgdat); if (!PageLRU(page) || !PageUnevictable(page)) continue; @@ -4307,10 +4315,10 @@ void check_move_unevictable_pages(struct pagevec *pvec) } } - if (pgdat) { + if (lruvec) { __count_vm_events(UNEVICTABLE_PGRESCUED, pgrescued); __count_vm_events(UNEVICTABLE_PGSCANNED, pgscanned); - spin_unlock_irq(&pgdat->lru_lock); + unlock_page_lruvec_irq(lruvec); } } EXPORT_SYMBOL_GPL(check_move_unevictable_pages); diff --git a/mm/workingset.c b/mm/workingset.c index d481ea452eeb..7423a022c27b 100644 --- a/mm/workingset.c +++ b/mm/workingset.c @@ -366,9 +366,9 @@ void workingset_refault(struct page *page, void *shadow) if (workingset) { SetPageWorkingset(page); /* XXX: Move to lru_cache_add() when it supports new vs putback */ - spin_lock_irq(&page_pgdat(page)->lru_lock); + spin_lock_irq(&lruvec->lru_lock); lru_note_cost_page(page); - spin_unlock_irq(&page_pgdat(page)->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); inc_lruvec_state(lruvec, WORKINGSET_RESTORE); } out: From patchwork Thu May 28 11:00:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575569 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 76D8460D for ; Thu, 28 May 2020 11:01:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4C6BD21501 for ; Thu, 28 May 2020 11:01:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4C6BD21501 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2B80C800BB; Thu, 28 May 2020 07:01:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 2715D800B8; Thu, 28 May 2020 07:01:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 04654800BB; Thu, 28 May 2020 07:01:38 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0160.hostedemail.com [216.40.44.160]) by kanga.kvack.org (Postfix) with ESMTP id DDC27800B8 for ; Thu, 28 May 2020 07:01:38 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3F5AB45C1 for ; Thu, 28 May 2020 11:01:38 +0000 (UTC) X-FDA: 76865837076.15.snail70_656263dd4254f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin15.hostedemail.com (Postfix) with ESMTP id 176471814B0D4 for ; Thu, 28 May 2020 11:01:38 +0000 (UTC) X-Spam-Summary: 2,0,0,7012ce3e7b7c2a72,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:69:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1535:1544:1605:1711:1730:1747:1777:1792:2393:2559:2562:2898:3138:3139:3140:3141:3142:3867:4117:4321:4605:5007:6261:6642:6737:8957:9207:9592:10004:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12555:12683:12895:12986:13846:14181:14394:14721:14915:21060:21080:21451:21611:21627:21987:21990:30012:30054:30070,0,RBL:115.124.30.56:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:22,LUA_SUMMARY:none X-HE-Tag: snail70_656263dd4254f X-Filterd-Recvd-Size: 6569 Received: from out30-56.freemail.mail.aliyun.com (out30-56.freemail.mail.aliyun.com [115.124.30.56]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:01:36 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01422;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=19;SR=0;TI=SMTPD_---0TztNDDZ_1590663690; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztNDDZ_1590663690) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:30 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi , Thomas Gleixner , Andrey Ryabinin Subject: [PATCH v11 13/16] mm/lru: introduce the relock_page_lruvec function Date: Thu, 28 May 2020 19:00:55 +0800 Message-Id: <1590663658-184131-14-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: 176471814B0D4 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Use this new function to replace repeated same code. Signed-off-by: Alex Shi Cc: Johannes Weiner Cc: Andrew Morton Cc: Thomas Gleixner Cc: Andrey Ryabinin Cc: Matthew Wilcox Cc: Mel Gorman Cc: Konstantin Khlebnikov Cc: Hugh Dickins Cc: Tejun Heo Cc: linux-kernel@vger.kernel.org Cc: cgroups@vger.kernel.org Cc: linux-mm@kvack.org --- include/linux/memcontrol.h | 36 ++++++++++++++++++++++++++++++++++++ mm/mlock.c | 9 +-------- mm/swap.c | 24 ++++++------------------ mm/vmscan.c | 8 +------- 4 files changed, 44 insertions(+), 33 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index a4601169bf7d..ed09bf53c70f 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1313,6 +1313,42 @@ static inline struct lruvec *parent_lruvec(struct lruvec *lruvec) return mem_cgroup_lruvec(memcg, lruvec_pgdat(lruvec)); } +/* Don't lock again iff page's lruvec locked */ +static inline struct lruvec *relock_page_lruvec_irq(struct page *page, + struct lruvec *locked_lruvec) +{ + struct pglist_data *pgdat = page_pgdat(page); + struct lruvec *lruvec; + + lruvec = mem_cgroup_page_lruvec(page, pgdat); + + if (likely(locked_lruvec == lruvec)) + return lruvec; + + if (unlikely(locked_lruvec)) + unlock_page_lruvec_irq(locked_lruvec); + + return lock_page_lruvec_irq(page); +} + +/* Don't lock again iff page's lruvec locked */ +static inline struct lruvec *relock_page_lruvec_irqsave(struct page *page, + struct lruvec *locked_lruvec, unsigned long *flags) +{ + struct pglist_data *pgdat = page_pgdat(page); + struct lruvec *lruvec; + + lruvec = mem_cgroup_page_lruvec(page, pgdat); + + if (likely(locked_lruvec == lruvec)) + return lruvec; + + if (unlikely(locked_lruvec)) + unlock_page_lruvec_irqrestore(locked_lruvec, *flags); + + return lock_page_lruvec_irqsave(page, flags); +} + #ifdef CONFIG_CGROUP_WRITEBACK struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb); diff --git a/mm/mlock.c b/mm/mlock.c index c1ef4ac7a744..5b79757e5d02 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -289,17 +289,10 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) /* Phase 1: page isolation */ for (i = 0; i < nr; i++) { struct page *page = pvec->pages[i]; - struct lruvec *new_lruvec; bool clearlru; clearlru = TestClearPageLRU(page); - - new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); - if (new_lruvec != lruvec) { - if (lruvec) - unlock_page_lruvec_irq(lruvec); - lruvec = lock_page_lruvec_irq(page); - } + lruvec = relock_page_lruvec_irq(page, lruvec); if (!TestClearPageMlocked(page)) { delta_munlocked++; diff --git a/mm/swap.c b/mm/swap.c index 91ff3d4a7751..bea9497bbda3 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -190,15 +190,8 @@ static void pagevec_lru_move_fn(struct pagevec *pvec, for (i = 0; i < pagevec_count(pvec); i++) { struct page *page = pvec->pages[i]; - struct lruvec *new_lruvec; - - new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); - if (lruvec != new_lruvec) { - if (lruvec) - unlock_page_lruvec_irqrestore(lruvec, flags); - lruvec = lock_page_lruvec_irqsave(page, &flags); - } + lruvec = relock_page_lruvec_irqsave(page, lruvec, &flags); (*move_fn)(page, lruvec, arg); } if (lruvec) @@ -821,17 +814,12 @@ void release_pages(struct page **pages, int nr) } if (TestClearPageLRU(page)) { - struct lruvec *new_lruvec; - - new_lruvec = mem_cgroup_page_lruvec(page, - page_pgdat(page)); - if (new_lruvec != lruvec) { - if (lruvec) - unlock_page_lruvec_irqrestore(lruvec, - flags); + struct lruvec *pre_lruvec = lruvec; + + lruvec = relock_page_lruvec_irqsave(page, lruvec, + &flags); + if (pre_lruvec != lruvec) lock_batch = 0; - lruvec = lock_page_lruvec_irqsave(page, &flags); - } del_page_from_lru_list(page, lruvec, page_off_lru(page)); } diff --git a/mm/vmscan.c b/mm/vmscan.c index c4c30b530876..7a0d4ac71558 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4291,15 +4291,9 @@ void check_move_unevictable_pages(struct pagevec *pvec) for (i = 0; i < pvec->nr; i++) { struct page *page = pvec->pages[i]; - struct lruvec *new_lruvec; pgscanned++; - new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); - if (lruvec != new_lruvec) { - if (lruvec) - unlock_page_lruvec_irq(lruvec); - lruvec = lock_page_lruvec_irq(page); - } + lruvec = relock_page_lruvec_irq(page, lruvec); if (!PageLRU(page) || !PageUnevictable(page)) continue; From patchwork Thu May 28 11:00:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575585 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7CBD460D for ; Thu, 28 May 2020 11:02:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 540B22088E for ; Thu, 28 May 2020 11:02:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 540B22088E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0961F800C3; Thu, 28 May 2020 07:02:14 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 04E1F800BE; Thu, 28 May 2020 07:02:13 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E76C1800C3; Thu, 28 May 2020 07:02:13 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0127.hostedemail.com [216.40.44.127]) by kanga.kvack.org (Postfix) with ESMTP id CC878800BE for ; Thu, 28 May 2020 07:02:13 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 79B0F4DC9 for ; Thu, 28 May 2020 11:02:13 +0000 (UTC) X-FDA: 76865838546.23.hat55_6a9c9e10f1c44 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id 5BE7737609 for ; Thu, 28 May 2020 11:02:13 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:30054:30070,0,RBL:47.88.44.36:@linux.alibaba.com:.lbl8.mailshell.net-62.18.0.100 64.10.201.10;47.88.44.36-irl.urbl.hostedemail.com-127.0.0.175,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: hat55_6a9c9e10f1c44 X-Filterd-Recvd-Size: 3878 Received: from out4436.biz.mail.alibaba.com (out4436.biz.mail.alibaba.com [47.88.44.36]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:02:12 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R681e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=19;SR=0;TI=SMTPD_---0TztNDDl_1590663690; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztNDDl_1590663690) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:31 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi , Andrey Ryabinin , Jann Horn Subject: [PATCH v11 14/16] mm/vmscan: use relock for move_pages_to_lru Date: Thu, 28 May 2020 19:00:56 +0800 Message-Id: <1590663658-184131-15-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: 5BE7737609 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Hugh Dickins Use the relock function to replace relocking action. And try to save few lock times. Signed-off-by: Hugh Dickins Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Tejun Heo Cc: Andrey Ryabinin Cc: Jann Horn Cc: Mel Gorman Cc: Johannes Weiner Cc: Matthew Wilcox Cc: Hugh Dickins Cc: cgroups@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org --- mm/vmscan.c | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 7a0d4ac71558..672e7304f211 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1854,15 +1854,15 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, enum lru_list lru; while (!list_empty(list)) { - struct lruvec *new_lruvec = NULL; - page = lru_to_page(list); VM_BUG_ON_PAGE(PageLRU(page), page); list_del(&page->lru); if (unlikely(!page_evictable(page))) { - spin_unlock_irq(&lruvec->lru_lock); + if (lruvec) { + spin_unlock_irq(&lruvec->lru_lock); + lruvec = NULL; + } putback_lru_page(page); - spin_lock_irq(&lruvec->lru_lock); continue; } @@ -1876,12 +1876,7 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, * list_add(&page->lru,) * list_add(&page->lru,) //corrupt */ - new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); - if (new_lruvec != lruvec) { - if (lruvec) - spin_unlock_irq(&lruvec->lru_lock); - lruvec = lock_page_lruvec_irq(page); - } + lruvec = relock_page_lruvec_irq(page, lruvec); SetPageLRU(page); if (unlikely(put_page_testzero(page))) { @@ -1890,8 +1885,8 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, if (unlikely(PageCompound(page))) { spin_unlock_irq(&lruvec->lru_lock); + lruvec = NULL; destroy_compound_page(page); - spin_lock_irq(&lruvec->lru_lock); } else list_add(&page->lru, &pages_to_free); continue; From patchwork Thu May 28 11:00:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575591 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C742160D for ; Thu, 28 May 2020 11:02:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9E3472088E for ; Thu, 28 May 2020 11:02:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9E3472088E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5568C800B6; Thu, 28 May 2020 07:02:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 464F48001A; Thu, 28 May 2020 07:02:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 32FE0800B6; Thu, 28 May 2020 07:02:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0187.hostedemail.com [216.40.44.187]) by kanga.kvack.org (Postfix) with ESMTP id 163568001A for ; Thu, 28 May 2020 07:02:39 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id C72A94DDA for ; Thu, 28 May 2020 11:02:38 +0000 (UTC) X-FDA: 76865839596.09.boys63_6e250d2d87015 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin09.hostedemail.com (Postfix) with ESMTP id A9ABF180AD81D for ; Thu, 28 May 2020 11:02:38 +0000 (UTC) X-Spam-Summary: 2,0,0,f1589ff0d4f74692,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1534:1541:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3352:3872:3876:4321:4605:5007:6261:6737:7903:9207:10004:11026:11473:11658:11914:12043:12048:12296:12297:12438:12555:12895:12986:13069:13311:13357:13846:14096:14181:14384:14394:14721:14915:21060:21080:21451:21627,0,RBL:115.124.30.44:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: boys63_6e250d2d87015 X-Filterd-Recvd-Size: 2738 Received: from out30-44.freemail.mail.aliyun.com (out30-44.freemail.mail.aliyun.com [115.124.30.44]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:02:36 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R821e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0TztNb5w_1590663691; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztNb5w_1590663691) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:31 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi Subject: [PATCH v11 15/16] mm/pgdat: remove pgdat lru_lock Date: Thu, 28 May 2020 19:00:57 +0800 Message-Id: <1590663658-184131-16-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: A9ABF180AD81D X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Now pgdat.lru_lock was replaced by lruvec lock. It's not used anymore. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Konstantin Khlebnikov Cc: Hugh Dickins Cc: Johannes Weiner Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org Cc: cgroups@vger.kernel.org --- include/linux/mmzone.h | 1 - mm/page_alloc.c | 1 - 2 files changed, 2 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index d70a12214936..42e646f7f30d 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -733,7 +733,6 @@ struct deferred_split { /* Write-intensive fields used by page reclaim */ ZONE_PADDING(_pad1_) - spinlock_t lru_lock; #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 79a3a6d62532..30081dc0cc15 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6765,7 +6765,6 @@ static void __meminit pgdat_init_internals(struct pglist_data *pgdat) init_waitqueue_head(&pgdat->pfmemalloc_wait); pgdat_page_ext_init(pgdat); - spin_lock_init(&pgdat->lru_lock); lruvec_init(&pgdat->__lruvec); } From patchwork Thu May 28 11:00:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11575571 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EE5A060D for ; Thu, 28 May 2020 11:01:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B8AAF20888 for ; Thu, 28 May 2020 11:01:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B8AAF20888 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D8DF4800BC; Thu, 28 May 2020 07:01:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D421F800B8; Thu, 28 May 2020 07:01:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5615800BC; Thu, 28 May 2020 07:01:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0175.hostedemail.com [216.40.44.175]) by kanga.kvack.org (Postfix) with ESMTP id AB8FE800B8 for ; Thu, 28 May 2020 07:01:39 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 4B43F824556B for ; Thu, 28 May 2020 11:01:39 +0000 (UTC) X-FDA: 76865837118.22.love27_6581e53b42002 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id D9EE918038E78 for ; Thu, 28 May 2020 11:01:38 +0000 (UTC) X-Spam-Summary: 2,0,0,82496775dca470be,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:4:41:69:355:379:541:800:960:966:968:973:988:989:1260:1261:1345:1359:1431:1437:1605:1730:1747:1777:1792:1801:1981:2194:2196:2198:2199:2200:2201:2393:2553:2559:2562:2639:2693:2731:2736:2737:2903:2916:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4385:4605:5007:6119:6261:6630:6737:7576:7875:7903:7974:8660:9010:9592:10004:11026:11232:11473:11658:11914:12043:12048:12291:12295:12296:12297:12438:12555:12679:12683:12895:12986:13148:13149:13156:13228:13230:13846:13869:13972:14096:14394:14915:21060:21067:21080:21324:21433:21451:21627:21740:21939:30005:30012:30034:30045:30051:30054:30070:30079:30085:30090,0,RBL:115.124.30.132:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: love27_6581e53b42002 X-Filterd-Recvd-Size: 15283 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) by imf02.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 May 2020 11:01:37 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R411e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04427;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=19;SR=0;TI=SMTPD_---0TztdQAz_1590663691; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TztdQAz_1590663691) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 May 2020 19:01:32 +0800 From: Alex Shi To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi , Andrey Ryabinin , Jann Horn Subject: [PATCH v11 16/16] mm/lru: revise the comments of lru_lock Date: Thu, 28 May 2020 19:00:58 +0800 Message-Id: <1590663658-184131-17-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> References: <1590663658-184131-1-git-send-email-alex.shi@linux.alibaba.com> X-Rspamd-Queue-Id: D9EE918038E78 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Hugh Dickins Since we changed the pgdat->lru_lock to lruvec->lru_lock, it's time to fix the incorrect comments in code. Also fixed some zone->lru_lock comment error from ancient time. etc. Signed-off-by: Hugh Dickins Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Tejun Heo Cc: Andrey Ryabinin Cc: Jann Horn Cc: Mel Gorman Cc: Johannes Weiner Cc: Matthew Wilcox Cc: Hugh Dickins Cc: cgroups@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org --- Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 +++------------ Documentation/admin-guide/cgroup-v1/memory.rst | 8 ++++---- Documentation/trace/events-kmem.rst | 2 +- Documentation/vm/unevictable-lru.rst | 22 ++++++++-------------- include/linux/mm_types.h | 2 +- include/linux/mmzone.h | 2 +- mm/filemap.c | 4 ++-- mm/memcontrol.c | 2 +- mm/rmap.c | 2 +- mm/vmscan.c | 12 ++++++++---- 10 files changed, 30 insertions(+), 41 deletions(-) diff --git a/Documentation/admin-guide/cgroup-v1/memcg_test.rst b/Documentation/admin-guide/cgroup-v1/memcg_test.rst index 3f7115e07b5d..0b9f91589d3d 100644 --- a/Documentation/admin-guide/cgroup-v1/memcg_test.rst +++ b/Documentation/admin-guide/cgroup-v1/memcg_test.rst @@ -133,18 +133,9 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y. 8. LRU ====== - Each memcg has its own private LRU. Now, its handling is under global - VM's control (means that it's handled under global pgdat->lru_lock). - Almost all routines around memcg's LRU is called by global LRU's - list management functions under pgdat->lru_lock. - - A special function is mem_cgroup_isolate_pages(). This scans - memcg's private LRU and call __isolate_lru_page() to extract a page - from LRU. - - (By __isolate_lru_page(), the page is removed from both of global and - private LRU.) - + Each memcg has its own vector of LRUs (inactive anon, active anon, + inactive file, active file, unevictable) of pages from each node, + each LRU handled under a single lru_lock for that memcg and node. 9. Typical Tests. ================= diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst index 12757e63b26c..669277c82769 100644 --- a/Documentation/admin-guide/cgroup-v1/memory.rst +++ b/Documentation/admin-guide/cgroup-v1/memory.rst @@ -292,13 +292,13 @@ When oom event notifier is registered, event will be delivered. PG_locked. mm->page_table_lock - pgdat->lru_lock - lock_page_cgroup. + lruvec->lru_lock + lock_page_cgroup. In many cases, just lock_page_cgroup() is called. - per-zone-per-cgroup LRU (cgroup's private LRU) is just guarded by - pgdat->lru_lock, it has no lock of its own. + per-node-per-cgroup LRU (cgroup's private LRU) is just guarded by + lruvec->lru_lock, it has no lock of its own. 2.7 Kernel Memory Extension (CONFIG_MEMCG_KMEM) ----------------------------------------------- diff --git a/Documentation/trace/events-kmem.rst b/Documentation/trace/events-kmem.rst index 555484110e36..68fa75247488 100644 --- a/Documentation/trace/events-kmem.rst +++ b/Documentation/trace/events-kmem.rst @@ -69,7 +69,7 @@ When pages are freed in batch, the also mm_page_free_batched is triggered. Broadly speaking, pages are taken off the LRU lock in bulk and freed in batch with a page list. Significant amounts of activity here could indicate that the system is under memory pressure and can also indicate -contention on the zone->lru_lock. +contention on the lruvec->lru_lock. 4. Per-CPU Allocator Activity ============================= diff --git a/Documentation/vm/unevictable-lru.rst b/Documentation/vm/unevictable-lru.rst index 17d0861b0f1d..0e1490524f53 100644 --- a/Documentation/vm/unevictable-lru.rst +++ b/Documentation/vm/unevictable-lru.rst @@ -33,7 +33,7 @@ reclaim in Linux. The problems have been observed at customer sites on large memory x86_64 systems. To illustrate this with an example, a non-NUMA x86_64 platform with 128GB of -main memory will have over 32 million 4k pages in a single zone. When a large +main memory will have over 32 million 4k pages in a single node. When a large fraction of these pages are not evictable for any reason [see below], vmscan will spend a lot of time scanning the LRU lists looking for the small fraction of pages that are evictable. This can result in a situation where all CPUs are @@ -55,7 +55,7 @@ unevictable, either by definition or by circumstance, in the future. The Unevictable Page List ------------------------- -The Unevictable LRU infrastructure consists of an additional, per-zone, LRU list +The Unevictable LRU infrastructure consists of an additional, per-node, LRU list called the "unevictable" list and an associated page flag, PG_unevictable, to indicate that the page is being managed on the unevictable list. @@ -84,15 +84,9 @@ The unevictable list does not differentiate between file-backed and anonymous, swap-backed pages. This differentiation is only important while the pages are, in fact, evictable. -The unevictable list benefits from the "arrayification" of the per-zone LRU +The unevictable list benefits from the "arrayification" of the per-node LRU lists and statistics originally proposed and posted by Christoph Lameter. -The unevictable list does not use the LRU pagevec mechanism. Rather, -unevictable pages are placed directly on the page's zone's unevictable list -under the zone lru_lock. This allows us to prevent the stranding of pages on -the unevictable list when one task has the page isolated from the LRU and other -tasks are changing the "evictability" state of the page. - Memory Control Group Interaction -------------------------------- @@ -101,8 +95,8 @@ The unevictable LRU facility interacts with the memory control group [aka memory controller; see Documentation/admin-guide/cgroup-v1/memory.rst] by extending the lru_list enum. -The memory controller data structure automatically gets a per-zone unevictable -list as a result of the "arrayification" of the per-zone LRU lists (one per +The memory controller data structure automatically gets a per-node unevictable +list as a result of the "arrayification" of the per-node LRU lists (one per lru_list enum element). The memory controller tracks the movement of pages to and from the unevictable list. @@ -196,7 +190,7 @@ for the sake of expediency, to leave a unevictable page on one of the regular active/inactive LRU lists for vmscan to deal with. vmscan checks for such pages in all of the shrink_{active|inactive|page}_list() functions and will "cull" such pages that it encounters: that is, it diverts those pages to the -unevictable list for the zone being scanned. +unevictable list for the node being scanned. There may be situations where a page is mapped into a VM_LOCKED VMA, but the page is not marked as PG_mlocked. Such pages will make it all the way to @@ -328,7 +322,7 @@ If the page was NOT already mlocked, mlock_vma_page() attempts to isolate the page from the LRU, as it is likely on the appropriate active or inactive list at that time. If the isolate_lru_page() succeeds, mlock_vma_page() will put back the page - by calling putback_lru_page() - which will notice that the page -is now mlocked and divert the page to the zone's unevictable list. If +is now mlocked and divert the page to the node's unevictable list. If mlock_vma_page() is unable to isolate the page from the LRU, vmscan will handle it later if and when it attempts to reclaim the page. @@ -603,7 +597,7 @@ Some examples of these unevictable pages on the LRU lists are: unevictable list in mlock_vma_page(). shrink_inactive_list() also diverts any unevictable pages that it finds on the -inactive lists to the appropriate zone's unevictable list. +inactive lists to the appropriate node's unevictable list. shrink_inactive_list() should only see SHM_LOCK'd pages that became SHM_LOCK'd after shrink_active_list() had moved them to the inactive list, or pages mapped diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index ef6d3aface8a..6f2a61e35deb 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -78,7 +78,7 @@ struct page { struct { /* Page cache and anonymous pages */ /** * @lru: Pageout list, eg. active_list protected by - * pgdat->lru_lock. Sometimes used as a generic list + * lruvec->lru_lock. Sometimes used as a generic list * by the page owner. */ struct list_head lru; diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 42e646f7f30d..1df5cd06da04 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -115,7 +115,7 @@ static inline bool free_area_empty(struct free_area *area, int migratetype) struct pglist_data; /* - * zone->lock and the zone lru_lock are two of the hottest locks in the kernel. + * zone->lock and the lru_lock are two of the hottest locks in the kernel. * So add a wild amount of padding here to ensure that they fall into separate * cachelines. There are very few zone structures in the machine, so space * consumption is not a concern here. diff --git a/mm/filemap.c b/mm/filemap.c index 5fda0ed6ee19..147736acd6f2 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -101,8 +101,8 @@ * ->swap_lock (try_to_unmap_one) * ->private_lock (try_to_unmap_one) * ->i_pages lock (try_to_unmap_one) - * ->pgdat->lru_lock (follow_page->mark_page_accessed) - * ->pgdat->lru_lock (check_pte_range->isolate_lru_page) + * ->lruvec->lru_lock (follow_page->mark_page_accessed) + * ->lruvec->lru_lock (check_pte_range->isolate_lru_page) * ->private_lock (page_remove_rmap->set_page_dirty) * ->i_pages lock (page_remove_rmap->set_page_dirty) * bdi.wb->list_lock (page_remove_rmap->set_page_dirty) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b106e3b86fff..ca4bbc25dde8 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3016,7 +3016,7 @@ void __memcg_kmem_uncharge_page(struct page *page, int order) #ifdef CONFIG_TRANSPARENT_HUGEPAGE /* - * Because tail pages are not marked as "used", set it. We're under + * Because tail pages are not marked as "used", set it. Don't need * lruvec->lru_lock and migration entries setup in all page mappings. */ void mem_cgroup_split_huge_fixup(struct page *head) diff --git a/mm/rmap.c b/mm/rmap.c index ad4a0fdcc94c..d3717d21c992 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -28,7 +28,7 @@ * hugetlb_fault_mutex (hugetlbfs specific page fault mutex) * anon_vma->rwsem * mm->page_table_lock or pte_lock - * pgdat->lru_lock (in mark_page_accessed, isolate_lru_page) + * lruvec->lru_lock (in mark_page_accessed, isolate_lru_page) * swap_lock (in swap_duplicate, swap_info_get) * mmlist_lock (in mmput, drain_mmlist and others) * mapping->private_lock (in __set_page_dirty_buffers) diff --git a/mm/vmscan.c b/mm/vmscan.c index 672e7304f211..fb3a5e580a1e 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1619,14 +1619,16 @@ static __always_inline void update_lru_sizes(struct lruvec *lruvec, } /** - * pgdat->lru_lock is heavily contended. Some of the functions that + * Isolating page from the lruvec to fill in @dst list by nr_to_scan times. + * + * lruvec->lru_lock is heavily contended. Some of the functions that * shrink the lists perform better by taking out a batch of pages * and working on them outside the LRU lock. * * For pagecache intensive workloads, this function is the hottest * spot in the kernel (apart from copy_*_user functions). * - * Appropriate locks must be held before calling this function. + * Lru_lock must be held before calling this function. * * @nr_to_scan: The number of eligible pages to look through on the list. * @lruvec: The LRU vector to pull pages from. @@ -1826,14 +1828,16 @@ static int too_many_isolated(struct pglist_data *pgdat, int file, /* * This moves pages from @list to corresponding LRU list. + * The pages from @list is out of any lruvec, and in the end list reuses as + * pages_to_free list. * * We move them the other way if the page is referenced by one or more * processes, from rmap. * * If the pages are mostly unmapped, the processing is fast and it is - * appropriate to hold zone_lru_lock across the whole operation. But if + * appropriate to hold lru_lock across the whole operation. But if * the pages are mapped, the processing is slow (page_referenced()) so we - * should drop zone_lru_lock around each page. It's impossible to balance + * should drop lru_lock around each page. It's impossible to balance * this, so instead we remove the pages from the LRU while processing them. * It is safe to rely on PG_active against the non-LRU pages in here because * nobody will play with that bit on a non-LRU page.