From patchwork Tue Feb 11 00:19:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Minchan Kim X-Patchwork-Id: 11374359 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E6D8F139A for ; Tue, 11 Feb 2020 00:20:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A020520715 for ; Tue, 11 Feb 2020 00:20:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hA6VDB/o" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A020520715 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C659D6B01FC; Mon, 10 Feb 2020 19:20:06 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C14D86B01FF; Mon, 10 Feb 2020 19:20:06 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B034D6B0200; Mon, 10 Feb 2020 19:20:06 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0246.hostedemail.com [216.40.44.246]) by kanga.kvack.org (Postfix) with ESMTP id 97BFD6B01FC for ; Mon, 10 Feb 2020 19:20:06 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 4A95940D6 for ; Tue, 11 Feb 2020 00:20:06 +0000 (UTC) X-FDA: 76475938812.19.bell37_3b0e8f50c6562 X-Spam-Summary: 2,0,0,a5f3dabb784a0d01,d41d8cd98f00b204,minchan.kim@gmail.com,:akpm@linux-foundation.org::josef@toxicpanda.com:hannes@cmpxchg.org:jack@suse.cz:linux-kernel@vger.kernel.org:minchan@kernel.org,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1437:1515:1535:1542:1711:1730:1747:1777:1792:2393:2559:2562:2693:2898:2909:2911:3138:3139:3140:3141:3142:3354:3622:3865:3866:3867:3868:3870:3871:3872:3874:4425:4605:5007:6261:6653:7903:9121:10004:11026:11232:11473:11658:11914:12043:12295:12296:12297:12438:12517:12519:12555:12895:13894:14096:14181:14394:14721:21080:21324:21444:21451:21627:21795:21987:21990:30012:30016:30051:30054:30070:30090,0,RBL:209.85.214.196:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: bell37_3b0e8f50c6562 X-Filterd-Recvd-Size: 5165 Received: from mail-pl1-f196.google.com (mail-pl1-f196.google.com [209.85.214.196]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Tue, 11 Feb 2020 00:20:05 +0000 (UTC) Received: by mail-pl1-f196.google.com with SMTP id d9so3484849plo.11 for ; Mon, 10 Feb 2020 16:20:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=3OTREQOKy0gnIvXKXGDvE24tk+qLBMpu/Of2xHZqo/4=; b=hA6VDB/o9xlXGxz9a3Jeb+DNW1qd5ufupvTSSjGfmKl1n4YanuCFru5Maw8WqpTWce j4mqh4Darcqmj85XbyBJ9DuxBAgAyn03iQZuG88hqQEV2rOhxb2m4EyqzMNjt2PXHyg7 NUzcPEyyOU8dB0zDP1zV0B+ei6+d//2Mw7xfGgL/H/oT+pZNZySvjnlPZOUuBfXoZxbc n5T/lj5bQ34VSC9MRZy0P/KMoWdJ562jRPrASKJ765OhI6YKnlKUPOe9kdLxpEGGqCxR 9wioDqCp+pHDcytTT5Lgnuq7TfeXC04Z7dUNOKfvIZMnO3FH8JnhPhG7/2NLNVbFNjQP 04HA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :mime-version:content-transfer-encoding; bh=3OTREQOKy0gnIvXKXGDvE24tk+qLBMpu/Of2xHZqo/4=; b=m7BfvcLNTBq+Kz751S58e4WXxoleI6Gl5OLQvOrVDtUFlQhNQhxEPc8habrYwQ0C1Q 0Z3JfEM+ek5Wz42+d9nWRi/sw3cEub3goLto/kew2aX5qxUYOyFPYG32F1eN6rJnFmRU obIGLNNE4PVPYFCpX7d0+Qc7z+wrayiusUCbpa3Wa2n368ZbWzORwqLREzd+m4n30oxI 75YmANTWOVAmHgGX2nlvn9M6Y8u2JnFhV2pGdI0v8o3imkcX+F1S8+yON8ryKkoGEwCV iqDiALSzGdeqk/8hC7Blw0z2Cm5UjR4vaQ22GzhJZ5kIWt7JiG7xe3ccMhhGAEco6R18 cExA== X-Gm-Message-State: APjAAAUBAhap5zd5X4sGzSueR7QCogSrdbGD03aoTYgcSaztZ4CQZhkA G4TRJe6v/RwZieEl8NJuIzM= X-Google-Smtp-Source: APXvYqyPeBczzNeYHJektBpOOUNWHawkwwYAhlRlcaRSELf4hUQd5E/C2uBZ3S7zpP0KWFXaRuBS2A== X-Received: by 2002:a17:90a:fb41:: with SMTP id iq1mr502054pjb.89.1581380404487; Mon, 10 Feb 2020 16:20:04 -0800 (PST) Received: from bbox-1.mtv.corp.google.com ([2620:15c:211:1:3e01:2939:5992:52da]) by smtp.gmail.com with ESMTPSA id c15sm1503020pfo.137.2020.02.10.16.20.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Feb 2020 16:20:03 -0800 (PST) From: Minchan Kim To: Andrew Morton Cc: linux-mm , Josef Bacik , Johannes Weiner , Jan Kara , LKML , Minchan Kim Subject: [PATCH] mm: fix long time stall from mm_populate Date: Mon, 10 Feb 2020 16:19:58 -0800 Message-Id: <20200211001958.170261-1-minchan@kernel.org> X-Mailer: git-send-email 2.25.0.225.g125e21ebc7-goog MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Basically, fault handler releases mmap_sem before requesting readahead and then it is supposed to retry lookup the page from page cache with FAULT_FLAG_TRIED so that it avoids the live lock of infinite retry. However, what happens if the fault handler find a page from page cache and the page has readahead marker but are waiting under writeback? Plus one more condition, it happens under mm_populate which repeats faulting unless it encounters error. So let's assemble conditions below. __mm_populate for (...) get_user_pages(faluty_address) handle_mm_fault filemap_fault find a page form page(PG_uptodate|PG_readahead|PG_writeback) it will return VM_FAULT_RETRY continue with faulty_address IOW, it will repeat fault retry logic until the page will be written back in the long run. It makes big spike latency of several seconds. This patch solves the issue by turning off fault retry logic in second trial. Signed-off-by: Minchan Kim Reviewed-by: Jan Kara Signed-off-by: Minchan Kim --- It was orignated from code review once I have seen several user reports but didn't confirm yet it's the root cause. mm/gup.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 1b521e0ac1de..b3f825092abf 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1196,6 +1196,7 @@ int __mm_populate(unsigned long start, unsigned long len, int ignore_errors) struct vm_area_struct *vma = NULL; int locked = 0; long ret = 0; + bool tried = false; end = start + len; @@ -1226,14 +1227,18 @@ int __mm_populate(unsigned long start, unsigned long len, int ignore_errors) * double checks the vma flags, so that it won't mlock pages * if the vma was already munlocked. */ - ret = populate_vma_page_range(vma, nstart, nend, &locked); + ret = populate_vma_page_range(vma, nstart, nend, + tried ? NULL : &locked); if (ret < 0) { if (ignore_errors) { ret = 0; continue; /* continue at next VMA */ } break; - } + } else if (ret == 0) + tried = true; + else + tried = false; nend = nstart + ret * PAGE_SIZE; ret = 0; }