From patchwork Sun Dec 1 01:56:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11268415 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0E52E14B7 for ; Sun, 1 Dec 2019 01:56:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C21AA215E5 for ; Sun, 1 Dec 2019 01:56:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="W3Y5A1YR" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C21AA215E5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CDAB36B034B; Sat, 30 Nov 2019 20:56:20 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C5E856B034D; Sat, 30 Nov 2019 20:56:20 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AFF326B034E; Sat, 30 Nov 2019 20:56:20 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0018.hostedemail.com [216.40.44.18]) by kanga.kvack.org (Postfix) with ESMTP id 9347A6B034B for ; Sat, 30 Nov 2019 20:56:20 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 4D9BC181AEF0B for ; Sun, 1 Dec 2019 01:56:20 +0000 (UTC) X-FDA: 76214907720.05.pipe37_374a4bd4c0a08 X-Spam-Summary: 40,2.5,0,30d734a2e2a10957,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:hughd@google.com:linux-man@vger.kernel.org::lixinhai.lxh@gmail.com:mhocko@suse.com:mm-commits@vger.kernel.org:n-horiguchi@ah.jp.nec.com:torvalds@linux-foundation.org:vbabka@suse.cz,RULES_HIT:41:69:355:379:800:960:967:973:988:989:1260:1263:1345:1381:1431:1437:1535:1544:1605:1711:1730:1747:1777:1792:2198:2199:2393:2525:2559:2563:2682:2685:2740:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4117:4250:4321:4605:5007:6121:6261:6653:7514:7576:7903:9025:9391:9545:9592:10011:10913:11026:11473:11658:11914:12043:12048:12050:12219:12291:12295:12296:12297:12438:12517:12519:12555:12679:12683:12783:12986:13141:13161:13180:13229:13230:13846:14096:14181:14721:14849:21060:21080:21094:21323:21324:21433:21451:21627:21740:21819:21939:30012:30054:30070:30074,0,RBL:e rror,Cac X-HE-Tag: pipe37_374a4bd4c0a08 X-Filterd-Recvd-Size: 6324 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Sun, 1 Dec 2019 01:56:19 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D5A1D208C3; Sun, 1 Dec 2019 01:56:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1575165379; bh=Byuwr/l4raRkXxcjH1j0OkOqNWKztHVTFt5DeNjwufU=; h=Date:From:To:Subject:From; b=W3Y5A1YRGOKcLq6p6hbpxr+stkJ1zUdjPdfj2wES7CX+A3q8+cj1VBaWtHEW8skj2 EgDgVRw+mFGpio40zkVUTLfOL+LXsKjtmZXIctOWr/eBh5FGZPLwFOnQPQIxjGj8Vu xGEGfU6AfxairVpSdqo+Of3poB/cR/vSpShcG2Lg= Date: Sat, 30 Nov 2019 17:56:18 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, hughd@google.com, linux-man@vger.kernel.org, linux-mm@kvack.org, lixinhai.lxh@gmail.com, mhocko@suse.com, mm-commits@vger.kernel.org, n-horiguchi@ah.jp.nec.com, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 119/158] mm/mempolicy.c: fix checking unmapped holes for mbind Message-ID: <20191201015618.fbLx16qnj%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Li Xinhai Subject: mm/mempolicy.c: fix checking unmapped holes for mbind mbind() is required to report EFAULT if range, specified by addr and len, contains unmapped holes. In current implementation, below rules are applied for this checking: 1: Unmapped holes at any part of the specified range should be reported as EFAULT if mbind() for none MPOL_DEFAULT cases; 2: Unmapped holes at any part of the specified range should be ignored (do not reprot EFAULT) if mbind() for MPOL_DEFAULT case; 3: The whole range in an unmapped hole should be reported as EFAULT; Note that rule 2 does not fullfill the mbind() API definition, but since that behavior has existed for long days (the internal flag MPOL_MF_DISCONTIG_OK is for this purpose), this patch does not plan to change it. In current code, application observed inconsistent behavior on rule 1 and rule 2 respectively. That inconsistency is fixed as below details. Cases of rule 1: 1) Hole at head side of range. Current code reprot EFAULT, no change by this patch. [ vma ][ hole ][ vma ] [ range ] 2) Hole at middle of range. Current code report EFAULT, no change by this patch. [ vma ][ hole ][ vma ] [ range ] 3) Hole at tail side of range. Current code do not report EFAULT, this patch fix it. [ vma ][ hole ][ vma ] [ range ] Cases of rule 2: 1) Hole at head side of range. Current code reprot EFAULT, this patch fix it. [ vma ][ hole ][ vma ] [ range ] 2) Hole at middle of range. Current code do not report EFAULT, no change by this patch. this patch. [ vma ][ hole ][ vma] [ range ] 3) Hole at tail side of range. Current code do not report EFAULT, no change by this patch. [ vma ][ hole ][ vma] [ range ] This patch has no changes to rule 3. The unmapped hole checking can also be handled by using .pte_hole(), instead of .test_walk(). But .pte_hole() is called for holes inside and outside vma, which causes more cost, so this patch keeps the original design with .test_walk(). Link: http://lkml.kernel.org/r/1573218104-11021-3-git-send-email-lixinhai.lxh@gmail.com Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()") Signed-off-by: Li Xinhai Reviewed-by: Naoya Horiguchi Cc: Michal Hocko Cc: Vlastimil Babka Cc: Hugh Dickins Cc: linux-man Signed-off-by: Andrew Morton --- mm/mempolicy.c | 40 +++++++++++++++++++++++++++------------- 1 file changed, 27 insertions(+), 13 deletions(-) --- a/mm/mempolicy.c~mm-fix-checking-unmapped-holes-for-mbind +++ a/mm/mempolicy.c @@ -410,7 +410,9 @@ struct queue_pages { struct list_head *pagelist; unsigned long flags; nodemask_t *nmask; - struct vm_area_struct *prev; + unsigned long start; + unsigned long end; + struct vm_area_struct *first; }; /* @@ -619,14 +621,20 @@ static int queue_pages_test_walk(unsigne unsigned long flags = qp->flags; /* range check first */ - if (!(flags & MPOL_MF_DISCONTIG_OK)) { - if (!vma->vm_next && vma->vm_end < end) - return -EFAULT; - if (qp->prev && qp->prev->vm_end < vma->vm_start) + VM_BUG_ON((vma->vm_start > start) || (vma->vm_end < end)); + + if (!qp->first) { + qp->first = vma; + if (!(flags & MPOL_MF_DISCONTIG_OK) && + (qp->start < vma->vm_start)) + /* hole at head side of range */ return -EFAULT; } - - qp->prev = vma; + if (!(flags & MPOL_MF_DISCONTIG_OK) && + ((vma->vm_end < qp->end) && + (!vma->vm_next || vma->vm_end < vma->vm_next->vm_start))) + /* hole at middle or tail of range */ + return -EFAULT; /* * Need check MPOL_MF_STRICT to return -EIO if possible @@ -638,8 +646,6 @@ static int queue_pages_test_walk(unsigne if (endvma > end) endvma = end; - if (vma->vm_start > start) - start = vma->vm_start; if (flags & MPOL_MF_LAZY) { /* Similar to task_numa_work, skip inaccessible VMAs */ @@ -682,14 +688,23 @@ queue_pages_range(struct mm_struct *mm, nodemask_t *nodes, unsigned long flags, struct list_head *pagelist) { + int err; struct queue_pages qp = { .pagelist = pagelist, .flags = flags, .nmask = nodes, - .prev = NULL, + .start = start, + .end = end, + .first = NULL, }; - return walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp); + err = walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp); + + if (!qp.first) + /* whole range in hole */ + err = -EFAULT; + + return err; } /* @@ -741,8 +756,7 @@ static int mbind_range(struct mm_struct unsigned long vmend; vma = find_vma(mm, start); - if (!vma || vma->vm_start > start) - return -EFAULT; + VM_BUG_ON(!vma); prev = vma->vm_prev; if (start > vma->vm_start)