From patchwork Mon Dec 3 20:08:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 10710497 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C56AA15A6 for ; Mon, 3 Dec 2018 20:09:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B70752B2E0 for ; Mon, 3 Dec 2018 20:09:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AABC82B32D; Mon, 3 Dec 2018 20:09:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0ECE82B2E0 for ; Mon, 3 Dec 2018 20:09:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8671E6B6ADB; Mon, 3 Dec 2018 15:09:13 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 678316B6ADA; Mon, 3 Dec 2018 15:09:13 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 517E36B6ADC; Mon, 3 Dec 2018 15:09:13 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by kanga.kvack.org (Postfix) with ESMTP id E9F366B6ADA for ; Mon, 3 Dec 2018 15:09:12 -0500 (EST) Received: by mail-ed1-f72.google.com with SMTP id c34so4734921edb.8 for ; Mon, 03 Dec 2018 12:09:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=vxLAF8bT0VIgTD2UGzdu92swisUy1mIrbjZx+y8Wxck=; b=CGG+OHNLWmK45ywrtx32fm7ob01iYbwJffwthKd8JAum+u1KVjiEOXJD92PFxE1qQb kz5MxHLGcNX/R9yM/bEswR+vaB+i6hWo22y6wps+if158+QrihYOy42QirkPo/v1P5Ye yJPmYGwL66moNCx225RiuGsqYwLiz3nUIW26TL8RRG1SHKF+Rkx2rw0HQsxkXffiMYyL PUpKAh6XXsZnxOSNuTaGuYirE5UPuRJOmnArlWl9dy9RIMLqoKBUrIsLFbnBhike1gCU 8K4tflShWsefA1YUCxZh0+fk7MkmvD27beeVDtmMv0ZVRKpNOBgG8v3ufTpxIQckPNsR yeLw== X-Gm-Message-State: AA+aEWa5ok7xvqaRElmRwLg2Y6MhpWjTFERA1C6vk70aqMd+fT4JNC6V 8n7q90dTVI5XK/5cpOEJCtY7u4GvV49+LL1AOFXcqN8u9F6/bZfBPVmfgEAoCV/KaKn/cU1EIhr NsV0wM6Ba7bpwVoONOHVhFZNwjWjyswDPMJU2KcTs6WQHN/fgGD02DdTOEUMldnhdZQ== X-Received: by 2002:a50:9784:: with SMTP id e4mr15643210edb.165.1543867752394; Mon, 03 Dec 2018 12:09:12 -0800 (PST) X-Google-Smtp-Source: AFSGD/XhPj7ti7VNHUsNERe0XVg9edYZt4qmN0D7HJiiokaLDmmAl/GO3wto6HJaFavdU2gMrXLD X-Received: by 2002:a50:9784:: with SMTP id e4mr15643172edb.165.1543867751324; Mon, 03 Dec 2018 12:09:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543867751; cv=none; d=google.com; s=arc-20160816; b=nOe8cpsc8LTt8mrrObMGb25ZYt3tHlSWDz9jWf6Av4PCu+3KFWGy88BFrDpT5n9GTP Te3GtnIxipS3eruZXSXfOCzK3Ya+PHhOYAJisOEmeH3+q5dPAY2XL+iY2WaLSZCi9Rmt Y6p5e5rntQSyXOLBkRdO93s6Cj+KxCmcIsXLkq0YSxwXRY/HdKMa75iydUEoFdyqYJVc LCne0/aAOBIJwH/dPmuH0NPKCff/V/UXrg5iZehW9E8pgwLl0HPL0JjI9+t4FKeizcbf 0llAa5lps1kpwjLh8k5g5Pmp/uwi0Jw1sVSSP1sXa2cyF+L7i+DMlFAmG7134OSmAKZl kQgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=vxLAF8bT0VIgTD2UGzdu92swisUy1mIrbjZx+y8Wxck=; b=KUR5qwL/kX7TK1+icxkQ+1O7LoRaxOwmYBQc2+CTCWNhRDiqbVg1hq6/0yihygO1H4 6ugmw9LXxUcGQXeVXDOp0t8sNG/CMRDMOqfIq2XqKoB5plCnkIgdLrRcLp3dfGHQtQLG +NITvp8kFvmAxBSFDBYqW2QXW41/PSfaECqVuoxFi4ZoIuewLeEr1NkQdDY/gS+0nagw opv+bZ8JXnEIL9qSgz8Zn+m92zvsXC2Y19isV1HRD5XeKNiZ5SG0Etut6pWVzT74kEEN EVVvnGkKYHkX8AE2zEQ9Z/uIhAI5hnJpvVmn3gfAxiMM9LzYvYfkh05saPT3vl6bIDAX fIrg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=hegYeheN; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.85 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from userp2120.oracle.com (userp2120.oracle.com. [156.151.31.85]) by mx.google.com with ESMTPS id e10-v6si3138979eji.18.2018.12.03.12.09.10 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 03 Dec 2018 12:09:11 -0800 (PST) Received-SPF: pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.85 as permitted sender) client-ip=156.151.31.85; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=hegYeheN; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.85 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id wB3JwZoj078799; Mon, 3 Dec 2018 20:09:04 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=vxLAF8bT0VIgTD2UGzdu92swisUy1mIrbjZx+y8Wxck=; b=hegYeheNo5PYGbcsOmPdUBgHYv5ujLGFmWxnqWCcG7cMhXP8j4RUtS3+V7pY69CWOzCF AiFW/GOsCSNQYWgrKPUyhvnWNBRYO0BKVDl3L1Hx8EeoPlieI/noPQ9GFbCreHVQI3E/ 9x+Y9fM7dpjCry5zm8owPH84kqFuWq7KcugLSVpqyS5WrxtgGcKrHHTfRs4x9+tEsJ49 MEs5G5v9VfMTHgNlG9Dpl8e9k6L450TZ7FoyfswHzt7JQBDRYNls011eVLJskd6Ow4IQ qfjy2OfjkRpYNwfIWfzkX4gBpm/glU6Wn1l+ycpfTbUrQ2dtwtyWr9YfWMcYNAK/odDY IQ== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2120.oracle.com with ESMTP id 2p3jxr8gk0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 03 Dec 2018 20:09:04 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id wB3K93xM018213 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 3 Dec 2018 20:09:03 GMT Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id wB3K93PW017194; Mon, 3 Dec 2018 20:09:03 GMT Received: from monkey.oracle.com (/50.38.38.67) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 03 Dec 2018 12:09:02 -0800 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Michal Hocko , Hugh Dickins , Naoya Horiguchi , "Aneesh Kumar K . V" , Andrea Arcangeli , "Kirill A . Shutemov" , Davidlohr Bueso , Prakash Sangappa , Andrew Morton , Mike Kravetz , stable@vger.kernel.org Subject: [PATCH 3/3] hugetlbfs: remove unnecessary code after i_mmap_rwsem synchronization Date: Mon, 3 Dec 2018 12:08:50 -0800 Message-Id: <20181203200850.6460-4-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20181203200850.6460-1-mike.kravetz@oracle.com> References: <20181203200850.6460-1-mike.kravetz@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9096 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=890 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1812030183 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP After expanding i_mmap_rwsem use for better shared pmd and page fault/ truncation synchronization, remove code that is no longer necessary. Cc: Fixes: ebed4bfc8da8 ("hugetlb: fix absurd HugePages_Rsvd") Signed-off-by: Mike Kravetz --- fs/hugetlbfs/inode.c | 46 +++++++++++++++----------------------------- mm/hugetlb.c | 21 ++++++++++---------- 2 files changed, 25 insertions(+), 42 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 3244147fc42b..a9c00c6ef80d 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -383,17 +383,16 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end) * truncation is indicated by end of range being LLONG_MAX * In this case, we first scan the range and release found pages. * After releasing pages, hugetlb_unreserve_pages cleans up region/reserv - * maps and global counts. Page faults can not race with truncation - * in this routine. hugetlb_no_page() prevents page faults in the - * truncated range. It checks i_size before allocation, and again after - * with the page table lock for the page held. The same lock must be - * acquired to unmap a page. + * maps and global counts. * hole punch is indicated if end is not LLONG_MAX * In the hole punch case we scan the range and release found pages. * Only when releasing a page is the associated region/reserv map * deleted. The region/reserv map for ranges without associated - * pages are not modified. Page faults can race with hole punch. - * This is indicated if we find a mapped page. + * pages are not modified. + * + * Callers of this routine must hold the i_mmap_rwsem in write mode to prevent + * races with page faults. + * * Note: If the passed end of range value is beyond the end of file, but * not LLONG_MAX this routine still performs a hole punch operation. */ @@ -423,32 +422,14 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, for (i = 0; i < pagevec_count(&pvec); ++i) { struct page *page = pvec.pages[i]; - u32 hash; index = page->index; - hash = hugetlb_fault_mutex_hash(h, current->mm, - &pseudo_vma, - mapping, index, 0); - mutex_lock(&hugetlb_fault_mutex_table[hash]); - /* - * If page is mapped, it was faulted in after being - * unmapped in caller. Unmap (again) now after taking - * the fault mutex. The mutex will prevent faults - * until we finish removing the page. - * - * This race can only happen in the hole punch case. - * Getting here in a truncate operation is a bug. + * A mapped page is impossible as callers should unmap + * all references before calling. And, i_mmap_rwsem + * prevents the creation of additional mappings. */ - if (unlikely(page_mapped(page))) { - BUG_ON(truncate_op); - - i_mmap_lock_write(mapping); - hugetlb_vmdelete_list(&mapping->i_mmap, - index * pages_per_huge_page(h), - (index + 1) * pages_per_huge_page(h)); - i_mmap_unlock_write(mapping); - } + VM_BUG_ON(page_mapped(page)); lock_page(page); /* @@ -470,7 +451,6 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, } unlock_page(page); - mutex_unlock(&hugetlb_fault_mutex_table[hash]); } huge_pagevec_release(&pvec); cond_resched(); @@ -624,7 +604,11 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, /* addr is the offset within the file (zero based) */ addr = index * hpage_size; - /* mutex taken here, fault path and hole punch */ + /* + * fault mutex taken here, protects against fault path + * and hole punch. inode_lock previously taken protects + * against truncation. + */ hash = hugetlb_fault_mutex_hash(h, mm, &pseudo_vma, mapping, index, addr); mutex_lock(&hugetlb_fault_mutex_table[hash]); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 362601b69c56..89e1a253a40b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3760,16 +3760,16 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, } /* - * Use page lock to guard against racing truncation - * before we get page_table_lock. + * We can not race with truncation due to holding i_mmap_rwsem. + * Check once here for faults beyond end of file. */ + size = i_size_read(mapping->host) >> huge_page_shift(h); + if (idx >= size) + goto out; + retry: page = find_lock_page(mapping, idx); if (!page) { - size = i_size_read(mapping->host) >> huge_page_shift(h); - if (idx >= size) - goto out; - /* * Check for page in userfault range */ @@ -3859,9 +3859,6 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, } ptl = huge_pte_lock(h, mm, ptep); - size = i_size_read(mapping->host) >> huge_page_shift(h); - if (idx >= size) - goto backout; ret = 0; if (!huge_pte_none(huge_ptep_get(ptep))) @@ -3964,8 +3961,10 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, /* * Acquire i_mmap_rwsem before calling huge_pte_alloc and hold - * until finished with ptep. This prevents huge_pmd_unshare from - * being called elsewhere and making the ptep no longer valid. + * until finished with ptep. This serves two purposes: + * 1) It prevents huge_pmd_unshare from being called elsewhere + * and making the ptep no longer valid. + * 2) It synchronizes us with file truncation. * * ptep could have already be assigned via huge_pte_offset. That * is OK, as huge_pte_alloc will return the same value unless