From patchwork Thu Aug 23 20:59:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 10574575 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BB9F71390 for ; Thu, 23 Aug 2018 20:59:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AE58E2C7A0 for ; Thu, 23 Aug 2018 20:59:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A26512C7A5; Thu, 23 Aug 2018 20:59:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B2662C7A0 for ; Thu, 23 Aug 2018 20:59:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 12A256B2C0D; Thu, 23 Aug 2018 16:59:34 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E8FB26B2C0A; Thu, 23 Aug 2018 16:59:33 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE3536B2C0C; Thu, 23 Aug 2018 16:59:33 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt0-f199.google.com (mail-qt0-f199.google.com [209.85.216.199]) by kanga.kvack.org (Postfix) with ESMTP id A09E76B2C09 for ; Thu, 23 Aug 2018 16:59:33 -0400 (EDT) Received: by mail-qt0-f199.google.com with SMTP id u13-v6so5848362qtb.18 for ; Thu, 23 Aug 2018 13:59:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=Jnapy2dSeCZmGcaolQKlPhRUp8AZAz1GwTF8DBhWeqY=; b=YYmoMdmPXkQDUPuS+4vti9Mggdud44f2ulKzH9U9VJpt0urBfisgDdkw2HrNjkataw lLZTz7LNsJNkR2Ihkz11l+fzhzrPLLxROQu38xqmf6KjBFUlHV7Hgjgy2+IvfK7Lgwlx tEhz0BiEg9U87hxgOkw5hB/GR1GgSCrWq6TFlMih8ub6W5FKwhF8wundFkF2Hpbj5Io9 cFhAO8OmIkj0cu8yPQosDMQibzQl8aKdaPk88XlbZyYqHilTPErrRLL+4xE6fX78M8Mf vi/0DLJflfDpEcHGcs8Zz9VajRxowEO98ycOOxjovUf4tUgvoa6xhOPTzqZffAE0K6OO VRuw== X-Gm-Message-State: AOUpUlFVMPnG2cydrC1iK0XAdGSNY7ZikjokrRZq6qHOJwlSzQVHsjsH YjOnj2LIjuu3zDCD5/IhJhhzz+KBA2P24pYiuml5XBNriEpWWsEgBCakstd5zNLHc2ma4kNq6fr moBGl/mF+Py434l7uS/mC83E5hxlDgHlKvkeNsK7Eh0pjrd8iAPFlvCLInbb7mEerRg== X-Received: by 2002:a37:48e:: with SMTP id 136-v6mr30955636qke.26.1535057973190; Thu, 23 Aug 2018 13:59:33 -0700 (PDT) X-Google-Smtp-Source: AA+uWPx6Az5trF/olSGYGkNF2y9wZHq4zo1p0z3vYmCWo/d2WcEBQWB1ugr/qDJ/4Y77ORnMNaMo X-Received: by 2002:a37:48e:: with SMTP id 136-v6mr30955599qke.26.1535057972420; Thu, 23 Aug 2018 13:59:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535057972; cv=none; d=google.com; s=arc-20160816; b=b4NU2yMDGQUM/mOoYktdKOcjZAG8VS3KulEuMF268Od0dPG9X/etqw+63nD3uIUhde KPMw80OSGCQfxKytlR4Y0nmcNHF3y4BBrWUq/IpRnMGBhBngG7+IO1GLp3CoY3dliSv7 mlO30gy96yVb3hx/FzIbWQEffk66mo1SFKirYTU8+ZvFGNrqqJ6uiN4vrRElekl6sqeB p0kVhKGXKaoKz7ruJAQj7bafUjj7bEDGZsrnuX0ZDJJGr3kyaqxaAs6gPNaY2HEe1FsP 6/+bHw6N2sSwuvLAw//nERKJT63EnhSph5zcqcaKoU6OQb+fu/wXASxO2/nj1/FirCX4 OMUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=Jnapy2dSeCZmGcaolQKlPhRUp8AZAz1GwTF8DBhWeqY=; b=P6T2eArl0VQ1uBGDXe+wFhqcZAPaJOdOi2VU7D/0T3HDAr7DJJAZBL3knwSdCJTYCU coq6i3B7Nce4/+CX7yeH7ZFAoRn3VYUNgO7vCe7WrDvO5T0CV+zCl98IGQpj4WYhCptm kn02FWNVZukUR7cW1akjhxtOi6PnvWioSXw3++eOpDja4yoRpLasrN9lPMRWrwAXMKFP JKaVg35FB5mao69ZivOcQJ31C8oC1BFO4FYnXK9zWccgiR0n6VXBN7Dyd4te40aCXO3N q2thJAGF33rO52VFVKLNWFwar2mYrOadq5UPfM9WtGCvKnKE3JMHHUqWqYSwpnV0B8Di oS7A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b="pGv3hX/e"; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.85 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from userp2120.oracle.com (userp2120.oracle.com. [156.151.31.85]) by mx.google.com with ESMTPS id m79-v6si3260751qkm.309.2018.08.23.13.59.32 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 23 Aug 2018 13:59:32 -0700 (PDT) Received-SPF: pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.85 as permitted sender) client-ip=156.151.31.85; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b="pGv3hX/e"; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.85 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w7NKwZAq129318; Thu, 23 Aug 2018 20:59:28 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=Jnapy2dSeCZmGcaolQKlPhRUp8AZAz1GwTF8DBhWeqY=; b=pGv3hX/e+0E6Ou15ulvm5c5cnAafH7IBsCUbkgBC3GfSrzC4LuXFnxZxOLrJwXh2Ifim GDSo3NvMebhnppNOp11SwZzYFJ2gMuBWXrGMw2XEP3Exo0ZNP7OS3h3TTaiTY39K1V3y KPrOj1enNIbdWKNV+na0ID2lQsrQXA8LzF/V28gm5QBjT+Zwc+9/jm9R7mICN2r5q08+ KL1y+QDhRMUWLk+3QwHUISxCqh4EzqzZfG3iv9miocPrzXBvA4ShSfSDv6SJ8EfoJVV9 284WA7NgEojXo/9a+VqrtvnJFuRlQ3yPnAK/NVNjAMK1SQVqHJjYMBFxBEzQg/2D1jql Fw== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp2120.oracle.com with ESMTP id 2kxc3r3pfd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 Aug 2018 20:59:27 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w7NKxRuw005245 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 Aug 2018 20:59:27 GMT Received: from abhmp0003.oracle.com (abhmp0003.oracle.com [141.146.116.9]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w7NKxRGo015296; Thu, 23 Aug 2018 20:59:27 GMT Received: from monkey.oracle.com (/50.38.38.67) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 23 Aug 2018 13:59:26 -0700 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: "Kirill A . Shutemov" , =?utf-8?b?SsOp?= =?utf-8?b?csO0bWUgR2xpc3Nl?= , Vlastimil Babka , Naoya Horiguchi , Davidlohr Bueso , Michal Hocko , Andrew Morton , Mike Kravetz Subject: [PATCH v6 2/2] hugetlb: take PMD sharing into account when flushing tlb/caches Date: Thu, 23 Aug 2018 13:59:17 -0700 Message-Id: <20180823205917.16297-3-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180823205917.16297-1-mike.kravetz@oracle.com> References: <20180823205917.16297-1-mike.kravetz@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8994 signatures=668707 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=822 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1808230215 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP When fixing an issue with PMD sharing and migration, it was discovered via code inspection that other callers of huge_pmd_unshare potentially have an issue with cache and tlb flushing. Use the routine adjust_range_if_pmd_sharing_possible() to calculate worst case ranges for mmu notifiers. Ensure that this range is flushed if huge_pmd_unshare succeeds and unmaps a PUD_SUZE area. Signed-off-by: Mike Kravetz Reviewed-by: Naoya Horiguchi --- mm/hugetlb.c | 53 +++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 44 insertions(+), 9 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a73c5728e961..082cddf46b4f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3333,8 +3333,8 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, struct page *page; struct hstate *h = hstate_vma(vma); unsigned long sz = huge_page_size(h); - const unsigned long mmun_start = start; /* For mmu_notifiers */ - const unsigned long mmun_end = end; /* For mmu_notifiers */ + unsigned long mmun_start = start; /* For mmu_notifiers */ + unsigned long mmun_end = end; /* For mmu_notifiers */ WARN_ON(!is_vm_hugetlb_page(vma)); BUG_ON(start & ~huge_page_mask(h)); @@ -3346,6 +3346,11 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, */ tlb_remove_check_page_size_change(tlb, sz); tlb_start_vma(tlb, vma); + + /* + * If sharing possible, alert mmu notifiers of worst case. + */ + adjust_range_if_pmd_sharing_possible(vma, &mmun_start, &mmun_end); mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); address = start; for (; address < end; address += sz) { @@ -3356,6 +3361,10 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, ptl = huge_pte_lock(h, mm, ptep); if (huge_pmd_unshare(mm, &address, ptep)) { spin_unlock(ptl); + /* + * We just unmapped a page of PMDs by clearing a PUD. + * The caller's TLB flush range should cover this area. + */ continue; } @@ -3438,12 +3447,23 @@ void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start, { struct mm_struct *mm; struct mmu_gather tlb; + unsigned long tlb_start = start; + unsigned long tlb_end = end; + + /* + * If shared PMDs were possibly used within this vma range, adjust + * start/end for worst case tlb flushing. + * Note that we can not be sure if PMDs are shared until we try to + * unmap pages. However, we want to make sure TLB flushing covers + * the largest possible range. + */ + adjust_range_if_pmd_sharing_possible(vma, &tlb_start, &tlb_end); mm = vma->vm_mm; - tlb_gather_mmu(&tlb, mm, start, end); + tlb_gather_mmu(&tlb, mm, tlb_start, tlb_end); __unmap_hugepage_range(&tlb, vma, start, end, ref_page); - tlb_finish_mmu(&tlb, start, end); + tlb_finish_mmu(&tlb, tlb_start, tlb_end); } /* @@ -4309,11 +4329,21 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, pte_t pte; struct hstate *h = hstate_vma(vma); unsigned long pages = 0; + unsigned long f_start = start; + unsigned long f_end = end; + bool shared_pmd = false; + + /* + * In the case of shared PMDs, the area to flush could be beyond + * start/end. Set f_start/f_end to cover the maximum possible + * range if PMD sharing is possible. + */ + adjust_range_if_pmd_sharing_possible(vma, &f_start, &f_end); BUG_ON(address >= end); - flush_cache_range(vma, address, end); + flush_cache_range(vma, f_start, f_end); - mmu_notifier_invalidate_range_start(mm, start, end); + mmu_notifier_invalidate_range_start(mm, f_start, f_end); i_mmap_lock_write(vma->vm_file->f_mapping); for (; address < end; address += huge_page_size(h)) { spinlock_t *ptl; @@ -4324,6 +4354,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, if (huge_pmd_unshare(mm, &address, ptep)) { pages++; spin_unlock(ptl); + shared_pmd = true; continue; } pte = huge_ptep_get(ptep); @@ -4359,9 +4390,13 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, * Must flush TLB before releasing i_mmap_rwsem: x86's huge_pmd_unshare * may have cleared our pud entry and done put_page on the page table: * once we release i_mmap_rwsem, another task can do the final put_page - * and that page table be reused and filled with junk. + * and that page table be reused and filled with junk. If we actually + * did unshare a page of pmds, flush the range corresponding to the pud. */ - flush_hugetlb_tlb_range(vma, start, end); + if (shared_pmd) + flush_hugetlb_tlb_range(vma, f_start, f_end); + else + flush_hugetlb_tlb_range(vma, start, end); /* * No need to call mmu_notifier_invalidate_range() we are downgrading * page table protection not changing it to point to a new page. @@ -4369,7 +4404,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, * See Documentation/vm/mmu_notifier.rst */ i_mmap_unlock_write(vma->vm_file->f_mapping); - mmu_notifier_invalidate_range_end(mm, start, end); + mmu_notifier_invalidate_range_end(mm, f_start, f_end); return pages << h->order; }