From patchwork Sat Feb 18 00:27:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145386 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 125B7C64ED8 for ; Sat, 18 Feb 2023 00:29:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1C3A1280010; Fri, 17 Feb 2023 19:29:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D821280002; Fri, 17 Feb 2023 19:29:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E92F2280010; Fri, 17 Feb 2023 19:29:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id DABE5280002 for ; Fri, 17 Feb 2023 19:29:05 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id BBF471605D5 for ; Sat, 18 Feb 2023 00:29:05 +0000 (UTC) X-FDA: 80478527850.14.1C521A1 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf01.hostedemail.com (Postfix) with ESMTP id F068140016 for ; Sat, 18 Feb 2023 00:29:03 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=cJVTi2N6; spf=pass (imf01.hostedemail.com: domain of 3zxvwYwoKCOwXhVciUVhcbUccUZS.QcaZWbil-aaYjOQY.cfU@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3zxvwYwoKCOwXhVciUVhcbUccUZS.QcaZWbil-aaYjOQY.cfU@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680144; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Wqs0fsEFEjqd+rJsTObxpTCfifnDhyGaJkzr9TFCSvk=; b=Jf97shFll6jxx1sCo83P2+OAQnTiBizM2I0wMWPSWSSzOPlSsczr+o1DypAertTO4dgjwK peFWQx+HwkBkswIZKs/qJGTwDz7z1iUpCG5x1jywUgI8xw1husGKsZvdf9n1C3cAxx0b6B ViBS/0df01e47IONU8GlaMj8mzk1q28= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=cJVTi2N6; spf=pass (imf01.hostedemail.com: domain of 3zxvwYwoKCOwXhVciUVhcbUccUZS.QcaZWbil-aaYjOQY.cfU@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3zxvwYwoKCOwXhVciUVhcbUccUZS.QcaZWbil-aaYjOQY.cfU@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680144; a=rsa-sha256; cv=none; b=KmQ6CrjQrmpRoH2TLfhr7hItvlhd8KXkv80o/MfUJR+JOsYCZYVYtRY392sTZA3P1En3bl 9+8n34aHDaivpWMVwuxNfuC7KRqxR1sRV3ZU3lde7EIVzm58pqICz/3emWSYt+b305qLQo zGRA+uS7eCwKjWwW7hp2TsrMpkUdP3U= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-5365a8dd33aso18698887b3.22 for ; Fri, 17 Feb 2023 16:29:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Wqs0fsEFEjqd+rJsTObxpTCfifnDhyGaJkzr9TFCSvk=; b=cJVTi2N6kM66ucMEEndfS5PbMZx/snZYIaX4pqh8wHLpb03XCb5lJGQNTsMPX9uGLM 3wrM65Jw9kjrzso9alDlQrPBuVrtt7sPBKbdMEoD3wjSNN9aVSq6XDimkW/z3s2yU53N 01HcXY5734HriS3KEBsLiLM4XpS0m8aYkKnF+jbQkjlqFDqIpmrep+3g8RZmFM7dnjig IH2rbnvxfAr1zdpMGTaaiYQFb5fCvbDD9c/lVJg17L/SIZ0Al4fOA+YIO6cW7+D3MDjz 3rGL2IRT+C4f2EXWISMVmkij1Ys+Tetb9JtllXr0s2awCerx9KT+6hSElNdAsCbda374 SHaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Wqs0fsEFEjqd+rJsTObxpTCfifnDhyGaJkzr9TFCSvk=; b=NDihqBDkIvrZHDbEp2AIsk6MRe1otFIXeIM8CFedLerRiUQYyenv52ibsSMpIL7rZR tHYE5wU2dI5K/3MJ7StGJqHz/Er5XVMV0r2mw5ah4kkivnHsm/tVUU6gtlwRFn5ent2+ 0tCSRVMqTalOSTMAC2Xc0j3tdsk1Rq2PpPAkLzg6lwZboWFcgp5pusZ+3sAihLIQUxjz ZXJpV3h323jEQEjyU7/qdIEZurqNhBx1QicUhHYwII4VA0kBXyusU9kBQabqdCFJ4+9u oyvzb+pdz79qbL/OBO5v1yC4GOtTl7mqhxLLVe4aj68JYH3I/wNHIsc5OMvbsn2dy9XM du3g== X-Gm-Message-State: AO0yUKVaRuKbn6BXCJuRBuseVd1B+7EJ6Re4+Losjfj2tGytkeF24WQG OXIY5fdA2Urcf8axO4FHQKoLEGF7Fjb4G6MY X-Google-Smtp-Source: AK7set8Vb2mEtNk6xau6GwJBBjn11LS8z+AdmXnFLSNyWOrweHaxia9rgscf+J08KFziUFJlJkWb+B9Wav4nPE21 X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:711:0:b0:97a:956d:6a4 with SMTP id g17-20020a5b0711000000b0097a956d06a4mr36513ybq.5.1676680143155; Fri, 17 Feb 2023 16:29:03 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:53 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-21-jthoughton@google.com> Subject: [PATCH v2 20/46] hugetlb: add HGM support to follow_hugetlb_page From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: F068140016 X-Stat-Signature: gnkopqy7bimu5qoedxqts8oy6dffb4ck X-HE-Tag: 1676680143-387855 X-HE-Meta: U2FsdGVkX1+/9Z4MvlYqc2ki5dFlxiQhQLzmhxbmUXzlOU17r6bHKJcmh//ErCwALAA1Yz9f/nncn3HJucwMrrB+ZF9w2OThSSy6Kk1t6V1F2MbjR7m1bwomiFyLs3iB7dhpxGQ6FT/UT8tsj0VBDgPt1qiq9tQC51z/TeOL0DFiP30V3GqNeaxZXIRukLLLoKJ72bl4y2axfw/R6PcdEXwlLrSVoAJo6ubHaWPVEScum2aDIfT0crS3d8VjmVec26ydEuRJiSoqbxDFXCKJhMAfrSBkv6WEH+AXUvDqy3cWv9dX7jyoFczsn3LBRgjLLEY+w8IJa8UVhUL3X6wS5g5D5TG89RC0qC6TCza7U2VT7btiDkG70TlleZ/ORD3h5fg2tHYuXG1c3b7YPXKiOM6Ygp0w+JI/x2/GND9pwziSX+O/w9p9FHLIkF2cvgyEC0dNmlqMykivPcmJOewLrWhhLoMrXQ1euPSVu0r9u1HgSHf5EJycTcrPoyad8eZpxicL9CxSOE//CrjmOb6R04izwgLZCduhDVoozz272Ow2AKJMlbW3kRWGHfM0SgJXFb3JLSIADUSgjdFFFVsqK/efKgy1VFIfrnkq4/2289GwKey8ZNWuddQjd0lx0Xm0SCY42lbPyEqUFuJmI+mk8aNQPrQ19c0cbFW/qo+ypr+3hNTDn5Gqy93Qty5rFMAzKQWlu8sdcJ3oOMa6DYYTPPRr1zBwObjCyV+0GbI3AuchHI9GIqWSfueKaqDpCm1KssRJYMjszxa9NuROWinEaS8Da1kZb6ZFauN2R9CUuVk3QkJRH1wuCs3zkVbTpVSNUnsi+Dntk/z6NBVq2HH0hAuzUbvNMh1xkpZTmScdP+CfUXTfTYj1mCMRF3U5zhXz5mwUPng32eUL/28wekh6AldhuXGMn2xOM4crN0nw+mzPlJqi//I7lFKRhEyexD/RGwYS5eAIwq9Rw8g8Ilm sxtZJT4U +6Bs1cERUcPoWgj4JMOfaZwMCkp3/q4abYvr+1uQupdlsi97/Zw2rbnighdrXEEe2UfjWvojXEVrLa//0PuSX8e0LOQuijaNo7N/3O4h17Hk2ZZD9+1Wmi+8cMQ/QwoRct5Ypb2OymfwPOmwiakrI/EE7Edr2z0BPELb/Bembr/KmIHeQ4Fnhi4zcq0yKlXaOu07Qdfa27tRIS5zyVjDT2eqEjd9OyZB5Uzp1mFEvi3VYVaXJ6eN0DS6qDKef+mJBnHxPKq2Zx065KkfLedVZzmQmwFqpjUs51npPz527cYZ+xl3ylmC6BwFhciwF7kIO+QtAVwck/87HlV0lSZ5hrx8Z6wiXeXvjWLrPs02pziec21sywg1WAiGY9SZEdkC96jhPNrNgPI4GcreeyIjgRGm9zOVY7P31hkEpYr9Bax1Mlw27FSHLShM2yV7dKyrNPGsYea5nlWyzqEs01LGMvlqEXCXqp5fCqIifrPE0g8UAi5NJ6xovJTjf1PD6lVfRZ6mmHeBjMSacKMr+rPvMOc7W6jhkm4LcNcDZxRIUISyiyIrthjM/v0WZOshfvDpWK3KAO7oD1uz4l0PyUIkHFjWoU2LnQmPUWxU8/c8qOYheEJfX9O5hOn82VxtUwhBrhZC0MC0WizBpCHsmlMCYAfEycVauLz+Cy4uLzPWGvqVwintkSD4iQbeDHwhF0RAJIv+lCHa+rNndCmU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Enable high-granularity mapping support in GUP. In case it is confusing, pfn_offset is the offset (in PAGE_SIZE units) that vaddr points to within the subpage that hpte points to. Signed-off-by: James Houghton diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7321c6602d6f..c26b040f4fb5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6634,11 +6634,9 @@ static void record_subpages_vmas(struct page *page, struct vm_area_struct *vma, } static inline bool __follow_hugetlb_must_fault(struct vm_area_struct *vma, - unsigned int flags, pte_t *pte, + unsigned int flags, pte_t pteval, bool *unshare) { - pte_t pteval = huge_ptep_get(pte); - *unshare = false; if (is_swap_pte(pteval)) return true; @@ -6713,11 +6711,13 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, int err = -EFAULT, refs; while (vaddr < vma->vm_end && remainder) { - pte_t *pte; + pte_t *ptep, pte; spinlock_t *ptl = NULL; bool unshare = false; int absent; - struct page *page; + unsigned long pages_per_hpte; + struct page *page, *subpage; + struct hugetlb_pte hpte; /* * If we have a pending SIGKILL, don't keep faulting pages and @@ -6734,13 +6734,19 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, * each hugepage. We have to make sure we get the * first, for the page indexing below to work. * - * Note that page table lock is not held when pte is null. + * hugetlb_full_walk will mask the address appropriately. + * + * Note that page table lock is not held when ptep is null. */ - pte = hugetlb_walk(vma, vaddr & huge_page_mask(h), - huge_page_size(h)); - if (pte) - ptl = huge_pte_lock(h, mm, pte); - absent = !pte || huge_pte_none(huge_ptep_get(pte)); + if (hugetlb_full_walk(&hpte, vma, vaddr)) { + ptep = NULL; + absent = true; + } else { + ptl = hugetlb_pte_lock(&hpte); + ptep = hpte.ptep; + pte = huge_ptep_get(ptep); + absent = huge_pte_none(pte); + } /* * When coredumping, it suits get_dump_page if we just return @@ -6751,13 +6757,21 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, */ if (absent && (flags & FOLL_DUMP) && !hugetlbfs_pagecache_present(h, vma, vaddr)) { - if (pte) + if (ptep) spin_unlock(ptl); hugetlb_vma_unlock_read(vma); remainder = 0; break; } + if (!absent && pte_present(pte) && + !hugetlb_pte_present_leaf(&hpte, pte)) { + /* We raced with someone splitting the PTE, so retry. */ + spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); + continue; + } + /* * We need call hugetlb_fault for both hugepages under migration * (in which case hugetlb_fault waits for the migration,) and @@ -6773,7 +6787,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, vm_fault_t ret; unsigned int fault_flags = 0; - if (pte) + if (ptep) spin_unlock(ptl); hugetlb_vma_unlock_read(vma); @@ -6822,8 +6836,10 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, continue; } - pfn_offset = (vaddr & ~huge_page_mask(h)) >> PAGE_SHIFT; - page = pte_page(huge_ptep_get(pte)); + pfn_offset = (vaddr & ~hugetlb_pte_mask(&hpte)) >> PAGE_SHIFT; + subpage = pte_page(pte); + pages_per_hpte = hugetlb_pte_size(&hpte) / PAGE_SIZE; + page = compound_head(subpage); VM_BUG_ON_PAGE((flags & FOLL_PIN) && PageAnon(page) && !PageAnonExclusive(page), page); @@ -6833,22 +6849,22 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, * and skip the same_page loop below. */ if (!pages && !vmas && !pfn_offset && - (vaddr + huge_page_size(h) < vma->vm_end) && - (remainder >= pages_per_huge_page(h))) { - vaddr += huge_page_size(h); - remainder -= pages_per_huge_page(h); - i += pages_per_huge_page(h); + (vaddr + hugetlb_pte_size(&hpte) < vma->vm_end) && + (remainder >= pages_per_hpte)) { + vaddr += hugetlb_pte_size(&hpte); + remainder -= pages_per_hpte; + i += pages_per_hpte; spin_unlock(ptl); hugetlb_vma_unlock_read(vma); continue; } /* vaddr may not be aligned to PAGE_SIZE */ - refs = min3(pages_per_huge_page(h) - pfn_offset, remainder, + refs = min3(pages_per_hpte - pfn_offset, remainder, (vma->vm_end - ALIGN_DOWN(vaddr, PAGE_SIZE)) >> PAGE_SHIFT); if (pages || vmas) - record_subpages_vmas(nth_page(page, pfn_offset), + record_subpages_vmas(nth_page(subpage, pfn_offset), vma, refs, likely(pages) ? pages + i : NULL, vmas ? vmas + i : NULL);