From patchwork Mon Apr 1 20:26:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vishal Moola X-Patchwork-Id: 13613000 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C358DCD1288 for ; Mon, 1 Apr 2024 20:27:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ADC8A6B0088; Mon, 1 Apr 2024 16:27:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9EEEE6B008A; Mon, 1 Apr 2024 16:27:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8668A6B008C; Mon, 1 Apr 2024 16:27:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 6B00B6B0088 for ; Mon, 1 Apr 2024 16:27:01 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3893B1A01F4 for ; Mon, 1 Apr 2024 20:27:01 +0000 (UTC) X-FDA: 81962097042.16.665C1DD Received: from mail-yw1-f181.google.com (mail-yw1-f181.google.com [209.85.128.181]) by imf30.hostedemail.com (Postfix) with ESMTP id 4C1F18001C for ; Mon, 1 Apr 2024 20:26:59 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=aOFVR9oL; spf=pass (imf30.hostedemail.com: domain of vishal.moola@gmail.com designates 209.85.128.181 as permitted sender) smtp.mailfrom=vishal.moola@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712003219; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OLJjpIJjwOYqxNhCjRA5yTdkpLaMKAJot06YpeH+/j4=; b=G6D00P4GqATH6VjX9UKkyE9+T/buIKercqVfsDU8JuMjw+nuDSW8MYt/f+4q+8P1hGdKQO nyEqDIjUjVD/C4e7ueMudB7EBk6zpB0KKUOAlkAVHxIohow9E/voeCYRD6xrA9GGeIS08O 6onWsGRSi1Ukki7ratrrbP83M9JdR4o= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712003219; a=rsa-sha256; cv=none; b=BoNB+rRuEN2vdBLWVGUNdz0hTz8doILMfAztjYwFUk1gPxVNUkpY4XswifQ8SRd6NBrGBI ABEoyy1XKGOOytvaUfIjiwlmg+eKqYThcQEipko8jkr1Vs1OVERz/r2Cq0ovxnGLP9nU4A TeFK4ngvb6/BN4dy/fxZQtvB/Ce1mXA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=aOFVR9oL; spf=pass (imf30.hostedemail.com: domain of vishal.moola@gmail.com designates 209.85.128.181 as permitted sender) smtp.mailfrom=vishal.moola@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-yw1-f181.google.com with SMTP id 00721157ae682-6151d2489b4so816367b3.0 for ; Mon, 01 Apr 2024 13:26:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712003218; x=1712608018; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OLJjpIJjwOYqxNhCjRA5yTdkpLaMKAJot06YpeH+/j4=; b=aOFVR9oL1UGkoQUuDnuuwan4eVTOC+r5fzDhhgAi65jnPUISBbbejkjfcQ2pDpYRfI YpNrYjkbLzNl0rGPw7KVVRzpRZIWxYPVR8rzXTRyRkKCS/gAzYGkGKW/y3O+VBK5mw0X sMEVVY607r0wMF/SoCltwO39aV6cKvLBeRcVfrrZEFUpxWoziawWPLafGKCn9RZUcgxq 4swmZPUFfOs0o5Duk8DarUcKXBzfv0tar3mdKAFTkVNoWV0Hq6MGVQJ5u5e8s5jlQL3c YoLcvSmrA88pdyOtIN4k/10otLf9k3kMdXaRVnPA1aDq/9Cos6QxeU7ai1f+iN4EWDAl VRyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712003218; x=1712608018; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OLJjpIJjwOYqxNhCjRA5yTdkpLaMKAJot06YpeH+/j4=; b=QfzVEg2clOhyYkuLSrs16r7yHBYAryAdBuRGlrLkWkQxM4SBALzCINldx8MWZH7Pkk YjCG8qvmgfP1/kD9N33ZV0eG5hLG204jFJQTwDyTTbQpvFS4PipOMnf7Jix8L+Kw4OQW NHYHFVlgWNdPky3KAZlzVykcXCOM0qy3MoBWq3lT70vlIp99e0NIdgcyVCcZobuXlV+W Ac8UIGc+zDmI2zXYbdVb4YuRd5HYMoRGSRIZV2jMWK5tvv5r7rZwVmzJ3kE3nAzdx9x1 z0ViHBsgAxZYybpMdWNfigKz2xNG6DIEURCtfvcvB1IawPmOapbIk2X/r7xT9ObjJC3j VILw== X-Gm-Message-State: AOJu0YzbccteaHkFK/EeuXJt3XpDDSpd9MAs9jwClVZnQkJqu2CFcFJx E5vrCyMD+SyzbRP5rMK/OJMECt3MD9QDMGzX2OpX9fBS8qcSlEOWYKaFqO0X X-Google-Smtp-Source: AGHT+IFFZ8j7M33dVd9jUeXLmj/uGlOxSzCXNksanWE/EfYlyKZd130C69mAXSmbpVhDWvYUfcCEdg== X-Received: by 2002:a0d:e211:0:b0:60a:243:547c with SMTP id l17-20020a0de211000000b0060a0243547cmr9827366ywe.44.1712003216926; Mon, 01 Apr 2024 13:26:56 -0700 (PDT) Received: from fedora.attlocal.net ([2600:1700:2f7d:1800::23]) by smtp.googlemail.com with ESMTPSA id y72-20020a81a14b000000b006142210a31esm1171181ywg.23.2024.04.01.13.26.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Apr 2024 13:26:56 -0700 (PDT) From: "Vishal Moola (Oracle)" To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, akpm@linux-foundation.org, muchun.song@linux.dev, willy@infradead.org, "Vishal Moola (Oracle)" Subject: [PATCH v2 1/3] hugetlb: Convert hugetlb_fault() to use struct vm_fault Date: Mon, 1 Apr 2024 13:26:49 -0700 Message-ID: <20240401202651.31440-2-vishal.moola@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240401202651.31440-1-vishal.moola@gmail.com> References: <20240401202651.31440-1-vishal.moola@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 4C1F18001C X-Rspam-User: X-Stat-Signature: nkeanskbzumx33ogd1pw86t8qf8bcxph X-Rspamd-Server: rspam03 X-HE-Tag: 1712003219-963926 X-HE-Meta: U2FsdGVkX19NyEJUefSyoehiNXAc8fcXZTFz0pLvGI29oO8yz2LMC3itAkqlOI4i7H2I6WRLWDelciCjDrl/XNwpa9zaHWeqEBF5SCqkS1TQw1fMm9ue+knvC6lnsjCU2DdrUc0zLju+VZRtV+I/W6Kd1afX1SNaBjv57lZJ73LtZZql3GF1cCy+N3vIu05MEmXJsjgloMc0S5M/47oAXoeQydH7DXoDWvqhnNWFN8zWCrnKbPpwz6LZFChJNnVzDpmawJYmhbFdnfAT+uL3BxWystY8P19XmayeUSG/+skroc1fzkfIEQIGS3E0tjjPDY9ss8oIQRord6KYwHvbZ4qExblomspBXTAN18NL0Y2QOkqulwuX6zaf90tC9ypdl9lfEkoqxNmRERrPekJwcEwEtHHgIebDychomROSZeiM3t8uMLSPBhdceHe4xLksqEHffcQvCTC7N53plnAQOfPmhe3sDodJ9lqBlDULZj7Pn4UvicjyWaHh2kZA9eTe7VZ4wqHUJgdoYj6FhBz7dui4soB9IGBdhWKLt/wqWItmWVTiGvpbYyOG686o0u54y/9tZUGNKusrpHsJsVkaU2t7kxvcNNTrAMIqgleST3hqprkamBm13FQyBMqqrQlfixuANheybQagznC9QskhwfG5wkRhkiDSf1M2ERK1d6CQmms+0+LoT7NmXTs9Jm77u8PZYAhEa6/a3hWnK8sif/limZcHSkXbe1IMh0LapwaWqc6cy8H+ErCvhx4ypIcJq1VUpx+Et0HgZHqVjdhbx8nxYzH6heZqkHm0nC/oID7plUkixkU3ITgEQZYibrvyWVU6WW76w1YL+tpiMpA5YcoA+elEnshaatoDRo5j2dthjmEUoU2xNEA+b/HHIARgtpFMAfzpWZqRGPApkHf7xR2fbOo+BcSjUbQR6ayC8lUf6i6wkahtuFICmCncXNqwyYGseZ50G+r4ztWJnhQ 2nwDs7Hw RilUU2vKYUlpDCfKVKAN0YwAQWImEJjbDaF4A8Ot0Hle/I6SHfRXdzr5zWzJV3wDnuANp5NqAjohD0gFp08SpS4akMcah7j78gO6kOumv/yqGMVZbr3F8Tud2XfwKFqeCQ2zzSGyR520Arq8KU376kqLtXpDhg/vBVIYieFeJPwC5kSrjU2dbli+gwn0l8OgsmDhtpIT19vIPTQ+uyMsh+68lq0caa98raaOmaYw/q6C1pX5cUpOhDa+l8Khy3tyOKPu/gINS9TfUzMhpSnfSl86MsVvoGGn5vyyzV1raLRgr7qKHPlpt89+Z6SdX8sleXnRvMCCA40gsXrU7sFjVqsYVuRWUiRBe494qmucmVMG/+3d3PyUHE7X3CLHaD3BRbk2QrUwUx1j7oeuQ4PU3+bzWEGncOUVRDLlXBM8He48QB9ZbURnBlCGD9s2VGSEru6riUDdGvj5YtUoBKPO84mMAo3npQyk6x08GckeaSteFXMRlEZoj+ZIBk7NyfDzZ9EMjhMDhGLRkRw+8dxobllYX2LsMMnTLb922vaSVHyo1T2RKTKpxybqw6ac/oi1yNeLT X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now that hugetlb_fault() has a vm_fault available for fault tracking, use it throughout. This cleans up the code by removing 2 variables, and prepares hugetlb_fault() to take in a struct vm_fault argument. Signed-off-by: Vishal Moola (Oracle) Reviewed-by: Oscar Salvador Reviewed-by: Muchun Song --- mm/hugetlb.c | 84 +++++++++++++++++++++++++--------------------------- 1 file changed, 41 insertions(+), 43 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8267e221ca5d..360b82374a89 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6423,8 +6423,6 @@ u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx) vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, unsigned int flags) { - pte_t *ptep, entry; - spinlock_t *ptl; vm_fault_t ret; u32 hash; struct folio *folio = NULL; @@ -6432,13 +6430,13 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, struct hstate *h = hstate_vma(vma); struct address_space *mapping; int need_wait_lock = 0; - unsigned long haddr = address & huge_page_mask(h); struct vm_fault vmf = { .vma = vma, - .address = haddr, + .address = address & huge_page_mask(h), .real_address = address, .flags = flags, - .pgoff = vma_hugecache_offset(h, vma, haddr), + .pgoff = vma_hugecache_offset(h, vma, + address & huge_page_mask(h)), /* TODO: Track hugetlb faults using vm_fault */ /* @@ -6458,22 +6456,22 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, /* * Acquire vma lock before calling huge_pte_alloc and hold - * until finished with ptep. This prevents huge_pmd_unshare from - * being called elsewhere and making the ptep no longer valid. + * until finished with vmf.pte. This prevents huge_pmd_unshare from + * being called elsewhere and making the vmf.pte no longer valid. */ hugetlb_vma_lock_read(vma); - ptep = huge_pte_alloc(mm, vma, haddr, huge_page_size(h)); - if (!ptep) { + vmf.pte = huge_pte_alloc(mm, vma, vmf.address, huge_page_size(h)); + if (!vmf.pte) { hugetlb_vma_unlock_read(vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); return VM_FAULT_OOM; } - entry = huge_ptep_get(ptep); - if (huge_pte_none_mostly(entry)) { - if (is_pte_marker(entry)) { + vmf.orig_pte = huge_ptep_get(vmf.pte); + if (huge_pte_none_mostly(vmf.orig_pte)) { + if (is_pte_marker(vmf.orig_pte)) { pte_marker marker = - pte_marker_get(pte_to_swp_entry(entry)); + pte_marker_get(pte_to_swp_entry(vmf.orig_pte)); if (marker & PTE_MARKER_POISONED) { ret = VM_FAULT_HWPOISON_LARGE; @@ -6488,20 +6486,20 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * mutex internally, which make us return immediately. */ return hugetlb_no_page(mm, vma, mapping, vmf.pgoff, address, - ptep, entry, flags, &vmf); + vmf.pte, vmf.orig_pte, flags, &vmf); } ret = 0; /* - * entry could be a migration/hwpoison entry at this point, so this - * check prevents the kernel from going below assuming that we have - * an active hugepage in pagecache. This goto expects the 2nd page - * fault, and is_hugetlb_entry_(migration|hwpoisoned) check will - * properly handle it. + * vmf.orig_pte could be a migration/hwpoison vmf.orig_pte at this + * point, so this check prevents the kernel from going below assuming + * that we have an active hugepage in pagecache. This goto expects + * the 2nd page fault, and is_hugetlb_entry_(migration|hwpoisoned) + * check will properly handle it. */ - if (!pte_present(entry)) { - if (unlikely(is_hugetlb_entry_migration(entry))) { + if (!pte_present(vmf.orig_pte)) { + if (unlikely(is_hugetlb_entry_migration(vmf.orig_pte))) { /* * Release the hugetlb fault lock now, but retain * the vma lock, because it is needed to guard the @@ -6510,9 +6508,9 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * be released there. */ mutex_unlock(&hugetlb_fault_mutex_table[hash]); - migration_entry_wait_huge(vma, ptep); + migration_entry_wait_huge(vma, vmf.pte); return 0; - } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) + } else if (unlikely(is_hugetlb_entry_hwpoisoned(vmf.orig_pte))) ret = VM_FAULT_HWPOISON_LARGE | VM_FAULT_SET_HINDEX(hstate_index(h)); goto out_mutex; @@ -6526,13 +6524,13 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * determine if a reservation has been consumed. */ if ((flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) && - !(vma->vm_flags & VM_MAYSHARE) && !huge_pte_write(entry)) { - if (vma_needs_reservation(h, vma, haddr) < 0) { + !(vma->vm_flags & VM_MAYSHARE) && !huge_pte_write(vmf.orig_pte)) { + if (vma_needs_reservation(h, vma, vmf.address) < 0) { ret = VM_FAULT_OOM; goto out_mutex; } /* Just decrements count, does not deallocate */ - vma_end_reservation(h, vma, haddr); + vma_end_reservation(h, vma, vmf.address); pagecache_folio = filemap_lock_hugetlb_folio(h, mapping, vmf.pgoff); @@ -6540,17 +6538,17 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, pagecache_folio = NULL; } - ptl = huge_pte_lock(h, mm, ptep); + vmf.ptl = huge_pte_lock(h, mm, vmf.pte); /* Check for a racing update before calling hugetlb_wp() */ - if (unlikely(!pte_same(entry, huge_ptep_get(ptep)))) + if (unlikely(!pte_same(vmf.orig_pte, huge_ptep_get(vmf.pte)))) goto out_ptl; /* Handle userfault-wp first, before trying to lock more pages */ - if (userfaultfd_wp(vma) && huge_pte_uffd_wp(huge_ptep_get(ptep)) && - (flags & FAULT_FLAG_WRITE) && !huge_pte_write(entry)) { + if (userfaultfd_wp(vma) && huge_pte_uffd_wp(huge_ptep_get(vmf.pte)) && + (flags & FAULT_FLAG_WRITE) && !huge_pte_write(vmf.orig_pte)) { if (!userfaultfd_wp_async(vma)) { - spin_unlock(ptl); + spin_unlock(vmf.ptl); if (pagecache_folio) { folio_unlock(pagecache_folio); folio_put(pagecache_folio); @@ -6560,18 +6558,18 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, return handle_userfault(&vmf, VM_UFFD_WP); } - entry = huge_pte_clear_uffd_wp(entry); - set_huge_pte_at(mm, haddr, ptep, entry, + vmf.orig_pte = huge_pte_clear_uffd_wp(vmf.orig_pte); + set_huge_pte_at(mm, vmf.address, vmf.pte, vmf.orig_pte, huge_page_size(hstate_vma(vma))); /* Fallthrough to CoW */ } /* - * hugetlb_wp() requires page locks of pte_page(entry) and + * hugetlb_wp() requires page locks of pte_page(vmf.orig_pte) and * pagecache_folio, so here we need take the former one * when folio != pagecache_folio or !pagecache_folio. */ - folio = page_folio(pte_page(entry)); + folio = page_folio(pte_page(vmf.orig_pte)); if (folio != pagecache_folio) if (!folio_trylock(folio)) { need_wait_lock = 1; @@ -6581,24 +6579,24 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, folio_get(folio); if (flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) { - if (!huge_pte_write(entry)) { - ret = hugetlb_wp(mm, vma, address, ptep, flags, - pagecache_folio, ptl, &vmf); + if (!huge_pte_write(vmf.orig_pte)) { + ret = hugetlb_wp(mm, vma, address, vmf.pte, flags, + pagecache_folio, vmf.ptl, &vmf); goto out_put_page; } else if (likely(flags & FAULT_FLAG_WRITE)) { - entry = huge_pte_mkdirty(entry); + vmf.orig_pte = huge_pte_mkdirty(vmf.orig_pte); } } - entry = pte_mkyoung(entry); - if (huge_ptep_set_access_flags(vma, haddr, ptep, entry, + vmf.orig_pte = pte_mkyoung(vmf.orig_pte); + if (huge_ptep_set_access_flags(vma, vmf.address, vmf.pte, vmf.orig_pte, flags & FAULT_FLAG_WRITE)) - update_mmu_cache(vma, haddr, ptep); + update_mmu_cache(vma, vmf.address, vmf.pte); out_put_page: if (folio != pagecache_folio) folio_unlock(folio); folio_put(folio); out_ptl: - spin_unlock(ptl); + spin_unlock(vmf.ptl); if (pagecache_folio) { folio_unlock(pagecache_folio);