From patchwork Fri Dec 9 17:00:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13070014 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF293C4332F for ; Fri, 9 Dec 2022 17:01:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 765868E0005; Fri, 9 Dec 2022 12:01:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6EECD8E0001; Fri, 9 Dec 2022 12:01:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 569448E0005; Fri, 9 Dec 2022 12:01:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 4402B8E0001 for ; Fri, 9 Dec 2022 12:01:18 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 00FD51612A3 for ; Fri, 9 Dec 2022 17:01:17 +0000 (UTC) X-FDA: 80223383436.28.A9D5B94 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf18.hostedemail.com (Postfix) with ESMTP id B05831C0021 for ; Fri, 9 Dec 2022 17:01:14 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="h/06BBKf"; spf=pass (imf18.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670605274; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=W6TRH+D4KiiNcSkU8trC/D1fuTVk/3U3ig0mjyWFdEw=; b=HN6JfYksooF7/+CGvjYjRVC7JNd9ohDlJipkLOd96G/uFiRiCewAAjbwFl9W1jAMVmxKKl 9t4lJsXb2UE/sONHBYZw1GGMT26ld1q/pvyS9LjGUCgATaJcP2BVfk3d7h2HmbHwM6TAbs E4x9dqUJf1vqkePW91u/h9eYc3ncNPs= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="h/06BBKf"; spf=pass (imf18.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670605274; a=rsa-sha256; cv=none; b=qiMGdAtVwz4o2RB/fR3Xoc2NPFUraCeWHzOrCT67So2dXSZx6UEP6FnCxnZr40tflMncqw Iv5pSXSQYRETRyYpE1kHrG5uZaOUmxMZX4ltvpt95tKEHUGpyPTsZBQRluXYu0hUzRoxcY ucpiADy/F5WpTfceRn5r6STzWaGZld4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670605273; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=W6TRH+D4KiiNcSkU8trC/D1fuTVk/3U3ig0mjyWFdEw=; b=h/06BBKfpncYFUep+eykfC6o4wSh6uV5/RYQAYCRoK+5QuTH+5Ym+6RKf1/PVdLjAbEIi+ 09leC7tOsk6TxQMhahuK4W0QCnRwOFT1XBY2EIuHVHJ2+vycyqbjmKpV6BaVpXW/pCGzIx 2aSps95QA5Axn3BmejmayMvieBx6j10= Received: from mail-oa1-f70.google.com (mail-oa1-f70.google.com [209.85.160.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-542-SZ4RfORDP-GkQupA5zCNqA-1; Fri, 09 Dec 2022 12:01:12 -0500 X-MC-Unique: SZ4RfORDP-GkQupA5zCNqA-1 Received: by mail-oa1-f70.google.com with SMTP id 586e51a60fabf-1445373be54so144658fac.7 for ; Fri, 09 Dec 2022 09:01:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=W6TRH+D4KiiNcSkU8trC/D1fuTVk/3U3ig0mjyWFdEw=; b=x5Up5K01f3dJwlSPPYseJe13q2Pr83+iq1+dmZOO4kjRPiHxnjrEL0lmjD9I+tX2Jk krTNxvChYM/9AVelDhPinAmV9p4+kj8OPol02NrEcXVUVtZRks1/sBLvMlru0V0ihXAH cvRfPp3FsWC9AwlioPonvyW4S9WXTQZ7mxF4SjWjHWuwwkqfLO4NyjrdKJkVAclm9QDb 61G1WqCWYPzx4PYt/bVsZfDKwYORN452vYe7Yos739FhDlap7IGS+dp6XmpoWGR8qq/F 3Xsi3cXySG9MlIqvbRIeKY+VHBe3rW3QjyKQWbBCyJPxn42nxfzi0+YsJw4Dx4gmMISZ fNLg== X-Gm-Message-State: ANoB5pnfluozeJW/hZarY3JSQIJBsccz6CHG2AWDnOUKZqjWj1QLg9we WWY5kZ9CwWPsvJ1KaqiypFEVj9ZhspnudhapqiSRne+entv8LyUWlJy0E/L0jSSlGWeoBPR15eF BuedFFfe5Pq6Z+30X4UCHxVSU2Q0Xabkgs64wN3VR6e/k7fGV+nqCuEEsTlTX X-Received: by 2002:a05:6870:ed8e:b0:144:c281:11ec with SMTP id fz14-20020a056870ed8e00b00144c28111ecmr3491126oab.38.1670605270812; Fri, 09 Dec 2022 09:01:10 -0800 (PST) X-Google-Smtp-Source: AA0mqf6WHTJqwwVeVsgY0NWBk0jYEVKINaa6MKI3WEpB+yvuS9iLQU0VCefP3qH0KxAb0MhxkQi25w== X-Received: by 2002:a05:6870:ed8e:b0:144:c281:11ec with SMTP id fz14-20020a056870ed8e00b00144c28111ecmr3491088oab.38.1670605270459; Fri, 09 Dec 2022 09:01:10 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id q7-20020a05620a0d8700b006cf38fd659asm178907qkl.103.2022.12.09.09.01.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Dec 2022 09:01:06 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrew Morton , Miaohe Lin , David Hildenbrand , Nadav Amit , peterx@redhat.com, Andrea Arcangeli , Jann Horn , John Hubbard , Mike Kravetz , James Houghton , Rik van Riel , Muchun Song Subject: [PATCH v3 1/9] mm/hugetlb: Let vma_offset_start() to return start Date: Fri, 9 Dec 2022 12:00:52 -0500 Message-Id: <20221209170100.973970-2-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221209170100.973970-1-peterx@redhat.com> References: <20221209170100.973970-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: B05831C0021 X-Stat-Signature: oxzsfzwkbr45ou5p6qk56npu9y7ima9p X-HE-Tag: 1670605274-933348 X-HE-Meta: U2FsdGVkX19O92BYLfcMtwUTTCJakyzUfTyqG9RtlRcxvjnjbitAEyLNgCpJCC3LsbmWDkeYxSY93NKPt0sDmsc/YOJyDh7tag0oJrctLRhSI7q4YomGkk20oEYZ43Fj1mTBHr7YThkXj+2HIuHlHigaiFG80fyNrJoscithEfzySBGThcfoUO9THwNEGqOe/CVo4dHHniMO3pudUzTeWtsqDAJX1wyjmPuno5NCil2k2LlgnEhQFgD5SgNDJQ7xObnk60Wx2jms9PtTYMLFh9F9I6i4Y4GSDwmJKJm74bZCNNyw0e3cN0w+zjcYo1gjCsvJakvbx9DwTyDraQqPvTRdoWFxdeTJDGpVU3ja8WR+iCtcjxrQbHHgXhQJ0LuSykiLarVZBMmFs3uDGkKdAO2Y05dm7GwnJYyVdSKP1F93eePBr0WfTsWBo/6X+zO9jMolvV6ZqJhXAPIQYvbohO754XkILvlGizZznX6lpgkym8HRJS8SMDHFK6axVTUJvAxybmDhX0L24yhJVY0KPZn94ApDWBQSS7szPBy5Sk1M7jXcjfWhCYONHmEkK9hEA/ZOUUCLblFZmzfkif/+S98cP5jniZ+nLonStBiJIXj0tY/9Jxb5TdUgaKXVUX5YYgDNr3Cv0aUG11PiZfFLDk9qlOYgoOKNjTnlJX+JmGLOwhhUEcAtQCj8cAfnCwCOtt4adcUR2k800j1REpHMg4/hJW3hdEhGWWvhcVYi4NfB0aRnU+IiIL6MyDxdo/og6fEqqu/wtNa7qs5/dSoVfbt3GSF3q5IGz0Rk+lpT+0DAtOiBfpC/QuPIULcerabYCa3OOJg6IEsjIgtuq2GpnznHxWn7YPLqBjQQ4V4P7dfKcTh4h39pkB9OUtl7yPk78ccQBEG61YdvnNdPp6VGbfoReTFZOXVv2Q+QI4SyvW0qW41NzuW7oLGoGXu+HJ92cNFG09v+BDUtwE6VaIm 63hoPExa xiBh9s2tTdAYrw5sXxWcjf0CFKZLD0tTPULw6sGfvBkgpQcUupL/ExlAD25DdeqANWQX+kzW06JIrzaTuOuC4ZNrRmkSHlo7xp5t5LEkDH3WowKH75JZAZPOuzCQawLNST8gej+aVKYOO8T8F4FWBdsbkE+ds7IkZPMZOImU57dDHT4OTgwyIaNqCoQqm/z9OhPSg4WByI0qUGz4TzU51VL5SDf9KFfSXkpWOvOPtcQDjftLwkIoHezROZqCQuumD0LQrqrOwI9WAcpQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Even though vma_offset_start() is named like that, it's not returning "the start address of the range" but rather the offset we should use to offset the vma->vm_start address. Make it return the real value of the start vaddr, and it also helps for all the callers because whenever the retval is used, it'll be ultimately added into the vma->vm_start anyway, so it's better. Reviewed-by: Mike Kravetz Reviewed-by: David Hildenbrand Reviewed-by: John Hubbard Signed-off-by: Peter Xu --- fs/hugetlbfs/inode.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 790d2727141a..fdb16246f46e 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -412,10 +412,12 @@ static bool hugetlb_vma_maps_page(struct vm_area_struct *vma, */ static unsigned long vma_offset_start(struct vm_area_struct *vma, pgoff_t start) { + unsigned long offset = 0; + if (vma->vm_pgoff < start) - return (start - vma->vm_pgoff) << PAGE_SHIFT; - else - return 0; + offset = (start - vma->vm_pgoff) << PAGE_SHIFT; + + return vma->vm_start + offset; } static unsigned long vma_offset_end(struct vm_area_struct *vma, pgoff_t end) @@ -457,7 +459,7 @@ static void hugetlb_unmap_file_folio(struct hstate *h, v_start = vma_offset_start(vma, start); v_end = vma_offset_end(vma, end); - if (!hugetlb_vma_maps_page(vma, vma->vm_start + v_start, page)) + if (!hugetlb_vma_maps_page(vma, v_start, page)) continue; if (!hugetlb_vma_trylock_write(vma)) { @@ -473,8 +475,8 @@ static void hugetlb_unmap_file_folio(struct hstate *h, break; } - unmap_hugepage_range(vma, vma->vm_start + v_start, v_end, - NULL, ZAP_FLAG_DROP_MARKER); + unmap_hugepage_range(vma, v_start, v_end, NULL, + ZAP_FLAG_DROP_MARKER); hugetlb_vma_unlock_write(vma); } @@ -507,10 +509,9 @@ static void hugetlb_unmap_file_folio(struct hstate *h, */ v_start = vma_offset_start(vma, start); v_end = vma_offset_end(vma, end); - if (hugetlb_vma_maps_page(vma, vma->vm_start + v_start, page)) - unmap_hugepage_range(vma, vma->vm_start + v_start, - v_end, NULL, - ZAP_FLAG_DROP_MARKER); + if (hugetlb_vma_maps_page(vma, v_start, page)) + unmap_hugepage_range(vma, v_start, v_end, NULL, + ZAP_FLAG_DROP_MARKER); kref_put(&vma_lock->refs, hugetlb_vma_lock_release); hugetlb_vma_unlock_write(vma); @@ -540,8 +541,7 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end, v_start = vma_offset_start(vma, start); v_end = vma_offset_end(vma, end); - unmap_hugepage_range(vma, vma->vm_start + v_start, v_end, - NULL, zap_flags); + unmap_hugepage_range(vma, v_start, v_end, NULL, zap_flags); /* * Note that vma lock only exists for shared/non-private From patchwork Fri Dec 9 17:00:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13070015 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2ED7C4167B for ; Fri, 9 Dec 2022 17:01:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1C01E8E0006; Fri, 9 Dec 2022 12:01:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 123E08E0001; Fri, 9 Dec 2022 12:01:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DCC1C8E0006; Fri, 9 Dec 2022 12:01:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id CA9418E0001 for ; Fri, 9 Dec 2022 12:01:18 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 9B34AC134C for ; Fri, 9 Dec 2022 17:01:18 +0000 (UTC) X-FDA: 80223383436.08.2AC7C99 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf24.hostedemail.com (Postfix) with ESMTP id 80E51180023 for ; Fri, 9 Dec 2022 17:01:16 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Gnw1KY0q; spf=pass (imf24.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670605276; a=rsa-sha256; cv=none; b=SiFyy+lVOs/jIq0jlfy2oIbcu42OmgeU+8mKyoWUm4UH0Q/YbktqiGk0CHRX9UBr3yaIe4 tE2X33K7cpVDEEgOCMeDL0NobcusJkFJdu9hFJY7OVjB3bIGCxvbBAKqozL0gSVHrPFWyk FbUNxGGQUE8x2B6Af49hGBUecMrrx0Q= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Gnw1KY0q; spf=pass (imf24.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670605276; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3uAZCG3P+mUZeKCIXPYURgU7pcsgAcHbWcWGaDTy3vU=; b=wFXLeoP5L2gQdMHU1TeAIWEViHi7qEZUKPmJ+8UGK4UI3iPWT1F6DUFTYz9HMMs0Gll3uO /akXZiGLxBcoCsTdvWuYNi/2V+j0JI5pW2sz6B4bSKSKrb0V/t2bn6frGmEeTFyDzrSWCg /m+wxp6c7adOi8DOkcVAshlXWSD7xys= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670605275; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3uAZCG3P+mUZeKCIXPYURgU7pcsgAcHbWcWGaDTy3vU=; b=Gnw1KY0qHlDDKClbv/O3LN1p22lSYAMYiVrGBWI+MiL+YkZ7tXC9QkuOvpdBauRymcQopi ZGi2+8MmOIxwTLmqBzv6bm14oi8HYtBHSfjrXlZ5c4XEsx9hs05fk9IAOeC2odNiIusvP4 1UEbl7+QbcBLccEmHYE3OjB/UBLTD10= Received: from mail-oi1-f198.google.com (mail-oi1-f198.google.com [209.85.167.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-628-gTAgb96SNBKfmit4TiMbJg-1; Fri, 09 Dec 2022 12:01:14 -0500 X-MC-Unique: gTAgb96SNBKfmit4TiMbJg-1 Received: by mail-oi1-f198.google.com with SMTP id l15-20020a54450f000000b0035e4dc7f325so2245149oil.20 for ; Fri, 09 Dec 2022 09:01:14 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3uAZCG3P+mUZeKCIXPYURgU7pcsgAcHbWcWGaDTy3vU=; b=RXCFD1XVaUWdqXe9CDHEiJfohXQD/REYPsvCXDOISIiyfVR/i34BA22BjwIE97SJ4/ 2VNRNgKvPw/8nwuDUzrHT/TY2ovGuqlXnT2WQAXhTC2n4wONql/m0N3LnfBmNaIisV53 KNSHrhTpfSlak0NAFqMaJ4L9cWytkLpBilHGhij3V/4MRPQY948bdsJlUWTxw5tg1fy8 V/Fxma/k1Ryv4O8YyKtd8z6/SJZNCm6Urly6XS5Anu3FxSDPeM1SUOJPKNZKmjjTYZj8 Tw3gtQokH2LoDAdyPRpWndMOZBSqF5OjVX9ySwZC0DxCmY5Zk6zdvQLwBrwMIbmHP0y/ u2lw== X-Gm-Message-State: ANoB5pmPB7r7gA766Axk6bWVlr0Fc55V0gYUs7tTEz6hgh9x/EeaDDkO geOtr17LbhqphqU7DFo3wE8oDF6QZG+TcSuo+Su0BM8R784ixa5QDq5rmZa3RBetElzUa0UZrHl NPhZHg1v1UZPTyR3mINd6F/Hst1i623sulSw+xZgIJq9ZEkN+DazqFT77rhw5 X-Received: by 2002:a54:438d:0:b0:35e:1ca6:ff6d with SMTP id u13-20020a54438d000000b0035e1ca6ff6dmr2673244oiv.5.1670605273276; Fri, 09 Dec 2022 09:01:13 -0800 (PST) X-Google-Smtp-Source: AA0mqf4i+e1KtxZvFOzCGHgXB6sOxKF23L/F2cv/jo46p0ozGa2WnFiSOSP68F7KBtkljhg4CQrInw== X-Received: by 2002:a54:438d:0:b0:35e:1ca6:ff6d with SMTP id u13-20020a54438d000000b0035e1ca6ff6dmr2673199oiv.5.1670605272841; Fri, 09 Dec 2022 09:01:12 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id q7-20020a05620a0d8700b006cf38fd659asm178907qkl.103.2022.12.09.09.01.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Dec 2022 09:01:12 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrew Morton , Miaohe Lin , David Hildenbrand , Nadav Amit , peterx@redhat.com, Andrea Arcangeli , Jann Horn , John Hubbard , Mike Kravetz , James Houghton , Rik van Riel , Muchun Song Subject: [PATCH v3 2/9] mm/hugetlb: Don't wait for migration entry during follow page Date: Fri, 9 Dec 2022 12:00:53 -0500 Message-Id: <20221209170100.973970-3-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221209170100.973970-1-peterx@redhat.com> References: <20221209170100.973970-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain X-Rspam-User: X-Rspamd-Queue-Id: 80E51180023 X-Rspamd-Server: rspam01 X-Stat-Signature: 6nnywd33sh8eubzxaui7a6io49wte59x X-HE-Tag: 1670605276-846114 X-HE-Meta: U2FsdGVkX1891BiMp7+3f5VqXJ/9NRqiMSNAtzrOYgTvJRvkwQBP3yIYChjU5bdl6XVeWebYDoRAIk/mFn5LefqqSvhcHwKGBZbJax0aIwhH2s8OSxL13rgGNYGfP/xh+UglhfYMBeOXKYGwcWb7Bu8Ggg58lZpxx1wDB1wWOU1lJjvXQJvSSsfNn2YdEgsBBuVfhN1v8J6PnXiFBCiEihEJxeiPrvhvYleKi9HCIYSOlxLCSUSdz5lAKnJJu1PLr5Ve+US3J6QNXJJQhDTXNK0KoUZbylG8/UE1QM8kATe78avDXYw5DITuyOeFII3WJ1Y9Vq6a+YLXtOXKutF9BWk4QHSgA/IvDdQaURB79D8icIjNtr0F9C59yo5iuHKAiYiVBPZB+0/rm0E/4reHrzPnmqFGbo0yCGOH+nFhxdX+BU9X4MhoJlqRuhH5ZBLIP2j0xFKLzbZtzj0PNfJxCFgkv/w5MAYxmgX+/hYZCXZ+55ek4P6umzAICWQJMheqiD//OU3Xp6e1/SkSg8GWcKETgI7T7D1Bjdv+AGSn+z2NkUsDwBwk1vyxIATYVZEZCqrVEsymV3wftwYJ2Or0s3V4D+4lZQcP4P91vFmC6EtYG3lQRz6kpzkFGRRPj13FNfUfw4V66nZZNnJKJ3xmFEe6Um1Cd51InUVhdirCDXXVdK9L5oflwPyAIg8haqewsDpmWyxHq6YsFBOIJEwz9Lh0EM7AZgxOnZ+VtIRRXLdjUMbiPVttlLRu1/+YxoExQMcfhFbT4kk6av08zslMbVaHpQX3W/1hWvJe/ntiierfjSNzQpO7onUkRzjDfIX1HmkcYOuBbnjWpcHivvwQxERLJOEUDGmJWvR48nW9k/+KH1h3Cp+pnsH7oWaw2dwbXhkRJDWHnbwuKmtVOIeMM+o3hkQnYAlU9SgvH7JcYlPxnZGSldTdyJWC4bvUjuWOZuP4aLyGfgjF8eIjmcP VIcK6HCj 16V3Zd1oGtaesGmJKtOpZagCmlo/EQsbzBJBgbagUN+hCAPjox7iwN/70nOaLC24/EkWF9iVlvggfaWNbsTLvWfCxeV+ZRq9FP08oetD7YPphr7wKyk/66gJt20fweJNhcBOZ3S81nLJFqejZWAzhh80nWGmXdJK6sFT7F74leoYejYf2mzZRXPZrcRDap7UXkBiWQyaD0Ib9m163fG8DYXUc7ztM5VJ93wSe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: That's what the code does with !hugetlb pages, so we should logically do the same for hugetlb, so migration entry will also be treated as no page. This is probably also the last piece in follow_page code that may sleep, the last one should be removed in cf994dd8af27 ("mm/gup: remove FOLL_MIGRATION", 2022-11-16). Reviewed-by: Mike Kravetz Reviewed-by: David Hildenbrand Reviewed-by: John Hubbard Signed-off-by: Peter Xu --- mm/hugetlb.c | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1088f2f41c88..c8a6673fe5b4 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6232,7 +6232,6 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, if (WARN_ON_ONCE(flags & FOLL_PIN)) return NULL; -retry: pte = huge_pte_offset(mm, haddr, huge_page_size(h)); if (!pte) return NULL; @@ -6255,16 +6254,6 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, page = NULL; goto out; } - } else { - if (is_hugetlb_entry_migration(entry)) { - spin_unlock(ptl); - __migration_entry_wait_huge(pte, ptl); - goto retry; - } - /* - * hwpoisoned entry is treated as no_page_table in - * follow_page_mask(). - */ } out: spin_unlock(ptl); From patchwork Fri Dec 9 17:00:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13070016 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEAFDC04FDE for ; Fri, 9 Dec 2022 17:01:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 78CC88E0007; Fri, 9 Dec 2022 12:01:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6EEC98E0001; Fri, 9 Dec 2022 12:01:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 541218E0007; Fri, 9 Dec 2022 12:01:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 415BD8E0001 for ; Fri, 9 Dec 2022 12:01:21 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 165171612A5 for ; Fri, 9 Dec 2022 17:01:21 +0000 (UTC) X-FDA: 80223383562.09.1C968A3 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf18.hostedemail.com (Postfix) with ESMTP id 8969D1C002F for ; Fri, 9 Dec 2022 17:01:18 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Uw8a7AAh; spf=pass (imf18.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670605278; a=rsa-sha256; cv=none; b=wdaQMRn7DRey8B6loFI3mlRloUWfSxzuRejAMp5AmK5J3aCjlDeORQnrVFtyyfwScB1S7u QFIqNnf278tULAb3SYFrj5mxfmGODPaPBl29fLn0rPj536nRvCFZk2qcKRWdtauUAEJsEV 7t1vGpJUtde5RLVIWIq2sHKiYtrstF0= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Uw8a7AAh; spf=pass (imf18.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670605278; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KF8YBIr+SenpQMPOCGcYKB11m6lZYSBtX9iYqo4LY9E=; b=QenYlXgeMX+Du45+SmDtDmVuDdL22I/fIErQcg0/6Efquverva0AWj7lcz0zPcy4LX36hw XBFLJZXtocG0vJWW8IsR4RCkxNNGvpAC3Tv7cmFOaNfuEB78M0gfqz7S2ur0dSUMTRa5xy cdNxjg0q6UvovPPkYCiNJRHB8QR655M= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670605277; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KF8YBIr+SenpQMPOCGcYKB11m6lZYSBtX9iYqo4LY9E=; b=Uw8a7AAhzLIhx8sWioDN6YkBSOA9pkP3o+0vi3iDFLGzZJUFY0frrw4aEa/RB+wm5sWIKz ybSQicFWvU7mdKGoi0fYMzujHTjCJlByj3EVQmR2ozN2Ro0uaUHvMavdZS1l3KC+fibX6l ci+pwETu/EXAGrrMcxsWJIzfGCQvbSM= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-114-UaZz1C1YM3eftrvPBNMWKQ-1; Fri, 09 Dec 2022 12:01:17 -0500 X-MC-Unique: UaZz1C1YM3eftrvPBNMWKQ-1 Received: by mail-qv1-f72.google.com with SMTP id qf9-20020a0562144b8900b004c71efc3528so4909612qvb.22 for ; Fri, 09 Dec 2022 09:01:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KF8YBIr+SenpQMPOCGcYKB11m6lZYSBtX9iYqo4LY9E=; b=RFbXymOvHJuNGo+n9jN0hIkQ2xhI5jktiR0ebPZUhIugRu9d8uAo1iEPSdPJK6M/2K uv1XnZW9YMjJGAHzzWWWQFOT1NCle3PMfiNYl1vDbqoc6XFw0OT8BK0DNk/NeAWMsj6a GAHcaKW0y7hrdhaMda/J6ze0zHYgz8+35yRl5mtvxtzJFCVazEyu8FBNWpj3OouNpqGj nRRCJ9HvKXexjNl+upfeycvtTv0KfVZDMo4sq2imLEa+hDc5jb27odcfzPf0T43eIP0b VfwuIHAuga9s5qN0nWYdixmsratChsGs77/SQfZjt5cBOQrN7x6HJzlxvIorPSHoyoQ9 2qJQ== X-Gm-Message-State: ANoB5plIjED9ZSuaJhRkAtCuQ3aWuadYusonpmGIyeS7e4rug9gkGDdJ XGenHb+PvccaNpKpc0KaSIA8XvCp9AfnVR+YuzcaM7cHKHPrBu+0AwFdloy+vhOg/fCgv+LT6J7 brac+vz3hfSfw0tabGXwTmzLDFch5vDFCr//5UpNQI+djObJjGWgOu1yUnDrh X-Received: by 2002:a05:622a:181b:b0:3a6:8b0a:89f4 with SMTP id t27-20020a05622a181b00b003a68b0a89f4mr5562665qtc.37.1670605275503; Fri, 09 Dec 2022 09:01:15 -0800 (PST) X-Google-Smtp-Source: AA0mqf7GjjyC9zskYeywpb+Ewu/62cki7ptxUtlYncuU3jbM740NAM3GpI8Fy9pUHhtyChxzmv3XbA== X-Received: by 2002:a05:622a:181b:b0:3a6:8b0a:89f4 with SMTP id t27-20020a05622a181b00b003a68b0a89f4mr5562589qtc.37.1670605275115; Fri, 09 Dec 2022 09:01:15 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id q7-20020a05620a0d8700b006cf38fd659asm178907qkl.103.2022.12.09.09.01.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Dec 2022 09:01:14 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrew Morton , Miaohe Lin , David Hildenbrand , Nadav Amit , peterx@redhat.com, Andrea Arcangeli , Jann Horn , John Hubbard , Mike Kravetz , James Houghton , Rik van Riel , Muchun Song Subject: [PATCH v3 3/9] mm/hugetlb: Document huge_pte_offset usage Date: Fri, 9 Dec 2022 12:00:54 -0500 Message-Id: <20221209170100.973970-4-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221209170100.973970-1-peterx@redhat.com> References: <20221209170100.973970-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain X-Rspam-User: X-Rspamd-Queue-Id: 8969D1C002F X-Rspamd-Server: rspam01 X-Stat-Signature: g4amw3x7zyd4zafoiwsmuaiikzsujccp X-HE-Tag: 1670605278-643355 X-HE-Meta: U2FsdGVkX1/J1+KLxadkBINH6zy0n6oIXLcXOzJ+RVav1F/3ABXJRD9/7Ca4VJ1s4teBr4EgxBek4H2jTNScvLz78a07q+6jaVK8bVVAS0SBbLHcxIRUuhEQaKp2PGeYzddWrJEq82X3nEUwm2cia3YasVZyMgpVFdUskyg16oie0FynjcuSmzzbNsraaPfU4YdRzhDtJfzuHNG2P2i8wY6NQ66o3U/UzSF/WskDapq5yeISE/13yZ//EF9qCmZ7tKlEwPoRLIzo67ECIBIijwcMrK7rLCjsPgVqVDRplB695TfEP4AElD0jTwgITepI8TGe1jejoKdtcok9y2mlCrcvNLwIipTyeu5AIWVDuiPz5N2IF2+lQlUFJ2f7LOTS1ydZ2dVlHarSPXZM2n7hHsoGY7Q0eotTv4PrWWZ5EgGzceZhV8YYlR8dqFmIbsXy6bwZCaiGZkemONX4xiO7zSCA/8KsSHLvu/vMXqs1IKxKCK2KV0MRfuBlsPwyyBrXrkVXHpro9dhAgKTSVj+VLwqKDIzputuqAKYJVtBqJL8WOPfEZp9VyCuLqMiBLqnG98YlfAha0+92611Jq201bjvqBU17mdlFh9zMEut7TICko/Sn2dux+kdYjnZWg+o3qRoHQME4oOe+4U0bZvbUSoz+n6qWWAvPohsdwff7u70J826L50Tf4QXZmTnUOZdPKVYGHUTs3BJFUscB7T4Bz31Lbuu6xS/lu8QX7pTutYkykVPcsimQqAF9MbCgpqofp+wLbPhcf5KAaWFHln7bM31gm0CPXNa9qIRjq0zmSuOKsZl549WQSSisdMPnzbmiNM5TKWg07H2+1Tmj1WSxZR1ATX1mJHjF0ciUux+ws5B8RqRqnAY1ySxrQhHQk48xWn/pon/BVtgryWeO+Lqz4+f6jLJzE1zY+czMAzeMKFf00t2duUGv0tD8KaI3quMFlszjpS7qdRIvKt7EMOB HM8DqhrQ 1ICy9+hGhesd1K05aGW8h7d2ecSOJ5Sb6KnfKMhPmJy6UK/2kBJrm+GdnwhRLR5xD4DKsoxhFbcwLRXten9j7i7NdoO/6pL5HZWZ9xQbxP8ShJvHEI89YZfAzGO7Zto5pSJXcW1SFqaKoY9Ta00shyU9rQSBG5m4d9xIzwMyFgrQnPhfe3Agi3KQtkqUFeGox18/+phXUujijsArVFsUV3xacvoFKf9RZPUeQp1so2BnhICUNZRjzJ3osNTv7QmTJl9dSrPUK8COse3M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: huge_pte_offset() is potentially a pgtable walker, looking up pte_t* for a hugetlb address. Normally, it's always safe to walk a generic pgtable as long as we're with the mmap lock held for either read or write, because that guarantees the pgtable pages will always be valid during the process. But it's not true for hugetlbfs, especially shared: hugetlbfs can have its pgtable freed by pmd unsharing, it means that even with mmap lock held for current mm, the PMD pgtable page can still go away from under us if pmd unsharing is possible during the walk. So we have two ways to make it safe even for a shared mapping: (1) If we're with the hugetlb vma lock held for either read/write, it's okay because pmd unshare cannot happen at all. (2) If we're with the i_mmap_rwsem lock held for either read/write, it's okay because even if pmd unshare can happen, the pgtable page cannot be freed from under us. Document it. Reviewed-by: John Hubbard Reviewed-by: David Hildenbrand Signed-off-by: Peter Xu --- include/linux/hugetlb.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 551834cd5299..d755e2a7c0db 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -192,6 +192,38 @@ extern struct list_head huge_boot_pages; pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz); +/* + * huge_pte_offset(): Walk the hugetlb pgtable until the last level PTE. + * Returns the pte_t* if found, or NULL if the address is not mapped. + * + * Since this function will walk all the pgtable pages (including not only + * high-level pgtable page, but also PUD entry that can be unshared + * concurrently for VM_SHARED), the caller of this function should be + * responsible of its thread safety. One can follow this rule: + * + * (1) For private mappings: pmd unsharing is not possible, so holding the + * mmap_lock for either read or write is sufficient. Most callers + * already hold the mmap_lock, so normally, no special action is + * required. + * + * (2) For shared mappings: pmd unsharing is possible (so the PUD-ranged + * pgtable page can go away from under us! It can be done by a pmd + * unshare with a follow up munmap() on the other process), then we + * need either: + * + * (2.1) hugetlb vma lock read or write held, to make sure pmd unshare + * won't happen upon the range (it also makes sure the pte_t we + * read is the right and stable one), or, + * + * (2.2) hugetlb mapping i_mmap_rwsem lock held read or write, to make + * sure even if unshare happened the racy unmap() will wait until + * i_mmap_rwsem is released. + * + * Option (2.1) is the safest, which guarantees pte stability from pmd + * sharing pov, until the vma lock released. Option (2.2) doesn't protect + * a concurrent pmd unshare, but it makes sure the pgtable page is safe to + * access. + */ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long sz); unsigned long hugetlb_mask_last_page(struct hstate *h); From patchwork Fri Dec 9 17:00:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13070018 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58E49C4167B for ; Fri, 9 Dec 2022 17:01:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EB8D58E0009; Fri, 9 Dec 2022 12:01:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E44978E0001; Fri, 9 Dec 2022 12:01:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF7A78E0009; Fri, 9 Dec 2022 12:01:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id AB7128E0001 for ; Fri, 9 Dec 2022 12:01:28 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2D9AE1612B3 for ; Fri, 9 Dec 2022 17:01:28 +0000 (UTC) X-FDA: 80223383856.05.6BC310A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf29.hostedemail.com (Postfix) with ESMTP id 5A6F8120026 for ; Fri, 9 Dec 2022 17:01:22 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SzwF386c; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf29.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670605282; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xtv4yezaaEbF1bO1grzv/hp6hNuvFjuXz2keNbWJszA=; b=S5tImzlTViwcoNaLvF8b4d5EagE2hAnbvt0YsK/Yxma2kGKQO+bEaOC/9xc5L2w2x0Af1C lzl++C9A+3kS9PH0RcARGQLBf0/zNjoHDBmMiOdJRvgHos9kjhqYwOWOuvUPAZlRBT5ulV yBqOUftkAOfB8OK4RJOjdp1+MVWrbfw= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SzwF386c; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf29.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670605282; a=rsa-sha256; cv=none; b=0ij+hhiae2i5y+oic0YAtAW9bMyzii78qX6OhYYoWiq1Azn3JoFuGgc8GQI1oPSDA+Mg0H yxggCh95U0+FjDfvXnSlP7JBVe/dh/iamMBrK8AFIgPk1xdl/eUat9ScqXMJxJ8sbbjhz+ 74+9rLoM8HF0ZdBLwzqweC0cJ0xWn3c= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670605281; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xtv4yezaaEbF1bO1grzv/hp6hNuvFjuXz2keNbWJszA=; b=SzwF386cJGBdeKTN2cX+P6IhpBeNqRC2JLfZm7jdN+9McljOZSuiRjd5B3QrCMqJpbv4bZ jvdrWeT7VXOsRRkvQEp7fhLmOMjp9UzpvBTESLf2yNx1CUrWTLmw1Ak6/QEr9iRAUYab6c Q/y5zbvMNCq0kkrVZGNKtbMRrJGPGSk= Received: from mail-oo1-f69.google.com (mail-oo1-f69.google.com [209.85.161.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-609-h5eFuJCkP5iftaob9YQxKA-1; Fri, 09 Dec 2022 12:01:20 -0500 X-MC-Unique: h5eFuJCkP5iftaob9YQxKA-1 Received: by mail-oo1-f69.google.com with SMTP id s17-20020a4a7311000000b004a35a996d0bso1584727ooc.20 for ; Fri, 09 Dec 2022 09:01:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xtv4yezaaEbF1bO1grzv/hp6hNuvFjuXz2keNbWJszA=; b=RZ9V/eMiskbgIglx2bAOmfGgX/IojjLQudK/dNfr0kxjBkVCTnVrZnJ9y1NpbtvozL yz/MQgNy9QKrJEIJxO4h+yPlZroLvjG679wqJ5jDKy/9J/fvEPMzmrRMbooYEu+Hn3Wo ZIzzx/EMVseuUN/JLxRlHxmMxbCvadtsa/4dxJfAfVvWVoT+I2pbb9GobHK7h9+L1dIO ZJFGQFsDab4wOgygyw7E86FaLQrcE+lwinvyxoX5zxWbEUGcwedLfUSOpqqA6olkwEX3 hP0kkFqRYeK01113jPBy/unbflDYA2QQMDjAss9+gLYADzNWlIfclhXgeSeoFBanQ7NY DKCw== X-Gm-Message-State: ANoB5pmNDViLlew9mA6RgCGZRk9fROE/5K8cuOQmPnzjw0PqRtjdE1qK +H6IzAyM7iMAjY4St7CWZ2Tda+ToHy8E16sIlKhuq2xEWxqB8rjkFSS/woK7azxwZS3VrrCp6Ug LCKemql8SYEKJ0HfjkrHh/Rg1wf7UGbPH7AAt+iFkECiOGxerOyxSvVmXPGq0 X-Received: by 2002:a4a:aec6:0:b0:49f:96f:e6c0 with SMTP id v6-20020a4aaec6000000b0049f096fe6c0mr3915672oon.8.1670605278377; Fri, 09 Dec 2022 09:01:18 -0800 (PST) X-Google-Smtp-Source: AA0mqf6Vuv1wv7YiysMsVS44h4EbTaETlm5u1UVAmf+zKytKpChDbeBgIigGDbiHmGXYIlQ2iKhB5w== X-Received: by 2002:a4a:aec6:0:b0:49f:96f:e6c0 with SMTP id v6-20020a4aaec6000000b0049f096fe6c0mr3915632oon.8.1670605278011; Fri, 09 Dec 2022 09:01:18 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id q7-20020a05620a0d8700b006cf38fd659asm178907qkl.103.2022.12.09.09.01.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Dec 2022 09:01:17 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrew Morton , Miaohe Lin , David Hildenbrand , Nadav Amit , peterx@redhat.com, Andrea Arcangeli , Jann Horn , John Hubbard , Mike Kravetz , James Houghton , Rik van Riel , Muchun Song Subject: [PATCH v3 4/9] mm/hugetlb: Move swap entry handling into vma lock when faulted Date: Fri, 9 Dec 2022 12:00:55 -0500 Message-Id: <20221209170100.973970-5-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221209170100.973970-1-peterx@redhat.com> References: <20221209170100.973970-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain X-Rspamd-Queue-Id: 5A6F8120026 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: zeko8atn3nxs1qtx38pyur83uo4mpfem X-HE-Tag: 1670605282-270784 X-HE-Meta: U2FsdGVkX19aPyEX2uIEH57Mexvb5A9um8QXoSiexlUiIt//YuR9sWJDTSbxW2UxneG9dP6wBaA4dDjjxx+P8oy7rmTB/j5WBfZrqbDUxZ+n7btTzGXCa4TqjLJ+Rpw99y5rcP4Zj4deiHLGrWwchupoyeCyK60a/XAq8qjw5j7bq1NENwJ9xvSHdRIhv7KdOr4HjuL8Ni9sMvE6Co/z1SdwVt0+XH685ujQBW/iwrC9bNt2Z3isq+0h99bEK6VN/yLuANu4+GcPsDq3i58rz1GCTKW92RGJCNh2XZar5Panlsprqn0JEE43E9wrks9HD8viE3ZBdOcJdK/fTVjdzP/JAl/9mC1BOwSZIONsUhdY7NOqi30kyH0Dfj7GQH5QzSUtHIPXiYMl/Xr4/FLCCKMqzak9CGDQ7+ZfVMghmRiTXGpM2ddHyuKegTiHToiH47rChH4/TFperMVX+sUWeyIJ5yxkWrINMVOEeBqbBVLAYjVvCFEMsvZsq1g+YElmWzeUotRwTHZPWv39PdaotdwFxHVyR+sLkjr40e3SSfAs/h3O0BlVFE8x2LeFkfk0VquBvUJkZmPjr6UFsPvFdyc6kuecTjCkdNG6bwMxmuaMV+gjgrskSmT98hKBMEswwMi8XbxTKtPguXIehqGrgs++J70EF0Wlg4ITAV8WTmJvVMXBy6EwBxJr80vH6KGeBjGYbZGOlaHRKx6HlUTDqUl6w4Q8MFz7May68ndzkrTekuz7239qP3vFlCcwmFSTC36SDYwJoGXHqX+GmfCQIrpdBd5BLgV94L4/6Rhh3Ft/hduILFaXj0ZRWvjFCN88jpoiGDKHc0Q9mAIT9hLzCFP+kWZMcbtnSPNobLNPeVk92+oE/1O6tX+hqOHqo2TR76/+I6GAUFdkivoEHL2YohQHN4gTsty0DtWOr7f0kv7bntFQi3psgM/ftnybwGU+0BFnS+GqN9DdtV34qYS je5Qslv7 vcWo871RRDyfh6CoKNsPnYjAYdaSkp5oAGA279jckFn8MpkpbQvIUz+hcXZ+Euf/+aWzpdSaIafebOvwdf0IUg1x7JfX0ZJ3tV2D2RA6v13593hU2BaeRdSWUYKheZFQUsO4bwbdismoRjzYElMqUATVO1/EMpX2/pHuAYJMrFdnY/rgpzCmaF3oIkLgs3srxQoe1aHvwZ+pxPd301SFM840WLYGsJfCFpYDF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In hugetlb_fault(), there used to have a special path to handle swap entry at the entrance using huge_pte_offset(). That's unsafe because huge_pte_offset() for a pmd sharable range can access freed pgtables if without any lock to protect the pgtable from being freed after pmd unshare. Here the simplest solution to make it safe is to move the swap handling to be after the vma lock being held. We may need to take the fault mutex on either migration or hwpoison entries now (also the vma lock, but that's really needed), however neither of them is hot path. Note that the vma lock cannot be released in hugetlb_fault() when the migration entry is detected, because in migration_entry_wait_huge() the pgtable page will be used again (by taking the pgtable lock), so that also need to be protected by the vma lock. Modify migration_entry_wait_huge() so that it must be called with vma read lock held, and properly release the lock in __migration_entry_wait_huge(). Reviewed-by: Mike Kravetz Reviewed-by: John Hubbard Signed-off-by: Peter Xu --- include/linux/swapops.h | 6 ++++-- mm/hugetlb.c | 37 ++++++++++++++++--------------------- mm/migrate.c | 25 +++++++++++++++++++++---- 3 files changed, 41 insertions(+), 27 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index a70b5c3a68d7..b134c5eb75cb 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -337,7 +337,8 @@ extern void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep, extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, unsigned long address); #ifdef CONFIG_HUGETLB_PAGE -extern void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl); +extern void __migration_entry_wait_huge(struct vm_area_struct *vma, + pte_t *ptep, spinlock_t *ptl); extern void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte); #endif /* CONFIG_HUGETLB_PAGE */ #else /* CONFIG_MIGRATION */ @@ -366,7 +367,8 @@ static inline void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep, static inline void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, unsigned long address) { } #ifdef CONFIG_HUGETLB_PAGE -static inline void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl) { } +static inline void __migration_entry_wait_huge(struct vm_area_struct *vma, + pte_t *ptep, spinlock_t *ptl) { } static inline void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { } #endif /* CONFIG_HUGETLB_PAGE */ static inline int is_writable_migration_entry(swp_entry_t entry) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c8a6673fe5b4..247702eb9f88 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5824,22 +5824,6 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, int need_wait_lock = 0; unsigned long haddr = address & huge_page_mask(h); - ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); - if (ptep) { - /* - * Since we hold no locks, ptep could be stale. That is - * OK as we are only making decisions based on content and - * not actually modifying content here. - */ - entry = huge_ptep_get(ptep); - if (unlikely(is_hugetlb_entry_migration(entry))) { - migration_entry_wait_huge(vma, ptep); - return 0; - } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) - return VM_FAULT_HWPOISON_LARGE | - VM_FAULT_SET_HINDEX(hstate_index(h)); - } - /* * Serialize hugepage allocation and instantiation, so that we don't * get spurious allocation failures if two CPUs race to instantiate @@ -5854,10 +5838,6 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * Acquire vma lock before calling huge_pte_alloc and hold * until finished with ptep. This prevents huge_pmd_unshare from * being called elsewhere and making the ptep no longer valid. - * - * ptep could have already be assigned via huge_pte_offset. That - * is OK, as huge_pte_alloc will return the same value unless - * something has changed. */ hugetlb_vma_lock_read(vma); ptep = huge_pte_alloc(mm, vma, haddr, huge_page_size(h)); @@ -5886,8 +5866,23 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * fault, and is_hugetlb_entry_(migration|hwpoisoned) check will * properly handle it. */ - if (!pte_present(entry)) + if (!pte_present(entry)) { + if (unlikely(is_hugetlb_entry_migration(entry))) { + /* + * Release the hugetlb fault lock now, but retain + * the vma lock, because it is needed to guard the + * huge_pte_lockptr() later in + * migration_entry_wait_huge(). The vma lock will + * be released there. + */ + mutex_unlock(&hugetlb_fault_mutex_table[hash]); + migration_entry_wait_huge(vma, ptep); + return 0; + } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) + ret = VM_FAULT_HWPOISON_LARGE | + VM_FAULT_SET_HINDEX(hstate_index(h)); goto out_mutex; + } /* * If we are going to COW/unshare the mapping later, we examine the diff --git a/mm/migrate.c b/mm/migrate.c index 48584b032ea9..9c4e3a833449 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -333,24 +333,41 @@ void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, } #ifdef CONFIG_HUGETLB_PAGE -void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl) +/* + * The vma read lock must be held upon entry. Holding that lock prevents either + * the pte or the ptl from being freed. + * + * This function will release the vma lock before returning. + */ +void __migration_entry_wait_huge(struct vm_area_struct *vma, + pte_t *ptep, spinlock_t *ptl) { pte_t pte; + hugetlb_vma_assert_locked(vma); spin_lock(ptl); pte = huge_ptep_get(ptep); - if (unlikely(!is_hugetlb_entry_migration(pte))) + if (unlikely(!is_hugetlb_entry_migration(pte))) { spin_unlock(ptl); - else + hugetlb_vma_unlock_read(vma); + } else { + /* + * If migration entry existed, safe to release vma lock + * here because the pgtable page won't be freed without the + * pgtable lock released. See comment right above pgtable + * lock release in migration_entry_wait_on_locked(). + */ + hugetlb_vma_unlock_read(vma); migration_entry_wait_on_locked(pte_to_swp_entry(pte), NULL, ptl); + } } void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { spinlock_t *ptl = huge_pte_lockptr(hstate_vma(vma), vma->vm_mm, pte); - __migration_entry_wait_huge(pte, ptl); + __migration_entry_wait_huge(vma, pte, ptl); } #endif From patchwork Fri Dec 9 17:00:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13070017 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48D0CC4332F for ; Fri, 9 Dec 2022 17:01:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C6DEA8E0008; Fri, 9 Dec 2022 12:01:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BF7238E0001; Fri, 9 Dec 2022 12:01:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A716E8E0008; Fri, 9 Dec 2022 12:01:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9768D8E0001 for ; Fri, 9 Dec 2022 12:01:27 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 6040F1413BD for ; Fri, 9 Dec 2022 17:01:27 +0000 (UTC) X-FDA: 80223383814.20.31E1D14 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf10.hostedemail.com (Postfix) with ESMTP id 0A22EC0034 for ; Fri, 9 Dec 2022 17:01:24 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=g7dzTzuI; spf=pass (imf10.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670605285; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aMLwOwXmHw3aEdY2dAtBVU7yezK1ZrIU2NcKyDsSofc=; b=GqiC9GG6mukZPHbQEXrnvvCfYPu8yRaFp9E4UmMQU5WEILGPQFEWZfOTWD4UaNzTvG1U5Z im8/MH4MqooYM9c/3vEHgSKokmJiFVzcEyAto5SztxskKsoqafTUnwaTHEHrli7WIo9Wlz iC/tWkgVpqdzW/x74LzpaoyxDdp7aXo= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=g7dzTzuI; spf=pass (imf10.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670605285; a=rsa-sha256; cv=none; b=THUtJsvmzdhCNwX+pt/9QiERvJdpRokYm8LigsH/S3nZkscWgJxLySrLybl1J7ixBvQYnB EoJfer3jT2iXkw4ZYElNQSzFz3GCK8RCwoGYEGKRPKRmeRZPVwd0kBiG4stRSgJnO/tH8E 65KEN5nYarUziOwbrlAbUp+pcMjORBw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670605284; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aMLwOwXmHw3aEdY2dAtBVU7yezK1ZrIU2NcKyDsSofc=; b=g7dzTzuIStyhUqbX5O2PvBP0ufK6hU7vMNljbd3UAFtAFPn3YwGQ6Z/OZeCmQ1eFuHRWTR tafqkgQspdy/FVwdoTKNLA1Jns6Gdn26ZYLaiLuRTCCUBFPZuwuLEW2Q4/wfI6yze8dK2T eignUeerhCZtpVbFtPshV2pgeQ7hbRU= Received: from mail-oo1-f72.google.com (mail-oo1-f72.google.com [209.85.161.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-35-pWm7OXQ9Ot-2PGyKKiH9Qw-1; Fri, 09 Dec 2022 12:01:22 -0500 X-MC-Unique: pWm7OXQ9Ot-2PGyKKiH9Qw-1 Received: by mail-oo1-f72.google.com with SMTP id c6-20020a4ad206000000b004a33f36aa4dso1583243oos.21 for ; Fri, 09 Dec 2022 09:01:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aMLwOwXmHw3aEdY2dAtBVU7yezK1ZrIU2NcKyDsSofc=; b=7E9MQpOmolV6kWQyd3el9kDmmT84fQpR0Qb/03D34qsrNQUCG9FaYD5XQ5RihjT3zt MCEijxv+uxQq+M3z6W+CWIUbQw9QNuVhJn4Vi1nN+tA4YsOtQf91AEUbAP1zvCK0ReZo ta2796yvl/Q5ZlDtrCdNnTZw7sHeHldfPC0WNmnpBkYRLJkIqhT36zJlPGbXe0SdVKxX 5lFdyNUhat84nSywpwtqB3NGi1RZI1Luxc1NCb4LnpzcaAPyctYwen9oVp6yCmdFF5vw qezdLTD2KyyjTQr9R7uzy/WVgd26dyXLoRAr4XhTOMUGV5nEwxVgVFI8CSZPRYA2ZWdg vWFQ== X-Gm-Message-State: ANoB5plv50NxZNN64dzeMuLlZB9zLfccQaICkAclZXk+hhnRhLMN8GHY JN6vtrIdXni3LuO2lJA1QvMZZkeTWY40H6CjCJ47+2MNahFK2zkOgrm+NBkMb/SnQiI6/cPTIfR dE5zoGWHytR/6F002Ht5E/06zKlRvRpLXA0eTd9ai5JzBHcuMTglZtx+jFti6 X-Received: by 2002:a9d:77c1:0:b0:66d:c8a2:b9b with SMTP id w1-20020a9d77c1000000b0066dc8a20b9bmr3276936otl.12.1670605281248; Fri, 09 Dec 2022 09:01:21 -0800 (PST) X-Google-Smtp-Source: AA0mqf4oQHCNJfUC59F5uvhM+SYrQAPCUwZXG/cB2hzsrwqoVJGknQCDLk0levxupMtmfDVgjRq8Jw== X-Received: by 2002:a9d:77c1:0:b0:66d:c8a2:b9b with SMTP id w1-20020a9d77c1000000b0066dc8a20b9bmr3276898otl.12.1670605280840; Fri, 09 Dec 2022 09:01:20 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id q7-20020a05620a0d8700b006cf38fd659asm178907qkl.103.2022.12.09.09.01.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Dec 2022 09:01:20 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrew Morton , Miaohe Lin , David Hildenbrand , Nadav Amit , peterx@redhat.com, Andrea Arcangeli , Jann Horn , John Hubbard , Mike Kravetz , James Houghton , Rik van Riel , Muchun Song Subject: [PATCH v3 5/9] mm/hugetlb: Make userfaultfd_huge_must_wait() safe to pmd unshare Date: Fri, 9 Dec 2022 12:00:56 -0500 Message-Id: <20221209170100.973970-6-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221209170100.973970-1-peterx@redhat.com> References: <20221209170100.973970-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 0A22EC0034 X-Stat-Signature: 3gzhfyd4d66e9ou6nz3gt7i643dccy6x X-HE-Tag: 1670605284-114513 X-HE-Meta: U2FsdGVkX1/yrRveH2rCVqN0JUGBpNoXD46NBCjssaIJfUI1Sja/X78QIEMaPv+1wXa0BKLy6LcJrIsTv5+qFb8jNcEFVFhH1+i1jdFmAOehBVmBvRWxSjmPnA847vR3QsmR1ZDbVr5jxL9u6uWV8rc1Jy6/As7tfzGDIVYayh80MrOv/dBlhCwBHX64SEXYRlCQ8D6wAI4MEnTbSwoUUetea/ojE7PWCWoDJYIv+gReHExek3suB8A7ytuGtCV34KLz4+up7Ecj5CMY/OHAai7lgYgmoxl/wYla0mKEJD81URUzhYsrrt6pmhqevlAy6ic0yUJAOOn71s80ZZ31g1bQvhqhCFGFIxZ+6HlPeMQSZwnXlSQ1qBs4fTPMQn9rV6VlwIns0QABtN/CFw13/P9N2YUOhZ6vY9q266D6bxgMqD9mi22OK2snDog3GlFt12xVw+KBNrwf9jTMJB96VSAu6/yOeWcqCd+/nBKJ/pmATL5FAM8bmXtZ/QrathMc/+UU5v7/SvqiScbnsVyDacWW0sl1VaqN/eNY3tUWiU9lYxdwN4AX0zudJ+J+kKR30B75CeY61vj4dXV+HcQYy/p7YOfRfjFoiJKFmCPW0OTNKT0B2Zt+HTg6xhGhoZfyhal1mMELbqdaccz1NhCC3S/JCH+2yVJNUTst15Ba7yhxlvyLEbiESSrE3jdFMd/YIQZi4n7L1OBzhS+HhMLSuzZ9UapsqMzXRiA1OlSQEQvMBKBivHZNKG3q3j59RMliJ2ws0YX9kzN1ISy4C4clRPF6fMrpGlJAbbBiOvV/atGaPjxdBfwnZjdMbBSfpZvym9CoxPXOzKh0su4pGqobMdQxJy7RQSiTi9QZSh5zdUh6WMkSQtEfcOcdx2jXYx6W8sxXcyEgu10zyHF6dgcNIlkbF50Gv6gBaOv7Kd+wSCFKOK7/NlIiqadnbFHyjGbANovgkibpEeiNmbaTlda 5V8Qp/5d yXRTVESB+nET0gblSnd6qm2Gn2n9DipLctiSnDallUiGZR40AczDWn0M28clGWqlowR2vVs39bUO8XBjNbMC4ydXXzGbCJ2B5h/67/Q495TaNrYGCiMeGtsnh5O+R1ApMVQ2FerzT9ODsm66Iorz7d2vIhW/OP19nXbLwMjig+i9eb5j+o5qMttAf6FvYgaGUvG/9rV/2csDHfDuCaxU4HRgsfgmgh+cOsjE7k6SsTTzhm4fMWP+PwoyA9Yy1uXMX5tfiRi4Ez+Fz+gk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We can take the hugetlb walker lock, here taking vma lock directly. Reviewed-by: David Hildenbrand Reviewed-by: Mike Kravetz Reviewed-by: John Hubbard Signed-off-by: Peter Xu --- fs/userfaultfd.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 07c81ab3fd4d..969f4be967c6 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -376,7 +376,8 @@ static inline unsigned int userfaultfd_get_blocking_state(unsigned int flags) */ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) { - struct mm_struct *mm = vmf->vma->vm_mm; + struct vm_area_struct *vma = vmf->vma; + struct mm_struct *mm = vma->vm_mm; struct userfaultfd_ctx *ctx; struct userfaultfd_wait_queue uwq; vm_fault_t ret = VM_FAULT_SIGBUS; @@ -403,7 +404,7 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) */ mmap_assert_locked(mm); - ctx = vmf->vma->vm_userfaultfd_ctx.ctx; + ctx = vma->vm_userfaultfd_ctx.ctx; if (!ctx) goto out; @@ -493,6 +494,15 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) blocking_state = userfaultfd_get_blocking_state(vmf->flags); + /* + * Take the vma lock now, in order to safely call + * userfaultfd_huge_must_wait() later. Since acquiring the + * (sleepable) vma lock can modify the current task state, that + * must be before explicitly calling set_current_state(). + */ + if (is_vm_hugetlb_page(vma)) + hugetlb_vma_lock_read(vma); + spin_lock_irq(&ctx->fault_pending_wqh.lock); /* * After the __add_wait_queue the uwq is visible to userland @@ -507,13 +517,15 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) set_current_state(blocking_state); spin_unlock_irq(&ctx->fault_pending_wqh.lock); - if (!is_vm_hugetlb_page(vmf->vma)) + if (!is_vm_hugetlb_page(vma)) must_wait = userfaultfd_must_wait(ctx, vmf->address, vmf->flags, reason); else - must_wait = userfaultfd_huge_must_wait(ctx, vmf->vma, + must_wait = userfaultfd_huge_must_wait(ctx, vma, vmf->address, vmf->flags, reason); + if (is_vm_hugetlb_page(vma)) + hugetlb_vma_unlock_read(vma); mmap_read_unlock(mm); if (likely(must_wait && !READ_ONCE(ctx->released))) { From patchwork Fri Dec 9 17:00:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13070021 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92B9FC10F1E for ; Fri, 9 Dec 2022 17:01:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 064B98E000C; Fri, 9 Dec 2022 12:01:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F06508E0001; Fri, 9 Dec 2022 12:01:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE0738E000D; Fri, 9 Dec 2022 12:01:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A409C8E0001 for ; Fri, 9 Dec 2022 12:01:36 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 6F9D6120F1A for ; Fri, 9 Dec 2022 17:01:36 +0000 (UTC) X-FDA: 80223384192.21.5A5FD22 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf18.hostedemail.com (Postfix) with ESMTP id 136721C0024 for ; Fri, 9 Dec 2022 17:01:31 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=AkI7v8Y0; spf=pass (imf18.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670605292; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QzeyNG+xisx8vPYcMzT01YLXY3V0nLFHTQTVfPkpFHw=; b=1FfuOKhZA3G0f5jUwtPeQdnAyHU2K3BujVZBb6c74TBYNkTC7Bz7Ddllt7yu3EKjqsxfjW lIVSMQIgnB4vla082/g4B7eNp36Wlo/Rc1wDavxWSi2+9xFlkvtP38ijNr3tq/5i5XRq8n HqkXtiAgeOPDWzB33MjmbOwv6TEsd9M= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=AkI7v8Y0; spf=pass (imf18.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670605292; a=rsa-sha256; cv=none; b=dQrYpz2axRhCvdHjEAbuaS2UOX5piCKdqMSEhDzlJTQYyaif8HIPXtDlRwM8vVg/jSH7MC hWmRpCz68DYTYvWAH5//ZWKj8bjw9RL5mUcDnQImPw1a0Wd9sho9vHRhMtCiwNbq4xv2x7 b5wT0dqs5CG/o//31jOUhFHZDzzf/hg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670605291; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QzeyNG+xisx8vPYcMzT01YLXY3V0nLFHTQTVfPkpFHw=; b=AkI7v8Y0KNY57iPYbG3y4A/aw30ON0o+EB8z7hi42F22iwngHD7TDQxvfjrcXUfC2yscJn T6dqltbYpOuiGCVb3Y8X2P3miIR37sjczgRForvreW352t76E8VHCPAeZoDnX8LESJ7oTo OamwEJdHJT7+vEEg1ppfo8ILeVev6CI= Received: from mail-ua1-f72.google.com (mail-ua1-f72.google.com [209.85.222.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-8-dnRVlOP4PKq5D_mt_d4t5A-1; Fri, 09 Dec 2022 12:01:28 -0500 X-MC-Unique: dnRVlOP4PKq5D_mt_d4t5A-1 Received: by mail-ua1-f72.google.com with SMTP id b19-20020ab01413000000b00418fea11cc9so2087624uae.18 for ; Fri, 09 Dec 2022 09:01:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QzeyNG+xisx8vPYcMzT01YLXY3V0nLFHTQTVfPkpFHw=; b=UABQcE01+brUm5e/OQ0slHgV6BOgQTzxXHp/VmBoDByDGzAwWBm3YKHlDYai4I+ZTy J44f0puFcTdiwhX2tipX7AVfgyz97csLNTVJj+yLSOtaC05K3f1m6hv1n2BsaEQWGzyz +hrgZofv3P9X4h3DYk/saqhRa9eBrN1DjvAUMgDmghFeCWlhAshcmzPiUa5KqunP/XvW m80FhUe+VYI8CWmV2Nuj1ajNWMNFEDzfctEQvAdVZYFJSkvdGJh4wbJAKq41wXp5a3wt bshCpbSAk4x71UK3InftAyzh8K5rQRb9aKBT2FHTsww/48bO4VNIBSBbFyYKJ47JnP7k iNWA== X-Gm-Message-State: ANoB5pl17HvYokCtxI6GwWKzbsUr4jvq626Sj+gUqe3iaJ5Hhz8kvoVU pNju8+xYUDEwwe+rKkAwuVwq49esqbH18XIYIMHjzLNC3pmlwCLqGQBbD/mDe4BhzoJus03xEFK pZ9bL4/60xovtS2b13wrVYW0YcJG59RLGIfQWyGWMXLnrTQw719Bx6MqRt9vQ X-Received: by 2002:a1f:ee4e:0:b0:3bd:f324:5500 with SMTP id m75-20020a1fee4e000000b003bdf3245500mr3198348vkh.2.1670605283934; Fri, 09 Dec 2022 09:01:23 -0800 (PST) X-Google-Smtp-Source: AA0mqf5nmOoaLlqHDusb7JXaz8VNhkWlW4brwxUYuPvukkJNtkEIbapa5We/uBYiNWEM3SCaE8rMvQ== X-Received: by 2002:a1f:ee4e:0:b0:3bd:f324:5500 with SMTP id m75-20020a1fee4e000000b003bdf3245500mr3198279vkh.2.1670605283615; Fri, 09 Dec 2022 09:01:23 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id q7-20020a05620a0d8700b006cf38fd659asm178907qkl.103.2022.12.09.09.01.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Dec 2022 09:01:22 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrew Morton , Miaohe Lin , David Hildenbrand , Nadav Amit , peterx@redhat.com, Andrea Arcangeli , Jann Horn , John Hubbard , Mike Kravetz , James Houghton , Rik van Riel , Muchun Song Subject: [PATCH v3 6/9] mm/hugetlb: Make hugetlb_follow_page_mask() safe to pmd unshare Date: Fri, 9 Dec 2022 12:00:57 -0500 Message-Id: <20221209170100.973970-7-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221209170100.973970-1-peterx@redhat.com> References: <20221209170100.973970-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 136721C0024 X-Stat-Signature: 97wj71iba3noo1t7qoraqatzebpicxoc X-HE-Tag: 1670605291-607137 X-HE-Meta: U2FsdGVkX1+acCm6V0kOoLbxAtFuG4crQYne3uPdNrmgTH42NlScTTzYe4zn4PG6oSf4o58ozR+LeGzpva3XX+K8HdPsGuCifCorQGsKerpv6kKA9Ri0UOJxf55yc8UlyfA0xTP6qqvRdI0MMBAovFVBNzptLkXsO+t2ea7/0oBaMe3k0ZoClD0elaRObJJe7TBsCnvQcFSEknO09DKKADmI7AgiN36M4eeOL4lC65rZjXJKiQ1OhtXts1m/QyP/pcc1X5E8AuvdMSjWIuI5fiu0vRkD8qpYP1C09Dj9h/1Yw5Ow3HWADN9WezcqzwvLWZz7TcrMknpAap/h6hm4klpJh4/6pbCrd3/baIhqrl95q+ZR7UBmopKLQmGT5e8wu0z5uwCYCN+rO8u8I5Xfbz5stnRJeAOJKncI5p+5UwNb4QEeQ92VtYAMkUHAZgrmCwByni6WCynu91a3WYthjtRfYA9jMkQFIV0upjQtP9HpHsSW61PbWkEpPtkFbbyRcCPnUIdstTNcKkHDvIfRGROVw406rUP+XWkR62sgfermiDHynDj5BsAHi+0UdhH5SQql5EOPa57HkUDscrnEtQN8ACdb/oPaUD0SLn1mgmZweVxrZwx20Ew1S99zDFvz2Tg6gj/TQ123Kw16juJRzdSh/ulWURZTJRRXsx6dy//R7Ee1JSv6qsRUvcOfeQsm1scnPmQo4GgrGXlUc/YIyu1zpzT6YD6hZqy65+tlNavkeoZt5u7MrIPmXtrZ5PZFBcJyk4qolT5LKRUxOMkDbqUF8r9YIxlBY0hrstDHUt83KhvRuHryzLUERQSzKx4J3/7UEdyB6ktHShcecYI3iupc9oPRvLbbaiOYE8AttNwiG6+h/+2450F9fdVxGkwX+svTCobVHoa5muKIsHUejD6PxLNm9mTNKjQ8O2rnhT9x2dZf9R5hWeYzuzSzjRWWoYJCLG1qXe/babFzT6f xyllzweD xZfOHvESw/EqzhYWkQj2CNPse+MHWreDxtOlL/tOUSFabqRfdVbiEg8IR75WVwsEbA/0keT9e3/EIZj8lUC88iBUipYso1LtBRTnCaJzdxSD9K6VUb8xq8sf72xwNDBs+YkKj+urt1ciVxnuNC/oiP29ir55vEtD+TcH+W/Hw9DbUtNVBmS03lzdiOe+xYTQq1crMbya7tqQ1Nh2AJoll0CBfIjhhV5UWV8GH0CS0VAK3ysG8/AlUJc6GRfh6J5iQuYHZKqBsvvAxIO4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since hugetlb_follow_page_mask() walks the pgtable, it needs the vma lock to make sure the pgtable page will not be freed concurrently. Acked-by: David Hildenbrand Reviewed-by: Mike Kravetz Reviewed-by: John Hubbard Signed-off-by: Peter Xu --- mm/hugetlb.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 247702eb9f88..e3af347470ac 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6227,9 +6227,10 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, if (WARN_ON_ONCE(flags & FOLL_PIN)) return NULL; + hugetlb_vma_lock_read(vma); pte = huge_pte_offset(mm, haddr, huge_page_size(h)); if (!pte) - return NULL; + goto out_unlock; ptl = huge_pte_lock(h, mm, pte); entry = huge_ptep_get(pte); @@ -6252,6 +6253,8 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, } out: spin_unlock(ptl); +out_unlock: + hugetlb_vma_unlock_read(vma); return page; } From patchwork Fri Dec 9 17:00:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13070019 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22D03C4332F for ; Fri, 9 Dec 2022 17:01:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B35518E000A; Fri, 9 Dec 2022 12:01:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ABEE48E0001; Fri, 9 Dec 2022 12:01:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 938468E000A; Fri, 9 Dec 2022 12:01:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8245A8E0001 for ; Fri, 9 Dec 2022 12:01:34 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3D7E81413B4 for ; Fri, 9 Dec 2022 17:01:34 +0000 (UTC) X-FDA: 80223384108.24.B7773A9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf25.hostedemail.com (Postfix) with ESMTP id F07A2A0031 for ; Fri, 9 Dec 2022 17:01:31 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=WeMlD0vR; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf25.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670605292; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BJ1c+DnQwwOjSKsDq4iN/7ttySyX/vWFxv5LWeLTqBE=; b=VTGiVIjxh3S2LqG+8uqTtZXEXoCBjoGHbzKOXGbR6VgaiOknU1rwBrZ24wW0wp1IafNkn3 58rpYxxdYOD+kwFvadTBV5sf5CWws0139cA9uH/UcON8Lr0JusSSJ1usbTVlF6tbzlbY38 hrFgFE5yGjekImf/QHnuPtLc90SjWY8= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=WeMlD0vR; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf25.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670605292; a=rsa-sha256; cv=none; b=ekwRA298gnNy1dSOMS2DD+1VQLPfHDx+EdFyJ7XeLQ58aW6p/Npw5hzBrmR4Jsdko1K5dj A30lCQbfOuYdLBOoWryl1SqdtFnQ2VekOOaLXOkcHahqt25MiGMI8nvITeblvc/4Ql7w1J jte8/r+ymQc5tQ/QPZ7mrkLVvrtrAtg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670605291; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BJ1c+DnQwwOjSKsDq4iN/7ttySyX/vWFxv5LWeLTqBE=; b=WeMlD0vR4fwYR8sl/Td4qOC0OB9oB7/VVFoYTrOTYM7H6c63oM1AVExwD/KQFzXFXM6rHU mdsZN4uTVPFd/aZgluEEtptQgrry9GPfi3CZLy56BSf+epJdetgSwmv/NLlkowFI/ldnRf kK76FrQ0A8IIr28Vz4qriWW9ew40LdM= Received: from mail-oa1-f69.google.com (mail-oa1-f69.google.com [209.85.160.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-140-x9MavlnvNmWDjmG8KfBldA-1; Fri, 09 Dec 2022 12:01:27 -0500 X-MC-Unique: x9MavlnvNmWDjmG8KfBldA-1 Received: by mail-oa1-f69.google.com with SMTP id 586e51a60fabf-144da30bb39so140790fac.10 for ; Fri, 09 Dec 2022 09:01:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BJ1c+DnQwwOjSKsDq4iN/7ttySyX/vWFxv5LWeLTqBE=; b=TEh4smQ+xNNdngO1KlXY2xoqXqh9BZ2/zgXQsTj6MS7BgJoS0bzv83ZjzjunIVEbJ0 h7oFmUl0wR4BTUPKyuv8AMoGNaQ2hJ/qZfSKJaNhAaCY0gTG0KmBiKmsmKaHpLUc+FgW PYnFd01+wWDGFNg5mlK5PybcRwLc+F1Y4UgQKgpi7Zt8VJxNejUjJjwweg7gA+leQevL y6fQJnL7ok6v0kKs6PwQfKI005iABGCI3+lRMeRcdhJgmSS1DHMSiOpVtLMZoj2lg0nS 4bmiBF8ZpdoqNi9/G9/qBtFYrYzIF1QB1xQZNcgq+ELUhE1yoXTW18ALmHiEJ2uYxfe5 hvFA== X-Gm-Message-State: ANoB5pmHulMYt25pYb+8+SPzh7biz0X2AP7YNliquyL+soln9vvzmL5Q fa2BJF9SxC2Nn8sg5kkkwK2Z8RTSgnsh5990J7n+s0mFIFTtawWh78MzY0mT+qRM+dl5ST6NJWx hwLGmr4bEcLUiIucbLoEHb7TBHsExpBI2JevSLD/dlrAG/lxUzQDVCoaYBKRF X-Received: by 2002:a05:6870:2f0a:b0:141:fe19:d4d0 with SMTP id qj10-20020a0568702f0a00b00141fe19d4d0mr2656015oab.50.1670605285802; Fri, 09 Dec 2022 09:01:25 -0800 (PST) X-Google-Smtp-Source: AA0mqf67Y/YxUkSaXiObMUPvZ9/FDlwGcy+0g+8APbf7404nsBoydGf3R0e/VdazPQe3Ssti4e+l/Q== X-Received: by 2002:a05:6870:2f0a:b0:141:fe19:d4d0 with SMTP id qj10-20020a0568702f0a00b00141fe19d4d0mr2655965oab.50.1670605285082; Fri, 09 Dec 2022 09:01:25 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id q7-20020a05620a0d8700b006cf38fd659asm178907qkl.103.2022.12.09.09.01.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Dec 2022 09:01:24 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrew Morton , Miaohe Lin , David Hildenbrand , Nadav Amit , peterx@redhat.com, Andrea Arcangeli , Jann Horn , John Hubbard , Mike Kravetz , James Houghton , Rik van Riel , Muchun Song Subject: [PATCH v3 7/9] mm/hugetlb: Make follow_hugetlb_page() safe to pmd unshare Date: Fri, 9 Dec 2022 12:00:58 -0500 Message-Id: <20221209170100.973970-8-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221209170100.973970-1-peterx@redhat.com> References: <20221209170100.973970-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: F07A2A0031 X-Stat-Signature: xc3acyg4hh9ahdwxkehy5o6eqmqqqhht X-HE-Tag: 1670605291-10991 X-HE-Meta: U2FsdGVkX18uxpfhmkoJBP2CYPkWgt5bytjjBTYfkBOKHBhXvk09jjt2TjeU/+tmt/trVK8awWUzbOkLUF/J34+P5XWSDTkJ6VLX7xAp9m33gYK75N/PfO1pSB3HZNJVT/p2vRGllhMSj/t8rCB6B7USsgDWb3LexgLIrmuGd0sDwS411MoN6+By7YGpttpSNwKLb5Cc7Gx7NQIrnGCc4IB4UUj0g6Xzsn0KbknCQB0TGMmbdZTFWf0ZjriICCPQhwv3c0RDyJmByIuM545Lsnkuuz9XC4wBff8VMKDxzJW2LKuBKQ5M5pFNBEkv94VVT7t6loOQg4qd8aqpzXb2zz9xeWEN7ux46WOchIC8N+FiB4jpj7A9l+bWek8QmFknModlSdxp9+H+mJ58DaO/WqoLFDsUKPqry8t/OVWQcAGIb7I7rfTWktMEFOLuyN4uYZT02ARcqVKCH8TP8EJzkJeGf0qKlGmwuJhZoSebqknE0ZdpJgDaptibGsWwroYm2zbKrggBZLt7zODO4kUoPu/Bm9GQ/bAKsxm3DuJeJhQ6UCXon7DL11VujqAMPQDTRdU4jTSu+tXzbaBHo09hw5GivuLvy1+RJGVXjgyX/U5NtctBfZ1SkgCZazAcy9vcO438E1qhQ/elTJAuy+f0l2HmA2Jl2HU4u53HsqxXlKE6gQ5U/uEyZZcio66eX7jJge0tf+UNb+KizR2KkcETqCDDHUXh9IxmX/lRp0KjQvzW3TVImG+mYJLsEgN8Dybj4Id3OUGX9nKjI1Vxr3o0Rbp6yZtTL/O0TJ0x9EKph/xa4cypTfLoJYJ3RGEVhHcc/jVXqvnV9LLH3+7jPDj7u0BJJRvzCDadTrCBAw5/2yp4o68SAEZbAJqI/qvooWIRtqSrDxTqxtdX4IJbVt5NkFicoZOzh6EWyIGgyEMGP9ErGgnESU7kqbmZ0Nt6WXyH07EAUsB0xfouizBhM3x doA1L2cX 4HL4VO/RRlcYFgbgv12KKa+I1rntQZ1zywXLTEZJxU3U/3KeqsFYMbWo3A8aG+2ypWnpgTnHODGlXOtz+2vgpodU1Mo0ubFy/0MY2KTu9Wx8p9Yf2cu4h43SjJNdshnj+GSClZM1autJ7wqmn16mH7sJ+vO2tHJJjAveU0MQBWrqxvwuseZCrfrYnHJEt8lVcwZQtpcuHUUwsNXShRrYWh8Eoo9MEX2alOMhSkTwwfRd3I1y6FXBTk8JueA1FS/f2LVMSBlNNcNMFqBY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since follow_hugetlb_page() walks the pgtable, it needs the vma lock to make sure the pgtable page will not be freed concurrently. Acked-by: David Hildenbrand Reviewed-by: Mike Kravetz Reviewed-by: John Hubbard Signed-off-by: Peter Xu --- mm/hugetlb.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e3af347470ac..9d8bb6508288 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6285,6 +6285,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, break; } + hugetlb_vma_lock_read(vma); /* * Some archs (sparc64, sh*) have multiple pte_ts to * each hugepage. We have to make sure we get the @@ -6309,6 +6310,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, !hugetlbfs_pagecache_present(h, vma, vaddr)) { if (pte) spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); remainder = 0; break; } @@ -6330,6 +6332,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, if (pte) spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); + if (flags & FOLL_WRITE) fault_flags |= FAULT_FLAG_WRITE; else if (unshare) @@ -6389,6 +6393,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, remainder -= pages_per_huge_page(h); i += pages_per_huge_page(h); spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); continue; } @@ -6416,6 +6421,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, if (WARN_ON_ONCE(!try_grab_folio(pages[i], refs, flags))) { spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); remainder = 0; err = -ENOMEM; break; @@ -6427,6 +6433,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, i += refs; spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); } *nr_pages = remainder; /* From patchwork Fri Dec 9 17:00:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13070020 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2532EC04FDE for ; Fri, 9 Dec 2022 17:01:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AED1B8E000B; Fri, 9 Dec 2022 12:01:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A7BA18E000C; Fri, 9 Dec 2022 12:01:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 875348E000B; Fri, 9 Dec 2022 12:01:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 74D878E0001 for ; Fri, 9 Dec 2022 12:01:36 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 3B8A312099F for ; Fri, 9 Dec 2022 17:01:36 +0000 (UTC) X-FDA: 80223384192.27.96FD28F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf01.hostedemail.com (Postfix) with ESMTP id C0B6340043 for ; Fri, 9 Dec 2022 17:01:32 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=TJDdxKJs; spf=pass (imf01.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670605293; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SiCXWzCdYbU9QHTa4B1WTrTi25NBF8hxZPF60vWGZSs=; b=bYyrxEk5tcJDXphHqPb8m34DYXPjygBBm4zgpSn4C3K4IfWI0MSb6KpnZCWT83X83gynsK 7nrdHiT4N/0P7f9jza3qJ1uNufP1xjEGWyL96noe4ZfVxlkdEuIItwjXOyX4ciD/6rqCmR 9EAsF6hCZ//jDcBJvdgUktHYUXJU0kw= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=TJDdxKJs; spf=pass (imf01.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670605293; a=rsa-sha256; cv=none; b=nHKJM6AKi5WWeu4cCIKO6Ei55G/7DVVwde2jzlNKNIn4ASlw1oy3C5IoPISXhXSgiIZFrA 2oU1NoDqRdlDJ9wAAM7Fiqq6jQBW1n8iUV2YzNZ0Rz/KG6EIsfKFUxXpWgRB7fxTC2H/4m zhJUqGkzNgHR5dLqBKPUHY/yJpxPZes= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670605292; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SiCXWzCdYbU9QHTa4B1WTrTi25NBF8hxZPF60vWGZSs=; b=TJDdxKJsKIsNSSdoJ+Zz9nypPwz0yALia0hIFQbRa+yl2yxi63utwuqJpL2aNA6bcQAvTo BbI2tutviRMFTTrDVjigpe9wskqxa3+nB88b15wSgfYBRQMxYsMIZAbMDH2WPuI28EGvw4 mFDaMbTyib8YvD0rfPLEtRBHRHyA9f4= Received: from mail-oo1-f69.google.com (mail-oo1-f69.google.com [209.85.161.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-587-mf_N1Z8RM0ORnkk2KI8wYw-1; Fri, 09 Dec 2022 12:01:30 -0500 X-MC-Unique: mf_N1Z8RM0ORnkk2KI8wYw-1 Received: by mail-oo1-f69.google.com with SMTP id d16-20020a4a3c10000000b004a096d64cabso1559664ooa.16 for ; Fri, 09 Dec 2022 09:01:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SiCXWzCdYbU9QHTa4B1WTrTi25NBF8hxZPF60vWGZSs=; b=hfFsR0nR9JxM6AKr5UG0AoKeuacMBt3656QEEZwqYBS9IzdUW7Of90GGg3/vzwO2an RxJJFpZGpmQCv+9lLmgu3axOrpM2EUnGpgwY//XxOy0c5sJMkZKDBA/ngnSsrhEa6Fcs kvhfkvyx7i7elGtvFdk4Y0ZVvvleKaGlMVKHYkoHtErCNjBFT9pwJkBYGq983QDhKkEX B7tXcfJamGsw9gZ6hry3wvOsdc4o4XSPZQukC3W++PaY36qVvMPeTrrvvybDt/CvftJf 9AadRNAGIhqyTjj4tYlKjSln9K9xlXuxeRtH5312TnEjOBzqC5Ya8454hKUxH/l+25iz Wmzw== X-Gm-Message-State: ANoB5pkwQWTvCo2DjgHbhOP1PTLYKPUbH89dD56EVIxgvdQOzcB4gpZ1 yfsYFSwc8AvMq8zORl84chHeGlmmY2xeCFf90xFKD40Jix7F0V9/riqjyB0vQAWQCxO5mTrkVFp +bjD6uX09DD9M2Pylxg/4MMc+Dw0qu9iIFJ3J+joafOzxvds4qxfE5BBcjwJ3 X-Received: by 2002:a05:6808:141:b0:35a:640d:300e with SMTP id h1-20020a056808014100b0035a640d300emr2638008oie.19.1670605288865; Fri, 09 Dec 2022 09:01:28 -0800 (PST) X-Google-Smtp-Source: AA0mqf6z8153BBKEHldWTDZCjSK48fzRxQgu6nHidb9TN0qLtv3uxM5cswjC0U3ES+rGDimiirmJAQ== X-Received: by 2002:a05:6808:141:b0:35a:640d:300e with SMTP id h1-20020a056808014100b0035a640d300emr2637970oie.19.1670605288528; Fri, 09 Dec 2022 09:01:28 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id q7-20020a05620a0d8700b006cf38fd659asm178907qkl.103.2022.12.09.09.01.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Dec 2022 09:01:27 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrew Morton , Miaohe Lin , David Hildenbrand , Nadav Amit , peterx@redhat.com, Andrea Arcangeli , Jann Horn , John Hubbard , Mike Kravetz , James Houghton , Rik van Riel , Muchun Song Subject: [PATCH v3 8/9] mm/hugetlb: Make walk_hugetlb_range() safe to pmd unshare Date: Fri, 9 Dec 2022 12:00:59 -0500 Message-Id: <20221209170100.973970-9-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221209170100.973970-1-peterx@redhat.com> References: <20221209170100.973970-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain X-Rspamd-Queue-Id: C0B6340043 X-Stat-Signature: n3ihtgagnxgrwjs1y1b7d6rprwwn7btd X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1670605292-150471 X-HE-Meta: U2FsdGVkX19lwMI8/ydgpiRti+nPWwcutuOJQY6DCHbjG3deVa9qfD/mlcTYMwDuz7ULYyI1yGDHhcOPy+QrgkU055jwUhwe/ZU/DhY4GjckSYznb/Kzl5iri5ap6znGWoaiPYCvHlHMMEg4mokJ6GB+B8kNpsL81cLyL0JrLcJbmVf31y3PxmnaozhAoOArrSPXV8H3k2zPK5CATJEV1/OxQcqP5yVCIQzM/K/6VX8k1+W3R38oxiyK/xiltGh74gWYvlWHwtaVwhc1A/LyyUPcfdSH5nXORPwYwEilRHw1FWtuL5jStOjJKACoSmPt6CfAMFqJii8S+xJWFmuR+FJmUwFo3a5qpIpD2D76espRxdpNz8GWvhngOeMSOonTBb1Us6UNlZdvSk6ib3QUG2IYJlP7MRnPgeh1AZmAMfKUglgNfByYyT4L6Ems3iK8wIlzvo+wY7WUgC7oVZUGAUXvhUT4WBFuiIek6COrztN3WYlgWEWJkHMahZlM5rssrnvu06i3P0h5rUotAw2zbRGNY0o/+Hc/RqtE87TuNIEiE7qL6ww4eJkuN43HhoiHlc1kpwaGLpv5RUmMZdpzCqfkagWNC3StDA/ugAtQssrUPIMcuqHkKEd5ISY7Jmo7pjhO6d5fq9B9g2cedvJkbP9G2kXB6dRL8vljnRAZ2YSG3zUFfI3Dha/vqAlr6XtIBjlkCW6Au8Eqp9/Nv+MaLT2o+QjwM+VwyMv7YKzwjQGGtLjqJ4bJKhGjFYPBQ2+Az7uEEV6BWLxnE0NGcFGggtN4I/2defi7BXKj+XAelTG26NRpco7v7CoFLgK2dcOa3PYfueU38b3nO103+HklJ270s2nIbIkmg2fJX8LYqpF4/BUVF+puH1coYxtzlzrFrgwLflic8IvW7qCLyqT858ZZD6unkbdDiCGG/aLXBSTAyYEWpIFcwhY5N/x5bQb7e8Cwh+J4Zr2WQgOBelr zZnWCfJ0 CnHmzXDcg5XUnb/AlOZQmJqwT8jmpzLxUDTRpXGzVTdhdzBaxuwi+Mj4VKnSQcdfaz9JLDwaL3089wLteqAYJu261+wiepyftpiu43lH7k8RPJeUtf09SSXCagmI42jiqR64xxSGC0I4xvNW2onqeUjk7eiSDj/BZVpCj48Qlg02h3A4kn1qJXQfP/RLJ6c5uKvaSIcgM3m+boZvZOzB1UT9GEXuJhc2xGUweADs/wuwnoyQHv24JCoZiqDAb8yEq38irc7sdWSWrTVI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since walk_hugetlb_range() walks the pgtable, it needs the vma lock to make sure the pgtable page will not be freed concurrently. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu Reviewed-by: John Hubbard --- include/linux/pagewalk.h | 11 ++++++++++- mm/hmm.c | 15 ++++++++++++++- mm/pagewalk.c | 2 ++ 3 files changed, 26 insertions(+), 2 deletions(-) diff --git a/include/linux/pagewalk.h b/include/linux/pagewalk.h index 959f52e5867d..27a6df448ee5 100644 --- a/include/linux/pagewalk.h +++ b/include/linux/pagewalk.h @@ -21,7 +21,16 @@ struct mm_walk; * depth is -1 if not known, 0:PGD, 1:P4D, 2:PUD, 3:PMD. * Any folded depths (where PTRS_PER_P?D is equal to 1) * are skipped. - * @hugetlb_entry: if set, called for each hugetlb entry + * @hugetlb_entry: if set, called for each hugetlb entry. This hook + * function is called with the vma lock held, in order to + * protect against a concurrent freeing of the pte_t* or + * the ptl. In some cases, the hook function needs to drop + * and retake the vma lock in order to avoid deadlocks + * while calling other functions. In such cases the hook + * function must either refrain from accessing the pte or + * ptl after dropping the vma lock, or else revalidate + * those items after re-acquiring the vma lock and before + * accessing them. * @test_walk: caller specific callback function to determine whether * we walk over the current vma or not. Returning 0 means * "do page table walk over the current vma", returning diff --git a/mm/hmm.c b/mm/hmm.c index 3850fb625dda..796de6866089 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -493,8 +493,21 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, required_fault = hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, cpu_flags); if (required_fault) { + int ret; + spin_unlock(ptl); - return hmm_vma_fault(addr, end, required_fault, walk); + hugetlb_vma_unlock_read(vma); + /* + * Avoid deadlock: drop the vma lock before calling + * hmm_vma_fault(), which will itself potentially take and + * drop the vma lock. This is also correct from a + * protection point of view, because there is no further + * use here of either pte or ptl after dropping the vma + * lock. + */ + ret = hmm_vma_fault(addr, end, required_fault, walk); + hugetlb_vma_lock_read(vma); + return ret; } pfn = pte_pfn(entry) + ((start & ~hmask) >> PAGE_SHIFT); diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 7f1c9b274906..d98564a7be57 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -302,6 +302,7 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end, const struct mm_walk_ops *ops = walk->ops; int err = 0; + hugetlb_vma_lock_read(vma); do { next = hugetlb_entry_end(h, addr, end); pte = huge_pte_offset(walk->mm, addr & hmask, sz); @@ -314,6 +315,7 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end, if (err) break; } while (addr = next, addr != end); + hugetlb_vma_unlock_read(vma); return err; } From patchwork Fri Dec 9 17:01:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13070022 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C16FC4332F for ; Fri, 9 Dec 2022 17:01:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D0A4E8E000D; Fri, 9 Dec 2022 12:01:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C93C28E0001; Fri, 9 Dec 2022 12:01:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE9E18E000D; Fri, 9 Dec 2022 12:01:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8B78F8E0001 for ; Fri, 9 Dec 2022 12:01:40 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 45E954125B for ; Fri, 9 Dec 2022 17:01:40 +0000 (UTC) X-FDA: 80223384360.22.26C8C5C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf13.hostedemail.com (Postfix) with ESMTP id 2E5F520039 for ; Fri, 9 Dec 2022 17:01:37 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IQNOl9Vn; spf=pass (imf13.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670605298; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=L7MEdlvbVLjBM5+9KWX/Vi/7wZkpFuDB6TgvbozGtp8=; b=yGeJ5eeTgJa0cHniAHAQGHxEXRk6dq2f5EI8iZNKdrwOefO9goUsjy811hdobbEhhh9uJS ekWBeURH8AzzH53yBnOj2O1Zqn6kNa6OC6MjPmSN8bP7ya/IQnTJZt4YLLr0HiHrCfMcBY DWH0k7AwYwb0VlNZKWvG4wQu+2AIjUU= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IQNOl9Vn; spf=pass (imf13.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670605298; a=rsa-sha256; cv=none; b=LM3fWlU4HtaStnhZmqHHHgkeux1IQ4FOVudRe8vvYRgNHjz/cVbZMeuyClkpYbzmMdnM9u rvjZezmr09teiLr/RrQLxF5iG/QpZuA8TVJi9N/DCU/TFPOL3HJpYhKBs/aiXUFMowaL79 YVATBVCG3p/vAE33PRizHb6fsEXHgJo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670605297; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=L7MEdlvbVLjBM5+9KWX/Vi/7wZkpFuDB6TgvbozGtp8=; b=IQNOl9Vn1d7CZvjuP9glpRe4g9KqZR2RgfWVQZORWLdVyfmAQmhKTgzM4XyaWgdExjarfQ ddHckQQ4UAkomyy+xXfzAhU+6VMF9VRZsX/uei/4tQxiZMf7Jr5KfAjUbpzJFmhjamUsSZ gAbqJZ7zb/5voPrbQijixSsKxgSG9Dg= Received: from mail-pf1-f197.google.com (mail-pf1-f197.google.com [209.85.210.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-630-bGAjMfiKOwW4Zs2unQkrAQ-1; Fri, 09 Dec 2022 12:01:35 -0500 X-MC-Unique: bGAjMfiKOwW4Zs2unQkrAQ-1 Received: by mail-pf1-f197.google.com with SMTP id n16-20020a056a000d5000b005764608bb24so3657590pfv.12 for ; Fri, 09 Dec 2022 09:01:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=L7MEdlvbVLjBM5+9KWX/Vi/7wZkpFuDB6TgvbozGtp8=; b=m7rOPxiNAFYTbVYwsIPKMAGjZyIH0XZQdz7yEX5NjxU2Y2t+NFl/esXoaCUOKyahXm R7G2g4WdKhaTX77Oupsxnz5O96eF00pN7jCa6OQi9HYt9eaKZ4PmaDXBYRbkWQlFWDGp 25t0vJ+AxlIcFKhQ0aaFETKwl/eDXCkZE488GP5DEQyIoHsofQ/3p4JIu7F1mGh4Fw8z DUBCPQpdFH2PojAEQZkJSEEEwrGsO4qZZ1wwzJEi/5n6MlsAcqaz/pBacvzJ+dbBKroB raSFgNxXI5EM1c5PNrctoitoNRCxncUEM45l7aevc+oxbN+Ql1xuJw0Z2Lm3TjTY+Xqj YWBw== X-Gm-Message-State: ANoB5pkJCwZvRs/aS/19SGNgOFJEjqhYJDB2ACKgHvqFlhHcmDQfIyy7 7ueM5Lxuu1nkZuhds8tZNJrxaNUCHd9o/UiimCNzHvtzYe3Rr76a0W+jU+PiUk1HM2sNVEF5Gwi UYUF9GVRngzQUDKMMdOIHh36uZMebCvVwWvdLT0OfgKXc9yFyh9SSMqVGUt91 X-Received: by 2002:a05:6a20:9f4a:b0:9d:efbf:6618 with SMTP id ml10-20020a056a209f4a00b0009defbf6618mr9037945pzb.38.1670605292630; Fri, 09 Dec 2022 09:01:32 -0800 (PST) X-Google-Smtp-Source: AA0mqf4691rJPo5hULuegymNELlZK/OS4Isy0pkxFdZAl/xogaBGdec9hd8nNWbKv9OTMLROMYo/pQ== X-Received: by 2002:a05:6a20:9f4a:b0:9d:efbf:6618 with SMTP id ml10-20020a056a209f4a00b0009defbf6618mr9037889pzb.38.1670605292058; Fri, 09 Dec 2022 09:01:32 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id q7-20020a05620a0d8700b006cf38fd659asm178907qkl.103.2022.12.09.09.01.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Dec 2022 09:01:30 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrew Morton , Miaohe Lin , David Hildenbrand , Nadav Amit , peterx@redhat.com, Andrea Arcangeli , Jann Horn , John Hubbard , Mike Kravetz , James Houghton , Rik van Riel , Muchun Song Subject: [PATCH v3 9/9] mm/hugetlb: Introduce hugetlb_walk() Date: Fri, 9 Dec 2022 12:01:00 -0500 Message-Id: <20221209170100.973970-10-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221209170100.973970-1-peterx@redhat.com> References: <20221209170100.973970-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 2E5F520039 X-Stat-Signature: 8i1r4bqazzuojwfqfo53xgshqkwm49sg X-HE-Tag: 1670605297-409401 X-HE-Meta: U2FsdGVkX18s3haomX616Uxwd9LTaE706qT1YgKRh+OhnH6pXMKHTbQ+97XIHRyeaQU29BgOjhU/YrrALHNZuM9OdgkhLupoNtg9vGUxanme1qGVOaIh0lmmuAj6sDXtmPf8v5TXopdHTVL64opqGmR5+Xa1zk/aljXru3U6c7zv9sD+3BC94Q5uPWWGjgWY7LDbTc5kP19um5izWQDiCJDYdrik+0OWPlXEWLfM3c8xcjJhTDZW+/tqm64HUyfpWssjo43Vv6TIeJIoTd43Ndg24/DpiQRK6qOq+VVdWH89x8/0ffws/atXZFDGy2f1bo5BMkSJ6EN1srz6u1Irr6dqAtkk0PqPS2J4IDKPEoNp6REj4xWhGAWDdR9VarJXopD8L+1NRTZ4IKrz/EYZhz/2zBgR7hbfPUz2arJkeqDFhusWXYm7KHEREYgQ7LzAft6/VvNq2+SZhRcqDfVUr32P5CpVLYy7ZYI+QrkfOK3g/G68ngNaOmHxtMrTHpTqLc0mKj932wqbCjcvcPE7mX/3bs9epZOfohcXJdKF9qlebpePAKCKA2Q2gkH7QiItux/GCLXygwx8a4EYeWf1GKH98x94JCXOIQuE68K3M2J8BpElCzOwT0cZZd5LBcy3HlKs74wmTDEW+O36ZCgrGrbrW3FpkY1oCMIrKFQR32vLj917+Ya92KfNjwvxVhJ/fPMqqNCBw8SSSi4N4eCv74UI+uecH1AGH4892DieSceZE7xLfmtFOvTevkfr+hgWFwjCo4vE1O7uEXBliGeO+PtO+x5coxc56ECrAz6mhI7FQhxJZkDgf4nVANIf/vQdvJ/iADW5xT9Ziies1q8n+zN/Rp0ia+7Vb2bVO4yZ9jFkNuG9oVfhsUF0CLqTlACDl+FEDdYObK1MoC2Apj5YxLAhSX+6R320a39/eXzBa62nYY16b1ElKcdV7ij6AmIt4TpUZNsffamJVh+kHRw mkJZeq0e kaqVv9X9/8Xd53yu+0m/LOrniFVPaDfls3MCnyJPeaKKJqa7sispbhr4snd4AYEddSCQHqP3Tzd/svuf9kMJC+l9xbna0j5psGMwObinVSVlMxZahpVLDc/qB98eHbdK3J/ol0cfdaTjGNHl7nA0pymGM4ZNOIQEd+M9FKJe/bUz1GSDtFse41wFC+A2huB3SkCjkHQX3wfEW+JEoU1kMPg8SNvn6coB7sPp2B9yRFKSFqtY63T1e1sb60MQGrKKyQiQ43GRiNXRE8Nc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: huge_pte_offset() is the main walker function for hugetlb pgtables. The name is not really representing what it does, though. Instead of renaming it, introduce a wrapper function called hugetlb_walk() which will use huge_pte_offset() inside. Assert on the locks when walking the pgtable. Note, the vma lock assertion will be a no-op for private mappings. Document the last special case in the page_vma_mapped_walk() path where we don't need any more lock to call hugetlb_walk(). Taking vma lock there is not needed because either: (1) potential callers of hugetlb pvmw holds i_mmap_rwsem already (from one rmap_walk()), or (2) the caller will not walk a hugetlb vma at all so the hugetlb code path not reachable (e.g. in ksm or uprobe paths). It's slightly implicit for future page_vma_mapped_walk() callers on that lock requirement. But anyway, when one day this rule breaks, one will get a straightforward warning in hugetlb_walk() with lockdep, then there'll be a way out. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu Reviewed-by: John Hubbard --- fs/hugetlbfs/inode.c | 4 +--- fs/userfaultfd.c | 6 ++---- include/linux/hugetlb.h | 39 +++++++++++++++++++++++++++++++++++++++ mm/hugetlb.c | 32 +++++++++++++------------------- mm/page_vma_mapped.c | 9 ++++++--- mm/pagewalk.c | 4 +--- 6 files changed, 62 insertions(+), 32 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index fdb16246f46e..48f1a8ad2243 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -388,9 +388,7 @@ static bool hugetlb_vma_maps_page(struct vm_area_struct *vma, { pte_t *ptep, pte; - ptep = huge_pte_offset(vma->vm_mm, addr, - huge_page_size(hstate_vma(vma))); - + ptep = hugetlb_walk(vma, addr, huge_page_size(hstate_vma(vma))); if (!ptep) return false; diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 969f4be967c6..6a278941ec84 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -237,14 +237,12 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, unsigned long flags, unsigned long reason) { - struct mm_struct *mm = ctx->mm; pte_t *ptep, pte; bool ret = true; - mmap_assert_locked(mm); - - ptep = huge_pte_offset(mm, address, vma_mmu_pagesize(vma)); + mmap_assert_locked(ctx->mm); + ptep = hugetlb_walk(vma, address, vma_mmu_pagesize(vma)); if (!ptep) goto out; diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index d755e2a7c0db..a5e87ec7fa6e 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -2,6 +2,7 @@ #ifndef _LINUX_HUGETLB_H #define _LINUX_HUGETLB_H +#include #include #include #include @@ -196,6 +197,11 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, * huge_pte_offset(): Walk the hugetlb pgtable until the last level PTE. * Returns the pte_t* if found, or NULL if the address is not mapped. * + * IMPORTANT: we should normally not directly call this function, instead + * this is only a common interface to implement arch-specific + * walker. Please use hugetlb_walk() instead, because that will attempt to + * verify the locking for you. + * * Since this function will walk all the pgtable pages (including not only * high-level pgtable page, but also PUD entry that can be unshared * concurrently for VM_SHARED), the caller of this function should be @@ -1229,4 +1235,37 @@ bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr); #define flush_hugetlb_tlb_range(vma, addr, end) flush_tlb_range(vma, addr, end) #endif +static inline bool +__vma_shareable_flags_pmd(struct vm_area_struct *vma) +{ + return vma->vm_flags & (VM_MAYSHARE | VM_SHARED) && + vma->vm_private_data; +} + +/* + * Safe version of huge_pte_offset() to check the locks. See comments + * above huge_pte_offset(). + */ +static inline pte_t * +hugetlb_walk(struct vm_area_struct *vma, unsigned long addr, unsigned long sz) +{ +#if defined(CONFIG_HUGETLB_PAGE) && \ + defined(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && defined(CONFIG_LOCKDEP) + struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; + + /* + * If pmd sharing possible, locking needed to safely walk the + * hugetlb pgtables. More information can be found at the comment + * above huge_pte_offset() in the same file. + * + * NOTE: lockdep_is_held() is only defined with CONFIG_LOCKDEP. + */ + if (__vma_shareable_flags_pmd(vma)) + WARN_ON_ONCE(!lockdep_is_held(&vma_lock->rw_sema) && + !lockdep_is_held( + &vma->vm_file->f_mapping->i_mmap_rwsem)); +#endif + return huge_pte_offset(vma->vm_mm, addr, sz); +} + #endif /* _LINUX_HUGETLB_H */ diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 9d8bb6508288..b20120d14a71 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4814,7 +4814,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } else { /* * For shared mappings the vma lock must be held before - * calling huge_pte_offset in the src vma. Otherwise, the + * calling hugetlb_walk() in the src vma. Otherwise, the * returned ptep could go away if part of a shared pmd and * another thread calls huge_pmd_unshare. */ @@ -4824,7 +4824,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, last_addr_mask = hugetlb_mask_last_page(h); for (addr = src_vma->vm_start; addr < src_vma->vm_end; addr += sz) { spinlock_t *src_ptl, *dst_ptl; - src_pte = huge_pte_offset(src, addr, sz); + src_pte = hugetlb_walk(src_vma, addr, sz); if (!src_pte) { addr |= last_addr_mask; continue; @@ -5028,7 +5028,7 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, hugetlb_vma_lock_write(vma); i_mmap_lock_write(mapping); for (; old_addr < old_end; old_addr += sz, new_addr += sz) { - src_pte = huge_pte_offset(mm, old_addr, sz); + src_pte = hugetlb_walk(vma, old_addr, sz); if (!src_pte) { old_addr |= last_addr_mask; new_addr |= last_addr_mask; @@ -5091,7 +5091,7 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct last_addr_mask = hugetlb_mask_last_page(h); address = start; for (; address < end; address += sz) { - ptep = huge_pte_offset(mm, address, sz); + ptep = hugetlb_walk(vma, address, sz); if (!ptep) { address |= last_addr_mask; continue; @@ -5404,7 +5404,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, mutex_lock(&hugetlb_fault_mutex_table[hash]); hugetlb_vma_lock_read(vma); spin_lock(ptl); - ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); + ptep = hugetlb_walk(vma, haddr, huge_page_size(h)); if (likely(ptep && pte_same(huge_ptep_get(ptep), pte))) goto retry_avoidcopy; @@ -5442,7 +5442,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, * before the page tables are altered */ spin_lock(ptl); - ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); + ptep = hugetlb_walk(vma, haddr, huge_page_size(h)); if (likely(ptep && pte_same(huge_ptep_get(ptep), pte))) { /* Break COW or unshare */ huge_ptep_clear_flush(vma, haddr, ptep); @@ -6228,7 +6228,7 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, return NULL; hugetlb_vma_lock_read(vma); - pte = huge_pte_offset(mm, haddr, huge_page_size(h)); + pte = hugetlb_walk(vma, haddr, huge_page_size(h)); if (!pte) goto out_unlock; @@ -6293,8 +6293,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, * * Note that page table lock is not held when pte is null. */ - pte = huge_pte_offset(mm, vaddr & huge_page_mask(h), - huge_page_size(h)); + pte = hugetlb_walk(vma, vaddr & huge_page_mask(h), + huge_page_size(h)); if (pte) ptl = huge_pte_lock(h, mm, pte); absent = !pte || huge_pte_none(huge_ptep_get(pte)); @@ -6480,7 +6480,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, last_addr_mask = hugetlb_mask_last_page(h); for (; address < end; address += psize) { spinlock_t *ptl; - ptep = huge_pte_offset(mm, address, psize); + ptep = hugetlb_walk(vma, address, psize); if (!ptep) { address |= last_addr_mask; continue; @@ -6858,12 +6858,6 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, *end = ALIGN(*end, PUD_SIZE); } -static bool __vma_shareable_flags_pmd(struct vm_area_struct *vma) -{ - return vma->vm_flags & (VM_MAYSHARE | VM_SHARED) && - vma->vm_private_data; -} - void hugetlb_vma_lock_read(struct vm_area_struct *vma) { if (__vma_shareable_flags_pmd(vma)) { @@ -7029,8 +7023,8 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, saddr = page_table_shareable(svma, vma, addr, idx); if (saddr) { - spte = huge_pte_offset(svma->vm_mm, saddr, - vma_mmu_pagesize(svma)); + spte = hugetlb_walk(svma, saddr, + vma_mmu_pagesize(svma)); if (spte) { get_page(virt_to_page(spte)); break; @@ -7388,7 +7382,7 @@ void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) hugetlb_vma_lock_write(vma); i_mmap_lock_write(vma->vm_file->f_mapping); for (address = start; address < end; address += PUD_SIZE) { - ptep = huge_pte_offset(mm, address, sz); + ptep = hugetlb_walk(vma, address, sz); if (!ptep) continue; ptl = huge_pte_lock(h, mm, ptep); diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 93e13fc17d3c..f3729b23dd0e 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -168,9 +168,12 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) /* The only possible mapping was handled on last iteration */ if (pvmw->pte) return not_found(pvmw); - - /* when pud is not present, pte will be NULL */ - pvmw->pte = huge_pte_offset(mm, pvmw->address, size); + /* + * All callers that get here will already hold the + * i_mmap_rwsem. Therefore, no additional locks need to be + * taken before calling hugetlb_walk(). + */ + pvmw->pte = hugetlb_walk(vma, pvmw->address, size); if (!pvmw->pte) return false; diff --git a/mm/pagewalk.c b/mm/pagewalk.c index d98564a7be57..cb23f8a15c13 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -305,13 +305,11 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end, hugetlb_vma_lock_read(vma); do { next = hugetlb_entry_end(h, addr, end); - pte = huge_pte_offset(walk->mm, addr & hmask, sz); - + pte = hugetlb_walk(vma, addr & hmask, sz); if (pte) err = ops->hugetlb_entry(pte, hmask, addr, next, walk); else if (ops->pte_hole) err = ops->pte_hole(addr, next, -1, walk); - if (err) break; } while (addr = next, addr != end);