From patchwork Thu Apr 10 00:00:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 14045659 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29149C369A2 for ; Thu, 10 Apr 2025 00:00:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B245C6B015F; Wed, 9 Apr 2025 20:00:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AD13B6B0163; Wed, 9 Apr 2025 20:00:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 95F346B015F; Wed, 9 Apr 2025 20:00:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5A4916B0161 for ; Wed, 9 Apr 2025 20:00:48 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 28E39BB8B8 for ; Thu, 10 Apr 2025 00:00:49 +0000 (UTC) X-FDA: 83316178218.11.812C79E Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf27.hostedemail.com (Postfix) with ESMTP id 8D92F40012 for ; Thu, 10 Apr 2025 00:00:47 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=h2GrDKXy; spf=pass (imf27.hostedemail.com: domain of sj@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744243247; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KAAphlR2WIlwTiPq7OkUFfoAyMKyIeF+suS8pSab5ZY=; b=JcrDTnRB/QCR6CSIm9rBFzWO2CySOfCZs6I6z847ryo7LJS+RRy2r1c5KShGhM4FcK3E1z YHkyytah2E5B34zvWzAxwp1hPIqTExi8zary+y2r0eGd/AVvkoktnhHUeWT9Z76J/fab2A er+yGWy6aNEkb/sYuGSFN3y4g3eXHMc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744243247; a=rsa-sha256; cv=none; b=fEDXvuPZN4nl4JEaTKMZpP8SjME7U4l1zP6TAhx57amoDHBRckph18aQTK2jS67Fg+BKww Afniw0I8B86s0QlKeiGnUI9YmYWAYFsjs4iGktZRQ3D71VHk6PqmIlVi8Qx/9w5/wvSxco RuKqzRe58tnKZ3e3LRs4i7mLkfbCTP4= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=h2GrDKXy; spf=pass (imf27.hostedemail.com: domain of sj@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 5CFB3A4911B; Wed, 9 Apr 2025 23:55:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 78490C4CEE9; Thu, 10 Apr 2025 00:00:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744243246; bh=AiRM6AgaPbYM4aARYtsDAE0SH4eLTigEd/mfyKuf6e8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=h2GrDKXyRxw0mpBW/ScN6CLLCrTI3Y/nMLtNQj4qrA+gMGaT4diZIwQxyVRlk+Qk5 U/k2UZAy4O6RbCvBkd22bPhecL7dT3ZIWFijQIAyBQRDEOM3erKpDai5Rmw5UJhSM/ DZWNpyizeqE5pr0G3xKb/i3sSdtzbT0iAPAMR4tJiETAFniiz7LS1VHtCCo1icS+Yq qzPQtWstjVegYNDRvWby9HCM1zEi3n0tVO61oE+ByGQ0ahG3C/YnJFvNzugE/TZTYv B5+/egDGYkBZziaNVKgk18AENnOJuy4HX76bioEVFEjwe6bmnZfnJdvDmsoU3MQ4b2 cxAxW45QgU+eA== From: SeongJae Park To: Andrew Morton Cc: SeongJae Park , "Liam R.Howlett" , David Hildenbrand , Lorenzo Stoakes , Rik van Riel , Shakeel Butt , Vlastimil Babka , kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 3/4] mm/memory: split non-tlb flushing part from zap_page_range_single() Date: Wed, 9 Apr 2025 17:00:21 -0700 Message-Id: <20250410000022.1901-4-sj@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250410000022.1901-1-sj@kernel.org> References: <20250410000022.1901-1-sj@kernel.org> MIME-Version: 1.0 X-Stat-Signature: 3y8i1zpxnai18qct9im4qtm3mga563sp X-Rspam-User: X-Rspamd-Queue-Id: 8D92F40012 X-Rspamd-Server: rspam08 X-HE-Tag: 1744243247-61789 X-HE-Meta: U2FsdGVkX1/TK9pwHHlWR33YewRpjmCVgy6b9kSUXWLtdiLrQCxWPJeb7/rKjd/WgZsgbvM/xbmkm13lll/e2SkUTkSLilkB2+EVwY+keehUw8bII65vECPIT4WFhh5aFeg/ew0fd9oxyS/EXn2rj2JhEKlXQ5gvxNo5e8zB1Wcn4T6JDWyLZmqPluGTMUHcRF4csX/sgYqyFYiUWvWvag1AzGBMduplmRqylsMt4ePdb6gQU1hNFfHo3pp3/w1nErNjpxvVd6M2kWdo+kzLD41+be2my3SvWEdZ2Mr5vqeOUn7OIRM++tMDokpGRZ3sxasusfLkTP3VbChKNvOpJGblVKwlDXFTzoyWHJvO4Vn0UDVv1tOQwfwqBZDaVP6XRjYafvmlRTMEj6a2lmeCaiSOqYJ8ifibCVIDIeWanxMXWd9a3EHN9cc3q4VC/ACEiXjRv2K27G3y9DbkpLSFKhaLNHO+nPxs0i3rB/q+/3tmpSExM+ha2vPUyUl1erUFvVzrc5B+rb0UdoBhFD871MXGvZVU0Zobbua7e35rWxgyDBQTEyAjXHXvzyOZO3g/bk15srIVJZJbJ0BqSQtZu1oyImpFJ0q3kwTEwYVz9VDCwt4Et2e0/4Tj/nSD+16fHcx/n26+SdEdJvRhHUC+3SC3af8fwfCFMZTWEXI2bwCvxSOb2qDNgrImTyEmO2YWhVIxYhVhThX+SAPJawf3CZ4cgzZxEF1vwKji+G7HWQsBRUEOIhNwIR70uSVT5HLa2VmcgZZaVCYz3vuatxOlofSa94sDwbPLZwCDdQ6ZvwX6kpuSl2wYxXSqxOeHS0ZyJOfLciJE9AqzV/z1mbNQ4AWLmfq5RQyC70cD19B+oYUmQ7E0TNh03zTQVyHdSdBIZqe/OJDsd8TifPEYE7h/CHRiIDQh4JCmP5SHIlYIaan//gtyInrICi8V7eon6UDOQgVZT1R6+PP3ivRNoy9 bM2G5Hsi 6PPmRQGtYHO3Q7coNvEjJ9DNtqtO43FShVWRyG511Xjq9LpjvFHK1X9s/H7GzFr7B39Hl38+cRLoEAW50e3eZ0OZdGgZbx9/pNr44+aw55a4mdbv7tNf/srtfm39vFn6hFIVMCxixAHok0DfRW0ygoMwPWgqxVY0kIO5TaYL0XqfU4ZKloQ9YSZ6aPX8T1RYgvjyEh8YoblLnWuSi3joGi929QtB/zb8fhl7DKxoGrVgtB+su6zT6kZM+rpc3mzDowxFHxH4tVHcO92Kf/tDESBh83g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Some of zap_page_range_single() callers such as [process_]madvise() with MADV_DONTNEED[_LOCKED] cannot batch tlb flushes because zap_page_range_single() flushes tlb for each invocation. Split out the body of zap_page_range_single() except mmu_gather object initialization and gathered tlb entries flushing for such batched tlb flushing usage. To avoid hugetlb pages allocation failures from concurrent page faults, the tlb flush should be done before hugetlb faults unlocking, though. Do the flush and the unlock inside the split out function in the order for hugetlb vma case. Refer to commit 2820b0f09be9 ("hugetlbfs: close race between MADV_DONTNEED and page fault") for more details about the concurrent faults' page allocation failure problem. Signed-off-by: SeongJae Park Reviewed-by: Lorenzo Stoakes --- mm/memory.c | 49 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 39 insertions(+), 10 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index fda6d6429a27..690695643dfb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1998,36 +1998,65 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas, mmu_notifier_invalidate_range_end(&range); } -/** - * zap_page_range_single - remove user pages in a given range +/* + * zap_page_range_single_batched - remove user pages in a given range + * @tlb: pointer to the caller's struct mmu_gather * @vma: vm_area_struct holding the applicable pages - * @address: starting address of pages to zap - * @size: number of bytes to zap + * @address: starting address of pages to remove + * @size: number of bytes to remove * @details: details of shared cache invalidation * - * The range must fit into one VMA. + * @tlb shouldn't be NULL. The range must fit into one VMA. If @vma is for + * hugetlb, @tlb is flushed and re-initialized by this function. */ -void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, +static void zap_page_range_single_batched(struct mmu_gather *tlb, + struct vm_area_struct *vma, unsigned long address, unsigned long size, struct zap_details *details) { const unsigned long end = address + size; struct mmu_notifier_range range; - struct mmu_gather tlb; + + VM_WARN_ON_ONCE(!tlb || tlb->mm != vma->vm_mm); mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma->vm_mm, address, end); hugetlb_zap_begin(vma, &range.start, &range.end); - tlb_gather_mmu(&tlb, vma->vm_mm); update_hiwater_rss(vma->vm_mm); mmu_notifier_invalidate_range_start(&range); /* * unmap 'address-end' not 'range.start-range.end' as range * could have been expanded for hugetlb pmd sharing. */ - unmap_single_vma(&tlb, vma, address, end, details, false); + unmap_single_vma(tlb, vma, address, end, details, false); mmu_notifier_invalidate_range_end(&range); + if (is_vm_hugetlb_page(vma)) { + /* + * flush tlb and free resources before hugetlb_zap_end(), to + * avoid concurrent page faults' allocation failure. + */ + tlb_finish_mmu(tlb); + hugetlb_zap_end(vma, details); + tlb_gather_mmu(tlb, vma->vm_mm); + } +} + +/** + * zap_page_range_single - remove user pages in a given range + * @vma: vm_area_struct holding the applicable pages + * @address: starting address of pages to zap + * @size: number of bytes to zap + * @details: details of shared cache invalidation + * + * The range must fit into one VMA. + */ +void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, + unsigned long size, struct zap_details *details) +{ + struct mmu_gather tlb; + + tlb_gather_mmu(&tlb, vma->vm_mm); + zap_page_range_single_batched(&tlb, vma, address, size, details); tlb_finish_mmu(&tlb); - hugetlb_zap_end(vma, details); } /**