From patchwork Wed Apr 7 01:44:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michel Lespinasse X-Patchwork-Id: 12186489 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9837CC433ED for ; Wed, 7 Apr 2021 01:46:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4486661396 for ; Wed, 7 Apr 2021 01:46:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4486661396 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=lespinasse.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9368F8E0007; Tue, 6 Apr 2021 21:45:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E66D28E0014; Tue, 6 Apr 2021 21:45:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9F7EB6B0093; Tue, 6 Apr 2021 21:45:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0229.hostedemail.com [216.40.44.229]) by kanga.kvack.org (Postfix) with ESMTP id 559006B0092 for ; Tue, 6 Apr 2021 21:45:09 -0400 (EDT) Received: from smtpin39.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 1DDAE8249980 for ; Wed, 7 Apr 2021 01:45:09 +0000 (UTC) X-FDA: 78003877938.39.96E891D Received: from server.lespinasse.org (server.lespinasse.org [63.205.204.226]) by imf08.hostedemail.com (Postfix) with ESMTP id C0B5F80192D4 for ; Wed, 7 Apr 2021 01:45:01 +0000 (UTC) DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=lespinasse.org; i=@lespinasse.org; q=dns/txt; s=srv-11-ed; t=1617759903; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : from; bh=KSwVFXcXC1LvgNAhlUzawOb6cyImWm4fVIR5ZQJd5xc=; b=gRHSZhDTNNGCbnIvlycMkhUVlPwyxniNsdMuORhL95atZDObl7ItbtuphSDdUjX23gpa2 5g/6FTl+SQbi/A/Ag== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lespinasse.org; i=@lespinasse.org; q=dns/txt; s=srv-11-rsa; t=1617759903; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : from; bh=KSwVFXcXC1LvgNAhlUzawOb6cyImWm4fVIR5ZQJd5xc=; b=MKIjsemPluEEIll0eWbtQ+LZ5lhWdvrPCIPD45WpylLuLds09nW2fmnBC0BmLlFlKs2TE z+Lj8BxURaoMDZEGU6ShDzgdUM73C8UPBdI2ut40XJBRoFJdPKDR5fIZoALC1Wc7q2vJmq/ rkoreBCPgtSezyWIDekGZoGnWX/wmypzyEXeFSNVR5xeRO5qyroG1rwM8043lZjBRC1FaWB tlBOQw9nwVBE4W9R8+hOsrIf6MwXYLl1ZGQNV8/QsjZ1T4uTfgweNCuxKJlNtxHTrRzAKK4 aF/5rzRtLwNtEAKGDik+k8MEqgZ7ImHLBgSe8amwi/4oPZclODAzWRvqHH+w== Received: from zeus.lespinasse.org (zeus.lespinasse.org [10.0.0.150]) by server.lespinasse.org (Postfix) with ESMTPS id 18CB516036D; Tue, 6 Apr 2021 18:45:03 -0700 (PDT) Received: by zeus.lespinasse.org (Postfix, from userid 1000) id 09A2819F31F; Tue, 6 Apr 2021 18:45:03 -0700 (PDT) From: Michel Lespinasse To: Linux-MM Cc: Laurent Dufour , Peter Zijlstra , Michal Hocko , Matthew Wilcox , Rik van Riel , Paul McKenney , Andrew Morton , Suren Baghdasaryan , Joel Fernandes , Rom Lemarchand , Linux-Kernel , Michel Lespinasse Subject: [RFC PATCH 24/37] mm: implement speculative handling in __do_fault() Date: Tue, 6 Apr 2021 18:44:49 -0700 Message-Id: <20210407014502.24091-25-michel@lespinasse.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210407014502.24091-1-michel@lespinasse.org> References: <20210407014502.24091-1-michel@lespinasse.org> MIME-Version: 1.0 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: C0B5F80192D4 X-Stat-Signature: 1cpzma4f8yu3yqssd4mtdk5catex7sj8 Received-SPF: none (lespinasse.org>: No applicable sender policy available) receiver=imf08; identity=mailfrom; envelope-from=""; helo=server.lespinasse.org; client-ip=63.205.204.226 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617759901-44549 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In the speculative case, call the vm_ops->fault() method from within an rcu read locked section, and verify the mmap sequence lock at the start of the section. A match guarantees that the original vma is still valid at that time, and that the associated vma->vm_file stays valid while the vm_ops->fault() method is running. Note that this implies that speculative faults can not sleep within the vm_ops->fault method. We will only attempt to fetch existing pages from the page cache during speculative faults; any miss (or prefetch) will be handled by falling back to non-speculative fault handling. The speculative handling case also does not preallocate page tables, as it is always called with a pre-existing page table. Signed-off-by: Michel Lespinasse --- mm/memory.c | 63 +++++++++++++++++++++++++++++++++++------------------ 1 file changed, 42 insertions(+), 21 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 6eddd7b4e89c..7139004c624d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3709,29 +3709,50 @@ static vm_fault_t __do_fault(struct vm_fault *vmf) struct vm_area_struct *vma = vmf->vma; vm_fault_t ret; - /* - * Preallocate pte before we take page_lock because this might lead to - * deadlocks for memcg reclaim which waits for pages under writeback: - * lock_page(A) - * SetPageWriteback(A) - * unlock_page(A) - * lock_page(B) - * lock_page(B) - * pte_alloc_one - * shrink_page_list - * wait_on_page_writeback(A) - * SetPageWriteback(B) - * unlock_page(B) - * # flush A, B to clear the writeback - */ - if (pmd_none(*vmf->pmd) && !vmf->prealloc_pte) { - vmf->prealloc_pte = pte_alloc_one(vma->vm_mm); - if (!vmf->prealloc_pte) - return VM_FAULT_OOM; - smp_wmb(); /* See comment in __pte_alloc() */ +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT + if (vmf->flags & FAULT_FLAG_SPECULATIVE) { + rcu_read_lock(); + if (!mmap_seq_read_check(vmf->vma->vm_mm, vmf->seq)) { + ret = VM_FAULT_RETRY; + } else { + /* + * The mmap sequence count check guarantees that the + * vma we fetched at the start of the fault was still + * current at that point in time. The rcu read lock + * ensures vmf->vma->vm_file stays valid. + */ + ret = vma->vm_ops->fault(vmf); + } + rcu_read_unlock(); + } else +#endif + { + /* + * Preallocate pte before we take page_lock because + * this might lead to deadlocks for memcg reclaim + * which waits for pages under writeback: + * lock_page(A) + * SetPageWriteback(A) + * unlock_page(A) + * lock_page(B) + * lock_page(B) + * pte_alloc_one + * shrink_page_list + * wait_on_page_writeback(A) + * SetPageWriteback(B) + * unlock_page(B) + * # flush A, B to clear writeback + */ + if (pmd_none(*vmf->pmd) && !vmf->prealloc_pte) { + vmf->prealloc_pte = pte_alloc_one(vma->vm_mm); + if (!vmf->prealloc_pte) + return VM_FAULT_OOM; + smp_wmb(); /* See comment in __pte_alloc() */ + } + + ret = vma->vm_ops->fault(vmf); } - ret = vma->vm_ops->fault(vmf); if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY | VM_FAULT_DONE_COW))) return ret;