From patchwork Tue Mar 21 19:18:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13183115 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCEDDC6FD1D for ; Tue, 21 Mar 2023 19:18:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA8F86B0074; Tue, 21 Mar 2023 15:18:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A31F26B0078; Tue, 21 Mar 2023 15:18:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8ACE96B007B; Tue, 21 Mar 2023 15:18:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 74ACD6B0074 for ; Tue, 21 Mar 2023 15:18:49 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 2617212024D for ; Tue, 21 Mar 2023 19:18:49 +0000 (UTC) X-FDA: 80593867578.18.292A851 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf16.hostedemail.com (Postfix) with ESMTP id 0456D180021 for ; Tue, 21 Mar 2023 19:18:45 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=fwpkCVaY; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf16.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679426326; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=bnoAyuqHmnvQnMwtwv+VIx5eISsKZ1h4nkzuYDMbGjQ=; b=JcKbTjnUiuBexZetPT3/WNjjaC0Idzlo++3I0btB31EHE/1FME+aeQdY4bVJ/oAbCFOS9S 2OGQUaEHVxFr4M9vGJblVztjZnsLGTCTAt+cHhYHiMKgvQ41JjHsPyMXE64/n8ggUAP2ga 0kgDG5wqZ/NsLpa3QlxrKea6TF8AXd0= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=fwpkCVaY; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf16.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679426326; a=rsa-sha256; cv=none; b=rKsvFI3nopJr/hdH0VQ7UKNE/0KVfMhQT2JAObLjt5wOW/DPhol52EL9cXYEIGLBc2tb4w uNX0C9JRRjjjndQ6pssGfrlrBpXzRc6pPeHKrwLke+sfD6H9kIVJHQOFK5gp0u850fTxeK 2GhlRyjOMUSDyzCl6G/TPITvhtPJxi0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679426325; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=bnoAyuqHmnvQnMwtwv+VIx5eISsKZ1h4nkzuYDMbGjQ=; b=fwpkCVaYg3mc76oAmPWmESpHVaZVTxYjl97jav7RPrtFGx8pFmt1VyqXCT9T8t0qdbfQrh 5SIVZN6QyyJnLlJOe7w8ebp+wieGRBnywZOk/CbZiygkvd+J+gljf8hkO1BGVBTKTtniZb S3K5pZvWXd+w103zBA6ETDHjb5OCwwQ= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-73-bOFcDt03OZe4HsKUG3cFvg-1; Tue, 21 Mar 2023 15:18:44 -0400 X-MC-Unique: bOFcDt03OZe4HsKUG3cFvg-1 Received: by mail-qk1-f199.google.com with SMTP id 195-20020a3705cc000000b00746a3ab9426so1000139qkf.20 for ; Tue, 21 Mar 2023 12:18:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679426323; x=1682018323; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=bnoAyuqHmnvQnMwtwv+VIx5eISsKZ1h4nkzuYDMbGjQ=; b=DnDuc9lVAOG8hJatBKpI1YZmN/Y0GNbpdhAXts8Xxt+tnUDFFOr9pOQ+MJjikcEgLM Gy3lJviqi9dLiuOa6FE+70JotqPEpvxVaCNAfgZ91m2sZPB4wVUbjayedzzB+/hmtpye jk11p/ZH43FwcToALxTj7KrTt3SE8M3uEe+LQTnZBMj1DsIaxQmkJX317Ymj+gWErpWw t0aMj7djD3l0a6SwWk/ezswSdKdWNcyrm/ZlcjsDw2cuEjuNsPGBlAET+59XlU/dCzLw RFDQJnnJw5kbGxkGFiat0/60VHAuKBCtrKQef8AtFNWRKSRRb4Drr1zj4h5gHvM39MD6 l9dQ== X-Gm-Message-State: AO0yUKWSd9ngZ/Yi6aJnkkKXWDmY2HLbfeLHHhXXHIKn1bSM4lG8bog+ 02JPv2FN6TLuEuWaOwPCJw3XvnjSPJ706EuRJ2OpQGPZaUX/Q88/YJJ41TSEO2YvpT71uULxKUQ pg3v+HdzIaUPQF5Lt2IlCFry2QBQA4POXIQkP0OvrhwsEwnyCVtapRTldIhxjpxv1vF+t X-Received: by 2002:a05:6214:410f:b0:572:8e20:bd7 with SMTP id kc15-20020a056214410f00b005728e200bd7mr5688247qvb.4.1679426323069; Tue, 21 Mar 2023 12:18:43 -0700 (PDT) X-Google-Smtp-Source: AK7set/+s4wSObnTOQwfeOPgJviuZsqBrAJhYmpM7TaTuPviS0MCZ/yLdpIQWMFbDykMZU+Vyluu9Q== X-Received: by 2002:a05:6214:410f:b0:572:8e20:bd7 with SMTP id kc15-20020a056214410f00b005728e200bd7mr5688193qvb.4.1679426322669; Tue, 21 Mar 2023 12:18:42 -0700 (PDT) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-40-70-52-229-124.dsl.bell.ca. [70.52.229.124]) by smtp.gmail.com with ESMTPSA id s10-20020a05620a16aa00b0074341cb30d0sm9756419qkj.62.2023.03.21.12.18.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Mar 2023 12:18:42 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrea Arcangeli , Andrew Morton , peterx@redhat.com, Axel Rasmussen , Mike Rapoport , Nadav Amit , David Hildenbrand , Muhammad Usama Anjum , linux-stable Subject: [PATCH] mm/hugetlb: Fix uffd wr-protection for CoW optimization path Date: Tue, 21 Mar 2023 15:18:40 -0400 Message-Id: <20230321191840.1897940-1-peterx@redhat.com> X-Mailer: git-send-email 2.39.1 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 0456D180021 X-Stat-Signature: 1ek7xdjgx65h9wmeumwqsxbe31enwmne X-HE-Tag: 1679426325-680890 X-HE-Meta: U2FsdGVkX18jHTRFAKAsj6H/WABkgQtBwgf4tQ27JiOjo89EDbrzsp2dm1cdlxKFyWlWZxINq7YG/iUKr01vGQKuJeao4DQWgkpbGZ/O4OaVoc5vthMJlHeOL9OBpxaOl8K6Ac4CExZmo8S0Ia+swPCe8lf9J4Nw2sys+w0Fhlp/nXmvwxekIf40miHJn8AZJ/i6TGlfNW5W54D61p8Plwys074aB9FP5sge0o1WOvwwYjyebb7Mq+jQ2m02uOnoJtVGj8JAxj+KXM2cTAqaPy5HO5MoziF1iyX/C1SvY6DBmne2gJ0ocd/TcHdSCSLy9DvCnXtThKS0YGgb2rhMUdi0NY4IiRmBWngRs8PV529bf8mvklbKqcXwZlZhvp/1ciD/ABA2E1gNR+frpAViovzeCrhSpGWs4on5r4OHsE7Ojmta1Uu4i20DG/AAzeTEWZjqOTRvxGzdz+nwohRkFUwBEb9ofrvkzEdSgSeLd92NIucdNsWfpPQXKYVKujZ2bKGckJDeRQW9+h2jQcgp6hhJyeRXZZ2gI6a0NR8Ies2ngmYF2YyxgnDBrIp5Dn4vFfU9QIQVTn0r3H3zQWC/PoYDfr6OqEkIboEsDRlm8Q96ACVAAMsFdz2YFkpD107/g6moDu/gjeiyGAXmt79uYoDO3K6ONDrGM+cKTMz+XmlEptq+zzHgrMOg6SEzaAc0qFrLce5oaL2Y/eWl/zLzOoJOGlwxXJvjfwGDdJLBqmgmhjOi6wTRlb2KLNbXaae4J1Bmdt8hpKHn/2bTpwejJJHf8FRnFTi4vbOQLi8qaiwZPVAtQ3RGeG3pcG5XnLVeo8Maqo9tg8mq6TLTuhvJlPvgz2YuvRIz0YzykLaukriU0T/7fdnjxN0yRDeVDLKEbdn7YUy5gcINChpa66MVUasE8vpN5zbCp8RGyXAGAiEBTYHOoVxFntrAwqq44/YdcQa5s6g1xsMh/FYdg2b eBNxtxR2 QBzVstMDKRu+9+wmgioXmo6L1E94l58jwKXH9z3DHfPf9t85f0mS+v/V6eYA2r4AsnLB7wFQxrJp+nnK1jQKZiuUdfI4DuSbYD8DfyCzIJPhBesnzI90fajq8qAlhO9WD9SlicJIWgEFO9eaLvgzh6BGAMD3KOL4FxocnSNMM6LABCKu5oRWd2kc3dIrRZ/SKCf6N7r6ATDXmV0kdJHLQ6xbIK38o8oks1DmyRgaocC0acs5QuCbcE/S+7QA8I+wZYsYSmdo8wy7daJh3bcngbJmvzLqu6FwCQ9scqN+0/vhA1neIBl2xBhpjEyeaksa3v0Svu14K/55Doj4zE+lfJlsE00fOrc18ZVa8AYuk3bS8nzBGVreu1Bn6d5bpDOnkY1ifTYkzssagbA2pcdqxt7ptUy8i7EAlkoCZ1MU4+b98266RwrdJq+9EIq/X+ACzfVMU8Gy8h3xUu5BfGuqX6dgKiJadhihYlxag+9gg32dj6I5+LPrdUp8sXSeu+h9FNPjtCbbC9pT6ppRMY17AmFPg1iVASAjj3atFMexjYoXYasTrHBPRE4MlyDFtA7u7tHlS+Y4NUvdaXDoEowfVfYGW+PHNi/TvZPf9y3wVRAj/18w= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch fixes an issue that a hugetlb uffd-wr-protected mapping can be writable even with uffd-wp bit set. It only happens with all these conditions met: (1) hugetlb memory (2) private mapping (3) original mapping was missing, then (4) being wr-protected (IOW, pte marker installed). Then write to the page to trigger. Userfaultfd-wp trap for hugetlb was implemented in hugetlb_fault() before even reaching hugetlb_wp() to avoid taking more locks that userfault won't need. However there's one CoW optimization path for missing hugetlb page that can trigger hugetlb_wp() inside hugetlb_no_page(), that can bypass the userfaultfd-wp traps. A few ways to resolve this: (1) Skip the CoW optimization for hugetlb private mapping, considering that private mappings for hugetlb should be very rare, so it may not really be helpful to major workloads. The worst case is we only skip the optimization if userfaultfd_wp(vma)==true, because uffd-wp needs another fault anyway. (2) Move the userfaultfd-wp handling for hugetlb from hugetlb_fault() into hugetlb_wp(). The major cons is there're a bunch of locks taken when calling hugetlb_wp(), and that will make the changeset unnecessarily complicated due to the lock operations. (3) Carry over uffd-wp bit in hugetlb_wp(), so it'll need to fault again for uffd-wp privately mapped pages. This patch chose option (3) which contains the minimum changeset (simplest for backport) and also make sure hugetlb_wp() itself will start to be always safe with uffd-wp ptes even if called elsewhere in the future. This patch will be needed for v5.19+ hence copy stable. Reported-by: Muhammad Usama Anjum Cc: linux-stable Fixes: 166f3ecc0daf ("mm/hugetlb: hook page faults for uffd write protection") Signed-off-by: Peter Xu Acked-by: David Hildenbrand Reviewed-by: Mike Kravetz Tested-by: Muhammad Usama Anjum --- mm/hugetlb.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8bfd07f4c143..22337b191eae 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5478,7 +5478,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, struct folio *pagecache_folio, spinlock_t *ptl) { const bool unshare = flags & FAULT_FLAG_UNSHARE; - pte_t pte; + pte_t pte, newpte; struct hstate *h = hstate_vma(vma); struct page *old_page; struct folio *new_folio; @@ -5622,8 +5622,10 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, mmu_notifier_invalidate_range(mm, range.start, range.end); page_remove_rmap(old_page, vma, true); hugepage_add_new_anon_rmap(new_folio, vma, haddr); - set_huge_pte_at(mm, haddr, ptep, - make_huge_pte(vma, &new_folio->page, !unshare)); + newpte = make_huge_pte(vma, &new_folio->page, !unshare); + if (huge_pte_uffd_wp(pte)) + newpte = huge_pte_mkuffd_wp(newpte); + set_huge_pte_at(mm, haddr, ptep, newpte); folio_set_hugetlb_migratable(new_folio); /* Make the old page be freed below */ new_folio = page_folio(old_page);