diff mbox series

[v3,20/37] mm/mprotect: Exclude shadow stack from preserve_write

Message ID 20221104223604.29615-21-rick.p.edgecombe@intel.com (mailing list archive)
State New
Headers show
Series Shadow stacks for userspace | expand

Commit Message

Rick Edgecombe Nov. 4, 2022, 10:35 p.m. UTC
From: Yu-cheng Yu <yu-cheng.yu@intel.com>

The x86 Control-flow Enforcement Technology (CET) feature includes a new
type of memory called shadow stack. This shadow stack memory has some
unusual properties, which requires some core mm changes to function
properly.

In change_pte_range(), when a PTE is changed for prot_numa, _PAGE_RW is
preserved to avoid the additional write fault after the NUMA hinting fault.
However, pte_write() now includes both normal writable and shadow stack
(Write=0, Dirty=1) PTEs, but the latter does not have _PAGE_RW and has no
need to preserve it.

Exclude shadow stack from preserve_write test, and apply the same change to
change_huge_pmd().

Tested-by: Pengfei Xu <pengfei.xu@intel.com>
Tested-by: John Allen <john.allen@amd.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

---

Yu-cheng v25:
 - Move is_shadow_stack_mapping() to a separate line.

Yu-cheng v24:
 - Change arch_shadow_stack_mapping() to is_shadow_stack_mapping().

 mm/huge_memory.c | 7 +++++++
 mm/mprotect.c    | 7 +++++++
 2 files changed, 14 insertions(+)

Comments

Peter Zijlstra Nov. 15, 2022, 12:05 p.m. UTC | #1
On Fri, Nov 04, 2022 at 03:35:47PM -0700, Rick Edgecombe wrote:
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 73b9b78f8cf4..7643a4db1b50 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1803,6 +1803,13 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>  		return 0;
>  
>  	preserve_write = prot_numa && pmd_write(*pmd);
> +
> +	/*
> +	 * Preserve only normal writable huge PMD, but not shadow
> +	 * stack (RW=0, Dirty=1).
> +	 */
> +	if (vma->vm_flags & VM_SHADOW_STACK)
> +		preserve_write = false;
>  	ret = 1;
>  
>  #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 668bfaa6ed2a..ea82ce5f38fe 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -115,6 +115,13 @@ static unsigned long change_pte_range(struct mmu_gather *tlb,
>  			pte_t ptent;
>  			bool preserve_write = prot_numa && pte_write(oldpte);
>  
> +			/*
> +			 * Preserve only normal writable PTE, but not shadow
> +			 * stack (RW=0, Dirty=1).
> +			 */
> +			if (vma->vm_flags & VM_SHADOW_STACK)
> +				preserve_write = false;
> +
>  			/*
>  			 * Avoid trapping faults against the zero or KSM
>  			 * pages. See similar comment in change_huge_pmd.

These comments lack a why component; someone is going to wonder wtf this
code is doing in the near future -- that someone might be you.
Rick Edgecombe Nov. 15, 2022, 8:41 p.m. UTC | #2
On Tue, 2022-11-15 at 13:05 +0100, Peter Zijlstra wrote:
> On Fri, Nov 04, 2022 at 03:35:47PM -0700, Rick Edgecombe wrote:
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 73b9b78f8cf4..7643a4db1b50 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -1803,6 +1803,13 @@ int change_huge_pmd(struct mmu_gather *tlb,
> > struct vm_area_struct *vma,
> >                return 0;
> >   
> >        preserve_write = prot_numa && pmd_write(*pmd);
> > +
> > +     /*
> > +      * Preserve only normal writable huge PMD, but not shadow
> > +      * stack (RW=0, Dirty=1).
> > +      */
> > +     if (vma->vm_flags & VM_SHADOW_STACK)
> > +             preserve_write = false;
> >        ret = 1;
> >   
> >   #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
> > diff --git a/mm/mprotect.c b/mm/mprotect.c
> > index 668bfaa6ed2a..ea82ce5f38fe 100644
> > --- a/mm/mprotect.c
> > +++ b/mm/mprotect.c
> > @@ -115,6 +115,13 @@ static unsigned long change_pte_range(struct
> > mmu_gather *tlb,
> >                        pte_t ptent;
> >                        bool preserve_write = prot_numa &&
> > pte_write(oldpte);
> >   
> > +                     /*
> > +                      * Preserve only normal writable PTE, but not
> > shadow
> > +                      * stack (RW=0, Dirty=1).
> > +                      */
> > +                     if (vma->vm_flags & VM_SHADOW_STACK)
> > +                             preserve_write = false;
> > +
> >                        /*
> >                         * Avoid trapping faults against the zero or
> > KSM
> >                         * pages. See similar comment in
> > change_huge_pmd.
> 
> These comments lack a why component; someone is going to wonder wtf
> this
> code is doing in the near future -- that someone might be you.

Good point, I'll expand it.
diff mbox series

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 73b9b78f8cf4..7643a4db1b50 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1803,6 +1803,13 @@  int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 		return 0;
 
 	preserve_write = prot_numa && pmd_write(*pmd);
+
+	/*
+	 * Preserve only normal writable huge PMD, but not shadow
+	 * stack (RW=0, Dirty=1).
+	 */
+	if (vma->vm_flags & VM_SHADOW_STACK)
+		preserve_write = false;
 	ret = 1;
 
 #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 668bfaa6ed2a..ea82ce5f38fe 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -115,6 +115,13 @@  static unsigned long change_pte_range(struct mmu_gather *tlb,
 			pte_t ptent;
 			bool preserve_write = prot_numa && pte_write(oldpte);
 
+			/*
+			 * Preserve only normal writable PTE, but not shadow
+			 * stack (RW=0, Dirty=1).
+			 */
+			if (vma->vm_flags & VM_SHADOW_STACK)
+				preserve_write = false;
+
 			/*
 			 * Avoid trapping faults against the zero or KSM
 			 * pages. See similar comment in change_huge_pmd.