Message ID | 20230218211433.26859-12-rick.p.edgecombe@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Shadow stacks for userspace | expand |
On Sat, Feb 18, 2023 at 01:14:03PM -0800, Rick Edgecombe wrote: > The x86 Control-flow Enforcement Technology (CET) feature includes a new > type of memory called shadow stack. This shadow stack memory has some > unusual properties, which requires some core mm changes to function > properly. > > One of these changes is to allow for pte_mkwrite() to create different > types of writable memory (the existing conventionally writable type and > also the new shadow stack type). Future patches will convert pte_mkwrite() > to take a VMA in order to facilitate this, however there are places in the > kernel where pte_mkwrite() is called outside of the context of a VMA. > These are for kernel memory. So create a new variant called > pte_mkwrite_kernel() and switch the kernel users over to it. Have > pte_mkwrite() and pte_mkwrite_kernel() be the same for now. Future patches > will introduce changes to make pte_mkwrite() take a VMA. > > Only do this for architectures that need it because they call pte_mkwrite() > in arch code without an associated VMA. Since it will only currently be > used in arch code, so do not include it in arch_pgtable_helpers.rst. > > Cc: linux-doc@vger.kernel.org > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-s390@vger.kernel.org > Cc: xen-devel@lists.xenproject.org > Cc: linux-arch@vger.kernel.org > Cc: linux-mm@kvack.org > Tested-by: Pengfei Xu <pengfei.xu@intel.com> > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> I think it's a little weird that it's the only PTE helper taking a vma, but it does seem like the right approach. Reviewed-by: Kees Cook <keescook@chromium.org>
On 19.02.23 21:38, Kees Cook wrote: > On Sat, Feb 18, 2023 at 01:14:03PM -0800, Rick Edgecombe wrote: >> The x86 Control-flow Enforcement Technology (CET) feature includes a new >> type of memory called shadow stack. This shadow stack memory has some >> unusual properties, which requires some core mm changes to function >> properly. >> >> One of these changes is to allow for pte_mkwrite() to create different >> types of writable memory (the existing conventionally writable type and >> also the new shadow stack type). Future patches will convert pte_mkwrite() >> to take a VMA in order to facilitate this, however there are places in the >> kernel where pte_mkwrite() is called outside of the context of a VMA. >> These are for kernel memory. So create a new variant called >> pte_mkwrite_kernel() and switch the kernel users over to it. Have >> pte_mkwrite() and pte_mkwrite_kernel() be the same for now. Future patches >> will introduce changes to make pte_mkwrite() take a VMA. >> >> Only do this for architectures that need it because they call pte_mkwrite() >> in arch code without an associated VMA. Since it will only currently be >> used in arch code, so do not include it in arch_pgtable_helpers.rst. >> >> Cc: linux-doc@vger.kernel.org >> Cc: linux-arm-kernel@lists.infradead.org >> Cc: linux-s390@vger.kernel.org >> Cc: xen-devel@lists.xenproject.org >> Cc: linux-arch@vger.kernel.org >> Cc: linux-mm@kvack.org >> Tested-by: Pengfei Xu <pengfei.xu@intel.com> >> Suggested-by: David Hildenbrand <david@redhat.com> >> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > > I think it's a little weird that it's the only PTE helper taking a vma, > but it does seem like the right approach. Right. We could pass the vm flags instead, but not sure if that really improves the situation. So unless someone has a better idea, this LGTM.
On 18.02.23 22:14, Rick Edgecombe wrote: > The x86 Control-flow Enforcement Technology (CET) feature includes a new > type of memory called shadow stack. This shadow stack memory has some > unusual properties, which requires some core mm changes to function > properly. > > One of these changes is to allow for pte_mkwrite() to create different > types of writable memory (the existing conventionally writable type and > also the new shadow stack type). Future patches will convert pte_mkwrite() > to take a VMA in order to facilitate this, however there are places in the > kernel where pte_mkwrite() is called outside of the context of a VMA. > These are for kernel memory. So create a new variant called > pte_mkwrite_kernel() and switch the kernel users over to it. Have > pte_mkwrite() and pte_mkwrite_kernel() be the same for now. Future patches > will introduce changes to make pte_mkwrite() take a VMA. > > Only do this for architectures that need it because they call pte_mkwrite() > in arch code without an associated VMA. Since it will only currently be > used in arch code, so do not include it in arch_pgtable_helpers.rst. > > Cc: linux-doc@vger.kernel.org > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-s390@vger.kernel.org > Cc: xen-devel@lists.xenproject.org > Cc: linux-arch@vger.kernel.org > Cc: linux-mm@kvack.org > Tested-by: Pengfei Xu <pengfei.xu@intel.com> > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > Acked-by: David Hildenbrand <david@redhat.com> Do we also have to care about pmd_mkwrite() ?
On Sat, Feb 18, 2023 at 01:14:03PM -0800, Rick Edgecombe wrote: >The x86 Control-flow Enforcement Technology (CET) feature includes a new >type of memory called shadow stack. This shadow stack memory has some >unusual properties, which requires some core mm changes to function >properly. > >One of these changes is to allow for pte_mkwrite() to create different >types of writable memory (the existing conventionally writable type and >also the new shadow stack type). Future patches will convert pte_mkwrite() >to take a VMA in order to facilitate this, however there are places in the >kernel where pte_mkwrite() is called outside of the context of a VMA. >These are for kernel memory. So create a new variant called >pte_mkwrite_kernel() and switch the kernel users over to it. Have >pte_mkwrite() and pte_mkwrite_kernel() be the same for now. Future patches >will introduce changes to make pte_mkwrite() take a VMA. > >Only do this for architectures that need it because they call pte_mkwrite() >in arch code without an associated VMA. Since it will only currently be >used in arch code, so do not include it in arch_pgtable_helpers.rst. > >Cc: linux-doc@vger.kernel.org >Cc: linux-arm-kernel@lists.infradead.org >Cc: linux-s390@vger.kernel.org >Cc: xen-devel@lists.xenproject.org >Cc: linux-arch@vger.kernel.org >Cc: linux-mm@kvack.org >Tested-by: Pengfei Xu <pengfei.xu@intel.com> >Suggested-by: David Hildenbrand <david@redhat.com> >Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > Acked-by: Deepak Gupta <debug@rivosinc.com>
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 65e78999c75d..ed555f947697 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -180,13 +180,18 @@ static inline pmd_t set_pmd_bit(pmd_t pmd, pgprot_t prot) return pmd; } -static inline pte_t pte_mkwrite(pte_t pte) +static inline pte_t pte_mkwrite_kernel(pte_t pte) { pte = set_pte_bit(pte, __pgprot(PTE_WRITE)); pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); return pte; } +static inline pte_t pte_mkwrite(pte_t pte) +{ + return pte_mkwrite_kernel(pte); +} + static inline pte_t pte_mkclean(pte_t pte) { pte = clear_pte_bit(pte, __pgprot(PTE_DIRTY)); diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c index 4ea2eefbc053..5c07e68d80ea 100644 --- a/arch/arm64/mm/trans_pgd.c +++ b/arch/arm64/mm/trans_pgd.c @@ -40,7 +40,7 @@ static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) * read only (code, rodata). Clear the RDONLY bit from * the temporary mappings we use during restore. */ - set_pte(dst_ptep, pte_mkwrite(pte)); + set_pte(dst_ptep, pte_mkwrite_kernel(pte)); } else if (debug_pagealloc_enabled() && !pte_none(pte)) { /* * debug_pagealloc will removed the PTE_VALID bit if @@ -53,7 +53,7 @@ static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) */ BUG_ON(!pfn_valid(pte_pfn(pte))); - set_pte(dst_ptep, pte_mkpresent(pte_mkwrite(pte))); + set_pte(dst_ptep, pte_mkpresent(pte_mkwrite_kernel(pte))); } } diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index b26cbf1c533c..29522418b5f4 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -991,7 +991,7 @@ static inline pte_t pte_wrprotect(pte_t pte) return set_pte_bit(pte, __pgprot(_PAGE_PROTECT)); } -static inline pte_t pte_mkwrite(pte_t pte) +static inline pte_t pte_mkwrite_kernel(pte_t pte) { pte = set_pte_bit(pte, __pgprot(_PAGE_WRITE)); if (pte_val(pte) & _PAGE_DIRTY) @@ -999,6 +999,11 @@ static inline pte_t pte_mkwrite(pte_t pte) return pte; } +static inline pte_t pte_mkwrite(pte_t pte) +{ + return pte_mkwrite_kernel(pte); +} + static inline pte_t pte_mkclean(pte_t pte) { pte = clear_pte_bit(pte, __pgprot(_PAGE_DIRTY)); diff --git a/arch/s390/mm/pageattr.c b/arch/s390/mm/pageattr.c index 85195c18b2e8..4ee5fe5caa23 100644 --- a/arch/s390/mm/pageattr.c +++ b/arch/s390/mm/pageattr.c @@ -96,7 +96,7 @@ static int walk_pte_level(pmd_t *pmdp, unsigned long addr, unsigned long end, if (flags & SET_MEMORY_RO) new = pte_wrprotect(new); else if (flags & SET_MEMORY_RW) - new = pte_mkwrite(pte_mkdirty(new)); + new = pte_mkwrite_kernel(pte_mkdirty(new)); if (flags & SET_MEMORY_NX) new = set_pte_bit(new, __pgprot(_PAGE_NOEXEC)); else if (flags & SET_MEMORY_X) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index b39f16c0d507..4f9fddcff2b9 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -364,11 +364,16 @@ static inline pte_t pte_mkyoung(pte_t pte) return pte_set_flags(pte, _PAGE_ACCESSED); } -static inline pte_t pte_mkwrite(pte_t pte) +static inline pte_t pte_mkwrite_kernel(pte_t pte) { return pte_set_flags(pte, _PAGE_RW); } +static inline pte_t pte_mkwrite(pte_t pte) +{ + return pte_mkwrite_kernel(pte); +} + static inline pte_t pte_mkhuge(pte_t pte) { return pte_set_flags(pte, _PAGE_PSE); diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index ee29fb558f2e..a23f04243c19 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -150,7 +150,7 @@ void make_lowmem_page_readwrite(void *vaddr) if (pte == NULL) return; /* vaddr missing */ - ptev = pte_mkwrite(*pte); + ptev = pte_mkwrite_kernel(*pte); if (HYPERVISOR_update_va_mapping(address, ptev, 0)) BUG();