Message ID | 20240725090345.28461-1-will@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: mm: Fix lockless walks with static and dynamic page-table folding | expand |
On Thu, 25 Jul 2024 at 11:03, Will Deacon <will@kernel.org> wrote: > > Lina reports random oopsen originating from the fast GUP code when > 16K pages are used with 4-level page-tables, the fourth level being > folded at runtime due to lack of LPA2. > > In this configuration, the generic implementation of > p4d_offset_lockless() will return a 'p4d_t *' corresponding to the > 'pgd_t' allocated on the stack of the caller, gup_fast_pgd_range(). > This is normally fine, but when the fourth level of page-table is folded > at runtime, pud_offset_lockless() will offset from the address of the > 'p4d_t' to calculate the address of the PUD in the same page-table page. > This results in a stray stack read when the 'p4d_t' has been allocated > on the stack and can send the walker into the weeds. > > Fix the problem by providing our own definition of p4d_offset_lockless() > when CONFIG_PGTABLE_LEVELS <= 4 which returns the real page-table > pointer rather than the address of the local stack variable. > > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Ard Biesheuvel <ardb@kernel.org> > Cc: <stable@vger.kernel.org> > Link: https://lore.kernel.org/r/50360968-13fb-4e6f-8f52-1725b3177215@asahilina.net > Fixes: 0dd4f60a2c76 ("arm64: mm: Add support for folding PUDs at runtime") > Reported-by: Asahi Lina <lina@asahilina.net> > Signed-off-by: Will Deacon <will@kernel.org> > --- > arch/arm64/include/asm/pgtable.h | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
On Thu, 25 Jul 2024 10:03:45 +0100, Will Deacon wrote: > Lina reports random oopsen originating from the fast GUP code when > 16K pages are used with 4-level page-tables, the fourth level being > folded at runtime due to lack of LPA2. > > In this configuration, the generic implementation of > p4d_offset_lockless() will return a 'p4d_t *' corresponding to the > 'pgd_t' allocated on the stack of the caller, gup_fast_pgd_range(). > This is normally fine, but when the fourth level of page-table is folded > at runtime, pud_offset_lockless() will offset from the address of the > 'p4d_t' to calculate the address of the PUD in the same page-table page. > This results in a stray stack read when the 'p4d_t' has been allocated > on the stack and can send the walker into the weeds. > > [...] Applied to arm64 (for-next/core), thanks! [1/1] arm64: mm: Fix lockless walks with static and dynamic page-table folding https://git.kernel.org/arm64/c/36639013b346 Cheers,
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index f8efbc128446..7a4f5604be3f 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1065,6 +1065,28 @@ static inline bool pgtable_l5_enabled(void) { return false; } #define p4d_offset_kimg(dir,addr) ((p4d_t *)dir) +static inline +p4d_t *p4d_offset_lockless_folded(pgd_t *pgdp, pgd_t pgd, unsigned long addr) +{ + /* + * With runtime folding of the pud, pud_offset_lockless() passes + * the 'pgd_t *' we return here to p4d_to_folded_pud(), which + * will offset the pointer assuming that it points into + * a page-table page. However, the fast GUP path passes us a + * pgd_t allocated on the stack and so we must use the original + * pointer in 'pgdp' to construct the p4d pointer instead of + * using the generic p4d_offset_lockless() implementation. + * + * Note: reusing the original pointer means that we may + * dereference the same (live) page-table entry multiple times. + * This is safe because it is still only loaded once in the + * context of each level and the CPU guarantees same-address + * read-after-read ordering. + */ + return p4d_offset(pgdp, addr); +} +#define p4d_offset_lockless p4d_offset_lockless_folded + #endif /* CONFIG_PGTABLE_LEVELS > 4 */ #define pgd_ERROR(e) \
Lina reports random oopsen originating from the fast GUP code when 16K pages are used with 4-level page-tables, the fourth level being folded at runtime due to lack of LPA2. In this configuration, the generic implementation of p4d_offset_lockless() will return a 'p4d_t *' corresponding to the 'pgd_t' allocated on the stack of the caller, gup_fast_pgd_range(). This is normally fine, but when the fourth level of page-table is folded at runtime, pud_offset_lockless() will offset from the address of the 'p4d_t' to calculate the address of the PUD in the same page-table page. This results in a stray stack read when the 'p4d_t' has been allocated on the stack and can send the walker into the weeds. Fix the problem by providing our own definition of p4d_offset_lockless() when CONFIG_PGTABLE_LEVELS <= 4 which returns the real page-table pointer rather than the address of the local stack variable. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/50360968-13fb-4e6f-8f52-1725b3177215@asahilina.net Fixes: 0dd4f60a2c76 ("arm64: mm: Add support for folding PUDs at runtime") Reported-by: Asahi Lina <lina@asahilina.net> Signed-off-by: Will Deacon <will@kernel.org> --- arch/arm64/include/asm/pgtable.h | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)