Message ID | 2e391fa6c6f9b3fcf1b41cefbace02ee4ab4bf59.1715750938.git.namcao@linutronix.de (mailing list archive) |
---|---|
State | Accepted |
Commit | c67ddf59ac44adc60649730bf8347e37c516b001 |
Headers | show |
Series | riscv: fix debug_pagealloc | expand |
Hi Nam, On 15/05/2024 07:50, Nam Cao wrote: > debug_pagealloc is a debug feature which clears the valid bit in page table > entry for freed pages to detect illegal accesses to freed memory. > > For this feature to work, virtual mapping must have PAGE_SIZE resolution. > (No, we cannot map with huge pages and split them only when needed; because > pages can be allocated/freed in atomic context and page splitting cannot be > done in atomic context) > > Force linear mapping to use small pages if debug_pagealloc is enabled. > > Note that it is not necessary to force the entire linear mapping, but only > those that are given to memory allocator. Some parts of memory can keep > using huge page mapping (for example, kernel's executable code). But these > parts are minority, so keep it simple. This is just a debug feature, some > extra overhead should be acceptable. > > Fixes: 5fde3db5eb02 ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support") > Signed-off-by: Nam Cao <namcao@linutronix.de> > Cc: stable@vger.kernel.org > --- > Interestingly this feature somehow still worked when first introduced. > My guess is that back then only 2MB page size is used. When a 4KB page is > freed, the entire 2MB will be (incorrectly) invalidated by this feature. > But 2MB is quite small, so no one else happen to use other 4KB pages in > this 2MB area. In other words, it used to work by luck. > > Now larger page sizes are used, so this feature invalidate large chunk of > memory, and the probability that someone else access this chunk and > trigger a page fault is much higher. > > arch/riscv/mm/init.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 2574f6a3b0e7..73914afa3aba 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -682,6 +682,9 @@ void __init create_pgd_mapping(pgd_t *pgdp, > static uintptr_t __init best_map_size(phys_addr_t pa, uintptr_t va, > phys_addr_t size) > { > + if (debug_pagealloc_enabled()) > + return PAGE_SIZE; > + > if (pgtable_l5_enabled && > !(pa & (P4D_SIZE - 1)) && !(va & (P4D_SIZE - 1)) && size >= P4D_SIZE) > return P4D_SIZE; You can add: Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Thanks, Alex
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 2574f6a3b0e7..73914afa3aba 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -682,6 +682,9 @@ void __init create_pgd_mapping(pgd_t *pgdp, static uintptr_t __init best_map_size(phys_addr_t pa, uintptr_t va, phys_addr_t size) { + if (debug_pagealloc_enabled()) + return PAGE_SIZE; + if (pgtable_l5_enabled && !(pa & (P4D_SIZE - 1)) && !(va & (P4D_SIZE - 1)) && size >= P4D_SIZE) return P4D_SIZE;
debug_pagealloc is a debug feature which clears the valid bit in page table entry for freed pages to detect illegal accesses to freed memory. For this feature to work, virtual mapping must have PAGE_SIZE resolution. (No, we cannot map with huge pages and split them only when needed; because pages can be allocated/freed in atomic context and page splitting cannot be done in atomic context) Force linear mapping to use small pages if debug_pagealloc is enabled. Note that it is not necessary to force the entire linear mapping, but only those that are given to memory allocator. Some parts of memory can keep using huge page mapping (for example, kernel's executable code). But these parts are minority, so keep it simple. This is just a debug feature, some extra overhead should be acceptable. Fixes: 5fde3db5eb02 ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support") Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: stable@vger.kernel.org --- Interestingly this feature somehow still worked when first introduced. My guess is that back then only 2MB page size is used. When a 4KB page is freed, the entire 2MB will be (incorrectly) invalidated by this feature. But 2MB is quite small, so no one else happen to use other 4KB pages in this 2MB area. In other words, it used to work by luck. Now larger page sizes are used, so this feature invalidate large chunk of memory, and the probability that someone else access this chunk and trigger a page fault is much higher. arch/riscv/mm/init.c | 3 +++ 1 file changed, 3 insertions(+)