Message ID | 20230105215025.422635-1-fvdl@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/debug: use valid physical memory for pmd/pud tests | expand |
Hi Frank, Thanks for the patch, in principle this LGTM. Did a quick run on arm64, did not find anything problematic. Although I have some comments below. On 1/6/23 03:20, Frank van der Linden wrote: > The page table debug tests need a physical address to validate > low-level page table manipulation with. The memory at this address > is not actually touched, it just encoded in the page table entries > at various levels during the tests only. > > Since the memory is not used, the code just picks the physical > address of the start_kernel symbol. This value is then truncated > to get a properly aligned address that is to be used for various > tests. Because of the truncation, the address might not actually > exist, or might not describe a complete huge page. That's not a > problem for most tests, but the arch-specific code may check > for attribute validity and consistency. The x86 version of > {pud,pmd}_set_huge actually validates the MTRRs for the PMD/PUD > range. This may fail with an address derived from start_kernel, > depending on where the kernel was loaded and what the physical > memory layout of the system is. This then leads to false negatives > for the {pud,pmd}_set_huge tests. > > Avoid this by finding a properly aligned memory range that exists > and is usable. If such a range is not found, skip the tests that > needed it. > > Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table helpers") > Cc: Anshuman Khandual <anshuman.khandual@arm.com> > Signed-off-by: Frank van der Linden <fvdl@google.com> > --- > mm/debug_vm_pgtable.c | 70 +++++++++++++++++++++++++++++++++++++------ > 1 file changed, 61 insertions(+), 9 deletions(-) > > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > index c631ade3f1d2..e9b52600904a 100644 > --- a/mm/debug_vm_pgtable.c > +++ b/mm/debug_vm_pgtable.c > @@ -15,6 +15,7 @@ > #include <linux/hugetlb.h> > #include <linux/kernel.h> > #include <linux/kconfig.h> > +#include <linux/memblock.h> > #include <linux/mm.h> > #include <linux/mman.h> > #include <linux/mm_types.h> > @@ -80,6 +81,8 @@ struct pgtable_debug_args { > unsigned long pmd_pfn; > unsigned long pte_pfn; > > + phys_addr_t fixed_alignment; > + This should not be a 'phys_addr_t', as it does not really contain a physical address. Alignment value can be captured in 'unsigned long' like other elements. > unsigned long fixed_pgd_pfn; > unsigned long fixed_p4d_pfn; > unsigned long fixed_pud_pfn; > @@ -430,7 +433,8 @@ static void __init pmd_huge_tests(struct pgtable_debug_args *args) > { > pmd_t pmd; > > - if (!arch_vmap_pmd_supported(args->page_prot)) > + if (!arch_vmap_pmd_supported(args->page_prot) || > + args->fixed_alignment < PMD_SIZE) > return; Small nit. Additional line not need for the conditional statement. > > pr_debug("Validating PMD huge\n"); > @@ -449,7 +453,8 @@ static void __init pud_huge_tests(struct pgtable_debug_args *args) > { > pud_t pud; > > - if (!arch_vmap_pud_supported(args->page_prot)) > + if (!arch_vmap_pud_supported(args->page_prot) || > + args->fixed_alignment < PUD_SIZE) > return; Small nit. Additional line not needed for the conditional statement. > > pr_debug("Validating PUD huge\n"); > @@ -1077,11 +1082,41 @@ debug_vm_pgtable_alloc_huge_page(struct pgtable_debug_args *args, int order) > return page; > } > > +/* > + * Check if a physical memory range described by <pstart, pend> contains > + * an area that is of size psize, and aligned to the same. > + * > + * Don't use address 0, and check for overflow. > + */ > +static int __init phys_align_check(phys_addr_t pstart, > + phys_addr_t pend, phys_addr_t psize, phys_addr_t *physp, > + phys_addr_t *alignp) > +{ > + phys_addr_t aligned_start, aligned_end; > + > + if (pstart == 0) > + pstart = PAGE_SIZE; Why ? > + > + aligned_start = ALIGN(pstart, psize); > + aligned_end = aligned_start + psize; > + > + if (aligned_end > aligned_start && aligned_end <= pend) { > + *alignp = psize; > + *physp = aligned_start; > + return 1; > + } > + > + return 0; > +} To be more clear, this function should return a 'bool' instead > + > + > static int __init init_args(struct pgtable_debug_args *args) > { > struct page *page = NULL; > phys_addr_t phys; > int ret = 0; > + u64 idx; > + phys_addr_t pstart, pend; This declaration can be merged into the previous line containing 'phys'. > > /* > * Initialize the debugging data. > @@ -1161,15 +1196,32 @@ static int __init init_args(struct pgtable_debug_args *args) > WARN_ON(!args->start_ptep); > > /* > - * PFN for mapping at PTE level is determined from a standard kernel > - * text symbol. But pfns for higher page table levels are derived by > - * masking lower bits of this real pfn. These derived pfns might not > - * exist on the platform but that does not really matter as pfn_pxx() > - * helpers will still create appropriate entries for the test. This > - * helps avoid large memory block allocations to be used for mapping > - * at higher page table levels in some of the tests. > + * Find a valid physical range, preferably aligned to PUD_SIZE. > + * Return the address and the alignment. It doesn't need to be > + * allocated, it just needs to exist as usable memory. The memory > + * won't be touched. > + * > + * The alignment is recorded, and can be checked to see if we > + * can run the tests that require and actual valid physical s/and/an ? > + * address range on some architectures ({pmd,pud}_huge_test > + * on x86). > */ > + > phys = __pa_symbol(&start_kernel); This original 'phys' will still be used as fallback, in case the below attempt does not find a physical address with required alignments i.e [PUD|PMD]_SIZE ? > + args->fixed_alignment = PAGE_SIZE; > + > + for_each_mem_range(idx, &pstart, &pend) { > + if (phys_align_check(pstart, pend, PUD_SIZE, &phys, > + &args->fixed_alignment)) > + break; > + > + if (args->fixed_alignment >= PMD_SIZE) > + continue; > + > + (void)phys_align_check(pstart, pend, PMD_SIZE, &phys, > + &args->fixed_alignment); (void) ? Why not check the return value here ? > + } > + > args->fixed_pgd_pfn = __phys_to_pfn(phys & PGDIR_MASK); > args->fixed_p4d_pfn = __phys_to_pfn(phys & P4D_MASK); > args->fixed_pud_pfn = __phys_to_pfn(phys & PUD_MASK); This loops attempts to find a PUD_SIZE aligned address but breaks out in case it atleast finds a PMD_SIZE aligned address, while looping through available memory ranges. The entire process of finding 'phys' and 'args->fixed_alignment' should be encapsulated inside a helper that also updates 'args->fixed_pxx_pfn' elements. - Anshuman
Hi Anshuman, thanks for looking at this. On Thu, Jan 5, 2023 at 8:24 PM Anshuman Khandual <anshuman.khandual@arm.com> wrote: > > Hi Frank, > > Thanks for the patch, in principle this LGTM. Did a quick run on arm64, > did not find anything problematic. Although I have some comments below. > [...] > > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > > index c631ade3f1d2..e9b52600904a 100644 > > --- a/mm/debug_vm_pgtable.c > > +++ b/mm/debug_vm_pgtable.c > > @@ -15,6 +15,7 @@ > > #include <linux/hugetlb.h> > > #include <linux/kernel.h> > > #include <linux/kconfig.h> > > +#include <linux/memblock.h> > > #include <linux/mm.h> > > #include <linux/mman.h> > > #include <linux/mm_types.h> > > @@ -80,6 +81,8 @@ struct pgtable_debug_args { > > unsigned long pmd_pfn; > > unsigned long pte_pfn; > > > > + phys_addr_t fixed_alignment; > > + > > This should not be a 'phys_addr_t', as it does not really contain a > physical address. Alignment value can be captured in 'unsigned long' > like other elements. True, yep. > > > unsigned long fixed_pgd_pfn; > > unsigned long fixed_p4d_pfn; > > unsigned long fixed_pud_pfn; > > @@ -430,7 +433,8 @@ static void __init pmd_huge_tests(struct pgtable_debug_args *args) > > { > > pmd_t pmd; > > > > - if (!arch_vmap_pmd_supported(args->page_prot)) > > + if (!arch_vmap_pmd_supported(args->page_prot) || > > + args->fixed_alignment < PMD_SIZE) > > return; > > Small nit. Additional line not need for the conditional statement. > You mean the line break in the condition? Not breaking it would push it to 90 characters (if tab=8). Most of this file, except for a few lines, does stick to 80. I don't feel particularly strongly about this either way, though :) > > > > > pr_debug("Validating PMD huge\n"); > > @@ -449,7 +453,8 @@ static void __init pud_huge_tests(struct pgtable_debug_args *args) > > { > > pud_t pud; > > > > - if (!arch_vmap_pud_supported(args->page_prot)) > > + if (!arch_vmap_pud_supported(args->page_prot) || > > + args->fixed_alignment < PUD_SIZE) > > return; > Small nit. Additional line not needed for the conditional statement. See above. > > > > > pr_debug("Validating PUD huge\n"); > > @@ -1077,11 +1082,41 @@ debug_vm_pgtable_alloc_huge_page(struct pgtable_debug_args *args, int order) > > return page; > > } > > > > +/* > > + * Check if a physical memory range described by <pstart, pend> contains > > + * an area that is of size psize, and aligned to the same. > > + * > > + * Don't use address 0, and check for overflow. > > + */ > > +static int __init phys_align_check(phys_addr_t pstart, > > + phys_addr_t pend, phys_addr_t psize, phys_addr_t *physp, > > + phys_addr_t *alignp) > > +{ > > + phys_addr_t aligned_start, aligned_end; > > + > > + if (pstart == 0) > > + pstart = PAGE_SIZE; > > Why ? Since the physical address will be used for page table tests, I think that avoiding 0 is probably a good idea. If e.g. a masking mistake crept into the code somewhere, using physical address 0 might not find it. Also, physical address 0 isn't used on x86. > > > + > > + aligned_start = ALIGN(pstart, psize); > > + aligned_end = aligned_start + psize; > > + > > + if (aligned_end > aligned_start && aligned_end <= pend) { > > + *alignp = psize; > > + *physp = aligned_start; > > + return 1; > > + } > > + > > + return 0; > > +} > > To be more clear, this function should return a 'bool' instead That would be better, yes. > > > + > > + > > static int __init init_args(struct pgtable_debug_args *args) > > { > > struct page *page = NULL; > > phys_addr_t phys; > > int ret = 0; > > + u64 idx; > > + phys_addr_t pstart, pend; > > This declaration can be merged into the previous line containing 'phys'. Sure, yes. > > > > > /* > > * Initialize the debugging data. > > @@ -1161,15 +1196,32 @@ static int __init init_args(struct pgtable_debug_args *args) > > WARN_ON(!args->start_ptep); > > > > /* > > - * PFN for mapping at PTE level is determined from a standard kernel > > - * text symbol. But pfns for higher page table levels are derived by > > - * masking lower bits of this real pfn. These derived pfns might not > > - * exist on the platform but that does not really matter as pfn_pxx() > > - * helpers will still create appropriate entries for the test. This > > - * helps avoid large memory block allocations to be used for mapping > > - * at higher page table levels in some of the tests. > > + * Find a valid physical range, preferably aligned to PUD_SIZE. > > + * Return the address and the alignment. It doesn't need to be > > + * allocated, it just needs to exist as usable memory. The memory > > + * won't be touched. > > + * > > + * The alignment is recorded, and can be checked to see if we > > + * can run the tests that require and actual valid physical > > s/and/an ? Indeed, that's a typo. > > > + * address range on some architectures ({pmd,pud}_huge_test > > + * on x86). > > */ > > + > > phys = __pa_symbol(&start_kernel); > > This original 'phys' will still be used as fallback, in case the below attempt > does not find a physical address with required alignments i.e [PUD|PMD]_SIZE ? Right, the original value (as it is done now) is there as a fallback. > > > + args->fixed_alignment = PAGE_SIZE; > > + > > + for_each_mem_range(idx, &pstart, &pend) { > > + if (phys_align_check(pstart, pend, PUD_SIZE, &phys, > > + &args->fixed_alignment)) > > + break; > > + > > + if (args->fixed_alignment >= PMD_SIZE) > > + continue; > > + > > + (void)phys_align_check(pstart, pend, PMD_SIZE, &phys, > > + &args->fixed_alignment); > > (void) ? Why not check the return value here ? If you get to that function call, you know that no aligned area has been found so far, so checking the return value won't change what you're going to do: you're going to keep going, since even if you get a PMD_SIZE aligned area, you still want to try to get a PUD_SIZE aligned area. So there's no point in checking it. > > > + } > > + > > args->fixed_pgd_pfn = __phys_to_pfn(phys & PGDIR_MASK); > > args->fixed_p4d_pfn = __phys_to_pfn(phys & P4D_MASK); > > args->fixed_pud_pfn = __phys_to_pfn(phys & PUD_MASK); > > This loops attempts to find a PUD_SIZE aligned address but breaks out in case it > atleast finds a PMD_SIZE aligned address, while looping through available memory > ranges. The entire process of finding 'phys' and 'args->fixed_alignment' should > be encapsulated inside a helper that also updates 'args->fixed_pxx_pfn' elements. The loop keeps going until it either runs out of physical memory ranges to check, or until it finds a PUD_SIZE-aligned area. It won't break out for a PMD_SIZE-aligned area. It could be made in to a separate function, yes, that might look a little cleaner. > > - Anshuman Thanks again for the comments. I see that this was added to mm-unstable by now. I can send an mm-unstable follow-up patch (though there won't be any functional changes). - Frank
On 1/6/23 23:24, Frank van der Linden wrote: > Hi Anshuman, thanks for looking at this. > > > On Thu, Jan 5, 2023 at 8:24 PM Anshuman Khandual > <anshuman.khandual@arm.com> wrote: >> >> Hi Frank, >> >> Thanks for the patch, in principle this LGTM. Did a quick run on arm64, >> did not find anything problematic. Although I have some comments below. >> > [...] > >>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c >>> index c631ade3f1d2..e9b52600904a 100644 >>> --- a/mm/debug_vm_pgtable.c >>> +++ b/mm/debug_vm_pgtable.c >>> @@ -15,6 +15,7 @@ >>> #include <linux/hugetlb.h> >>> #include <linux/kernel.h> >>> #include <linux/kconfig.h> >>> +#include <linux/memblock.h> >>> #include <linux/mm.h> >>> #include <linux/mman.h> >>> #include <linux/mm_types.h> >>> @@ -80,6 +81,8 @@ struct pgtable_debug_args { >>> unsigned long pmd_pfn; >>> unsigned long pte_pfn; >>> >>> + phys_addr_t fixed_alignment; >>> + >> >> This should not be a 'phys_addr_t', as it does not really contain a >> physical address. Alignment value can be captured in 'unsigned long' >> like other elements. > > True, yep. > >> >>> unsigned long fixed_pgd_pfn; >>> unsigned long fixed_p4d_pfn; >>> unsigned long fixed_pud_pfn; >>> @@ -430,7 +433,8 @@ static void __init pmd_huge_tests(struct pgtable_debug_args *args) >>> { >>> pmd_t pmd; >>> >>> - if (!arch_vmap_pmd_supported(args->page_prot)) >>> + if (!arch_vmap_pmd_supported(args->page_prot) || >>> + args->fixed_alignment < PMD_SIZE) >>> return; >> >> Small nit. Additional line not need for the conditional statement. >> > > You mean the line break in the condition? Not breaking it would push > it to 90 characters (if tab=8). > > Most of this file, except for a few lines, does stick to 80. I don't > feel particularly strongly about this either way, though :) I guess currently the lines could extend up to 100 instead. > >> >>> >>> pr_debug("Validating PMD huge\n"); >>> @@ -449,7 +453,8 @@ static void __init pud_huge_tests(struct pgtable_debug_args *args) >>> { >>> pud_t pud; >>> >>> - if (!arch_vmap_pud_supported(args->page_prot)) >>> + if (!arch_vmap_pud_supported(args->page_prot) || >>> + args->fixed_alignment < PUD_SIZE) >>> return; >> Small nit. Additional line not needed for the conditional statement. > > See above. > >> >>> >>> pr_debug("Validating PUD huge\n"); >>> @@ -1077,11 +1082,41 @@ debug_vm_pgtable_alloc_huge_page(struct pgtable_debug_args *args, int order) >>> return page; >>> } >>> >>> +/* >>> + * Check if a physical memory range described by <pstart, pend> contains >>> + * an area that is of size psize, and aligned to the same. >>> + * >>> + * Don't use address 0, and check for overflow. >>> + */ >>> +static int __init phys_align_check(phys_addr_t pstart, >>> + phys_addr_t pend, phys_addr_t psize, phys_addr_t *physp, >>> + phys_addr_t *alignp) >>> +{ >>> + phys_addr_t aligned_start, aligned_end; >>> + >>> + if (pstart == 0) >>> + pstart = PAGE_SIZE; >> >> Why ? > > Since the physical address will be used for page table tests, I think > that avoiding 0 is probably a good idea. If e.g. a masking mistake > crept into the code somewhere, using physical address 0 might not find > it. Also, physical address 0 isn't used on x86. Make sense, but will need a small comment explaining the same. >> >>> + >>> + aligned_start = ALIGN(pstart, psize); >>> + aligned_end = aligned_start + psize; >>> + >>> + if (aligned_end > aligned_start && aligned_end <= pend) { >>> + *alignp = psize; >>> + *physp = aligned_start; >>> + return 1; >>> + } >>> + >>> + return 0; >>> +} >> >> To be more clear, this function should return a 'bool' instead > > That would be better, yes. > >> >>> + >>> + >>> static int __init init_args(struct pgtable_debug_args *args) >>> { >>> struct page *page = NULL; >>> phys_addr_t phys; >>> int ret = 0; >>> + u64 idx; >>> + phys_addr_t pstart, pend; >> >> This declaration can be merged into the previous line containing 'phys'. > > Sure, yes. >> >>> >>> /* >>> * Initialize the debugging data. >>> @@ -1161,15 +1196,32 @@ static int __init init_args(struct pgtable_debug_args *args) >>> WARN_ON(!args->start_ptep); >>> >>> /* >>> - * PFN for mapping at PTE level is determined from a standard kernel >>> - * text symbol. But pfns for higher page table levels are derived by >>> - * masking lower bits of this real pfn. These derived pfns might not >>> - * exist on the platform but that does not really matter as pfn_pxx() >>> - * helpers will still create appropriate entries for the test. This >>> - * helps avoid large memory block allocations to be used for mapping >>> - * at higher page table levels in some of the tests. >>> + * Find a valid physical range, preferably aligned to PUD_SIZE. >>> + * Return the address and the alignment. It doesn't need to be >>> + * allocated, it just needs to exist as usable memory. The memory >>> + * won't be touched. >>> + * >>> + * The alignment is recorded, and can be checked to see if we >>> + * can run the tests that require and actual valid physical >> >> s/and/an ? > > Indeed, that's a typo. > >> >>> + * address range on some architectures ({pmd,pud}_huge_test >>> + * on x86). >>> */ >>> + >>> phys = __pa_symbol(&start_kernel); >> >> This original 'phys' will still be used as fallback, in case the below attempt >> does not find a physical address with required alignments i.e [PUD|PMD]_SIZE ? > > Right, the original value (as it is done now) is there as a fallback. > >> >>> + args->fixed_alignment = PAGE_SIZE; >>> + >>> + for_each_mem_range(idx, &pstart, &pend) { >>> + if (phys_align_check(pstart, pend, PUD_SIZE, &phys, >>> + &args->fixed_alignment)) >>> + break; >>> + >>> + if (args->fixed_alignment >= PMD_SIZE) >>> + continue; >>> + >>> + (void)phys_align_check(pstart, pend, PMD_SIZE, &phys, >>> + &args->fixed_alignment); >> >> (void) ? Why not check the return value here ? > > If you get to that function call, you know that no aligned area has > been found so far, so checking the return value won't change what > you're going to do: you're going to keep going, since even if you get > a PMD_SIZE aligned area, you still want to try to get a PUD_SIZE > aligned area. So there's no point in checking it. Okay but does a void is really necessary here even if the return value is not checked ? > >> >>> + } >>> + >>> args->fixed_pgd_pfn = __phys_to_pfn(phys & PGDIR_MASK); >>> args->fixed_p4d_pfn = __phys_to_pfn(phys & P4D_MASK); >>> args->fixed_pud_pfn = __phys_to_pfn(phys & PUD_MASK); >> >> This loops attempts to find a PUD_SIZE aligned address but breaks out in case it >> atleast finds a PMD_SIZE aligned address, while looping through available memory >> ranges. The entire process of finding 'phys' and 'args->fixed_alignment' should >> be encapsulated inside a helper that also updates 'args->fixed_pxx_pfn' elements. > > The loop keeps going until it either runs out of physical memory > ranges to check, or until it finds a PUD_SIZE-aligned area. It won't > break out for a PMD_SIZE-aligned area. > > It could be made in to a separate function, yes, that might look a > little cleaner. Indeed. >> >> - Anshuman > > Thanks again for the comments. I see that this was added to > mm-unstable by now. I can send an mm-unstable follow-up patch (though > there won't be any functional changes). I think you could still send an updated version with the suggested changes, which can be pulled again for mm-unstable. These changes should be part of a single commit being merged, for future clarity while reading these code.
Sure, v2 sent, addressing your comments. I got rid of the return value of phys_align_check() entirely, instead just checking the recorded alignment value. It's more consistent. Added more comments, made types consistent, split off things into a function. Thanks, - Frank On Mon, Jan 9, 2023 at 12:48 AM Anshuman Khandual <anshuman.khandual@arm.com> wrote: > > > On 1/6/23 23:24, Frank van der Linden wrote: > > Hi Anshuman, thanks for looking at this. > > > > > > On Thu, Jan 5, 2023 at 8:24 PM Anshuman Khandual > > <anshuman.khandual@arm.com> wrote: > >> > >> Hi Frank, > >> > >> Thanks for the patch, in principle this LGTM. Did a quick run on arm64, > >> did not find anything problematic. Although I have some comments below. > >> > > [...] > > > >>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > >>> index c631ade3f1d2..e9b52600904a 100644 > >>> --- a/mm/debug_vm_pgtable.c > >>> +++ b/mm/debug_vm_pgtable.c > >>> @@ -15,6 +15,7 @@ > >>> #include <linux/hugetlb.h> > >>> #include <linux/kernel.h> > >>> #include <linux/kconfig.h> > >>> +#include <linux/memblock.h> > >>> #include <linux/mm.h> > >>> #include <linux/mman.h> > >>> #include <linux/mm_types.h> > >>> @@ -80,6 +81,8 @@ struct pgtable_debug_args { > >>> unsigned long pmd_pfn; > >>> unsigned long pte_pfn; > >>> > >>> + phys_addr_t fixed_alignment; > >>> + > >> > >> This should not be a 'phys_addr_t', as it does not really contain a > >> physical address. Alignment value can be captured in 'unsigned long' > >> like other elements. > > > > True, yep. > > > >> > >>> unsigned long fixed_pgd_pfn; > >>> unsigned long fixed_p4d_pfn; > >>> unsigned long fixed_pud_pfn; > >>> @@ -430,7 +433,8 @@ static void __init pmd_huge_tests(struct pgtable_debug_args *args) > >>> { > >>> pmd_t pmd; > >>> > >>> - if (!arch_vmap_pmd_supported(args->page_prot)) > >>> + if (!arch_vmap_pmd_supported(args->page_prot) || > >>> + args->fixed_alignment < PMD_SIZE) > >>> return; > >> > >> Small nit. Additional line not need for the conditional statement. > >> > > > > You mean the line break in the condition? Not breaking it would push > > it to 90 characters (if tab=8). > > > > Most of this file, except for a few lines, does stick to 80. I don't > > feel particularly strongly about this either way, though :) > > I guess currently the lines could extend up to 100 instead. > > > > >> > >>> > >>> pr_debug("Validating PMD huge\n"); > >>> @@ -449,7 +453,8 @@ static void __init pud_huge_tests(struct pgtable_debug_args *args) > >>> { > >>> pud_t pud; > >>> > >>> - if (!arch_vmap_pud_supported(args->page_prot)) > >>> + if (!arch_vmap_pud_supported(args->page_prot) || > >>> + args->fixed_alignment < PUD_SIZE) > >>> return; > >> Small nit. Additional line not needed for the conditional statement. > > > > See above. > > > >> > >>> > >>> pr_debug("Validating PUD huge\n"); > >>> @@ -1077,11 +1082,41 @@ debug_vm_pgtable_alloc_huge_page(struct pgtable_debug_args *args, int order) > >>> return page; > >>> } > >>> > >>> +/* > >>> + * Check if a physical memory range described by <pstart, pend> contains > >>> + * an area that is of size psize, and aligned to the same. > >>> + * > >>> + * Don't use address 0, and check for overflow. > >>> + */ > >>> +static int __init phys_align_check(phys_addr_t pstart, > >>> + phys_addr_t pend, phys_addr_t psize, phys_addr_t *physp, > >>> + phys_addr_t *alignp) > >>> +{ > >>> + phys_addr_t aligned_start, aligned_end; > >>> + > >>> + if (pstart == 0) > >>> + pstart = PAGE_SIZE; > >> > >> Why ? > > > > Since the physical address will be used for page table tests, I think > > that avoiding 0 is probably a good idea. If e.g. a masking mistake > > crept into the code somewhere, using physical address 0 might not find > > it. Also, physical address 0 isn't used on x86. > > Make sense, but will need a small comment explaining the same. > > >> > >>> + > >>> + aligned_start = ALIGN(pstart, psize); > >>> + aligned_end = aligned_start + psize; > >>> + > >>> + if (aligned_end > aligned_start && aligned_end <= pend) { > >>> + *alignp = psize; > >>> + *physp = aligned_start; > >>> + return 1; > >>> + } > >>> + > >>> + return 0; > >>> +} > >> > >> To be more clear, this function should return a 'bool' instead > > > > That would be better, yes. > > > >> > >>> + > >>> + > >>> static int __init init_args(struct pgtable_debug_args *args) > >>> { > >>> struct page *page = NULL; > >>> phys_addr_t phys; > >>> int ret = 0; > >>> + u64 idx; > >>> + phys_addr_t pstart, pend; > >> > >> This declaration can be merged into the previous line containing 'phys'. > > > > Sure, yes. > >> > >>> > >>> /* > >>> * Initialize the debugging data. > >>> @@ -1161,15 +1196,32 @@ static int __init init_args(struct pgtable_debug_args *args) > >>> WARN_ON(!args->start_ptep); > >>> > >>> /* > >>> - * PFN for mapping at PTE level is determined from a standard kernel > >>> - * text symbol. But pfns for higher page table levels are derived by > >>> - * masking lower bits of this real pfn. These derived pfns might not > >>> - * exist on the platform but that does not really matter as pfn_pxx() > >>> - * helpers will still create appropriate entries for the test. This > >>> - * helps avoid large memory block allocations to be used for mapping > >>> - * at higher page table levels in some of the tests. > >>> + * Find a valid physical range, preferably aligned to PUD_SIZE. > >>> + * Return the address and the alignment. It doesn't need to be > >>> + * allocated, it just needs to exist as usable memory. The memory > >>> + * won't be touched. > >>> + * > >>> + * The alignment is recorded, and can be checked to see if we > >>> + * can run the tests that require and actual valid physical > >> > >> s/and/an ? > > > > Indeed, that's a typo. > > > >> > >>> + * address range on some architectures ({pmd,pud}_huge_test > >>> + * on x86). > >>> */ > >>> + > >>> phys = __pa_symbol(&start_kernel); > >> > >> This original 'phys' will still be used as fallback, in case the below attempt > >> does not find a physical address with required alignments i.e [PUD|PMD]_SIZE ? > > > > Right, the original value (as it is done now) is there as a fallback. > > > >> > >>> + args->fixed_alignment = PAGE_SIZE; > >>> + > >>> + for_each_mem_range(idx, &pstart, &pend) { > >>> + if (phys_align_check(pstart, pend, PUD_SIZE, &phys, > >>> + &args->fixed_alignment)) > >>> + break; > >>> + > >>> + if (args->fixed_alignment >= PMD_SIZE) > >>> + continue; > >>> + > >>> + (void)phys_align_check(pstart, pend, PMD_SIZE, &phys, > >>> + &args->fixed_alignment); > >> > >> (void) ? Why not check the return value here ? > > > > If you get to that function call, you know that no aligned area has > > been found so far, so checking the return value won't change what > > you're going to do: you're going to keep going, since even if you get > > a PMD_SIZE aligned area, you still want to try to get a PUD_SIZE > > aligned area. So there's no point in checking it. > > Okay but does a void is really necessary here even if the return value > is not checked ? > > > > >> > >>> + } > >>> + > >>> args->fixed_pgd_pfn = __phys_to_pfn(phys & PGDIR_MASK); > >>> args->fixed_p4d_pfn = __phys_to_pfn(phys & P4D_MASK); > >>> args->fixed_pud_pfn = __phys_to_pfn(phys & PUD_MASK); > >> > >> This loops attempts to find a PUD_SIZE aligned address but breaks out in case it > >> atleast finds a PMD_SIZE aligned address, while looping through available memory > >> ranges. The entire process of finding 'phys' and 'args->fixed_alignment' should > >> be encapsulated inside a helper that also updates 'args->fixed_pxx_pfn' elements. > > > > The loop keeps going until it either runs out of physical memory > > ranges to check, or until it finds a PUD_SIZE-aligned area. It won't > > break out for a PMD_SIZE-aligned area. > > > > It could be made in to a separate function, yes, that might look a > > little cleaner. > > Indeed. > > >> > >> - Anshuman > > > > Thanks again for the comments. I see that this was added to > > mm-unstable by now. I can send an mm-unstable follow-up patch (though > > there won't be any functional changes). > > I think you could still send an updated version with the suggested changes, > which can be pulled again for mm-unstable. These changes should be part of > a single commit being merged, for future clarity while reading these code.
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index c631ade3f1d2..e9b52600904a 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -15,6 +15,7 @@ #include <linux/hugetlb.h> #include <linux/kernel.h> #include <linux/kconfig.h> +#include <linux/memblock.h> #include <linux/mm.h> #include <linux/mman.h> #include <linux/mm_types.h> @@ -80,6 +81,8 @@ struct pgtable_debug_args { unsigned long pmd_pfn; unsigned long pte_pfn; + phys_addr_t fixed_alignment; + unsigned long fixed_pgd_pfn; unsigned long fixed_p4d_pfn; unsigned long fixed_pud_pfn; @@ -430,7 +433,8 @@ static void __init pmd_huge_tests(struct pgtable_debug_args *args) { pmd_t pmd; - if (!arch_vmap_pmd_supported(args->page_prot)) + if (!arch_vmap_pmd_supported(args->page_prot) || + args->fixed_alignment < PMD_SIZE) return; pr_debug("Validating PMD huge\n"); @@ -449,7 +453,8 @@ static void __init pud_huge_tests(struct pgtable_debug_args *args) { pud_t pud; - if (!arch_vmap_pud_supported(args->page_prot)) + if (!arch_vmap_pud_supported(args->page_prot) || + args->fixed_alignment < PUD_SIZE) return; pr_debug("Validating PUD huge\n"); @@ -1077,11 +1082,41 @@ debug_vm_pgtable_alloc_huge_page(struct pgtable_debug_args *args, int order) return page; } +/* + * Check if a physical memory range described by <pstart, pend> contains + * an area that is of size psize, and aligned to the same. + * + * Don't use address 0, and check for overflow. + */ +static int __init phys_align_check(phys_addr_t pstart, + phys_addr_t pend, phys_addr_t psize, phys_addr_t *physp, + phys_addr_t *alignp) +{ + phys_addr_t aligned_start, aligned_end; + + if (pstart == 0) + pstart = PAGE_SIZE; + + aligned_start = ALIGN(pstart, psize); + aligned_end = aligned_start + psize; + + if (aligned_end > aligned_start && aligned_end <= pend) { + *alignp = psize; + *physp = aligned_start; + return 1; + } + + return 0; +} + + static int __init init_args(struct pgtable_debug_args *args) { struct page *page = NULL; phys_addr_t phys; int ret = 0; + u64 idx; + phys_addr_t pstart, pend; /* * Initialize the debugging data. @@ -1161,15 +1196,32 @@ static int __init init_args(struct pgtable_debug_args *args) WARN_ON(!args->start_ptep); /* - * PFN for mapping at PTE level is determined from a standard kernel - * text symbol. But pfns for higher page table levels are derived by - * masking lower bits of this real pfn. These derived pfns might not - * exist on the platform but that does not really matter as pfn_pxx() - * helpers will still create appropriate entries for the test. This - * helps avoid large memory block allocations to be used for mapping - * at higher page table levels in some of the tests. + * Find a valid physical range, preferably aligned to PUD_SIZE. + * Return the address and the alignment. It doesn't need to be + * allocated, it just needs to exist as usable memory. The memory + * won't be touched. + * + * The alignment is recorded, and can be checked to see if we + * can run the tests that require and actual valid physical + * address range on some architectures ({pmd,pud}_huge_test + * on x86). */ + phys = __pa_symbol(&start_kernel); + args->fixed_alignment = PAGE_SIZE; + + for_each_mem_range(idx, &pstart, &pend) { + if (phys_align_check(pstart, pend, PUD_SIZE, &phys, + &args->fixed_alignment)) + break; + + if (args->fixed_alignment >= PMD_SIZE) + continue; + + (void)phys_align_check(pstart, pend, PMD_SIZE, &phys, + &args->fixed_alignment); + } + args->fixed_pgd_pfn = __phys_to_pfn(phys & PGDIR_MASK); args->fixed_p4d_pfn = __phys_to_pfn(phys & P4D_MASK); args->fixed_pud_pfn = __phys_to_pfn(phys & PUD_MASK);
The page table debug tests need a physical address to validate low-level page table manipulation with. The memory at this address is not actually touched, it just encoded in the page table entries at various levels during the tests only. Since the memory is not used, the code just picks the physical address of the start_kernel symbol. This value is then truncated to get a properly aligned address that is to be used for various tests. Because of the truncation, the address might not actually exist, or might not describe a complete huge page. That's not a problem for most tests, but the arch-specific code may check for attribute validity and consistency. The x86 version of {pud,pmd}_set_huge actually validates the MTRRs for the PMD/PUD range. This may fail with an address derived from start_kernel, depending on where the kernel was loaded and what the physical memory layout of the system is. This then leads to false negatives for the {pud,pmd}_set_huge tests. Avoid this by finding a properly aligned memory range that exists and is usable. If such a range is not found, skip the tests that needed it. Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table helpers") Cc: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Frank van der Linden <fvdl@google.com> --- mm/debug_vm_pgtable.c | 70 +++++++++++++++++++++++++++++++++++++------ 1 file changed, 61 insertions(+), 9 deletions(-)