Message ID | 20250311073043.96795-2-ardb+git@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] arm64/kernel: Always use level 2 or higher for early mappings | expand |
On 3/11/25 13:00, Ard Biesheuvel wrote: > From: Ard Biesheuvel <ardb@kernel.org> > > The page table population code in map_range() uses a recursive algorithm > to create the early mappings of the kernel, the DTB and the ID mapped > text and data pages, and this fails to take into account that the way > these page tables may be constructed is not precisely the same at each > level. In particular, block mappings are not permitted at each level, > and the code as it exists today might inadvertently create such a > forbidden block mapping if it were used to map a region of the > appropriate size and alignment. > > This never happens in practice, given the limited size of the assets > being mapped by the early boot code. Nonetheless, it would be better if > this code would behave correctly in all circumstances. > > So only permit block mappings at level 2, and page mappings at level 3, > for any page size, and use table mappings exclusively at all other > levels. This change should have no impact in practice, but it makes the > code more robust. > > Cc: Anshuman Khandual <anshuman.khandual@arm.com> > Reported-by: Ryan Roberts <ryan.roberts@arm.com> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > --- > v2: take the assignment of protval out of the loop again, so that > clearing a mapping works as expected wrt the PTE_CONT bit > > arch/arm64/kernel/pi/map_range.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/kernel/pi/map_range.c b/arch/arm64/kernel/pi/map_range.c > index 2b69e3beeef8..5778697f3062 100644 > --- a/arch/arm64/kernel/pi/map_range.c > +++ b/arch/arm64/kernel/pi/map_range.c > @@ -45,12 +45,12 @@ void __init map_range(u64 *pte, u64 start, u64 end, u64 pa, pgprot_t prot, > * clearing the mapping > */ > if (protval) > - protval |= (level < 3) ? PMD_TYPE_SECT : PTE_TYPE_PAGE; > + protval |= (level == 2) ? PMD_TYPE_SECT : PTE_TYPE_PAGE; > > while (start < end) { > u64 next = min((start | lmask) + 1, PAGE_ALIGN(end)); > > - if (level < 3 && (start | next | pa) & lmask) { > + if (level < 2 || (level == 2 && (start | next | pa) & lmask)) { > /* > * This chunk needs a finer grained mapping. Create a > * table mapping if necessary and recurse. Again did not see any problem running this on various 4K/16K/64K page configs. Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
On 11/03/2025 07:30, Ard Biesheuvel wrote: > From: Ard Biesheuvel <ardb@kernel.org> > > The page table population code in map_range() uses a recursive algorithm > to create the early mappings of the kernel, the DTB and the ID mapped > text and data pages, and this fails to take into account that the way > these page tables may be constructed is not precisely the same at each > level. In particular, block mappings are not permitted at each level, > and the code as it exists today might inadvertently create such a > forbidden block mapping if it were used to map a region of the > appropriate size and alignment. > > This never happens in practice, given the limited size of the assets > being mapped by the early boot code. Nonetheless, it would be better if > this code would behave correctly in all circumstances. > > So only permit block mappings at level 2, and page mappings at level 3, > for any page size, and use table mappings exclusively at all other > levels. This change should have no impact in practice, but it makes the > code more robust. > > Cc: Anshuman Khandual <anshuman.khandual@arm.com> > Reported-by: Ryan Roberts <ryan.roberts@arm.com> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > --- > v2: take the assignment of protval out of the loop again, so that > clearing a mapping works as expected wrt the PTE_CONT bit Ouch, totally missed that. Now that I've looked at the code properly, I wonder if it is worth supporting CONT_PMD mappings at level 2? That would be 32M for 4K pages, so is likely to actaually get used. I think it would just be something like: u64 cmask = (level == 3) ? CONT_PTE_SIZE - 1 : ((level == 2) ? CONT_PMD_SIZE - 1 : U64_MAX) : U64_MAX; Anyway, probably not worth the complexity for these boot-time tables, so regardless: Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> > > arch/arm64/kernel/pi/map_range.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/kernel/pi/map_range.c b/arch/arm64/kernel/pi/map_range.c > index 2b69e3beeef8..5778697f3062 100644 > --- a/arch/arm64/kernel/pi/map_range.c > +++ b/arch/arm64/kernel/pi/map_range.c > @@ -45,12 +45,12 @@ void __init map_range(u64 *pte, u64 start, u64 end, u64 pa, pgprot_t prot, > * clearing the mapping > */ > if (protval) > - protval |= (level < 3) ? PMD_TYPE_SECT : PTE_TYPE_PAGE; > + protval |= (level == 2) ? PMD_TYPE_SECT : PTE_TYPE_PAGE; > > while (start < end) { > u64 next = min((start | lmask) + 1, PAGE_ALIGN(end)); > > - if (level < 3 && (start | next | pa) & lmask) { > + if (level < 2 || (level == 2 && (start | next | pa) & lmask)) { > /* > * This chunk needs a finer grained mapping. Create a > * table mapping if necessary and recurse.
On Tue, 11 Mar 2025 at 12:31, Ryan Roberts <ryan.roberts@arm.com> wrote: > > On 11/03/2025 07:30, Ard Biesheuvel wrote: > > From: Ard Biesheuvel <ardb@kernel.org> > > > > The page table population code in map_range() uses a recursive algorithm > > to create the early mappings of the kernel, the DTB and the ID mapped > > text and data pages, and this fails to take into account that the way > > these page tables may be constructed is not precisely the same at each > > level. In particular, block mappings are not permitted at each level, > > and the code as it exists today might inadvertently create such a > > forbidden block mapping if it were used to map a region of the > > appropriate size and alignment. > > > > This never happens in practice, given the limited size of the assets > > being mapped by the early boot code. Nonetheless, it would be better if > > this code would behave correctly in all circumstances. > > > > So only permit block mappings at level 2, and page mappings at level 3, > > for any page size, and use table mappings exclusively at all other > > levels. This change should have no impact in practice, but it makes the > > code more robust. > > > > Cc: Anshuman Khandual <anshuman.khandual@arm.com> > > Reported-by: Ryan Roberts <ryan.roberts@arm.com> > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > --- > > v2: take the assignment of protval out of the loop again, so that > > clearing a mapping works as expected wrt the PTE_CONT bit > > Ouch, totally missed that. Now that I've looked at the code properly, I wonder > if it is worth supporting CONT_PMD mappings at level 2? That would be 32M for 4K > pages, so is likely to actaually get used. I think it would just be something like: > > u64 cmask = (level == 3) ? CONT_PTE_SIZE - 1 : > ((level == 2) ? CONT_PMD_SIZE - 1 : U64_MAX) : U64_MAX; > With this patch applied, it is guaranteed that cmask is only referenced if level >= 2, so we can simplify this as u64 cmask = (level < 3) ? CONT_PMD_SIZE - 1 : CONT_PTE_SIZE - 1; (using the level < 3 for consistency with the protval assignment) > Anyway, probably not worth the complexity for these boot-time tables, so regardless: > They are created at boot time but these are not 'boot-time tables' - they are the ones that are used by the kernel at runtime. So I think it makes sense to add this. > Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> > Thanks.
diff --git a/arch/arm64/kernel/pi/map_range.c b/arch/arm64/kernel/pi/map_range.c index 2b69e3beeef8..5778697f3062 100644 --- a/arch/arm64/kernel/pi/map_range.c +++ b/arch/arm64/kernel/pi/map_range.c @@ -45,12 +45,12 @@ void __init map_range(u64 *pte, u64 start, u64 end, u64 pa, pgprot_t prot, * clearing the mapping */ if (protval) - protval |= (level < 3) ? PMD_TYPE_SECT : PTE_TYPE_PAGE; + protval |= (level == 2) ? PMD_TYPE_SECT : PTE_TYPE_PAGE; while (start < end) { u64 next = min((start | lmask) + 1, PAGE_ALIGN(end)); - if (level < 3 && (start | next | pa) & lmask) { + if (level < 2 || (level == 2 && (start | next | pa) & lmask)) { /* * This chunk needs a finer grained mapping. Create a * table mapping if necessary and recurse.