Message ID | 20210715134612.809280-2-willy@infradead.org (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
Series | Make PMD_ORDER generically available | expand |
On Thu, Jul 15, 2021 at 02:46:10PM +0100, Matthew Wilcox (Oracle) wrote: > This is the order of the page table allocation, not the order of a PMD. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > --- > arch/arm/kernel/head.S | 34 +++++++++++++++++----------------- > 1 file changed, 17 insertions(+), 17 deletions(-) > > diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S > index 9eb0b4dbcc12..6da39a1d70ba 100644 > --- a/arch/arm/kernel/head.S > +++ b/arch/arm/kernel/head.S > @@ -38,10 +38,10 @@ > #ifdef CONFIG_ARM_LPAE > /* LPAE requires an additional page for the PGD */ > #define PG_DIR_SIZE 0x5000 > -#define PMD_ORDER 3 > +#define PMD_TABLE_ORDER 3 > #else > #define PG_DIR_SIZE 0x4000 > -#define PMD_ORDER 2 > +#define PMD_TABLE_ORDER 2 I think PMD_ENTRY_ORDER would make more sense here - this is the power-of-2 of an individual PMD entry, not of the entire table.
On Thu, Jul 15, 2021 at 05:47:41PM +0100, Russell King (Oracle) wrote: > On Thu, Jul 15, 2021 at 02:46:10PM +0100, Matthew Wilcox (Oracle) wrote: > > This is the order of the page table allocation, not the order of a PMD. > > -#define PMD_ORDER 3 > > +#define PMD_TABLE_ORDER 3 > > #else > > #define PG_DIR_SIZE 0x4000 > > -#define PMD_ORDER 2 > > +#define PMD_TABLE_ORDER 2 > > I think PMD_ENTRY_ORDER would make more sense here - this is the > power-of-2 of an individual PMD entry, not of the entire table. But ... we have two kinds of PMD entries. We have the direct entry that points to a 1-16MB sized chunk of memory, and we have the table entry that points to a 4k-32k chunk of memory that contains PTEs. So I don't think calling it 'entry' order actually disambiguates anything. That's why I went with 'table' -- I can't think of anything else to call it! PMD_PTE_ARRAY_ORDER doesn't seem like an improvement to me ...
On Thu, Jul 15, 2021 at 07:10:54PM +0100, Matthew Wilcox wrote: > On Thu, Jul 15, 2021 at 05:47:41PM +0100, Russell King (Oracle) wrote: > > On Thu, Jul 15, 2021 at 02:46:10PM +0100, Matthew Wilcox (Oracle) wrote: > > > This is the order of the page table allocation, not the order of a PMD. > > > -#define PMD_ORDER 3 > > > +#define PMD_TABLE_ORDER 3 > > > #else > > > #define PG_DIR_SIZE 0x4000 > > > -#define PMD_ORDER 2 > > > +#define PMD_TABLE_ORDER 2 > > > > I think PMD_ENTRY_ORDER would make more sense here - this is the > > power-of-2 of an individual PMD entry, not of the entire table. > > But ... we have two kinds of PMD entries. We have the direct entry that > points to a 1-16MB sized chunk of memory, and we have the table entry that > points to a 4k-32k chunk of memory that contains PTEs. So I don't think > calling it 'entry' order actually disambiguates anything. That's why > I went with 'table' -- I can't think of anything else to call it! > PMD_PTE_ARRAY_ORDER doesn't seem like an improvement to me ... There may be two kinds of PMD entries, but that isn't relevant here. Going back to the original terminology, 1 << PMD_ORDER here is the size of each PMD entry. It doesn't have anything to do with how much memory is being mapped by each entry. I think what is confusing you is stuff like: add r0, r4, #KERNEL_OFFSET >> (SECTION_SHIFT - PMD_ORDER) r4 is the base address of the page tables, and r0 is the address of the entry we want to manipulate for "KERNEL_OFFSET" - which is the virtual address. 1 << SECTION_SHIFT is how much memory each entry maps (and this is fixed here - there's no variability as you suggest above.) Effectively, the calculation above is: index = KERNEL_OFFSET >> SECTION_SHIFT; pmd_entry_size = 1 << PMD_ORDER; r0 = base + index * pmd_entry_size; but in a single instruction as we can be sure that KERNEL_OFFSET will have zeros as the low bits after shifting by SECTION_SHIFT - PMD_ORDER. Hope this helps to explain what this PMD_ORDER is actually doing here.
On Thu, Jul 15, 2021 at 07:37:27PM +0100, Russell King (Oracle) wrote: > On Thu, Jul 15, 2021 at 07:10:54PM +0100, Matthew Wilcox wrote: > > On Thu, Jul 15, 2021 at 05:47:41PM +0100, Russell King (Oracle) wrote: > > > On Thu, Jul 15, 2021 at 02:46:10PM +0100, Matthew Wilcox (Oracle) wrote: > > > > This is the order of the page table allocation, not the order of a PMD. > > > > -#define PMD_ORDER 3 > > > > +#define PMD_TABLE_ORDER 3 > > > > #else > > > > #define PG_DIR_SIZE 0x4000 > > > > -#define PMD_ORDER 2 > > > > +#define PMD_TABLE_ORDER 2 > > > > > > I think PMD_ENTRY_ORDER would make more sense here - this is the > > > power-of-2 of an individual PMD entry, not of the entire table. > > > > But ... we have two kinds of PMD entries. We have the direct entry that > > points to a 1-16MB sized chunk of memory, and we have the table entry that > > points to a 4k-32k chunk of memory that contains PTEs. So I don't think > > calling it 'entry' order actually disambiguates anything. That's why > > I went with 'table' -- I can't think of anything else to call it! > > PMD_PTE_ARRAY_ORDER doesn't seem like an improvement to me ... > > There may be two kinds of PMD entries, but that isn't relevant here. > Going back to the original terminology, 1 << PMD_ORDER here is the > size of each PMD entry. It doesn't have anything to do with how much > memory is being mapped by each entry. Oh. Oh! So, 'order' is usually a shift that is _added on to_ the PAGE_SHIFT in order to find how many bytes are in question. See include/asm-generic/getorder.h. Now, PMD_SHIFT is already in use, but perhaps what is meant here is PMD_ENTRY_SHIFT? > I think what is confusing you is stuff like: > > add r0, r4, #KERNEL_OFFSET >> (SECTION_SHIFT - PMD_ORDER) > > r4 is the base address of the page tables, and r0 is the address of > the entry we want to manipulate for "KERNEL_OFFSET" - which is the > virtual address. 1 << SECTION_SHIFT is how much memory each entry > maps (and this is fixed here - there's no variability as you suggest > above.) (the variability I intended above was more to accommodate architectural differences; I hate to use x86-specific numbers like 4KiB and 2MiB) > Effectively, the calculation above is: > > index = KERNEL_OFFSET >> SECTION_SHIFT; > pmd_entry_size = 1 << PMD_ORDER; > r0 = base + index * pmd_entry_size; > > but in a single instruction as we can be sure that KERNEL_OFFSET will > have zeros as the low bits after shifting by SECTION_SHIFT - PMD_ORDER. > > Hope this helps to explain what this PMD_ORDER is actually doing here. Thank you, yes, I was terminally confused.
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S index 9eb0b4dbcc12..6da39a1d70ba 100644 --- a/arch/arm/kernel/head.S +++ b/arch/arm/kernel/head.S @@ -38,10 +38,10 @@ #ifdef CONFIG_ARM_LPAE /* LPAE requires an additional page for the PGD */ #define PG_DIR_SIZE 0x5000 -#define PMD_ORDER 3 +#define PMD_TABLE_ORDER 3 #else #define PG_DIR_SIZE 0x4000 -#define PMD_ORDER 2 +#define PMD_TABLE_ORDER 2 #endif .globl swapper_pg_dir @@ -237,7 +237,7 @@ __create_page_tables: mov r6, r6, lsr #SECTION_SHIFT 1: orr r3, r7, r5, lsl #SECTION_SHIFT @ flags + kernel base - str r3, [r4, r5, lsl #PMD_ORDER] @ identity mapping + str r3, [r4, r5, lsl #PMD_TABLE_ORDER] @ identity mapping cmp r5, r6 addlo r5, r5, #1 @ next section blo 1b @@ -247,13 +247,13 @@ __create_page_tables: * set two variables to indicate the physical start and end of the * kernel. */ - add r0, r4, #KERNEL_OFFSET >> (SECTION_SHIFT - PMD_ORDER) + add r0, r4, #KERNEL_OFFSET >> (SECTION_SHIFT - PMD_TABLE_ORDER) ldr r6, =(_end - 1) adr_l r5, kernel_sec_start @ _pa(kernel_sec_start) str r8, [r5] @ Save physical start of kernel orr r3, r8, r7 @ Add the MMU flags - add r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER) -1: str r3, [r0], #1 << PMD_ORDER + add r6, r4, r6, lsr #(SECTION_SHIFT - PMD_TABLE_ORDER) +1: str r3, [r0], #1 << PMD_TABLE_ORDER add r3, r3, #1 << SECTION_SHIFT cmp r0, r6 bls 1b @@ -269,14 +269,14 @@ __create_page_tables: mov r3, pc mov r3, r3, lsr #SECTION_SHIFT orr r3, r7, r3, lsl #SECTION_SHIFT - add r0, r4, #(XIP_START & 0xff000000) >> (SECTION_SHIFT - PMD_ORDER) - str r3, [r0, #((XIP_START & 0x00f00000) >> SECTION_SHIFT) << PMD_ORDER]! + add r0, r4, #(XIP_START & 0xff000000) >> (SECTION_SHIFT - PMD_TABLE_ORDER) + str r3, [r0, #((XIP_START & 0x00f00000) >> SECTION_SHIFT) << PMD_TABLE_ORDER]! ldr r6, =(_edata_loc - 1) - add r0, r0, #1 << PMD_ORDER - add r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER) + add r0, r0, #1 << PMD_TABLE_ORDER + add r6, r4, r6, lsr #(SECTION_SHIFT - PMD_TABLE_ORDER) 1: cmp r0, r6 add r3, r3, #1 << SECTION_SHIFT - strls r3, [r0], #1 << PMD_ORDER + strls r3, [r0], #1 << PMD_TABLE_ORDER bls 1b #endif @@ -286,10 +286,10 @@ __create_page_tables: */ mov r0, r2, lsr #SECTION_SHIFT cmp r2, #0 - ldrne r3, =FDT_FIXED_BASE >> (SECTION_SHIFT - PMD_ORDER) + ldrne r3, =FDT_FIXED_BASE >> (SECTION_SHIFT - PMD_TABLE_ORDER) addne r3, r3, r4 orrne r6, r7, r0, lsl #SECTION_SHIFT - strne r6, [r3], #1 << PMD_ORDER + strne r6, [r3], #1 << PMD_TABLE_ORDER addne r6, r6, #1 << SECTION_SHIFT strne r6, [r3] @@ -308,7 +308,7 @@ __create_page_tables: addruart r7, r3, r0 mov r3, r3, lsr #SECTION_SHIFT - mov r3, r3, lsl #PMD_ORDER + mov r3, r3, lsl #PMD_TABLE_ORDER add r0, r4, r3 mov r3, r7, lsr #SECTION_SHIFT @@ -338,7 +338,7 @@ __create_page_tables: * If we're using the NetWinder or CATS, we also need to map * in the 16550-type serial port for the debug messages */ - add r0, r4, #0xff000000 >> (SECTION_SHIFT - PMD_ORDER) + add r0, r4, #0xff000000 >> (SECTION_SHIFT - PMD_TABLE_ORDER) orr r3, r7, #0x7c000000 str r3, [r0] #endif @@ -348,10 +348,10 @@ __create_page_tables: * Similar reasons here - for debug. This is * only for Acorn RiscPC architectures. */ - add r0, r4, #0x02000000 >> (SECTION_SHIFT - PMD_ORDER) + add r0, r4, #0x02000000 >> (SECTION_SHIFT - PMD_TABLE_ORDER) orr r3, r7, #0x02000000 str r3, [r0] - add r0, r4, #0xd8000000 >> (SECTION_SHIFT - PMD_ORDER) + add r0, r4, #0xd8000000 >> (SECTION_SHIFT - PMD_TABLE_ORDER) str r3, [r0] #endif #endif
This is the order of the page table allocation, not the order of a PMD. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> --- arch/arm/kernel/head.S | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-)