diff mbox series

[v5,2/4] mm: support Svnapot in physical page linear-mapping

Message ID 20221003134721.1772455-3-panqinglin2020@iscas.ac.cn (mailing list archive)
State Superseded
Headers show
Series riscv, mm: detect svnapot cpu support at runtime | expand

Commit Message

Qinglin Pan Oct. 3, 2022, 1:47 p.m. UTC
From: Qinglin Pan <panqinglin2020@iscas.ac.cn>

Svnapot is powerful when a physical region is going to mapped to a
virtual region. Kernel will do like this when mapping all allocable
physical pages to kernel vm space. This commit modifies the
create_pte_mapping function used in linear-mapping procedure, so the
kernel can be able to use Svnapot when both address and length of
physical region are 64KB align. Code here will be executed only when
other size huge page is not suitable, so it can be an addition of
PMD_SIZE and PUD_SIZE mapping.

This commit also modifies the best_map_size function to give map_size
many times instead of only once, so a memory region can be mapped by
both PMD_SIZE and 64KB napot size.

It is tested by setting qemu's memory to a 262272k region, and the
kernel can boot successfully.

Currently, the modified create_pte_mapping will never take use of SVNAPOT,
because this extension is detected in riscv_fill_hwcap and enabled in
apply_boot_alternatives(called from setup_arch) which is called
after setup_vm_final. We will need to support function like
riscv_fill_hwcap_early to fill hardware capabilities more earlier, and
try to enable SVNAPOT more earlier in apply_early_boot_alternatives,
so that we can determine SVNAPOT's presence during setup_vm_final.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>

Comments

Conor Dooley Oct. 4, 2022, 6:40 p.m. UTC | #1
Hey Qinglin Pan,

Other comments aside, it'd be good to add previous reviewers to the CC
list on follow-up versions.

On Mon, Oct 03, 2022 at 09:47:19PM +0800, panqinglin2020@iscas.ac.cn wrote:
> From: Qinglin Pan <panqinglin2020@iscas.ac.cn>
> mm: modify pte format for Svnapot

"riscv: mm: foo" please.

> 
> Svnapot is powerful when a physical region is going to mapped to a
> virtual region. Kernel will do like this when mapping all allocable
> physical pages to kernel vm space. This commit modifies the

s/This commit modifies/Modify

> create_pte_mapping function used in linear-mapping procedure, so the
> kernel can be able to use Svnapot when both address and length of
> physical region are 64KB align. Code here will be executed only when
> other size huge page is not suitable, so it can be an addition of
> PMD_SIZE and PUD_SIZE mapping.
> 
> This commit also modifies the best_map_size function to give map_size

s/This commit also modifies/Modify/

Although, with the "also" should this be two patches or is there a
compile time dependency?

> many times instead of only once, so a memory region can be mapped by
> both PMD_SIZE and 64KB napot size.
> 
> It is tested by setting qemu's memory to a 262272k region, and the
> kernel can boot successfully.
> 
> Currently, the modified create_pte_mapping will never take use of SVNAPOT,
> because this extension is detected in riscv_fill_hwcap and enabled in
> apply_boot_alternatives(called from setup_arch) which is called
> after setup_vm_final. We will need to support function like

Out of curiousity, why doesn't this series add the support?
Do you intend sending a follow up series?

> riscv_fill_hwcap_early to fill hardware capabilities more earlier, and
> try to enable SVNAPOT more earlier in apply_early_boot_alternatives,
> so that we can determine SVNAPOT's presence during setup_vm_final.
> 
> Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
> 
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index b56a0a75533f..76317bb28f29 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -373,9 +373,21 @@ static void __init create_pte_mapping(pte_t *ptep,
>  				      phys_addr_t sz, pgprot_t prot)
>  {
>  	uintptr_t pte_idx = pte_index(va);
> +#ifdef CONFIG_RISCV_ISA_SVNAPOT

Would using IS_ENBLED() cause problems here?

> +	pte_t pte;
> +
> +	if (has_svnapot() && sz == NAPOT_CONT64KB_SIZE) {
> +		do {
> +			pte = pfn_pte(PFN_DOWN(pa), prot);
> +			ptep[pte_idx] = pte_mknapot(pte, NAPOT_CONT64KB_ORDER);
> +			pte_idx++;
> +			sz -= PAGE_SIZE;
> +		} while (sz > 0);
> +		return;
> +	}
> +#endif
>  
>  	BUG_ON(sz != PAGE_SIZE);
> -
>  	if (pte_none(ptep[pte_idx]))
>  		ptep[pte_idx] = pfn_pte(PFN_DOWN(pa), prot);
>  }
> @@ -673,10 +685,18 @@ void __init create_pgd_mapping(pgd_t *pgdp,
>  static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size)
>  {
>  	/* Upgrade to PMD_SIZE mappings whenever possible */
> -	if ((base & (PMD_SIZE - 1)) || (size & (PMD_SIZE - 1)))
> +	base &= PMD_SIZE - 1;
> +	if (!base && size >= PMD_SIZE)
> +		return PMD_SIZE;
> +
> +	if (!has_svnapot())
>  		return PAGE_SIZE;
>  
> -	return PMD_SIZE;
> +	base &= NAPOT_CONT64KB_SIZE - 1;
> +	if (!base && size >= NAPOT_CONT64KB_SIZE)
> +		return NAPOT_CONT64KB_SIZE;
> +
> +	return PAGE_SIZE;
>  }
>  
>  #ifdef CONFIG_XIP_KERNEL
> @@ -1111,9 +1131,9 @@ static void __init setup_vm_final(void)
>  		if (end >= __pa(PAGE_OFFSET) + memory_limit)
>  			end = __pa(PAGE_OFFSET) + memory_limit;
>  
> -		map_size = best_map_size(start, end - start);
>  		for (pa = start; pa < end; pa += map_size) {
>  			va = (uintptr_t)__va(pa);
> +			map_size = best_map_size(pa, end - pa);
>  
>  			create_pgd_mapping(swapper_pg_dir, va, pa, map_size,
>  					   pgprot_from_va(va));
> -- 
> 2.35.1
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
Qinglin Pan Oct. 5, 2022, 2:43 a.m. UTC | #2
Hi Conor,

On 10/5/22 2:40 AM, Conor Dooley wrote:
> 
> Hey Qinglin Pan,
> 
> Other comments aside, it'd be good to add previous reviewers to the CC
> list on follow-up versions.
> 

Will do it in the next version:)

> On Mon, Oct 03, 2022 at 09:47:19PM +0800, panqinglin2020@iscas.ac.cn wrote:
>> From: Qinglin Pan <panqinglin2020@iscas.ac.cn>
>> mm: modify pte format for Svnapot
> 
> "riscv: mm: foo" please.
> 
>>
>> Svnapot is powerful when a physical region is going to mapped to a
>> virtual region. Kernel will do like this when mapping all allocable
>> physical pages to kernel vm space. This commit modifies the
> 
> s/This commit modifies/Modify
> 
>> create_pte_mapping function used in linear-mapping procedure, so the
>> kernel can be able to use Svnapot when both address and length of
>> physical region are 64KB align. Code here will be executed only when
>> other size huge page is not suitable, so it can be an addition of
>> PMD_SIZE and PUD_SIZE mapping.
>>
>> This commit also modifies the best_map_size function to give map_size
> 
> s/This commit also modifies/Modify/
> 
> Although, with the "also" should this be two patches or is there a
> compile time dependency?
> 

The above typo will be repaired in the next version.

>> many times instead of only once, so a memory region can be mapped by
>> both PMD_SIZE and 64KB napot size.
>>
>> It is tested by setting qemu's memory to a 262272k region, and the
>> kernel can boot successfully.
>>
>> Currently, the modified create_pte_mapping will never take use of SVNAPOT,
>> because this extension is detected in riscv_fill_hwcap and enabled in
>> apply_boot_alternatives(called from setup_arch) which is called
>> after setup_vm_final. We will need to support function like
> 
> Out of curiousity, why doesn't this series add the support?

Because I am not familiar with parsing fdt without memory alloction:(
It may delay this merging of this patchset. I will try to do this in
a follow up series:)

> Do you intend sending a follow up series?
> 
>> riscv_fill_hwcap_early to fill hardware capabilities more earlier, and
>> try to enable SVNAPOT more earlier in apply_early_boot_alternatives,
>> so that we can determine SVNAPOT's presence during setup_vm_final.
>>
>> Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
>>
>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>> index b56a0a75533f..76317bb28f29 100644
>> --- a/arch/riscv/mm/init.c
>> +++ b/arch/riscv/mm/init.c
>> @@ -373,9 +373,21 @@ static void __init create_pte_mapping(pte_t *ptep,
>>   				      phys_addr_t sz, pgprot_t prot)
>>   {
>>   	uintptr_t pte_idx = pte_index(va);
>> +#ifdef CONFIG_RISCV_ISA_SVNAPOT
> 
> Would using IS_ENBLED() cause problems here?

Yes, will do it in next version.

Qinglin
Andrew Jones Oct. 5, 2022, 11:19 a.m. UTC | #3
On Mon, Oct 03, 2022 at 09:47:19PM +0800, panqinglin2020@iscas.ac.cn wrote:
> From: Qinglin Pan <panqinglin2020@iscas.ac.cn>
> 
> Svnapot is powerful when a physical region is going to mapped to a
> virtual region. Kernel will do like this when mapping all allocable
> physical pages to kernel vm space. This commit modifies the
> create_pte_mapping function used in linear-mapping procedure, so the
> kernel can be able to use Svnapot when both address and length of
> physical region are 64KB align. Code here will be executed only when
> other size huge page is not suitable, so it can be an addition of
> PMD_SIZE and PUD_SIZE mapping.
> 
> This commit also modifies the best_map_size function to give map_size
> many times instead of only once, so a memory region can be mapped by
> both PMD_SIZE and 64KB napot size.

I'd prefer to see the best_map_size() change for PMD_SIZE and PAGE_SIZE
be a separate patch. Then, the NAPOT_CONT64KB_SIZE support can be added
on to a ready best_map_size(). In fact, I'd prefer this patch be dropped
from this series and the best_map_size() for PMD_SIZE and PAGE_SIZE patch
be either posted as a first patch of the "use svnapot in early boot"
series or be posted alone, as it's already applicable.

> 
> It is tested by setting qemu's memory to a 262272k region, and the
> kernel can boot successfully.
> 
> Currently, the modified create_pte_mapping will never take use of SVNAPOT,
> because this extension is detected in riscv_fill_hwcap and enabled in
> apply_boot_alternatives(called from setup_arch) which is called
> after setup_vm_final. We will need to support function like
> riscv_fill_hwcap_early to fill hardware capabilities more earlier, and
> try to enable SVNAPOT more earlier in apply_early_boot_alternatives,
> so that we can determine SVNAPOT's presence during setup_vm_final.

Thanks,
drew
Qinglin Pan Oct. 5, 2022, 12:45 p.m. UTC | #4
Hi Andrew,

On 10/5/22 7:19 PM, Andrew Jones wrote:
> On Mon, Oct 03, 2022 at 09:47:19PM +0800, panqinglin2020@iscas.ac.cn wrote:
>> From: Qinglin Pan <panqinglin2020@iscas.ac.cn>
>>
>> Svnapot is powerful when a physical region is going to mapped to a
>> virtual region. Kernel will do like this when mapping all allocable
>> physical pages to kernel vm space. This commit modifies the
>> create_pte_mapping function used in linear-mapping procedure, so the
>> kernel can be able to use Svnapot when both address and length of
>> physical region are 64KB align. Code here will be executed only when
>> other size huge page is not suitable, so it can be an addition of
>> PMD_SIZE and PUD_SIZE mapping.
>>
>> This commit also modifies the best_map_size function to give map_size
>> many times instead of only once, so a memory region can be mapped by
>> both PMD_SIZE and 64KB napot size.
> 
> I'd prefer to see the best_map_size() change for PMD_SIZE and PAGE_SIZE
> be a separate patch. Then, the NAPOT_CONT64KB_SIZE support can be added
> on to a ready best_map_size(). In fact, I'd prefer this patch be dropped
> from this series and the best_map_size() for PMD_SIZE and PAGE_SIZE patch
> be either posted as a first patch of the "use svnapot in early boot"
> series or be posted alone, as it's already applicable.
> 

I agree with you. I will drop this commit from the series and post the 
modification on best_map_size for PMD_SIZE and PAGE_SIZE as a single 
patch alone. I will do this in this series' v7.

Thanks,
Qinglin

>>
>> It is tested by setting qemu's memory to a 262272k region, and the
>> kernel can boot successfully.
>>
>> Currently, the modified create_pte_mapping will never take use of SVNAPOT,
>> because this extension is detected in riscv_fill_hwcap and enabled in
>> apply_boot_alternatives(called from setup_arch) which is called
>> after setup_vm_final. We will need to support function like
>> riscv_fill_hwcap_early to fill hardware capabilities more earlier, and
>> try to enable SVNAPOT more earlier in apply_early_boot_alternatives,
>> so that we can determine SVNAPOT's presence during setup_vm_final.
> 
> Thanks,
> drew
diff mbox series

Patch

diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index b56a0a75533f..76317bb28f29 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -373,9 +373,21 @@  static void __init create_pte_mapping(pte_t *ptep,
 				      phys_addr_t sz, pgprot_t prot)
 {
 	uintptr_t pte_idx = pte_index(va);
+#ifdef CONFIG_RISCV_ISA_SVNAPOT
+	pte_t pte;
+
+	if (has_svnapot() && sz == NAPOT_CONT64KB_SIZE) {
+		do {
+			pte = pfn_pte(PFN_DOWN(pa), prot);
+			ptep[pte_idx] = pte_mknapot(pte, NAPOT_CONT64KB_ORDER);
+			pte_idx++;
+			sz -= PAGE_SIZE;
+		} while (sz > 0);
+		return;
+	}
+#endif
 
 	BUG_ON(sz != PAGE_SIZE);
-
 	if (pte_none(ptep[pte_idx]))
 		ptep[pte_idx] = pfn_pte(PFN_DOWN(pa), prot);
 }
@@ -673,10 +685,18 @@  void __init create_pgd_mapping(pgd_t *pgdp,
 static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size)
 {
 	/* Upgrade to PMD_SIZE mappings whenever possible */
-	if ((base & (PMD_SIZE - 1)) || (size & (PMD_SIZE - 1)))
+	base &= PMD_SIZE - 1;
+	if (!base && size >= PMD_SIZE)
+		return PMD_SIZE;
+
+	if (!has_svnapot())
 		return PAGE_SIZE;
 
-	return PMD_SIZE;
+	base &= NAPOT_CONT64KB_SIZE - 1;
+	if (!base && size >= NAPOT_CONT64KB_SIZE)
+		return NAPOT_CONT64KB_SIZE;
+
+	return PAGE_SIZE;
 }
 
 #ifdef CONFIG_XIP_KERNEL
@@ -1111,9 +1131,9 @@  static void __init setup_vm_final(void)
 		if (end >= __pa(PAGE_OFFSET) + memory_limit)
 			end = __pa(PAGE_OFFSET) + memory_limit;
 
-		map_size = best_map_size(start, end - start);
 		for (pa = start; pa < end; pa += map_size) {
 			va = (uintptr_t)__va(pa);
+			map_size = best_map_size(pa, end - pa);
 
 			create_pgd_mapping(swapper_pg_dir, va, pa, map_size,
 					   pgprot_from_va(va));