[RFC] arm64: update iomem_resource.end

Message ID	1525906703-28481-1-git-send-email-nicoleotsuka@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> From: Nicolin Chen <nicoleotsuka@gmail.com> To: will.deacon@arm.com, catalin.marinas@arm.com Subject: [RFC][PATCH] arm64: update iomem_resource.end Date: Wed, 9 May 2018 15:58:23 -0700 Message-Id: <1525906703-28481-1-git-send-email-nicoleotsuka@gmail.com> Precedence: list Cc: steve.capper@arm.com, linux-kernel@vger.kernel.org, stefan@agner.ch, jglisse@redhat.com, kristina.martsenko@arm.com, akpm@linux-foundation.org, labbott@redhat.com, robin.murphy@arm.com, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Message ID

1525906703-28481-1-git-send-email-nicoleotsuka@gmail.com (mailing list archive)

State

New, archived

Headers

From: Nicolin Chen <nicoleotsuka@gmail.com>
To: will.deacon@arm.com,
	catalin.marinas@arm.com
Subject: [RFC][PATCH] arm64: update iomem_resource.end
Date: Wed,  9 May 2018 15:58:23 -0700
Message-Id: <1525906703-28481-1-git-send-email-nicoleotsuka@gmail.com>
Precedence: list
Cc: steve.capper@arm.com, linux-kernel@vger.kernel.org, stefan@agner.ch,
	jglisse@redhat.com, kristina.martsenko@arm.com,
	akpm@linux-foundation.org, labbott@redhat.com, robin.murphy@arm.com,
	linux-arm-kernel@lists.infradead.org
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Commit Message

Nicolin Chen May 9, 2018, 10:58 p.m. UTC

The iomem_resource.end is -1 by default and should be updated in
arch-level code.

ARM64 so far hasn't updated it while core kernel code (mm/hmm.c)
started to use iomem_resource.end for boundary check. So it'd be
better to assign iomem_resource.end using a valid value, the end
of physical address space for example because iomem_resource.end
in theory should reflect that.

However, VA_BITS might be smaller than PA_BITS in ARM64. So using
the end of physical address space doesn't make a lot of sense in
this case, or could be even harmful since virtual address cannot
reach that memory region. Furthermore, the linear region size of
ARM64 only has the half of Virtual Memory size -- "VA_BITS - 1".

So this patch updates the iomem_resource.end by using the end of
physical address space or the end of linear mapping region when
(VA_BITS - 1) is smaller than PA_BITS.

Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
---
 arch/arm64/mm/init.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Robin Murphy May 10, 2018, 5:45 p.m. UTC | #1

On 09/05/18 23:58, Nicolin Chen wrote:
> The iomem_resource.end is -1 by default and should be updated in
> arch-level code.
> 
> ARM64 so far hasn't updated it while core kernel code (mm/hmm.c)
> started to use iomem_resource.end for boundary check. So it'd be
> better to assign iomem_resource.end using a valid value, the end
> of physical address space for example because iomem_resource.end
> in theory should reflect that.
> 
> However, VA_BITS might be smaller than PA_BITS in ARM64. So using
> the end of physical address space doesn't make a lot of sense in
> this case, or could be even harmful since virtual address cannot
> reach that memory region.

Why? There's plenty of stuff in the physical address space that will 
only ever be accessed via ioremap/memremap. There's no reason you 
shouldn't be able to run a VA_BITS < 48 kernel on a Cavium ThunderX 
where *all* the I/O is in the top half of the PA space. We already 
constrain RAM in this very function to those regions which fit into the 
linear map, and if you're accessing anything other than RAM through the 
linear map you're probably doing something wrong.

Furthermore, the physical region covered by the linear map doesn't 
necessarily start at physical address 0 anyway - see PHYS_OFFSET.

Robin.

> Furthermore, the linear region size of
> ARM64 only has the half of Virtual Memory size -- "VA_BITS - 1".
> 
> So this patch updates the iomem_resource.end by using the end of
> physical address space or the end of linear mapping region when
> (VA_BITS - 1) is smaller than PA_BITS.
> 
> Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
> ---
>   arch/arm64/mm/init.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index f48b194..22015af 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -377,6 +377,12 @@ void __init arm64_memblock_init(void)
>   	BUILD_BUG_ON(linear_region_size != BIT(VA_BITS - 1));
>   
>   	/*
> +	 * Update iomem_resource.end based on size of physical address space,
> +	 * or linear region size when (VA_BITS - 1) is smaller than PA_BITS.
> +	 */
> +	iomem_resource.end = BIT(min(PHYS_MASK_SHIFT, VA_BITS - 1)) - 1;
> +
> +	/*
>   	 * Select a suitable value for the base of physical memory.
>   	 */
>   	memstart_addr = round_down(memblock_start_of_DRAM(),
>

Nicolin Chen May 10, 2018, 10:29 p.m. UTC | #2

Thanks for the comments, Robin.

On Thu, May 10, 2018 at 06:45:59PM +0100, Robin Murphy wrote:
> On 09/05/18 23:58, Nicolin Chen wrote:
> >The iomem_resource.end is -1 by default and should be updated in
> >arch-level code.
> >
> >ARM64 so far hasn't updated it while core kernel code (mm/hmm.c)
> >started to use iomem_resource.end for boundary check. So it'd be
> >better to assign iomem_resource.end using a valid value, the end
> >of physical address space for example because iomem_resource.end
> >in theory should reflect that.
> >
> >However, VA_BITS might be smaller than PA_BITS in ARM64. So using
> >the end of physical address space doesn't make a lot of sense in
> >this case, or could be even harmful since virtual address cannot
> >reach that memory region.
> 
> Why? There's plenty of stuff in the physical address space that will
> only ever be accessed via ioremap/memremap. There's no reason you
> shouldn't be able to run a VA_BITS < 48 kernel on a Cavium ThunderX

I'm running VA_BITS_39 and PA_BITS_48 on Tegra 210. There had
not been any problem of it, however with hmm.....

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/mm/hmm.c#n1144

This hmm_devmem_add() requests a region with PFNs being outside
of the linear region in ARM64 case which takes MAX_PHYSMEM_BITS
(48 bits) over iomem_resource.end without this patch. Then when
dealing with page structures in vmemmap region from a given PFN
directly (CONFIG_SPARSEMEM_VMEMMAP=y), and the given PFN is the
last one based on physical region (48 bits), the address of its
page structure will go beyond vmemmap region. Does this sound a
problem?

> where *all* the I/O is in the top half of the PA space. We already
> constrain RAM in this very function to those regions which fit into
> the linear map, and if you're accessing anything other than RAM
> through the linear map you're probably doing something wrong.

If I understand this part correctly, since ARM64 has applied the
memory limit already, does it mean that probably we should fix
something in the region_intersects() or add an extra check in the
hmm_devmem_add(), instead of limiting the iomem_resource? 

> Furthermore, the physical region covered by the linear map doesn't
> necessarily start at physical address 0 anyway - see PHYS_OFFSET.

Hmm...okay...but there still should be a protection somewhere if
it happens to access a page structure via pfn_to_page() while the
PFN is not covered by the vmemmap linear mapping, right?

Thanks!

Robin Murphy May 11, 2018, 11:12 a.m. UTC | #3

On 10/05/18 23:29, Nicolin Chen wrote:
> Thanks for the comments, Robin.
> 
> On Thu, May 10, 2018 at 06:45:59PM +0100, Robin Murphy wrote:
>> On 09/05/18 23:58, Nicolin Chen wrote:
>>> The iomem_resource.end is -1 by default and should be updated in
>>> arch-level code.
>>>
>>> ARM64 so far hasn't updated it while core kernel code (mm/hmm.c)
>>> started to use iomem_resource.end for boundary check. So it'd be
>>> better to assign iomem_resource.end using a valid value, the end
>>> of physical address space for example because iomem_resource.end
>>> in theory should reflect that.
>>>
>>> However, VA_BITS might be smaller than PA_BITS in ARM64. So using
>>> the end of physical address space doesn't make a lot of sense in
>>> this case, or could be even harmful since virtual address cannot
>>> reach that memory region.
>>
>> Why? There's plenty of stuff in the physical address space that will
>> only ever be accessed via ioremap/memremap. There's no reason you
>> shouldn't be able to run a VA_BITS < 48 kernel on a Cavium ThunderX
> 
> I'm running VA_BITS_39 and PA_BITS_48 on Tegra 210. There had
> not been any problem of it, however with hmm.....
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/mm/hmm.c#n1144
> 
> This hmm_devmem_add() requests a region with PFNs being outside
> of the linear region in ARM64 case which takes MAX_PHYSMEM_BITS
> (48 bits) over iomem_resource.end without this patch. Then when
> dealing with page structures in vmemmap region from a given PFN
> directly (CONFIG_SPARSEMEM_VMEMMAP=y), and the given PFN is the
> last one based on physical region (48 bits), the address of its
> page structure will go beyond vmemmap region. Does this sound a
> problem?

Yes, but as far as we're concerned here it's not a problem with arm64:

	config ARCH_HAS_HMM
		...
		depends on (X86_64 || PPC64)
		depends on ZONE_DEVICE
		...
		depends on MEMORY_HOTPLUG
		depends on MEMORY_HOTREMOVE
		...

Whatever out-of-tree changes you have to address all of those are 
clearly implemented incorrectly; *that's* your problem.

>> where *all* the I/O is in the top half of the PA space. We already
>> constrain RAM in this very function to those regions which fit into
>> the linear map, and if you're accessing anything other than RAM
>> through the linear map you're probably doing something wrong.
> 
> If I understand this part correctly, since ARM64 has applied the
> memory limit already, does it mean that probably we should fix
> something in the region_intersects() or add an extra check in the
> hmm_devmem_add(), instead of limiting the iomem_resource?

It means we should implement memory hotplug correctly. Which, 
unfortunately, I happen to know is really hard (it's something I've been 
looking at from the device-DAX angle).

>> Furthermore, the physical region covered by the linear map doesn't
>> necessarily start at physical address 0 anyway - see PHYS_OFFSET.
> 
> Hmm...okay...but there still should be a protection somewhere if
> it happens to access a page structure via pfn_to_page() while the
> PFN is not covered by the vmemmap linear mapping, right?

There already is: pfn_valid() will return false for anything outside the 
intersection of memblock regions and the linear map region as calculated 
by arm64_memblock_init(); anything calling pfn_to_page() without 
checking pfn_valid() first is fundamentally broken. Or if you have 
out-of-tree changes to the pfn_valid() implementation then all bets are 
off, and it's not something for mainline to work around.

Robin.

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f48b194..22015af 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -377,6 +377,12 @@  void __init arm64_memblock_init(void)
 	BUILD_BUG_ON(linear_region_size != BIT(VA_BITS - 1));
 
 	/*
+	 * Update iomem_resource.end based on size of physical address space,
+	 * or linear region size when (VA_BITS - 1) is smaller than PA_BITS.
+	 */
+	iomem_resource.end = BIT(min(PHYS_MASK_SHIFT, VA_BITS - 1)) - 1;
+
+	/*
 	 * Select a suitable value for the base of physical memory.
 	 */
 	memstart_addr = round_down(memblock_start_of_DRAM(),

[RFC] arm64: update iomem_resource.end

Commit Message

Comments

Patch