[v2,4/4] drm/panfrost: Handle non-aligned lock addresses

Message ID	20210824173028.7528-5-alyssa.rosenzweig@collabora.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=+0eo=NP=lists.freedesktop.org=dri-devel-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E39B360E8B sender: alyssa) by bhuna.collabora.co.uk (Postfix) with ESMTPSA id 109241F41A29; Tue, 24 Aug 2021 18:31:21 +0100 (BST) From: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> To: dri-devel@lists.freedesktop.org Cc: Rob Herring <robh@kernel.org>, Tomeu Vizoso <tomeu.vizoso@collabora.com>, Steven Price <steven.price@arm.com>, Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>, David Airlie <airlied@linux.ie>, Daniel Vetter <daniel@ffwll.ch>, linux-kernel@vger.kernel.org Subject: [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses Date: Tue, 24 Aug 2021 13:30:28 -0400 Message-Id: <20210824173028.7528-5-alyssa.rosenzweig@collabora.com> In-Reply-To: <20210824173028.7528-1-alyssa.rosenzweig@collabora.com> References: <20210824173028.7528-1-alyssa.rosenzweig@collabora.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	drm/panfrost: Bug fixes for lock_region \| expand [v2,0/4] drm/panfrost: Bug fixes for lock_region [v2,1/4] drm/panfrost: Simplify lock_region calculation [v2,2/4] drm/panfrost: Use u64 for size in lock_region [v2,3/4] drm/panfrost: Clamp lock region to Bifrost minimum [v2,4/4] drm/panfrost: Handle non-aligned lock addresses

Message ID

20210824173028.7528-5-alyssa.rosenzweig@collabora.com (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E39B360E8B
From: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
To: dri-devel@lists.freedesktop.org
Cc: Rob Herring <robh@kernel.org>, Tomeu Vizoso <tomeu.vizoso@collabora.com>,
 Steven Price <steven.price@arm.com>,
 Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>,
 David Airlie <airlied@linux.ie>, Daniel Vetter <daniel@ffwll.ch>,
 linux-kernel@vger.kernel.org
Subject: [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses
Date: Tue, 24 Aug 2021 13:30:28 -0400
Message-Id: <20210824173028.7528-5-alyssa.rosenzweig@collabora.com>
In-Reply-To: <20210824173028.7528-1-alyssa.rosenzweig@collabora.com>
References: <20210824173028.7528-1-alyssa.rosenzweig@collabora.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: list
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>

Series

drm/panfrost: Bug fixes for lock_region | expand

Commit Message

Alyssa Rosenzweig Aug. 24, 2021, 5:30 p.m. UTC

When locking memory, the base address is rounded down to the nearest
page. The current code does not adjust the size in this case,
truncating the lock region:

	Input:      [----size----]
	Round: [----size----]

To fix the truncation, extend the lock region by the amount rounded off.

	Input:      [----size----]
	Round: [-------size------]

This bug is difficult to hit under current assumptions: since the size
of the lock region is stored as a ceil(log2), the truncation must cause
us to cross a power-of-two boundary. This is possible, for example if
the caller tries to lock 65535 bytes starting at iova 0xCAFE0010. The
existing code rounds down the iova to 0xCAFE0000 and rounds up the lock
region to 65536 bytes, locking until 0xCAFF0000. This fails to lock the
last 15 bytes.

In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
the bug. Therefore this doesn't need to be backported. Still, that's a
happy accident and not a precondition of lock_region, so we let's do the
right thing to future proof.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Steven Price <steven.price@arm.com>
---
 drivers/gpu/drm/panfrost/panfrost_mmu.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Steven Price Aug. 25, 2021, 9:04 a.m. UTC | #1

On 24/08/2021 18:30, Alyssa Rosenzweig wrote:
> When locking memory, the base address is rounded down to the nearest
> page. The current code does not adjust the size in this case,
> truncating the lock region:
> 
> 	Input:      [----size----]
> 	Round: [----size----]
> 
> To fix the truncation, extend the lock region by the amount rounded off.
> 
> 	Input:      [----size----]
> 	Round: [-------size------]
> 
> This bug is difficult to hit under current assumptions: since the size
> of the lock region is stored as a ceil(log2), the truncation must cause
> us to cross a power-of-two boundary. This is possible, for example if
> the caller tries to lock 65535 bytes starting at iova 0xCAFE0010. The
> existing code rounds down the iova to 0xCAFE0000 and rounds up the lock
> region to 65536 bytes, locking until 0xCAFF0000. This fails to lock the
> last 15 bytes.
> 
> In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
> the bug. Therefore this doesn't need to be backported. Still, that's a
> happy accident and not a precondition of lock_region, so we let's do the
> right thing to future proof.

Actually it's worse than that due to the hardware behaviour, the spec
states (for LOCKADDR_BASE):

> Only the upper bits of the address are used. The address is aligned to a
> multiple of the region size, so a variable number of low-order bits are
> ignored, depending on the selected region size. It is recommended that software
> ensures that these low bits in the address are cleared, to avoid confusion.

It appears that indeed this has caused confusion ;)

So for a simple request like locking from 0xCAFE0000 - 0xCB010000 (size
= 0x30000) the region width gets rounded up (to 0x40000) which causes
the start address to be effectively rounded down (by the hardware) to
0xCAFC0000 and we fail to lock 0xCB000000-0xCB010000.

Interestingly (unless my reading of this is wrong) that means to lock
0xFFFF0000-0x100010000 (i.e. crossing the 4GB boundary) requires locking
*at least* 0x00000000-0x200000000 (i.e. locking the first 8GB).

This appears to be broken in kbase (which actually does zero out the low
bits of the address) - I've raised a bug internally so hopefully someone
will tell me if I've read the spec completely wrong here.

Steve

> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Reported-by: Steven Price <steven.price@arm.com>
> ---
>  drivers/gpu/drm/panfrost/panfrost_mmu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index dfe5f1d29763..14be32497ec3 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -63,6 +63,9 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
>  	u8 region_width;
>  	u64 region = iova & PAGE_MASK;
>  
> +	/* After rounding the address down, extend the size to lock the end. */
> +	size += (region - iova);
> +
>  	/* The size is encoded as ceil(log2) minus(1), which may be calculated
>  	 * with fls. The size must be clamped to hardware bounds.
>  	 */
>

Alyssa Rosenzweig Aug. 25, 2021, 2:07 p.m. UTC | #2

> > In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
> > the bug. Therefore this doesn't need to be backported. Still, that's a
> > happy accident and not a precondition of lock_region, so we let's do the
> > right thing to future proof.
> 
> Actually it's worse than that due to the hardware behaviour, the spec
> states (for LOCKADDR_BASE):
> 
> > Only the upper bits of the address are used. The address is aligned to a
> > multiple of the region size, so a variable number of low-order bits are
> > ignored, depending on the selected region size. It is recommended that software
> > ensures that these low bits in the address are cleared, to avoid confusion.
> 
> It appears that indeed this has caused confusion ;)
> 
> So for a simple request like locking from 0xCAFE0000 - 0xCB010000 (size
> = 0x30000) the region width gets rounded up (to 0x40000) which causes
> the start address to be effectively rounded down (by the hardware) to
> 0xCAFC0000 and we fail to lock 0xCB000000-0xCB010000.
> 
> Interestingly (unless my reading of this is wrong) that means to lock
> 0xFFFF0000-0x100010000 (i.e. crossing the 4GB boundary) requires locking
> *at least* 0x00000000-0x200000000 (i.e. locking the first 8GB).
> 
> This appears to be broken in kbase (which actually does zero out the low
> bits of the address) - I've raised a bug internally so hopefully someone
> will tell me if I've read the spec completely wrong here.

Horrifying, and not what I wanted to read my last day before 2 weeks of
leave. Let's drop this patch, hopefully by the time I'm back, your
friends in GPU can confirm that's a spec bug and not an actual
hardware/driver one...

Can you apply the other 3 patches in the mean time? Thanks :-)

Steven Price Aug. 25, 2021, 2:34 p.m. UTC | #3

On 25/08/2021 15:07, Alyssa Rosenzweig wrote:
>>> In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
>>> the bug. Therefore this doesn't need to be backported. Still, that's a
>>> happy accident and not a precondition of lock_region, so we let's do the
>>> right thing to future proof.
>>
>> Actually it's worse than that due to the hardware behaviour, the spec
>> states (for LOCKADDR_BASE):
>>
>>> Only the upper bits of the address are used. The address is aligned to a
>>> multiple of the region size, so a variable number of low-order bits are
>>> ignored, depending on the selected region size. It is recommended that software
>>> ensures that these low bits in the address are cleared, to avoid confusion.
>>
>> It appears that indeed this has caused confusion ;)
>>
>> So for a simple request like locking from 0xCAFE0000 - 0xCB010000 (size
>> = 0x30000) the region width gets rounded up (to 0x40000) which causes
>> the start address to be effectively rounded down (by the hardware) to
>> 0xCAFC0000 and we fail to lock 0xCB000000-0xCB010000.
>>
>> Interestingly (unless my reading of this is wrong) that means to lock
>> 0xFFFF0000-0x100010000 (i.e. crossing the 4GB boundary) requires locking
>> *at least* 0x00000000-0x200000000 (i.e. locking the first 8GB).
>>
>> This appears to be broken in kbase (which actually does zero out the low
>> bits of the address) - I've raised a bug internally so hopefully someone
>> will tell me if I've read the spec completely wrong here.
> 
> Horrifying, and not what I wanted to read my last day before 2 weeks of
> leave. Let's drop this patch, hopefully by the time I'm back, your
> friends in GPU can confirm that's a spec bug and not an actual
> hardware/driver one...
> 
> Can you apply the other 3 patches in the mean time? Thanks :-)
> 

Yeah, sure. I'll push the first 3 to drm-misc-next-fixes (should land in
v5.15).

It's interesting that if my (new) reading of the spec is correct then
kbase has been horribly broken in this respect forever. So clearly it
can't be something that crops up very often. It would have been good if
the spec could have included wording such as "naturally aligned" if
that's what was intended.

Enjoy your holiday!

Thanks,
Steve

Alyssa Rosenzweig Aug. 25, 2021, 3:05 p.m. UTC | #4

> > Horrifying, and not what I wanted to read my last day before 2 weeks of
> > leave. Let's drop this patch, hopefully by the time I'm back, your
> > friends in GPU can confirm that's a spec bug and not an actual
> > hardware/driver one...
> > 
> > Can you apply the other 3 patches in the mean time? Thanks :-)
> > 
> 
> Yeah, sure. I'll push the first 3 to drm-misc-next-fixes (should land in
> v5.15).
> 
> It's interesting that if my (new) reading of the spec is correct then
> kbase has been horribly broken in this respect forever. So clearly it
> can't be something that crops up very often. It would have been good if
> the spec could have included wording such as "naturally aligned" if
> that's what was intended.

Indeed. Fingers crossed this is a mix-up. Although the text you quoted
seems pretty clear unfortunately :|

> Enjoy your holiday!

Thanks!

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index dfe5f1d29763..14be32497ec3 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -63,6 +63,9 @@  static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
 	u8 region_width;
 	u64 region = iova & PAGE_MASK;
 
+	/* After rounding the address down, extend the size to lock the end. */
+	size += (region - iova);
+
 	/* The size is encoded as ceil(log2) minus(1), which may be calculated
 	 * with fls. The size must be clamped to hardware bounds.
 	 */

[v2,4/4] drm/panfrost: Handle non-aligned lock addresses

Commit Message

Comments

Patch